COMPASS: Fit the COMPASS Model

Description Usage Arguments Value Category Filter See Also Examples

View source: R/COMPASS.R

Description

This function fits the COMPASS model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
COMPASS(
  data,
  treatment,
  control,
  subset = NULL,
  category_filter = function(x) colSums(x > 5) > 2,
  filter_lowest_frequency = 0,
  filter_specific_markers = NULL,
  model = "discrete",
  iterations = 40000,
  replications = 8,
  keep_original_data = FALSE,
  verbose = TRUE,
  dropDegreeOne = FALSE,
  init_with_fisher = FALSE,
  run_model_or_return_data = "run_model",
  ...
)

Arguments

data

An object of class COMPASSContainer.

treatment

An R expression, evaluated within the metadata, that returns TRUE for those samples that should belong to the treatment group. For example, if the samples that received a positive stimulation were named "92TH023 Env" within a variable in meta called Stim, you could write Stim == "92TH023 Env". The expression should have the name of the stimulation vector on the left hand side.

control

An R expression, evaluated within the metadata, that returns TRUE for those samples that should belong to the control group. See above for details.

subset

An expression used to subset the data. We keep only the samples for which the expression evaluates to TRUE in the metadata.

category_filter

A filter for the categories that are generated. This is a function that will be applied to the treatment counts matrix generated from the intensities. Only categories meeting the category_filter criteria will be kept.

filter_lowest_frequency

A number specifying how many of the least expressed markers should be removed.

filter_specific_markers

Similar to filter_lowest_frequency, but lets you explicitly exclude markers.

model

A string denoting which model to fit; currently, only the discrete model ("discrete") is available.

iterations

The number of iterations (per 'replication') to perform.

replications

The number of 'replications' to perform. In order to conserve memory, we only keep the model estimates from the last replication.

keep_original_data

Keep the original COMPASSContainer as part of the COMPASS output? If memory or disk space is an issue, you may set this to FALSE.

verbose

Boolean; if TRUE we output progress information.

dropDegreeOne

Boolean; if TRUE we drop degree one categories and merge them with the negative subset.

init_with_fisher

Boolean;initialize from fisher's exact test. Any subset and subject with lower 95 Otherwise initialize very subject and subset as a responder except those where ps <= pu.

run_model_or_return_data

character defaults to "run_model" otherwise set it to "return_data" in order to not fit the model just return the data set needed for modeling. Useful for extracting the boolean counts.

...

Other arguments; currently unused.

Value

A COMPASSResult is a list with the following components:

fit

A list of various fitted parameters resulting from the COMPASS model fitting procedure.

data

The data used as input to the COMPASS fitting procedure – in particular, the counts matrices generated for the selected categories, n_s and n_u, can be extracted from here.

orig

If keep_original_data was set to TRUE in the COMPASS fit, then this will be the COMPASSContainer passed in. This is primarily kept for easier running of the Shiny app.

The fit component is a list with the following components:

alpha_s

The hyperparameter shared across all subjects under the stimulated condition. It is updated through the COMPASS model fitting process.

A_alphas

The acceptance rate of alpha_s, as computed through the MCMC sampling process in COMPASS.

alpha_u

The hyperparameter shared across all subjects under the unstimulated condition. It is updated through the COMPASS model fitting process.

A_alphau

The acceptance rate of alpha_u, as computed through the MCMC sampling process in COMPASS.

gamma

An array of dimensions I x K x T, where I denotes the number of individuals, K denotes the number of categories / subsets, and T denotes the number of iterations. Each cell in a matrix for a given iteration is either zero or one, reflecting whether individual i is responding to the stimulation for subset k.

mean_gamma

A matrix of mean response rates. Each cell denotes the mean response of individual i and subset k.

A_gamma

The acceptance rate for the gamma. Each element corresponds to the number of times an individual's gamma vector was updated.

categories

The category matrix, showing which categories entered the model.

model

The type of model called.

posterior

Posterior measures from the sample fit.

call

The matched call used to generate the model fit.

The data component is a list with the following components:

n_s

The counts matrix for stimulated samples.

n_u

The counts matrix for unstimulated samples.

counts_s

Total cell counts for stimulated samples.

counts_u

Total cell counts for unstimulated samples.

categories

The categories matrix used to define which categories will enter the model.

meta

The metadata. Note that only individual-level metadata will be kept; sample-specific metadata is dropped.

sample_id

The name of the vector in the metadata used to identify the samples.

individual_id

The name of the vector in the metadata used to identify the individuals.

The orig component (included if keep_original_data is TRUE) is the COMPASSContainer object used in the model fit.

Category Filter

The category filter is used to exclude categories (combinations of markers expressed for a particular cell) that are expressed very rarely. It is applied to the treatment counts matrix, which is a N samples by K categories matrix. Those categories which are mostly unexpressed can be excluded here. For example, the default criteria,

category_filter=function(x) colSums(x > 5) > 2

indicates that we should only retain categories for which at least three samples had at least six cells expressing that particular combination of markers.

See Also

Examples

1
2
3
4
5
6
7
8
data(COMPASS) ## loads the COMPASSContainer 'CC'
fit <- COMPASS(CC,
  category_filter=NULL,
  treatment=trt == "Treatment",
  control=trt == "Control",
  verbose=FALSE,
  iterations=100 ## set higher for a real analysis
)

RGLab/COMPASS documentation built on Feb. 11, 2021, 3:23 p.m.