zinbsurf: Perform dimensionality reduction using a ZINB regression...
In drisso/zinbwave: Zero-Inflated Negative Binomial Model for RNA-Seq Data

zinbsurf

R Documentation

Perform dimensionality reduction using a ZINB regression model for large datasets.

Description

Given an object with the data, it performs dimensionality reduction using a ZINB regression model with gene and cell-level covariates on a random subset of the data. It then projects the remaining data onto the lower dimensional space.

Usage

zinbsurf(Y, ...)

## S4 method for signature 'SummarizedExperiment'
zinbsurf(
  Y,
  X,
  V,
  K,
  which_assay,
  which_genes,
  zeroinflation = TRUE,
  prop_fit = 0.1,
  BPPARAM = BiocParallel::bpparam(),
  verbose = FALSE,
  ...
)

Arguments

`Y`	The data (genes in rows, samples in columns). Currently implemented only for `SummarizedExperiment`.
`...`	Additional parameters to describe the model, see `zinbModel`.
`X`	The design matrix containing sample-level covariates, one sample per row. If missing, X will contain only an intercept. If Y is a SummarizedExperiment object, X can be a formula using the variables in the colData slot of Y.
`V`	The design matrix containing gene-level covariates, one gene per row. If missing, V will contain only an intercept. If Y is a SummarizedExperiment object, V can be a formula using the variables in the rowData slot of Y.
`K`	integer. Number of latent factors. Specify `K = 0` if only computing observational weights.
`which_assay`	numeric or character. Which assay of Y to use. If missing, if 'assayNames(Y)' contains "counts" then that is used. Otherwise, the first assay is used.
`which_genes`	character. Which genes to use to estimate W (see details). Ignored if `fitted_model` is provided.
`zeroinflation`	Whether or not a ZINB model should be fitted. If FALSE, a negative binomial model is fitted instead.
`prop_fit`	numeric between 0 and 1. The proportion of cells to use for the zinbwave fit.
`BPPARAM`	object of class `bpparamClass` that specifies the back-end to be used for computations. See `bpparam` for details.
`verbose`	Print helpful messages.

Details

This function implements an approximate strategy, in which the full zinbwave model is fit only on a random subset of the data (controlled by the prop_fit parameter). The rest of the samples are subsequently projected onto the low-rank space. This strategy is much faster and uses less memory than the full zinbwave method. It is recommended with extremely large datasets.

By default zinbsurf uses all genes to estimate W. However, we recommend to use the top 1,000 most variable genes for this step. In general, a user can specify any custom set of genes to be used to estimate W, by specifying either a vector of gene names, or a single character string corresponding to a column of the rowData.

Value

An object of class SingleCellExperiment; the dimensionality reduced matrix is stored in the reducedDims slot.

Methods (by class)

zinbsurf(SummarizedExperiment): Y is a SummarizedExperiment.

Examples

se <- SingleCellExperiment(assays = list(counts = matrix(rpois(60, lambda=5),
                                                         nrow=10, ncol=6)),
                           colData = data.frame(bio = gl(2, 3)))
colnames(se) <- paste0("sample", 1:6)
m <- zinbsurf(se, X="~bio", K = 1, prop_fit = .5, which_assay = 1,
              BPPARAM=BiocParallel::SerialParam())

drisso/zinbwave documentation built on March 18, 2024, 5:13 p.m.