applyMultiSCE: Apply function over multiple SingleCellExperiments

View source: R/applyMultiSCE.R

applyMultiSCER Documentation

Apply function over multiple SingleCellExperiments

Description

A generalization of applySCE to apply a function to corresponding parts of multiple SingleCellExperiments, each of which have one or more alternative Experiments.

Usage

applyMultiSCE(
  ...,
  FUN,
  WHICH = NULL,
  COMMON.ARGS = list(),
  MAIN.ARGS = list(),
  ALT.ARGS = list(),
  SIMPLIFY = TRUE
)

Arguments

...

One or more SingleCellExperiment objects containing counts and size factors. Each object should contain the same number of rows, corresponding to the same genes in the same order.

If multiple objects are supplied, each object is assumed to contain all and only cells from a single batch. If a single object is supplied, batch should also be specified.

Alternatively, one or more lists of SingleCellExperiments can be provided; this is flattened as if the objects inside were passed directly to ....

FUN

Any function that accepts multiple SummarizedExperiment or SingleCellExperiment objects.

WHICH

A character or integer vector containing the names or positions of alternative Experiments to loop over. Defaults to all alternative Experiments that are present in each SingleCellExperiment in ....

COMMON.ARGS

Further (named) arguments to pass to all calls to FUN.

MAIN.ARGS

A named list of arguments to pass to calls to FUN involving the main Experiment(s) only. Alternatively NULL, in which case the function is not applied to the main Experiment.

ALT.ARGS

A named list where each entry is named after an alternative Experiment and contains named arguments to use in FUN for that Experiment.

SIMPLIFY

Logical scalar indicating whether the output should be simplified to one or more SingleCellExperiments.

Details

This function is a generalization of applySCE whereby corresponding Experiments from ... are passed to FUN. To illustrate, if we passed objects x, y and z in ...:

  1. We first call FUN on the set of all main Experiments from ..., obtaining a result equivalent to FUN(x, y, z) (more on the other arguments later).

  2. Then we call FUN on the set of all first alternative Experiments. This is equivalent to FUN(altExp(x), altExp(y), altExp(z)).

  3. Then we call FUN on the set of all second alternative Experiments. This is equivalent to FUN(altExp(x, 2), altExp(y, 2), altExp(z, 2)).

  4. And so on.

In effect, much like applySCE is analogous to lapply, applyMultiSCE is analogous to mapply. This allows users to easily apply the same function to all the Experiments (main and alternative) in a list of SingleCellExperiment objects.

Arguments in COMMON.ARGS (plus some extra arguments, see below) are passed to all calls to FUN. Arguments in MAIN.ARGS are only used in the call to FUN on the main Experiments. Arguments in ALT.ARGS are passed to the call to FUN on the alternative Experiments of the same name. For the last two, any arguments therein will override arguments of the same name in COMMON.ARGS.

Arguments in ... that are not SingleCellExperiments are actually treated as additional arguments for COMMON.ARGS. This is purely intended as a user convenience, to avoid the need to write COMMON.ARGS=list() when specifying these arguments. However, explicitly using COMMON.ARGS is the safer approach and recommended for developers.

By default, looping is performed over alternative Experiments with names that are present across all entries of .... Values of WHICH should be unique if any simplification of the output is desired. If MAIN.ARGS=NULL, the main Experiment is ignored and the function is only applied to the alternative Experiments.

The default of SIMPLIFY=TRUE is aims to make the output easier to manipulate. If FUN returns a SingleCellExperiment, the outputs across main and alternative Experiments are simplified into a single SingleCellExperiment. If FUN returns a list of SingleCellExperiments of the same length, the outputs are simplified into one list of SingleCellExperiments. This assumes that WHICH contains no more than one reference to each alternative Experiment in x.

Value

In most cases or when SIMPLIFY=FALSE, a list is returned containing the output of FUN applied to each corresponding Experiment across all .... If MAIN.ARGS is not NULL, the first entry corresponds to the result generated from the main Experiments; all other results are generated according to the entries specified in WHICH and are named accordingly.

If SIMPLIFY=TRUE and certain conditions are fulfilled, we can either return:

  • A single SingleCellExperiment, if all calls to FUN return a SingleCellExperiment. Here, the results of FUN on the main/alternative Experiments in ... are mapped to the main or alternative Experiments of the same name in the output.

  • A list of SingleCellExperiments, if all calls to FUN return a list of SingleCellExperiments of the same length. The altExps of each output SingleCellExperiment contains the results from the corresponding call to FUN on the alternative Experiments of the same name in ....

In both cases, the aim is to mirror the organization of Experiments in each entry of ....

Author(s)

Aaron Lun

See Also

applySCE, for the simpler version that involves only one SingleCellExperiment object.

simplifyToSCE, for the conditions required for simplification.

Examples

# Setting up some objects with alternative Experiments.
d1 <- matrix(rnbinom(50000, mu=10, size=1), ncol=100)
sce1 <- SingleCellExperiment(list(counts=d1))
sizeFactors(sce1) <- runif(ncol(d1))
altExp(sce1, "Spike") <- sce1
altExp(sce1, "Protein") <- sce1

d2 <- matrix(rnbinom(20000, mu=50, size=1), ncol=40)
sce2 <- SingleCellExperiment(list(counts=d2))
sizeFactors(sce2) <- runif(ncol(d2))
altExp(sce2, "Spike") <- sce2
altExp(sce2, "Protein") <- sce2

# Applying a function over the main and alternative experiments.
normed <- applyMultiSCE(sce1, sce2, FUN=multiBatchNorm)
normed
altExp(normed[[1]]) # contains log-normalized values

regressed <- applyMultiSCE(normed, FUN=regressBatches)
regressed
altExp(regressed) # contains corrected expression values

rescaled <- applyMultiSCE(normed, FUN=rescaleBatches)
rescaled
altExp(rescaled) # contains corrected expression values

# We can also specify 'batch=' directly.
combined <- cbind(sce1, sce2)
batch <- rep(1:2, c(ncol(sce1), ncol(sce2)))

normed <- applyMultiSCE(combined, batch=batch, FUN=multiBatchNorm)
normed
altExp(normed) # contains log-normalized values

regressed <- applyMultiSCE(normed, batch=batch, FUN=regressBatches)
regressed
altExp(regressed) # contains corrected expression values

rescaled <- applyMultiSCE(normed, batch=batch, FUN=rescaleBatches)
rescaled
altExp(rescaled) # contains corrected expression values


LTLA/batchelor documentation built on July 10, 2024, 9:09 p.m.