simulate_replicates: Computes 'BEARscc' simulated technical replicates.

Description Usage Arguments Details Value A brief description of subfunctions Note Author(s) See Also Examples

View source: R/generate_noise_matrices.R

Description

Computes BEARscc simulated technical replicates from the previously estimated noise parameters computed with the function estimate_noise_parameters().

Usage

1
simulate_replicates(SCEList, max_cumprob=0.9999, n = 3)

Arguments

SCEList

A SingleCellExpression object that has been appropriately processed by estimate_noiseparameters() to add the necessary parameters describing the noise model for drop-outs and variance in the single cell experiment.

max_cumprob

Because a cumulative distribution will range from n=0 to a countable infinity, the event space needs to be set to cover a reasonable fraction of the probability density. This parameter determines the the fraction of probability density covered by the event space, which in turn defines the highes count number in the event space. We recommend users use the default value of 0.9999. However, if the default value was altered in estimate_noiseparameters(), then the value used in that function is most definitely what should be input here!

n

The number of simulated technical replicates to generate.

Details

In the second step of BEARscc, the algorithm applies the model from first step to produce simulated technical replicates. For every observed gene count below which drop-outs occurred amongst the spike-ins, BEARscc assesses whether to convert the count to zero (using the drop-out injection distribution). For observations where the count is zero, the drop-out recovery distribution is used to estimate a new value, based on the overall drop-out frequency for that gene. After this drop-out processing, all non-zero counts are substituted with a value generated by the model of expression variance created in the first step. parameterized to the observed counts for each gene. This second step is repeated any number of times (as prescribed by parameter n) to generate a collection of simulated technical replicates for downstream analysis.

Value

The resulting object is a list of counts data that is added to the metadata of the SingleCellExpression object as a long list titled "simulated_replicates". Each element of the list is a data.frame of the counts representing a BEARscc simulated technical replicate, e.g for n=10 we would have the list:

[,1] Counts data.frame of simulated replicate 1.
[,2] Counts data.frame of simulated replicate 2.
[,3] Counts data.frame of simulated replicate 3.
[,4] Counts data.frame of simulated replicate 4.
[,5] Counts data.frame of simulated replicate 5.
[,6] Counts data.frame of simulated replicate 6.
[,7] Counts data.frame of simulated replicate 7.
[,8] Counts data.frame of simulated replicate 8.
[,9] Counts data.frame of simulated replicate 9.
[,10] Counts data.frame of simulated replicate 10.
[,11] Counts data.frame of observed data.

A brief description of subfunctions

simulate_replicates relies on the following subfunctions to generate simulated technical replicates. These functions share many common options with the user interactive function. For those options that are internal to the programming; these are annotated to give an idea of flow. For further detail please examine source code in the R directory of this package:

Note

Frequently, the user will want to compute simulated technical replicates in a high performance computational environment. When running estimate_noiseparameters() using the option write.noise.model=TRUE, the user recives the files with root file="noise_estimation", "noise_estimation_counts4clusterperturbation.xls", "noise_estimation_bayesianestimates.xls" and "noise_estimation_parameters4randomize.xls". These files may be input into the example code, HPC_generate_noise_matrices.R, on a high performance computational environment for faster processing.

Author(s)

David T. Severson <david_severson@hms.harvard.edu>

Maintainer: Benjamin Schuster-Boeckler <benjamin.schuster-boeckler@ludwig.ox.ac.uk>

See Also

The example code for running the simulation of technical replicates on a high performance computing cluster can be found in inst/example/.

The code for generating simulated technical replicates on a high powered compute node requires the function, HPC_simulate_replicates().

Examples

1
2
3
4
5
library("SingleCellExperiment")
data(analysis_examples)

BEAR_simreplicates.sce<-simulate_replicates(BEAR_analyzed.sce, n=3)
BEAR_simreplicates.sce

BEARscc documentation built on Nov. 8, 2020, 7:56 p.m.