Description Usage Arguments Details Value Author(s) References See Also Examples
This function estimates the size factors using the
"median ratio method" described by Equation 5 in Anders and Huber (2010).
The estimated size factors can be accessed using sizeFactors
.
Alternative library size estimators can also be supplied
using sizeFactors
.
1 2 3 | ## S4 method for signature 'DESeqDataSet'
estimateSizeFactors(object, type = c("ratio",
"iterate"), locfunc = stats::median, geoMeans, controlGenes, normMatrix)
|
object |
a DESeqDataSet |
type |
either "ratio" or "iterate". "ratio" uses the standard median ratio method introduced in DESeq. The size factor is the median ratio of the sample over a pseudosample: for each gene, the geometric mean of all samples. "iterate" offers an alternative estimator, which can be used even when all genes contain a sample with a zero. This estimator iterates between estimating the dispersion with a design of ~1, and finding a size factor vector by numerically optimizing the likelihood of the ~1 model. |
locfunc |
a function to compute a location for a sample. By default, the
median is used. However, especially for low counts, the
|
geoMeans |
by default this is not provided and the geometric means of the counts are calculated within the function. A vector of geometric means from another count matrix can be provided for a "frozen" size factor calculation |
controlGenes |
optional, numeric or logical index vector specifying those genes to use for size factor estimation (e.g. housekeeping or spike-in genes) |
normMatrix |
optional, a matrix of normalization factors which do not
control for library size (e.g. average transcript length of genes for each
sample). Providing |
Typically, the function is called with the idiom:
dds <- estimateSizeFactors(dds)
See DESeq
for a description of the use of size factors in the GLM.
One should call this function after DESeqDataSet
unless size factors are manually specified with sizeFactors
.
Alternatively, gene-specific normalization factors for each sample can be provided using
normalizationFactors
which will always preempt sizeFactors
in calculations.
Internally, the function calls estimateSizeFactorsForMatrix
,
which provides more details on the calculation.
The DESeqDataSet passed as parameters, with the size factors filled in.
Simon Anders
Reference for the median ratio method:
Simon Anders, Wolfgang Huber: Differential expression analysis for sequence count data. Genome Biology 2010, 11:106. http://dx.doi.org/10.1186/gb-2010-11-10-r106
1 2 3 4 5 6 7 8 9 10 11 12 13 | dds <- makeExampleDESeqDataSet(n=1000, m=4)
dds <- estimateSizeFactors(dds)
sizeFactors(dds)
dds <- estimateSizeFactors(dds, controlGenes=1:200)
m <- matrix(runif(1000 * 4, .5, 1.5), ncol=4)
dds <- estimateSizeFactors(dds, normMatrix=m)
normalizationFactors(dds)[1:3,]
geoMeans <- exp(rowMeans(log(counts(dds))))
dds <- estimateSizeFactors(dds,geoMeans=geoMeans)
sizeFactors(dds)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.