filterByExpr: Filter Genes By Expression Level
In edgeR: Empirical Analysis of Digital Gene Expression Data in R

Description Usage Arguments Details Value Author(s) References Examples

Determine which genes have sufficiently large counts to be retained in a statistical analysis.

## S3 method for class 'DGEList'
filterByExpr(y, design = NULL, group = NULL, lib.size = NULL, ...)
## S3 method for class 'SummarizedExperiment'
filterByExpr(y, design = NULL, group = NULL, lib.size = NULL, ...)
## Default S3 method:
filterByExpr(y, design = NULL, group = NULL, lib.size = NULL,
             min.count = 10, min.total.count = 15, large.n = 10, min.prop = 0.7, ...)

`y`	matrix of counts, or a `DGEList` object, or a `SummarizedExperiment` object.
`design`	design matrix. Ignored if `group` is not `NULL`.
`group`	vector or factor giving group membership for a oneway layout, if appropriate.
`lib.size`	library size, defaults to `colSums(y)`.
`min.count`	numeric. Minimum count required for at least some samples.
`min.total.count`	numeric. Minimum total count required.
`large.n`	integer. Number of samples per group that is considered to be “large”.
`min.prop`	numeric. Minimum proportion of samples in the smallest group that express the gene.
`...`	any other arguments. For the `DGEList` and `SummarizedExperiment` methods, other arguments will be passed to the default method. For the default method, other arguments are not currently used.

This function implements the filtering strategy that was intuitively described by Chen et al (2016). Roughly speaking, the strategy keeps genes that have at least min.count reads in a worthwhile number samples. More precisely, the filtering keeps genes that have count-per-million (CPM) above k in n samples, where k is determined by min.count and by the sample library sizes and n is determined by the design matrix.

n is essentially the smallest group sample size or, more generally, the minimum inverse leverage of any fitted value. If all the group sizes are larger than large.n, then this is relaxed slightly, but with n always greater than min.prop of the smallest group size (70% by default).

In addition, each kept gene is required to have at least min.total.count reads across all the samples.

Logical vector of length nrow(y) indicating which rows of y to keep in the analysis.

Gordon Smyth

Chen Y, Lun ATL, and Smyth, GK (2016). From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Research 5, 1438. http://f1000research.com/articles/5-1438

## Not run: 
keep <- filterByExpr(y, design)
y <- y[keep,]

## End(Not run)

edgeR documentation built on Jan. 16, 2021, 2:03 a.m.

edgeR index

Package overview edgeR Vignette

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

edgeR
Empirical Analysis of Digital Gene Expression Data in R

filterByExpr: Filter Genes By Expression Level
In edgeR: Empirical Analysis of Digital Gene Expression Data in R

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to filterByExpr in edgeR...

R Package Documentation

Browse R Packages

We want your feedback!

edgeR Empirical Analysis of Digital Gene Expression Data in R

filterByExpr: Filter Genes By Expression Level In edgeR: Empirical Analysis of Digital Gene Expression Data in R

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to filterByExpr in edgeR...

R Package Documentation

Browse R Packages

We want your feedback!

edgeR
Empirical Analysis of Digital Gene Expression Data in R

filterByExpr: Filter Genes By Expression Level
In edgeR: Empirical Analysis of Digital Gene Expression Data in R