keep_abundant | R Documentation |
Filters the data to keep only transcripts/genes that are consistently expressed above a threshold across samples. This is a filtering version of identify_abundant() that removes low-abundance features instead of just marking them.
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
design = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
## S4 method for signature 'spec_tbl_df'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
design = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
## S4 method for signature 'tbl_df'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
design = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
## S4 method for signature 'tidybulk'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
design = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
## S4 method for signature 'SummarizedExperiment'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
design = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
## S4 method for signature 'RangedSummarizedExperiment'
keep_abundant(
.data,
.sample = NULL,
.transcript = NULL,
.abundance = NULL,
factor_of_interest = NULL,
design = NULL,
minimum_counts = 10,
minimum_proportion = 0.7
)
.data |
A 'tbl' or 'SummarizedExperiment' object containing transcript/gene abundance data |
.sample |
The name of the sample column |
.transcript |
The name of the transcript/gene column |
.abundance |
The name of the transcript/gene abundance column |
factor_of_interest |
The name of the column containing groups/conditions for filtering. Used by edgeR's filterByExpr to define sample groups. |
design |
A design matrix for more complex experimental designs. If provided, this is passed to filterByExpr instead of factor_of_interest. |
minimum_counts |
A positive number specifying the minimum counts per million (CPM) threshold for a transcript to be kept (default = 10) |
minimum_proportion |
A number between 0 and 1 specifying the minimum proportion of samples that must exceed the minimum_counts threshold (default = 0.7) |
questioning
This function uses edgeR's filterByExpr() function to identify and keep consistently expressed features. A feature is kept if it has CPM > minimum_counts in at least minimum_proportion of samples in at least one experimental group (defined by factor_of_interest or design).
This function is similar to identify_abundant() but instead of adding an .abundant column, it filters out the low-abundance features directly.
Returns a filtered version of the input object containing only the features that passed the abundance threshold criteria.
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A consistent object (to the input) with additional columns for the statistics from the hypothesis test (e.g., log fold change, p-value and false discovery rate).
A 'SummarizedExperiment' object
A 'SummarizedExperiment' object
McCarthy, D. J., Chen, Y., & Smyth, G. K. (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research, 40(10), 4288-4297. DOI: 10.1093/bioinformatics/btp616
# Basic usage
se_mini |> keep_abundant()
# With custom thresholds
se_mini |> keep_abundant(
minimum_counts = 5,
minimum_proportion = 0.5
)
# Using a factor of interest
se_mini |> keep_abundant(factor_of_interest = condition)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.