isNotContaminant: Identify non-contaminant sequences.
In decontam: Identify Contaminants in Marker-gene and Metagenomics Sequencing Data

Description Usage Arguments Value Examples

View source: R/decontam.R

The prevalence of each sequence (or OTU) in the input feature table across samples and negative controls is used to identify non-contaminant sequences. Note that the null hypothesis here is that sequences **are** contaminants. This function is intended for use on low-biomass samples in which a large proportion of the sequences are likely to be contaminants.

1 2	isNotContaminant(seqtab, neg = NULL, method = "prevalence", threshold = 0.5, normalize = TRUE, detailed = FALSE)

`seqtab`	(Required). Integer matrix. A feature table recording the observed abundances of each sequence (or OTU) in each sample. Rows should correspond to samples, and columns to sequences (or OTUs).
`neg`	(Required). `logical` The negative control samples. Extraction controls give the best results.
`method`	(Optional). Default "prevalence". The method used to test for contaminants. Currently the only method supported is prevalence. prevalence: Contaminants are identified by increased prevalence in negative controls.
`threshold`	(Optional). Default `0.5`. The probability threshold below which (strictly less than) the null-hypothesis (a contaminant) should be rejected in favor of the alternate hypothesis (not a contaminant).
`normalize`	(Optional). Default TRUE. If TRUE, the input `seqtab` is normalized so that each row sums to 1 (converted to frequency). If FALSE, no normalization is performed (the data should already be frequencies or counts from equal-depth samples).
`detailed`	(Optional). Default FALSE. If TRUE, the return value is a `data.frame` containing diagnostic information on the non-contaminant decision. If FALSE, the return value is a `logical` vector containing the non-contaminant decisions.

If detailed=FALSE a logical vector is returned, with TRUE indicating non-contaminants. If detailed=TRUE a data.frame is returned instead.

1
2
3

st <- readRDS(system.file("extdata", "st.rds", package="decontam"))
samdf <- readRDS(system.file("extdata", "samdf.rds", package="decontam"))
isNotContaminant(st, samdf$quant_reading, threshold=0.05)