isBimeraDenovo: Identify bimeras from collections of unique sequences.

View source: R/chimeras.R

isBimeraDenovoR Documentation

Identify bimeras from collections of unique sequences.

Description

This function is a wrapper around isBimera for collections of unique sequences (i.e. sequences with associated abundances). Each sequence is evaluated against a set of "parents" drawn from the sequence collection that are sufficiently more abundant than the sequence being evaluated. A logical vector is returned, with an entry for each input sequence indicating whether it was (was not) consistent with being a bimera of those more abundant "parents".

Usage

isBimeraDenovo(
  unqs,
  minFoldParentOverAbundance = 2,
  minParentAbundance = 8,
  allowOneOff = FALSE,
  minOneOffParentDistance = 4,
  maxShift = 16,
  multithread = FALSE,
  verbose = FALSE
)

Arguments

unqs

(Required). A uniques-vector or any object that can be coerced into one with getUniques.

minFoldParentOverAbundance

(Optional). A numeric(1). Default is 2. Only sequences greater than this-fold more abundant than a sequence can be its "parents".

minParentAbundance

(Optional). A numeric(1). Default is 8. Only sequences at least this abundant can be "parents".

allowOneOff

(Optional). A logical(1). Default is FALSE. If FALSE, sequences that have one mismatch or indel to an exact bimera are also flagged as bimeric.

minOneOffParentDistance

(Optional). A numeric(1). Default is 4. Only sequences with at least this many mismatches to the potential bimeric sequence considered as possible "parents" when flagging one-off bimeras. There is no such screen when considering exact bimeras.

maxShift

(Optional). A numeric(1). Default is 16. Maximum shift allowed when aligning sequences to potential "parents".

multithread

(Optional). Default is FALSE. If TRUE, multithreading is enabled and the number of available threads is automatically determined. If an integer is provided, the number of threads to use is set by passing the argument on to mclapply.

verbose

(Optional). logical(1) indicating verbose text output. Default FALSE.

Value

logical of length the number of input unique sequences. TRUE if sequence is a bimera of more abundant "parent" sequences. Otherwise FALSE.

See Also

isBimera, removeBimeraDenovo

Examples

derep1 = derepFastq(system.file("extdata", "sam1F.fastq.gz", package="dada2"))
dada1 <- dada(derep1, err=tperr1, errorEstimationFunction=loessErrfun, selfConsist=TRUE)
is.bim <- isBimeraDenovo(dada1)
is.bim2 <- isBimeraDenovo(dada1$denoised, minFoldParentOverAbundance = 2, allowOneOff=TRUE)


benjjneb/dada2 documentation built on Jan. 12, 2025, 10:03 a.m.