Description Usage Arguments Details Value Author(s) References
This function evaluates the sequence complexity using the DUST algorithm.
1 2 | complexity.dust(object, xlab="Complexity score (0=high, 100=low)", ylab="Number of sequences",
xlim=c(0, 100), col="firebrick1", breaks=100, ...)
|
object |
An object of class DNAStringSet, ShortRead or SFFContainer. |
xlab |
The X axis label. |
ylab |
The Y axis label. |
xlim |
The limits of the X axis. |
col |
The plotting color. |
breaks |
The number of breaks in the histogram (see ‘hist’). |
... |
Arguments to be passed to methods, such as graphical parameters (see ‘par’). |
The complexity score is based on how often different trinucleotides occur and is scaled between 0 and 100. A sequence of homopolymer repeats (e.g. TTTTTTTTTT) has a score of 100, of dinucleotide repeats (e.g. TATATATATA) has a score around 49, and of trinucleotide repeats (e.g. TAGTAGTAG) has a score around 32. Scores above seven can be considered low-complexity.
A numeric vector containing the complexity score for each sequence.
Christian Ruckert
Schmieder R. (2011) Quality control and preprocessing of metagenomic datasets. Bioinformatics, 2011 Mar 15;27(6):863-4.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.