complex_filter: Remove sequences with low complexity
In FastqCleaner: A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files

Description Usage Arguments Value Author(s) Examples

The program removes low complexity sequences, computing the entropy with the observed frequency of dinucleotides.

1	complex_filter(input, threshold = 0.5, referenceEntropy = 3.908135)

`input`	`ShortReadQ` object
`threshold`	A threshold value computed as the relation of the H of the sequences and the reference H. Default is 0.5
`referenceEntropy`	Reference entropy. By default, the program uses a value of 3.908, that corresponds to the entropy of the human genome in bits

Filtered ShortReadQ object

Leandro Roser learoser@gmail.com

require('Biostrings')
require('ShortRead')

# create  sequences of different width
set.seed(10)
input <- lapply(c(0, 6, 10, 16, 20, 26, 30, 36, 40), 
               function(x) random_seq(1, x))


# create repetitive 'CG' sequences with length adequante 
# for a total length:
# input +  CG = 40

set.seed(10)
CG <- lapply(c(20, 17, 15, 12, 10, 7, 5, 2, 0), 
            function(x) paste(rep('CG', x), collapse = ''))


# concatenate input and CG
input  <- mapply('paste', input, CG, sep = '')
input <- DNAStringSet(input)

# plot relative entropy (E, Shannon 1948)

freq <- dinucleotideFrequency(input)
freq  <- freq /rowSums(freq)
H <- -rowSums(freq  * log2(freq), na.rm = TRUE)
H_max <- 3.908135  # max entropy
plot(H/H_max, type='b', xlab = 'Sequence', ylab= 'E')


# create qualities of width 40

set.seed(10)
input_q <- random_qual(c(30,40), slength = 9, swidth = 40, 
                       encod = 'Sanger')

# create names
input_names <- seq_names(9)

# create ShortReadQ object
my_read <- ShortReadQ(sread = input, quality = input_q, id = input_names)

# apply the filter
filtered <- complex_filter(my_read)

# look at the filtered sequences
sread(filtered)

FastqCleaner documentation built on Nov. 8, 2020, 5:05 p.m.

FastqCleaner index

README.md An Introduction to FastqCleaner An Introduction to FastqCleaner

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

FastqCleaner
A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files

complex_filter: Remove sequences with low complexity
In FastqCleaner: A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files

Description

Usage

Arguments

Value

Author(s)

Examples

Related to complex_filter in FastqCleaner...

R Package Documentation

Browse R Packages

We want your feedback!

FastqCleaner A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files

complex_filter: Remove sequences with low complexity In FastqCleaner: A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files

Description

Usage

Arguments

Value

Author(s)

Examples

Related to complex_filter in FastqCleaner...

R Package Documentation

Browse R Packages

We want your feedback!

FastqCleaner
A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files

complex_filter: Remove sequences with low complexity
In FastqCleaner: A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files