Description Usage Arguments Value Examples
Generate overrepresented kmers of length k based on their observed to expected ratio at each position across all sequences in the dataset. The expected proportion of a length k kmer assumes site independence and is computed as the sum of the count of each base pair in the kmer times the probability of observing that base pair in the data set, i.e. P(A)count_in_kmer(A)+P(C)count_in_kmer(C)+... The observed to expected ratio is computed as log2(obs/exp). Those with obsexp_ratio > 2 are considered to be overrepresented and appear in the returned data frame along with their position in the sequence.
1 | overrep_kmer(infile, k, output_file = NA)
|
infile |
path to gzipped FASTQ file |
k |
the kmer length |
output_file |
File to save plot to. Default NA. |
Data frame with columns: Position (in read), Obsexp_ratio, & Kmer
1 2 3 | infile <-system.file("extdata", "test.fq.gz",
package = "qckitfastq")
overrep_kmer(infile,k=4)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.