calcKL: Calculate the Kullback-Leibler Divergence Between the k-mer...

Description Usage Arguments Value Note Author(s) See Also Examples

Description

calcKL takes in an object that inherits from SequenceSummary that has a kmers slot, and returns the terms of the K-L divergence sum (which correspond to items in the sample space, in this case, k-mers).

Usage

1
  calcKL(x)

Arguments

x

an S4 object a class that inherits from SequenceSummary.

Value

calcKL returns a data.frame with columns:

kmer

the k-mer sequence.

position

the position in the read.

kl

the K-L term for this k-mer in the K-L sum, calculated as p(i)*log2(p(i)/q(i)).

p

the probability for this k-mer, at this position.

q

the probability for this k-mer across all positions.

Note

The K-L divergence calculation in calcKL uses base 2 in the log; the units are in bits.

Author(s)

Vince Buffalo <vsbuffalo@ucdavis.edu>

See Also

kmerKLPlot, getKmer

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
  ## Load a somewhat contaminated FASTQ file
  s.fastq <- readSeqFile(system.file('extdata', 'test.fastq',
    package='qrqc'), hash.prop=1)

  ## As with getQual, this function is provided so custom graphics can
  ## be made easily. For example K-L divergence by position:
  kld <- with(calcKL(s.fastq), aggregate(kl, list(position),
    sum))
  colnames(kld) <- c("position", "KL")
  p <- ggplot(kld) + geom_line(aes(x=position, y=KL), color="blue")
  p + scale_y_continuous("K-L divergence")

vsbuffalo/qrqc documentation built on May 3, 2019, 7:07 p.m.