nucleR-package: Nucleosome positioning package for R
In nucleR: Nucleosome positioning package for R

Nucleosome positioning from Tiling Arrays and High-Troughput Sequencing Experiments

Package:	nucleR
Type:	Package
License:	LGPL (>= 3)
LazyLoad:	yes

This package provides a convenient pipeline to process and analize nucleosome positioning experiments from High-Troughtput Sequencing or Tiling Arrays.

Despite it's use is intended to nucleosome experiments, it can be also useful for general ChIP experiments, such as ChIP-on-ChIP or ChIP-Seq.

See following example for a brief introduction to the available functions

Oscar Flores Ricard Illa

Maintainer: Ricard Illa ricard.illa@irbbarcelona.org

library(ggplot2)
# Load example dataset:
# some NGS paired-end reads, mapped with Bowtie and processed with R
# it is a GRanges object with the start/end coordinates for each read.
reads <- get(data(nucleosome_htseq))

# Process the paired end reads, but discard those with length > 200
preads_orig <- processReads(reads, type="paired", fragmentLen=200)

# Process the reads, but now trim each read to 40bp around the dyad
preads_trim <- processReads(reads, type="paired", fragmentLen=200, trim=40)

# Calculate the coverage, directly in reads per million (r.p.m.)
cover_orig <- coverage.rpm(preads_orig)
cover_trim <- coverage.rpm(preads_trim)

# Compare both coverages, the dyad is much more clear in trimmed version
t1 <- as.vector(cover_orig[[1]])[1:2000]
t2 <- as.vector(cover_trim[[1]])[1:2000]
t1 <- (t1-min(t1)) / max(t1-min(t1))  # Normalization
t2 <- (t2-min(t2)) / max(t2-min(t2))  # Normalization

plot_data <- rbind(
    data.frame(
        x=seq_along(t1),
        y=t1,
        coverage="original"
    ),
    data.frame(
        x=seq_along(t2),
        y=t2,
        coverage="trimmed"
    )
)
qplot(x=x, y=y, color=coverage, data=plot_data, geom="line",
  xlab="position", ylab="coverage")

# Let's try to call nucleosomes from the trimmed version
# First of all, let's remove some noise with FFT
# Power spectrum will be plotted, look how with a 2%
# of the components we capture almost all the signal
cover_clean <- filterFFT(cover_trim, pcKeepComp=0.02, showPowerSpec=TRUE)

# How clean is it now?
plot_data <- rbind(
    data.frame(
        x=1:4000,
        y=as.vector(cover_trim[[1]])[1:4000],
        coverage="noisy"
    ),
    data.frame(
        x=1:4000,
        y=as.vector(cover_clean[[1]])[1:4000],
        coverage="filtered"
    )
)
qplot(x=x, y=y, color=coverage, data=plot_data, geom="line",
  xlab="position", ylab="coverage")

# And how similar? Let's see the correlation
cor(cover_clean[[1]], as.vector(cover_trim[[1]]))

# Now it's time to call for peaks, first just as points
# See that the score is only a measure of the height of the peak
peaks <- peakDetection(cover_clean, threshold="25%", score=TRUE)
plotPeaks(peaks[[1]], cover_clean[[1]], threshold="25%")

# Do the same as previously, but now we will create the nucleosome calls:
peaks <- peakDetection(cover_clean, width=147, threshold="25%", score=TRUE)
plotPeaks(peaks, cover_clean[[1]], threshold="25%")

#This is all. From here, you can filter, merge or work with the nucleosome
#calls using standard IRanges functions and R/Bioconductor manipulation