Description Details Author(s) Examples
Nucleosome positioning from Tiling Arrays and High-Troughput Sequencing Experiments
Package: | nucleR |
Type: | Package |
License: | LGPL (>= 3) |
LazyLoad: | yes |
This package provides a convenient pipeline to process and analize nucleosome positioning experiments from High-Troughtput Sequencing or Tiling Arrays.
Despite it's use is intended to nucleosome experiments, it can be also useful for general ChIP experiments, such as ChIP-on-ChIP or ChIP-Seq.
See following example for a brief introduction to the available functions
Oscar Flores Ricard Illa
Maintainer: Ricard Illa ricard.illa@irbbarcelona.org
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | library(ggplot2)
# Load example dataset:
# some NGS paired-end reads, mapped with Bowtie and processed with R
# it is a GRanges object with the start/end coordinates for each read.
reads <- get(data(nucleosome_htseq))
# Process the paired end reads, but discard those with length > 200
preads_orig <- processReads(reads, type="paired", fragmentLen=200)
# Process the reads, but now trim each read to 40bp around the dyad
preads_trim <- processReads(reads, type="paired", fragmentLen=200, trim=40)
# Calculate the coverage, directly in reads per million (r.p.m.)
cover_orig <- coverage.rpm(preads_orig)
cover_trim <- coverage.rpm(preads_trim)
# Compare both coverages, the dyad is much more clear in trimmed version
t1 <- as.vector(cover_orig[[1]])[1:2000]
t2 <- as.vector(cover_trim[[1]])[1:2000]
t1 <- (t1-min(t1)) / max(t1-min(t1)) # Normalization
t2 <- (t2-min(t2)) / max(t2-min(t2)) # Normalization
plot_data <- rbind(
data.frame(
x=seq_along(t1),
y=t1,
coverage="original"
),
data.frame(
x=seq_along(t2),
y=t2,
coverage="trimmed"
)
)
qplot(x=x, y=y, color=coverage, data=plot_data, geom="line",
xlab="position", ylab="coverage")
# Let's try to call nucleosomes from the trimmed version
# First of all, let's remove some noise with FFT
# Power spectrum will be plotted, look how with a 2%
# of the components we capture almost all the signal
cover_clean <- filterFFT(cover_trim, pcKeepComp=0.02, showPowerSpec=TRUE)
# How clean is it now?
plot_data <- rbind(
data.frame(
x=1:4000,
y=as.vector(cover_trim[[1]])[1:4000],
coverage="noisy"
),
data.frame(
x=1:4000,
y=as.vector(cover_clean[[1]])[1:4000],
coverage="filtered"
)
)
qplot(x=x, y=y, color=coverage, data=plot_data, geom="line",
xlab="position", ylab="coverage")
# And how similar? Let's see the correlation
cor(cover_clean[[1]], as.vector(cover_trim[[1]]))
# Now it's time to call for peaks, first just as points
# See that the score is only a measure of the height of the peak
peaks <- peakDetection(cover_clean, threshold="25%", score=TRUE)
plotPeaks(peaks[[1]], cover_clean[[1]], threshold="25%")
# Do the same as previously, but now we will create the nucleosome calls:
peaks <- peakDetection(cover_clean, width=147, threshold="25%", score=TRUE)
plotPeaks(peaks, cover_clean[[1]], threshold="25%")
#This is all. From here, you can filter, merge or work with the nucleosome
#calls using standard IRanges functions and R/Bioconductor manipulation
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.