Description Usage Arguments Details Value Author(s) Examples
When using single-ended sequencing, the resulting partial sequences map only
in one strand, causing a bias in the coverage profile if not corrected. The
only way to correct this is knowing the average size of the real fragments.
nucleR
uses this information when preprocessing single-ended sequences.
You can provide this information by your own (usually a 147bp length is a
good aproximation) or you can use this method to automatically guess the
size of the inserts.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | fragmentLenDetect(
reads,
samples = 1000,
window = 5000,
min.shift = 1,
max.shift = 100,
mc.cores = 1,
as.shift = FALSE
)
## S4 method for signature 'AlignedRead'
fragmentLenDetect(
reads,
samples = 1000,
window = 1000,
min.shift = 1,
max.shift = 100,
mc.cores = 1,
as.shift = FALSE
)
## S4 method for signature 'GRanges'
fragmentLenDetect(
reads,
samples = 1000,
window = 1000,
min.shift = 1,
max.shift = 100,
mc.cores = 1,
as.shift = FALSE
)
|
reads |
Raw single-end reads ShortRead::AlignedRead or GenomicRanges::GRanges format) |
samples |
Number of samples to perform the analysis (more = slower but more accurate) |
window |
Analysis window. Usually there's no need to touch this parameter. |
min.shift, max.shift |
Minimum and maximum shift to apply on the strands to detect the optimal fragment size. If the range is too big, the performance decreases. |
mc.cores |
If multicore support, maximum number of cores allowed to use. |
as.shift |
If TRUE, returns the shift needed to align the middle of the reads in opposite strand. If FALSE, returns the mean inferred fragment length. |
This function shifts one strand downstream one base by one from min.shift
to max.shift
. In every step, the correlation on a random position of
length window
is checked between both strands. The maximum correlation is
returned and averaged for samples
repetitions.
The final returned length is the best shift detected plus the width of the
reads. You can increase the performance of this function by reducing the
samples
value and/or narrowing the shift range. The window
size has
almost no impact on the performance, despite a to small value can give
biased results.
Inferred mean lenght of the inserts by default, or shift needed to
align strands if as.shift=TRUE
.
Oscar Flores oflores@mmb.pcb.ub.es
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | library(GenomicRanges)
library(IRanges)
# Create a sinthetic dataset, simulating single-end reads, for positive and
# negative strands
# Positive strand reads
pos <- syntheticNucMap(nuc.len=40, lin.len=130)$syn.reads
# Negative strand (shifted 147bp)
neg <- IRanges(end=start(pos)+147, width=40)
sim <- GRanges(
seqnames="chr1",
ranges=c(pos, neg),
strand=c(rep("+", length(pos)), rep("-", length(neg)))
)
# Detect fragment lenght (we know by construction it is really 147)
fragmentLenDetect(sim, samples=50)
# The function restricts the sampling to speed up the example
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.