Description Usage Arguments Details Value Author(s) See Also Examples
Chimeric reads may be caused by sequencing a chromosomal aberration or by technical issues during sample preparation. This method implements several filter steps to remove false chimeric reads.
1 2 | ## S4 method for signature 'list,IntegerRangesList,DNAString,numeric,numeric'
filterChimericReads(alnReads, targetRegion, linkerSeq, minDist, dupReadDist)
|
alnReads |
A list storing the aligned reads as produced by the function scanBam. |
targetRegion |
A object of class IRangesList containing the target region of e.g. a used capture array. The parameter may be omitted in case of a non targeted sequencing approach. |
linkerSeq |
A linker sequence that was used during sample preparation. It may be omitted. |
minDist |
The minimum distance between two local alignments (see details), default 1000 |
dupReadDist |
The maximum distance between the 5 prime start position of two duplicated reads (see details), default 1. |
The following filter steps are performed:
1. All chimeric reads with exactly two local alignments are extracted. Reads with more than two local alignments are discarded.
2. If the targetRegion argument is given, chimeric reads must have one local alignment at least overlapping the the target region. If both local alignments are outside the target region, the read is discarded.
3. If the linkerSeq argument is given, all chimeric reads that have the linker sequence between their local alignments are removed. When searching the linker sequence, 4 mismatches or indels are allowed and the linker sequence must not start or end within the first or last ten bases of the read. The function searches for the linkerSeq and for it's reverse complement.
4. Two local alignment of a read must have minDist reads between the alignments (if both alignment are on the same chromosome). Otherwise, the read seems to span a deletion and not a chromosomal aberration and is discarded.
5. Duplicated reads are removed. Two reads are duplicated, if the lie on the same strand and have the same 5 prime start position. Due to sequencing and alignment errors, the start position may vary for a maximum of dupReadDist bases. In case of duplicated reads, only the longest read is kept.
Reads passing all filtering steps are returned in the list
structure as given by the alnReads
argument (as derived from
the scanBam
method). A data frame with information about the
number of reads that passed each filter is added to the list.
A list
containing only filtered chimeric reads. The list
has the same structure like the given argument
alnReads
. Additionally, one element “log” with logging
information of each filtering step is added.
Hans-Ulrich Klein
detectBreakpoints
,
mergeBreakpoints
,
Breakpoints-class
,
scanBam
, sequenceCaptureLinkers
1 2 3 4 5 6 | library(Rsamtools)
bamFile = system.file("extdata", "SVDetection", "bam", "N01.bam", package="R453Plus1Toolbox")
bam = scanBam(bamFile)
data(captureArray)
linker = sequenceCaptureLinkers("gSel3")[[1]]
filterReads = filterChimericReads(bam, targetRegion=captureArray, linkerSeq=linker)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.