Description Usage Arguments Details Value Author(s) See Also Examples
Quantify methylation of cytosines from bisulfite sequencing data.
1 2 3 4 5 |
proj |
a |
query |
a |
reportLevel |
report results combined for C's
( |
, the default) or individually for single
alignments (reportLevel
=“alignment”). The latter imposes
further restrictions on some arguments (see ‘Details’).
mode |
cytosine quantification mode, one of:
|
collapseBySample |
if |
collapseByQueryRegion |
if |
asGRanges |
if |
mask |
an optional |
reference |
source of bam files; can be either “genome”
(then the alignments against the genome are used) or the name of an
auxiliary target sequence (then alignments against this target
sequence will be used). The auxiliary name must correspond to the
name contained in the auxiliary file refered by the
|
keepZero |
if |
mapqMin |
minimal mapping quality of alignments to be included when
counting (mapping quality must be greater than or equal to
|
mapqMax |
maximal mapping quality of alignments to be included when
counting (mapping quality must be less than or equal to |
clObj |
a cluster object to be used for parallel processing of multiple files (see ‘Details’). |
qMeth
can be used on a qProject
object from a bisulfite
sequencing experiment (sequencing of bisulfite-converted DNA), such as
the one returned by qAlign
when its parameter bisulfite
is
set to a different value than “no”.
qMeth
quantifies DNA methylation by counting total and
methylated events for individual cytosines, using the alignments that
have been generated in converted (three-letter) sequence space for
example by qAlign
. A methylated event corresponds to a C/C
match in the alignment, an unmethylated event to a T/C mismatch (or G/G
matches and A/G mismatches on the opposite strand). For paired-end
samples, the part of the left fragment alignment that overlaps
with the right fragment alignment is ignored, preventing the
use of redundant information coming from the same molecule.
Both directed (bisulfite
=“dir”) and undirected
(bisulfite
=“undir”) experimental protocols are supported
by qAlign
and qMeth
.
By default, results are returned per C nucleotide. If
reportLevel
=“alignment”, results are reported separately
for individual alignments. In that case, query
has to be a
GRanges
object with exactly one region, mode
has to be
either “CpG” or “allC”, the arguments
collapseByQueryRegion
, asGRanges
, mask
and
keepZero
have no effect and allele-specific projects are
treated in the same way as normal (non-allele specific) projects.
Using the parameter mode
, quantification can be limited to
cytosines in CpG context, and counts obtained for the two cytosines on
opposite strands within a single CpG can be combined (summed).
The quantification of methylation for all cytosines in the query region(s)
(mode
=“allC”) should be done with care, especially for
large query regions, as the return value may require a large amount of
memory.
If mode
is set to “var”, qMeth
only counts reads
from the strand opposite of the cytosine and reports total and
matching alignments. For a position identical to the reference
sequence, only matches (and very few sequencing errors) are
expected, independent on the methylation state of the cytosine. A
reduced fraction of alignments matching the reference are indicative
of sequence variations in the sequenced sample.
mapqMin
and mapqMax
allow to select alignments
based on their mapping qualities. mapqMin
and mapqMax
can
take integer values between 0 and 255 and equal to
-10
log10 Pr(mapping position is wrong), rounded to the nearest
integer. A value 255 indicates that the mapping quality is not available.
If an object that inherits from class cluster
is provided to
the clObj
argument, for example an object returned by
makeCluster
from package parallel,
the quantification task is split into multiple chunks and processed in
parallel using clusterApplyLB
from package
parallel. Not all tasks will be efficiently parallelized: For
example, a single query region and a single (group of) bam files will
not be split into multiple chunks.
For reportLevel
=“C”, a GRanges
object if
asGRanges
=TRUE
, otherwise a data.frame
.
Each row contains the coordinates of individual cytosines for
collapseByQueryRegion
=FALSE
or query regions
for collapseByQueryRegion
=TRUE
.
In addition to the coordinates columns (or seqnames
,
ranges
and strand
slots for GRanges
objects),
each row contains per bam file:
Two values (total and methylated events, with suffixes _T and _M), or
if the qProject
object was created including a SNP table,
six values (total and methylated events for Reference, Unknown and
Alternative genotypes, with suffixed _TR, _TU, _TA, _MR, _MU and _MA).
In the latter case, C's or CpG's that overlap with SNPs in the table
are removed.
If collapseBySample
=TRUE
, groups of bam files with
identical sample name are combined (summed) and will be represented by
a single set of total and methylated count columns.
If mode
=“var”, the _T and _M columns correspond to total
and matching alignments overlapping the guanine paired to the
cytosine.
For reportLevel
=“alignment”, a list
with one
element per bam file or sample (depending on
collapseBySample
). Each list element is another list with the
elements:
aid
: character vector with unique alignment identifiers
Cid
: integer vector with genomic coordinate of C base
strand
: character vector with the strand of the C base
meth
: integer vector with methylation state for
alignment and C defined by aid
and Cid
. The values are
1 for methylated or 0 for unmethylated states.
Anita Lerch, Dimos Gaidatzis and Michael Stadler
qAlign
,
makeCluster
from package parallel
1 2 3 4 5 6 7 8 9 10 11 12 | # copy example data to current working directory
file.copy(system.file(package="QuasR", "extdata"), ".", recursive=TRUE)
# create alignments
sampleFile <- "extdata/samples_bis_single.txt"
genomeFile <- "extdata/hg19sub.fa"
proj <- qAlign(sampleFile, genomeFile, bisulfite="dir")
proj
# calculate methylation states
meth <- qMeth(proj, mode="CpGcomb")
meth
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.