loadCoverage | R Documentation |
For a group of samples this function reads the coverage information for a specific chromosome directly from the BAM files. It then merges them into a DataFrame and removes the bases that do not pass the cutoff.
loadCoverage(
files,
chr,
cutoff = NULL,
filter = "one",
chrlen = NULL,
output = NULL,
bai = NULL,
...
)
files |
A character vector with the full path to the sample BAM files
(or BigWig files).
The names are used for the column names of the DataFrame. Check
rawFiles for constructing |
chr |
Chromosome to read. Should be in the format matching the one used in the raw data. |
cutoff |
This argument is passed to filterData. |
filter |
Has to be either |
chrlen |
The chromosome length in base pairs. If it's |
output |
If |
bai |
The full path to the BAM index files. If |
... |
Arguments passed to other methods and/or advanced arguments. Advanced arguments:
Passed to extendedMapSeqlevels, define_cluster,
scanBamFlag and filterData.
Note that filterData is used internally
by loadCoverage and has the important arguments |
The ...
argument can be used to control which alignments to consider
when reading from BAM files. See scanBamFlag.
Parallelization for loading the data in chunks is used only used when
tilewidth
is specified. You may use up to one core per tile.
If you set the advanced argument drop.D = TRUE
, bases with CIGAR
string "D" (deletion from reference) will be excluded from the base-level
coverage calculation.
If you are working with data from an organism different from 'Homo sapiens'
specify so by setting the global 'species' and 'chrsStyle' options. For
example:
options(species = 'arabidopsis_thaliana')
options(chrsStyle = 'NCBI')
A list with two components.
is a DataFrame object where each column represents a sample. The number of rows depends on the number of base pairs that passed the cutoff and the information stored is the coverage at that given base.
is a logical Rle with the positions of the chromosome that passed the cutoff.
Leonardo Collado-Torres, Andrew Jaffe
fullCoverage, getTotalMapped
datadir <- system.file("extdata", "genomeData", package = "derfinder")
files <- rawFiles(
datadir = datadir, samplepatt = "*accepted_hits.bam$",
fileterm = NULL
)
## Shorten the column names
names(files) <- gsub("_accepted_hits.bam", "", names(files))
## Read and filter the data, only for 2 files
dataSmall <- loadCoverage(files = files[1:2], chr = "21", cutoff = 0)
## Not run:
## Export to BigWig files
createBw(list("chr21" = dataSmall))
## Load data from BigWig files
dataSmall.bw <- loadCoverage(c(
ERR009101 = "ERR009101.bw", ERR009102 =
"ERR009102.bw"
), chr = "chr21")
## Compare them
mapply(function(x, y) {
x - y
}, dataSmall$coverage, dataSmall.bw$coverage)
## Note that the only difference is the type of Rle (integer vs numeric) used
## to store the data.
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.