featuretypeCounts | R Documentation |
Counts how many reads in short read alignment files (BAM format) overlap with
entire annotation categories. This utility is useful for analyzing the
distribution of the read mappings across feature types, e.g. coding versus non-coding
genes. By default the read counts are reported for the sense and antisense
strand of each feature type separately. To minimize memory consumption, the
BAM files are processed in a stream using utilities from the Rsamtools
and GenomicAlignment
packages. The counts can be reported for each
read length separately or as a single value for reads of any length.
Subsequently, the counting results can be plotted with the associated
plotfeaturetypeCounts
function.
featuretypeCounts(bfl, grl, singleEnd = TRUE, readlength = NULL, type = "data.frame")
bfl |
|
grl |
|
singleEnd |
Specifies whether the targets BAM files contain alignments for single-end (SE) or paired-end
read data. |
readlength |
Integer vector specifying the read length values for which to report counts
separately. If |
type |
Determines whether the results are returned as |
The results are returned as data.frame
or list
of data.frames
.
For details see above under types
argument. The result data.frames
contain
the following columns in the given order:
SampleName |
Sample names obtained from |
Strand |
Sense or antisense strand of read mappings. |
Featuretype |
Name of feature type provided by |
Featuretypelength |
Total genomic length of each reduced feature type in bases. This value is useful to normalize the read counts by genomic length units, e.g. in plots. |
Subsequent columns |
Counts for reads of any length or for individual read lengths. |
Thomas Girke
plotfeaturetypeCounts
, genFeatures
## Construct SYSargs2 object from param and targets files
targets <- system.file("extdata", "targets.txt", package="systemPipeR")
dir_path <- system.file("extdata/cwl", package="systemPipeR")
args <- loadWorkflow(targets=targets, wf_file="hisat2/hisat2-mapping-se.cwl",
input_file="hisat2/hisat2-mapping-se.yml", dir_path=dir_path)
args <- renderWF(args, inputvars=c(FileName="_FASTQ_PATH1_", SampleName="_SampleName_"))
args
## Not run:
## Run alignments
args <- runCommandline(args, dir = FALSE, make_bam = TRUE)
outpaths <- subsetWF(args, slot = "output", subset = 1, index = 1)
## Features from sample data of systemPipeRdata package
library(txdbmaker)
file <- system.file("extdata/annotation", "tair10.gff", package="systemPipeRdata")
txdb <- makeTxDbFromGFF(file=file, format="gff3", organism="Arabidopsis")
feat <- genFeatures(txdb, featuretype="all", reduce_ranges=TRUE, upstream=1000, downstream=0, verbose=TRUE)
## Generate and plot feature counts for specific read lengths
fc <- featuretypeCounts(bfl=BamFileList(outpaths, yieldSize=50000), grl=feat, singleEnd=TRUE, readlength=c(74:76,99:102), type="data.frame")
p <- plotfeaturetypeCounts(x=fc, graphicsfile="featureCounts.pdf", graphicsformat="pdf", scales="fixed", anyreadlength=FALSE)
## Generate and plot feature counts for any read length
fc2 <- featuretypeCounts(bfl=BamFileList(outpaths, yieldSize=50000), grl=feat, singleEnd=TRUE, readlength=NULL, type="data.frame")
p2 <- plotfeaturetypeCounts(x=fc2, graphicsfile="featureCounts2.pdf", graphicsformat="pdf", scales="fixed", anyreadlength=TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.