transcripts | R Documentation |
Generic functions to extract genomic features from a TxDb-like object. This page documents the methods for TxDb objects only.
transcripts(x, ...)
## S4 method for signature 'TxDb'
transcripts(x, columns=c("tx_id", "tx_name"), filter=NULL, use.names=FALSE)
exons(x, ...)
## S4 method for signature 'TxDb'
exons(x, columns="exon_id", filter=NULL, use.names=FALSE)
cds(x, ...)
## S4 method for signature 'TxDb'
cds(x, columns="cds_id", filter=NULL, use.names=FALSE)
genes(x, ...)
## S4 method for signature 'TxDb'
genes(x, columns="gene_id", filter=NULL, single.strand.genes.only=TRUE)
## S4 method for signature 'TxDb'
promoters(x, upstream=2000, downstream=200, use.names=TRUE, ...)
## S4 method for signature 'TxDb'
terminators(x, upstream=2000, downstream=200, use.names=TRUE, ...)
x |
A TxDb object. |
... |
For the For the |
columns |
Columns to include in the output.
Must be
If the vector is named, those names are used for the corresponding column in the element metadata of the returned object. |
filter |
Either |
use.names |
|
single.strand.genes.only |
If |
upstream , downstream |
For For For additional details see
|
These are the main functions for extracting features from a
TxDb-like object. Note that cds()
extracts the
bulk CDS parts stored in x
, that is, the CDS regions
associated with exons. It is often more useful to extract them grouped
by transcript with cdsBy()
.
To restrict the output based on interval information, use the
transcriptsByOverlaps()
, exonsByOverlaps()
,
or cdsByOverlaps()
function.
The promoters()
and terminators()
functions compute
user-defined promoter or terminator regions for the transcripts
in a TxDb-like object. The returned object is a
GRanges with one range per transcript
in the TxDb-like object.
Each range represents the promoter or terminator region associated
with a transcript, that is, the region around the TSS
(transcription start site) or TES (transcription end site) the
span of which is defined by upstream
and downstream
.
For additional details on how the promoter and terminator ranges
are computed and the handling of +
and -
strands see
?GenomicRanges::promoters
in the
GenomicRanges package.
A GRanges object. The only exception being
when genes()
is used with single.strand.genes.only=FALSE
,
in which case a GRangesList object is returned.
M. Carlson, P. Aboyoun and H. Pagès
transcriptsBy
and transcriptsByOverlaps
for more ways to extract genomic features
from a TxDb-like object.
transcriptLengths
for extracting the transcript
lengths (and other metrics) from a TxDb object.
exonicParts
and intronicParts
for
extracting non-overlapping exonic or intronic parts from a
TxDb-like object.
extendExonsIntoIntrons
for extending exons
into their adjacent introns.
extractTranscriptSeqs
for extracting transcript
(or CDS) sequences from reference sequences.
getPromoterSeq
for extracting gene promoter or
terminator sequences.
coverageByTranscript
for computing coverage by
transcript (or CDS) of a set of ranges.
select-methods for how to use the simple "select" interface to extract information from a TxDb object.
microRNAs
and tRNAs
for extracting
microRNA or tRNA genomic ranges from a TxDb object.
id2name
for mapping TxDb internal ids
to external names for a given feature type.
The TxDb class.
txdb_file <- system.file("extdata", "hg19_knownGene_sample.sqlite",
package="GenomicFeatures")
txdb <- loadDb(txdb_file)
## ---------------------------------------------------------------------
## transcripts()
## ---------------------------------------------------------------------
tx1 <- transcripts(txdb)
tx1
transcripts(txdb, use.names=TRUE)
transcripts(txdb, columns=NULL, use.names=TRUE)
filter <- list(tx_chrom = c("chr3", "chr5"), tx_strand = "+")
tx2 <- transcripts(txdb, filter=filter)
tx2
## Sanity checks:
stopifnot(
identical(mcols(tx1)$tx_id, seq_along(tx1)),
identical(tx2, tx1[seqnames(tx1) == "chr3" & strand(tx1) == "+"])
)
## ---------------------------------------------------------------------
## exons()
## ---------------------------------------------------------------------
exons(txdb, columns=c("EXONID", "TXNAME"),
filter=list(exon_id=1))
exons(txdb, columns=c("EXONID", "TXNAME"),
filter=list(tx_name="uc009vip.1"))
## ---------------------------------------------------------------------
## genes()
## ---------------------------------------------------------------------
genes(txdb) # a GRanges object
cols <- c("tx_id", "tx_chrom", "tx_strand",
"exon_id", "exon_chrom", "exon_strand")
## By default, genes are returned in a GRanges object and those that
## cannot be represented by a single genomic range (because they have
## exons located on both strands of the same reference sequence or on
## more than one reference sequence) are dropped with a message:
single_strand_genes <- genes(txdb, columns=cols)
## Because we've returned single strand genes only, the "tx_chrom"
## and "exon_chrom" metadata columns are guaranteed to match
## 'seqnames(single_strand_genes)':
stopifnot(identical(as.character(seqnames(single_strand_genes)),
as.character(mcols(single_strand_genes)$tx_chrom)))
stopifnot(identical(as.character(seqnames(single_strand_genes)),
as.character(mcols(single_strand_genes)$exon_chrom)))
## and also the "tx_strand" and "exon_strand" metadata columns are
## guaranteed to match 'strand(single_strand_genes)':
stopifnot(identical(as.character(strand(single_strand_genes)),
as.character(mcols(single_strand_genes)$tx_strand)))
stopifnot(identical(as.character(strand(single_strand_genes)),
as.character(mcols(single_strand_genes)$exon_strand)))
all_genes <- genes(txdb, columns=cols, single.strand.genes.only=FALSE)
all_genes # a GRangesList object
multiple_strand_genes <- all_genes[elementNROWS(all_genes) >= 2]
multiple_strand_genes
mcols(multiple_strand_genes)
## ---------------------------------------------------------------------
## promoters() and terminators()
## ---------------------------------------------------------------------
## This:
promoters(txdb, upstream=100, downstream=50)
## is equivalent to:
tx <- transcripts(txdb, use.names=TRUE)
promoters(tx, upstream=100, downstream=50)
## And this:
terminators(txdb, upstream=100, downstream=50)
## is equivalent to:
terminators(tx, upstream=100, downstream=50)
## Extra arguments are passed to transcripts(). So this:
columns <- c("tx_name", "gene_id")
promoters(txdb, upstream=100, downstream=50, columns=columns)
## is equivalent to:
promoters(transcripts(txdb, columns=columns, use.names=TRUE),
upstream=100, downstream=50)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.