SplicingGraphs-class: SplicingGraphs objects

SplicingGraphs-classR Documentation

SplicingGraphs objects

Description

The SplicingGraphs class is a container for storing splicing graphs together with the gene model that they are based on.

Usage

## Constructor:

SplicingGraphs(x, grouping=NULL, min.ntx=2, max.ntx=NA, check.introns=TRUE)

## SplicingGraphs basic API:

## S4 method for signature 'SplicingGraphs'
length(x)

## S4 method for signature 'SplicingGraphs'
names(x)

## S4 method for signature 'SplicingGraphs'
seqnames(x)

## S4 method for signature 'SplicingGraphs'
strand(x)

## S4 method for signature 'SplicingGraphs,ANY,ANY,ANY'
x[i, j, ... , drop=TRUE]

## S4 method for signature 'SplicingGraphs,ANY,ANY'
x[[i, j, ...]]

## S4 method for signature 'SplicingGraphs'
elementNROWS(x)

## S4 method for signature 'SplicingGraphs'
unlist(x, recursive=TRUE, use.names=TRUE)

## S4 method for signature 'SplicingGraphs'
seqinfo(x)

Arguments

x

For SplicingGraphs: A GRangesList object containing the exons of one or more genes grouped by transcript. Alternatively, x can be a TxDb object. See Details section below.

For the methods in the SplicingGraphs basic API: A SplicingGraphs object.

grouping

An optional object that represents the grouping by gene of the top-level elements (i.e. the transcripts) in x. See Details section below.

min.ntx, max.ntx

Single integers (or NA for max.ntx) specifying the minimum and maximum number of transcripts a gene must have to be considered for inclusion in the object returned by SplicingGraphs. A value of NA for max.ntx means no maximum.

check.introns

If TRUE, SplicingGraphs checks that, within each transcript, exons are ordered from 5' to 3' with gaps of at least 1 nucleotide between them.

i, j, ..., drop

A SplicingGraphs object is a list-like object and therefore it can be subsetted like a list. When subsetting with [, the result is another SplicingGraphs object containing only the selected genes. When subsetting with [[, the result is an unnamed GRangesList object containing the exons grouped by transcript. Like for list, subsetting only accepts 1 argument (i). The drop argument is ignored and trying to pass any additional argument (to j or in ...) will raise an error.

recursive, use.names

A SplicingGraphs object is a list-like object and therefore it can be unlisted with unlist. The result is a GRangesList object containing the exons grouped by transcript. By default this object has names on it, and the names are the gene ids. Note that because each element in this object represents a transcript (and not a gene), the names are not unique. If use.names=FALSE is used, the result has no names on it. The recursive agument is ignored.

Details

The Splicing graph theory only applies to genes that have all the exons of all their transcripts on the same chromosome and strand. In particular, in its current form, the splicing graph theory cannot describe trans-splicing events. The SplicingGraphs constructor will reject genes that do not satisfy this.

The first argument of the SplicingGraphs constructor, x, can be either a GRangesList object or a TxDb object.

When x is a GRangesList object, it must contain the exons of one or more genes grouped by transcripts. More precisely, each top-level element in x must contain the genomic ranges of the exons for a particular transcript. Typically x will be obtained from a TxDb object txdb with exonsBy(txdb, by="tx", use.names=TRUE).

grouping is an optional argument that is only supported when x is a GRangesList object. It represents the grouping by gene of the top-level elements (i.e. the transcripts) in GRangesList object x. It can be either:

  • Missing (i.e. NULL). In that case, all the transcripts in x are considered to belong to the same gene and the SplicingGraphs object returned by SplicingGraphs will be unnamed.

  • A list of integer or character vectors, or an IntegerList, or a CharacterList object, of length the number of genes to process, and where grouping[[i]] is a vector of valid subscripts in x pointing to all the transcripts of the i-th gene.

  • A factor, character vector, or integer vector, of the same length as x and 1 level per gene.

  • A named GRangesList object containing transcripts grouped by genes i.e. each top-level element in grouping contains the genomic ranges of the transcripts for a particular gene. In that case, the grouping is inferred from the tx_id (or alternatively tx_name) metadata column of unlist(grouping) and all the values in that column must be in names(x). If x was obtained with exonsBy(txdb, by="tx", use.names=TRUE), then the GRangesList object used for grouping would typically be obtained with transcriptsBy(txdb, by="gene").

  • A data.frame or DataFrame with 2 character vector columns: a gene_id column (factor, character vector, or integer vector), and a tx_id (or alternatively tx_name) column. In that case, x must be named and all the values in the tx_id (or tx_name) column must be in names(x).

Value

For SplicingGraphs: a SplicingGraphs object with 1 element per gene.

For length: the number of genes in x, which is also the number of splicing graphs in x.

For names: the gene ids. Note that the names on a SplicingGraphs object are always unique and cannot be modified.

For seqnames: a named factor of the length of x containing the name of the chromosome for each gene.

For strand: a named factor of the length of x containing the strand for each gene.

For elementNROWS: the number of transcripts per gene.

For seqinfo: the seqinfo of the GRangesList or TxDb object that was used to construct the SplicingGraphs object.

Author(s)

H. Pagès

References

Heber, S., Alekseyev, M., Sze, S., Tang, H., and Pevzner, P. A. Splicing graphs and EST assembly problem Bioinformatics Date: Jul 2002 Vol: 18 Pages: S181-S188

Sammeth, M. (2009) Complete Alternative Splicing Events Are Bubbles in Splicing Graphs J. Comput. Biol. Date: Aug 2009 Vol: 16 Pages: 1117-1140

See Also

This man page is part of the SplicingGraphs package. Please see ?`SplicingGraphs-package` for an overview of the package and for an index of its man pages.

Other topics related to this man page and documented in other packages:

  • The exonsBy and transcriptsBy functions, and the TxDb class, defined in the GenomicFeatures package.

  • The GRangesList class defined in the GenomicRanges package.

  • The IntegerList and CharacterList classes defined in the IRanges package.

  • The DataFrame class defined in the S4Vectors package.

Examples

## ---------------------------------------------------------------------
## 1. Load a toy gene model as a TxDb object
## ---------------------------------------------------------------------

library(txdbmaker)
suppressWarnings(
  toy_genes_txdb <- makeTxDbFromGFF(toy_genes_gff())
)

## ---------------------------------------------------------------------
## 2. Compute all the splicing graphs (1 graph per gene) and return them
##    in a SplicingGraphs object
## ---------------------------------------------------------------------

## Extract the exons grouped by transcript:
ex_by_tx <- exonsBy(toy_genes_txdb, by="tx", use.names=TRUE)

## Extract the transcripts grouped by gene:
tx_by_gn <- transcriptsBy(toy_genes_txdb, by="gene")

sg <- SplicingGraphs(ex_by_tx, tx_by_gn)
sg

## Alternatively 'sg' can be constructed directly from the TxDb
## object:
sg2 <- SplicingGraphs(toy_genes_txdb)  # same as 'sg'
sg2

## Note that because SplicingGraphs objects have a slot that is an
## environment (for caching the bubbles), they cannot be compared with
## 'identical()' (will always return FALSE). 'all.equal()' should be
## used instead:
stopifnot(isTRUE(all.equal(sg2, sg)))

## 'sg' has 1 element per gene and 'names(sg)' gives the gene ids:
length(sg)
names(sg)

## ---------------------------------------------------------------------
## 3. Basic manipulation of a SplicingGraphs object
## ---------------------------------------------------------------------

## Basic accessors:
seqnames(sg)
strand(sg)
seqinfo(sg)

## Number of transcripts per gene:
elementNROWS(sg)

## The transcripts of a given gene can be extracted with [[. The result
## is an *unnamed* GRangesList object containing the exons grouped by
## transcript:
sg[["geneD"]]

## See '?plotTranscripts' for how to plot those transcripts.

## The transcripts of all the genes can be extracted with unlist(). The
## result is a *named* GRangesList object containing the exons grouped
## by transcript. The names on the object are the gene ids:
ex_by_tx <- unlist(sg)
ex_by_tx

Bioconductor/SplicingGraphs documentation built on Nov. 3, 2024, 8:11 a.m.