TelescopeParam-class: Telescope parameter class

TelescopeParam-classR Documentation

Telescope parameter class

Description

This is a class for storing parameters provided to the Telescope algorithm.

Build an object of the class TelescopeParam.

Usage

TelescopeParam(
  bfl,
  teFeatures,
  aggregateby = character(0),
  ovMode = "ovUnion",
  geneFeatures = NULL,
  singleEnd = TRUE,
  strandMode = 1L,
  ignoreStrand = FALSE,
  fragments = FALSE,
  minOverlFract = 0.2,
  pi_prior = 0L,
  theta_prior = 0L,
  em_epsilon = 1e-07,
  maxIter = 100L,
  reassign_mode = "exclude",
  conf_prob = 0.9,
  verbose = TRUE
)

## S4 method for signature 'TelescopeParam'
show(object)

Arguments

bfl

A BamFile or BamFileList object, or a character string vector of BAM filenames.

teFeatures

A GRanges or GRangesList object. Elements in this object should have names, which are used as a grouping factor for genomic ranges forming a common locus (equivalent to "locus" column in Telescope). This grouping is performed previous to TE expression quantification, unlike the aggregation of quantifications performed when the aggregateby parameter is specified, which is performed after individual TE instances are quantified.

aggregateby

Character vector with column names from the annotation to be used to aggregate quantifications. By default, this is an empty vector, which means that the names of the input GRanges or GRangesList object given in the teFeatures parameter are used to aggregate quantifications.

ovMode

Character vector indicating the overlapping mode. Available options are: "ovUnion" (default) and "ovIntersectionStrict", which implement the corresponding methods from HTSeq (https://htseq.readthedocs.io/en/release_0.11.1/count.html). Ambiguous alignments (alignments overlapping > 1 feature) are addressed as in the original Telescope method: the overlap with the longest overlapping length is kept.

geneFeatures

(Default NULL) A GRanges or GRangesList object with the gene annotated features to be quantified. The TEtranscripts approach for gene expression quantification is used, in which overlaps with multi-mapping reads are preferentially assigned to TEs. Elements should have names indicating the gene name/id. In case that geneFeatures is a GRanges and contains a metadata column named type, only the elements with type = exon are considered for the analysis. Then, exon counts are summarized to the gene level. If NULL, gene expression is not quantified.

singleEnd

(Default TRUE) Logical value indicating if reads are single (TRUE) or paired-end (FALSE).

strandMode

(Default 1) Numeric vector which can take values 0, 1 or 2. The strand mode is a per-object switch on GAlignmentPairs objects that controls the behavior of the strand getter. See GAlignmentPairs class for further detail. If singleEnd = TRUE, then strandMode is ignored.

ignoreStrand

(Default FALSE) A logical which defines if the strand should be taken into consideration when computing the overlap between reads and annotated features. When ignoreStrand = FALSE, an aligned read is considered to be overlapping an annotated feature as long as they have a non-empty intersecting genomic range on the same strand, while when ignoreStrand = TRUE the strand is not considered.

fragments

(Default FALSE) A logical; applied to paired-end data only. When fragments=FALSE, the read-counting method only counts ‘mated pairs’ from opposite strands (non-ambiguous properly paired reads), while when fragments=TRUE same-strand pairs, singletons, reads with unmapped pairs and other ambiguous or not properly paired fragments are also counted (see "Pairing criteria" in readGAlignments()). fragments=TRUE is equivalent to the original Telescope algorithm. For further details see summarizeOverlaps().

minOverlFract

(Default 0.2) A numeric scalar. minOverlFract is multiplied by the read length and the resulting value is used to discard alignments for which the overlapping length (number of base pairs the alignment and the feature overlap) is lower. When no minimum overlap is required, set minOverlFract = 0.

pi_prior

(Default 0) A positive integer scalar indicating the prior on pi. This is equivalent to adding n unique reads.

theta_prior

(Default 0) A positive integer scalar storing the prior on Q. Equivalent to adding n non-unique reads.

em_epsilon

(Default 1e-7) A numeric scalar indicating the EM Algorithm Epsilon cutoff.

maxIter

A positive integer scalar storing the maximum number of iterations of the EM SQUAREM algorithm (Du and Varadhan, 2020). Default is 100 and this value is passed to the maxiter parameter of the squarem() function.

reassign_mode

(Default 'exclude') Character vector indicating reassignment mode after EM step. Available methods are 'exclude' (reads with more than one best assignment are excluded from the final counts), 'choose' (when reads have more than one best assignment, one of them is randomly chosen), 'average' (the read count is divided evenly among the best assignments) and 'conf' (only assignments that exceed a certain threshold -defined by conf_prob parameter- are accepted, then the read count is proportionally divided among the assignments above conf_prob).

conf_prob

(Default 0.9) Minimum probability for high confidence assignment.

verbose

(Default TRUE) Logical value indicating whether to report progress.

object

A TelescopeParam object.

Details

This is the constructor function for objects of the class TelescopeParam-class. This type of object is the input to the function qtex() for quantifying expression of transposable elements, which will call the Telescope algorithm Bendall et al. (2019) with this type of object.

Value

A TelescopeParam object.

Slots

singleEnd

(Default TRUE) Logical value indicating if reads are single (TRUE) or paired-end (FALSE).

strandMode

(Default 1) Numeric vector which can take values 0, 1 or 2. The strand mode is a per-object switch on GAlignmentPairs objects that controls the behavior of the strand getter. See GAlignmentPairs class for further detail. If singleEnd = TRUE, then strandMode is ignored.

ignoreStrand

(Default FALSE) A logical which defines if the strand should be taken into consideration when computing the overlap between reads and annotated features. When ignoreStrand = FALSE, an aligned read is considered to be overlapping an annotated feature as long as they have a non-empty intersecting genomic range on the same strand, while when ignoreStrand = TRUE the strand is not considered.

fragments

(Default FALSE) A logical; applied to paired-end data only. When fragments=FALSE, the read-counting method only counts ‘mated pairs’ from opposite strands (non-ambiguous properly paired reads), while when fragments=TRUE same-strand pairs, singletons, reads with unmapped pairs and other ambiguous or not properly paired fragments are also counted (see "Pairing criteria" in readGAlignments()). fragments=TRUE is equivalent to the original Telescope algorithm. For further details see summarizeOverlaps().

minOverlFract

(Default 0.2) A numeric scalar. minOverlFract is multiplied by the read length and the resulting value is used to discard alignments for which the overlapping length (number of base pairs the alignment and the feature overlap) is lower. When no minimum overlap is required, set minOverlFract = 0.

pi_prior

(Default 0) A positive integer scalar indicating the prior on pi. This is equivalent to adding n unique reads.

theta_prior

(Default 0) A positive integer scalar storing the prior on Q. Equivalent to adding n non-unique reads.

em_epsilon

(Default 1e-7) A numeric scalar indicating the EM Algorithm Epsilon cutoff.

maxIter

A positive integer scalar storing the maximum number of iterations of the EM SQUAREM algorithm (Du and Varadhan, 2020). Default is 100 and this value is passed to the maxiter parameter of the squarem() function.

reassign_mode

(Default 'exclude') Character vector indicating reassignment mode after EM step. Available methods are 'exclude' (reads with more than one best assignment are excluded from the final counts), 'choose' (when reads have more than one best assignment, one of them is randomly chosen), 'average' (the read count is divided evenly among the best assignments) and 'conf' (only assignments that exceed a certain threshold -defined by conf_prob parameter- are accepted, then the read count is proportionally divided among the assignments above conf_prob).

conf_prob

(Default 0.9) Minimum probability for high confidence assignment.

References

Bendall et al. Telescope: characterization of the retrotranscriptome by accurate estimation of transposable element expression. PLOS Comp. Biol. 2019;15(9):e1006453. DOI: https://doi.org/10.1371/journal.pcbi.1006453

Bendall et al. Telescope: characterization of the retrotranscriptome by accurate estimation of transposable element expression. PLOS Comp. Biol. 2019;15(9):e1006453. DOI: https://doi.org/10.1371/journal.pcbi.1006453

Examples

bamfiles <- list.files(system.file("extdata", package="atena"),
                       pattern="*.bam", full.names=TRUE)
## Not run: 
## use the following two instructions to fetch annotations, they are here
## commented out to enable running this example quickly when building and
## checking the package
rmskat <- annotaTEs(genome="dm6", parsefun=rmskatenaparser,
                    strict=FALSE, insert=500)
rmskLTR <- getLTRs(rmskat, relLength=0.8,
                   fullLength=TRUE,
                   partial=TRUE,
                   otherLTR=TRUE)

## End(Not run)

## DO NOT TYPE THIS INSTRUCTION, WHICH JUST LOADS A PRE-COMPUTED ANNOTATION
## YOU SHOULD USE THE INSTRUCTIONS ABOVE TO FETCH ANNOTATIONS
rmskLTR <- readRDS(system.file("extdata", "rmskatLTRrlen80flenpartoth.rds",
                               package="atena"))

## build a parameter object for Telescope
tspar <- TelescopeParam(bfl=bamfiles,
                        teFeatures=rmskLTR,
                        singleEnd=TRUE,
                        ignoreStrand=TRUE)
tspar


functionalgenomics/atena documentation built on Nov. 4, 2024, 7:33 p.m.