asTX: Map genomic to transcript coordinates by reference

View source: R/ranges_helpers.R

asTXR Documentation

Map genomic to transcript coordinates by reference

Description

Map range coordinates between features in the genome and transcriptome (reference) space.

Usage

asTX(
  grl,
  reference,
  ignore.strand = FALSE,
  x.is.sorted = TRUE,
  tx.is.sorted = TRUE
)

Arguments

grl

a GRangesList of ranges within the reference, grl must either have names matching, or a meta column called 'names' that gives grouping names. i.e. grl named uORF_1_ENST00001, must then have a names meta column with ENST00001.

reference

a GRangesList of ranges that include grl as a subset of ranges. Example: cds is grl and mrna can be reference

ignore.strand

When ignore.strand is TRUE, strand is ignored in overlaps operations (i.e., all strands are considered "+") and the strand in the output is '*'.
When ignore.strand is FALSE (default) strand in the output is taken from the transcripts argument. When transcripts is a GRangesList, all inner list elements of a common list element must have the same strand or an error is thrown.
Mapped position is computed by counting from the transcription start site (TSS) and is not affected by the value of ignore.strand.

x.is.sorted

if x is a GRangesList object, are "-" strand groups pre-sorted in decreasing order within group, default: TRUE

tx.is.sorted

if transcripts is a GRangesList object, are "-" strand groups pre-sorted in decreasing order within group, default: TRUE

Details

Similar to GenomicFeatures' pmapToTranscripts, but in this version the grl ranges are compared to reference ranges with same name, not by index. This gives a large speedup, but also requires all objects must be named.

Value

a GRangesList in transcript coordinates

See Also

Other ExtendGenomicRanges: coveragePerTiling(), extendLeaders(), extendTrailers(), reduceKeepAttr(), tile1(), txSeqsFromFa(), windowPerGroup()

Examples

seqname <- c("tx1", "tx2", "tx3")
seqs <- c("ATGGGTATTTATA", "AAAAA", "ATGGGTAATA")
grIn1 <- GRanges(seqnames = "1",
                 ranges = IRanges(start = c(21, 10), end = c(23, 19)),
                 strand = "-")
grIn2 <- GRanges(seqnames = "1",
                 ranges = IRanges(start = c(1), end = c(5)),
                 strand = "-")
grIn3 <- GRanges(seqnames = "1",
                 ranges = IRanges(start = c(1010), end = c(1019)),
                 strand = "-")
grl <- GRangesList(grIn1, grIn2, grIn3)
names(grl) <- seqname
# Find ORFs
test_ranges <- findMapORFs(grl, seqs,
                 "ATG|TGG|GGG",
                 "TAA|AAT|ATA",
                 longestORF = FALSE,
                 minimumLength = 0)
# Genomic coordinates ORFs
test_ranges
# Transcript coordinate ORFs
asTX(test_ranges, reference = grl)
# seqnames will here be index of transcript it came from


Roleren/ORFik documentation built on Oct. 19, 2024, 7:37 a.m.