orfScore: Get ORFscore for a GRangesList of ORFs

Description Usage Arguments Details Value References See Also Examples

Description

ORFscore tries to check whether the first frame of the 3 possible frames in an ORF has more reads than second and third frame. IMPORTANT: Only use p-shifted libraries, see (detectRibosomeShifts). Else this score makes no sense.

Usage

1
orfScore(grl, RFP, is.sorted = FALSE, weight = "score", overlapGrl = NULL)

Arguments

grl

a GRangesList of 5' utrs, CDS, transcripts, etc.

RFP

ribosomal footprints, given as GAlignments or GRanges object, must be already shifted and resized to the p-site. Requires a $size column with original read lengths.

is.sorted

logical (FALSE), is grl sorted. That is + strand groups in increasing ranges (1,2,3), and - strand groups in decreasing ranges (3,2,1)

weight

(default: 'score'), if defined a character name of valid meta column in subject. GRanges("chr1", 1, "+", score = 5), would mean score column tells that this alignment region was found 5 times. ORFik .bedo files, contains a score column like this. As do CAGEr CAGE files and many other package formats. You can also assign a score column manually.

overlapGrl

an integer, (default: NULL), if defined must be countOverlaps(grl, RFP), added for speed if you already have it

Details

Pseudocode: assume rff - is reads fraction in specific frame

1
ORFScore = log(rrf1 + rrf2 + rrf3)

If rrf2 or rrf3 is bigger than rff1, negate the resulting value.

1
ORFScore[rrf1Smaller] <- ORFScore[rrf1Smaller] * -1

As result there is one value per ORF: Positive values say that the first frame have the most reads, negative values say that the first frame does not have the most reads. NOTE: If reads are not of width 1, then a read from 1-4 on range of 1-4, will get scores frame1 = 2, frame2 = 1, frame3 = 1. What could be logical is that only the 5' end is important, so that only frame1 = 1, to get this, you first resize reads to 5'end only.

NOTES: 1. p shifting is not exact, so some functional ORFs will get a bad ORF score.
2. If a score column is defined, it will use it as weights, set to weight = 1L if you don't have weight, and score column is something else. see getWeights

Value

a data.table with 4 columns, the orfscore (ORFScores) and score of each of the 3 tiles (frame_zero_RP, frame_one_RP, frame_two_RP)

References

doi: 10.1002/embj.201488411

See Also

Other features: computeFeaturesCage(), computeFeatures(), countOverlapsW(), disengagementScore(), distToCds(), distToTSS(), entropy(), floss(), fpkm_calc(), fpkm(), fractionLength(), initiationScore(), insideOutsideORF(), isInFrame(), isOverlapping(), kozakSequenceScore(), rankOrder(), ribosomeReleaseScore(), ribosomeStallingScore(), startRegionCoverage(), startRegion(), stopRegion(), subsetCoverage(), translationalEff()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
ORF <- GRanges(seqnames = "1",
               ranges = IRanges(start = c(1, 10, 20), end = c(5, 15, 25)),
               strand = "+")
names(ORF) <- c("tx1", "tx1", "tx1")
grl <- GRangesList(tx1_1 = ORF)
RFP <- GRanges("1", IRanges(25, 25), "+") # 1 width position based
score(RFP) <- 28 # original width
orfScore(grl, RFP) # negative because more hits on frames 1,2 than 0.

# example with positive result, more hits on frame 0 (in frame of ORF)
RFP <- GRanges("1", IRanges(c(1, 1, 1, 25), width = 1), "+")
score(RFP) <- c(28, 29, 31, 28) # original width
orfScore(grl, RFP)

ORFik documentation built on March 27, 2021, 6 p.m.