GTuples-comparison: Comparing and ordering genomic tuples

GTuples-comparisonR Documentation

Comparing and ordering genomic tuples

Description

Methods for comparing and ordering the elements in one or more GTuples objects.

Usage

## duplicated()
## ------------

## S4 method for signature 'GTuples'
duplicated(x, incomparables = FALSE, fromLast = FALSE)

## match() and selfmatch()
## -------

## S4 method for signature 'GTuples,GTuples'
match(x, table, nomatch = NA_integer_, 
      incomparables = NULL, ignore.strand = FALSE)
## S4 method for signature 'GTuples'
selfmatch(x, ignore.strand = FALSE, ...)

## order() and related methods
## -----------------------------------------------------------------------------

## S4 method for signature 'GTuples'
order(..., na.last = TRUE, decreasing = FALSE, method = c("auto", "shell", "radix"))

## S4 method for signature 'GTuples'
sort(x, decreasing = FALSE, ignore.strand = FALSE, by)

## S4 method for signature 'GTuples'
rank(x, na.last = TRUE,
     ties.method = c("average", "first", "random", "max", "min"))
     
## S4 method for signature 'GTuples'
is.unsorted(x, na.rm=FALSE, strictly=FALSE, ignore.strand = FALSE)

## Generalized element-wise (aka "parallel") comparison of 2 GTuples
## objects
## -----------------------------------------------------------------------------

## S4 method for signature 'GTuples,GTuples'
pcompare(x, y)

Arguments

x, table, y

GTuples objects.

incomparables

Not supported.

fromLast, method, nomatch

See ?`GenomicRanges-comparison` in the GenomicRanges package for a description of these arguments.

ignore.strand

Whether or not the strand should be ignored when comparing 2 genomic tuples.

...

One or more GTuples objects. The GTuples objects after the first one are used to break ties

na.last

Ignored.

decreasing

TRUE or FALSE.

ties.method

A character string specifying how ties are treated. Only "first" is supported for now.

by

An optional formula that is resolved against as.env(x); the resulting variables are passed to order to generate the ordering permutation.

na.rm

logical. Should missing values be removed before checking? WARNING: This currently has no effect and is ignored.

strictly

logical indicating if the check should be for strictly increasing values.

Details

Two elements of a GTuples object (i.e. two genomic tuples) are considered equal if and only if they are on the same underlying sequence and strand, and have the same positions (tuples). duplicated() and unique() on a GTuples object are conforming to this.

The "natural order" for the elements of a GTuples object is to order them (a) first by sequence level, (b) then by strand, (c) then by pos_{1}, \ldots, pos_{m}. This way, the space of genomic tuples is totally ordered.

order(), sort(), is.unsorted(), and rank() on a GTuples object are using this "natural order".

Also ==, !=, <=, >=, < and > on GTuples objects are using this "natural order".

pcompare(x, y):

Performs "generalized range-wise comparison" of x and y, that is, returns an integer vector where the i-th element is a code describing how the i-th element in x is qualitatively positioned relatively to the i-th element in y.

A code that is < 0, = 0, or > 0, corresponds to x[i] < y[i], x[i] == y[i], or x[i] > y[i], respectively.

WARNING: These predefined codes are not as detailed as those for IPosRanges-comparison. Specifically, only the sign of the code matters, not the actual value.

match(x, table, nomatch = NA_integer_):

Returns an integer vector of the length of x, containing the index of the first matching range in table (or nomatch if there is no matching range) for each tuple in x.

duplicated(x, fromLast = FALSE, method = c("hash", "base")):

Determines which elements of x are equal to elements with smaller subscripts, and returns a logical vector indicating which elements are duplicates. See duplicated in the base package for more details.

unique(x, fromLast = FALSE, method = c("hash", "base")):

Removes duplicate tuples from x. See unique in the base package for more details.

x %in% table:

A shortcut for finding the ranges in x that match any of the tuples in table. Returns a logical vector of length equal to the number of tuples in x.

findMatches(x, table):

An enhanced version of match that returns all the matches in a Hits object.

countMatches(x, table):

Returns an integer vector of the length of x containing the number of matches in table for each element in x.

order(...):

Returns a permutation which rearranges its first argument (a GTuples object) into ascending order, breaking ties by further arguments. See order in the BiocGenerics package for more information.

sort(x):

Sorts x. See sort in the base package for more details.

rank(x, na.last = TRUE, ties.method = c("average", "first", "random", "max", "min")):

Returns the sample ranks of the tuples in x. See rank in the base package for more details.

Value

For pcompare: see Details section above.

For selfmatch: an integer vector of the same length as x.

For duplicated, unique, and %in%: see ?BiocGenerics::duplicated, ?BiocGenerics::unique, and ?`%in%`.

For findMatches: a Hits object by default (i.e. if select="all").

For countMatches: an integer vector of the length of x containing the number of matches in table for each element in x.

For sort: see ?BiocGenerics::sort.

Author(s)

Peter Hickey

See Also

  • The GTuples class.

  • GenomicRanges-comparison in the GRanges package for comparing and ordering genomic ranges.

  • intra-tuple-methods for intra-tuple transformations.

  • findOverlaps-methods for finding overlapping genomic ranges.

Examples

## GTuples object containing 3-tuples:
gt3 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr1', 'chr2'), 
               tuples = matrix(c(10L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 25L, 
                                 20L, 30L, 30L, 35L, 30L, 30L), ncol = 3), 
               strand = c('+', '-', '*', '+', '+'))
gt3 <- c(gt3, rev(gt3[3:5]))

## ---------------------------------------------------------------------
## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 GTuples OBJECTS
## ---------------------------------------------------------------------
gt3[2] == gt3[2]  # TRUE
gt3[2] == gt3[5]  # FALSE
gt3 == gt3[4]
gt3 >= gt3[3]

## ---------------------------------------------------------------------
## B. duplicated(), unique()
## ---------------------------------------------------------------------
duplicated(gt3)
unique(gt3)

## ---------------------------------------------------------------------
## C. match(), %in%
## ---------------------------------------------------------------------
table <- gt3[2:5]
match(gt3, table)
match(gt3, table, ignore.strand = TRUE)

## ---------------------------------------------------------------------
## D. findMatches(), countMatches()
## ---------------------------------------------------------------------
findMatches(gt3, table)
countMatches(gt3, table)

findMatches(gt3, table, ignore.strand = TRUE)
countMatches(gt3, table, ignore.strand = TRUE)

gt3_levels <- unique(gt3)
countMatches(gt3_levels, gt3)

## ---------------------------------------------------------------------
## E. order() AND RELATED METHODS
## ---------------------------------------------------------------------
is.unsorted(gt3)
order(gt3)
sort(gt3)
is.unsorted(sort(gt3))

is.unsorted(gt3, ignore.strand=TRUE)
gt3_2 <- sort(gt3, ignore.strand=TRUE)
is.unsorted(gt3_2)  # TRUE
is.unsorted(gt3_2, ignore.strand=TRUE)  # FALSE

## TODO (TODO copied from GenomicRanges): Broken. Please fix!
#sort(gt3, by = ~ seqnames + start + end) # equivalent to (but slower than) above

score(gt3) <- rev(seq_len(length(gt3)))

## TODO (TODO copied from GenomicRanges): Broken. Please fix!
#sort(gt3, by = ~ score)

rank(gt3)

## ---------------------------------------------------------------------
## F. GENERALIZED ELEMENT-WISE COMPARISON OF 2 GTuples OBJECTS
## ---------------------------------------------------------------------
pcompare(gt3[3], gt3)


PeteHaitch/GenomicTuples documentation built on Nov. 5, 2024, 2 p.m.