BiocStyle::markdown()
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
The r Biocpkg("GenomicTuples")
R package defines general purpose containers
for storing genomic tuples. It aims to provide functionality for
tuples of genomic co-ordinates that are analogous to those available for
genomic ranges in the r Biocpkg("GenomicRanges")
Bioconductor package.
As you will see, the functionality of the r Biocpkg("GenomicTuples")
package
is based almost entirely on the wonderful r Biocpkg("GenomicRanges")
package. Therefore, I have tried to keep the user interface as similar
as possible. This vignette is also heavily based on the vignette "An
Introduction to Genomic Ranges Classes", which is included with the
r Biocpkg("GenomicRanges")
package[^GenomicRanges]. While not essential, familiarity with the r Biocpkg("GenomicRanges")
will be of benefit in understanding the r Biocpkg("GenomicTuples")
package.
[^GenomicRanges]: The r Biocpkg("GenomicRanges")
vignette can be accessed
by typingvignette("GenomicRangesIntroduction", package = "GenomicRanges")
at the R console.
A genomic tuple is defined by a sequence name (seqnames
), a
strand (strand
) and a tuple (tuples
). All positions
in a genomic tuple must be on the same strand and sorted in ascending order.
Each tuple has an associated size
, which is a positive integer. For example,
chr1:+:{34, 39, 60}
is a 3-tuple (size
= 3) of the
positions chr1:34
, chr1:39
and chr1:60
on the
+
strand.
When referring to genomic tuples of a general (fixed) size
, I
will abbreviate these to $m$-tuples, where $m$ = size
. I
will refer to the first position as $pos_{1}$ (pos1
), the second
as $pos_{2}$ (pos2
), $\ldots{}$, and the final position as
$pos_{m}$ (posm
).
The difference between a genomic tuple and a genomic range can be
thought of as the difference between a set and an interval. For example,
the genomic tuple chr10:-:{800, 900}
only includes the
positions chr10:-:800
and chr10:-:900
whereas the
genomic range chr10:-:[800, 900]
includes the positions
chr10:-:800
, chr10:-:801
, chr10:-:802
,
$\ldots{}$, chr10:-:900
.
In short, whenever the co-ordinates of your genomic data are better defined by a set than by an interval.
The original use case for the GTuples class was to
store the genomic co-ordinates of "methylation patterns". I am currently
developing these ideas in a separate R package,
r Githubpkg("PeteHaitch/MethylationTuples")
, which makes heavy use of the
GTuples class. Other genomic data, such as long reads containing
multiple variants, may also be better conceptualised as genomic tuples rather
than as genomic ranges and therefore may benefit from the
r Biocpkg("GenomicTuples")
infrastructure.
The GTuples class represents a collection of genomic tuples,
where each tuple has the same size
. These objects can be
created by using the GTuples
constructor function. For example, the following
code creates a GTuples object with 10 genomic tuples:
library(GenomicTuples)
seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1") gt3 <- GTuples(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)), tuples = matrix(c(1:10, 2:11, 3:12), ncol = 3), strand = Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)), score = 1:10, GC = seq(1, 0, length = 10), seqinfo = seqinfo) names(gt3) <- letters[1:10] gt3
The output of the GTuples show
method is very similar to that of the show
method for GenomicRanges::GRanges objects. Namely, it separates
the information into a left and right hand region that are separated by
|
symbols. The genomic coordinates (seqnames
,
tuples
, and strand
) are located on the
left-hand side and the metadata columns (annotation) are located on the
right. For this example, the metadata is comprised of score
and
GC
information, but almost anything can be stored in the
metadata portion of a GTuples object.
The main difference between a GTuples object and GenomicRanges::GRanges object is that the former uses tuples while the latter uses ranges in the genomic coordinates.
For even more information on the GTuples class, be sure to consult the documentation:
?GTuples
Most methods defined for GenomicRanges::GRanges are also defined for GTuples. Those that are not yet defined, which are those that make sense for ranges but generally not for tuples, return error messages.
If you require a method that is not defined for GTuples but is defined for GenomicRanges::GRanges, then this can be achieved by first coercing the GTuples object to a GenomicRanges::GRanges object; Warning: coercing a GTuples object to a GenomicRanges::GRanges is generally a destructive operation.
as(gt3, "GRanges")
The components of the genomic coordinates within a GTuples
object can be extracted using the seqnames
,
tuples
, and strand
accessor functions.
Warning: The tuples
accessor should be used in place of the ranges
accessor. While the ranges
method is well-defined, namely it accesses $pos_{1}$ and $pos_{m}$ of the object, this is not generally what is desired or required.
seqnames(gt3) tuples(gt3) strand(gt3)
Stored annotations for these coordinates can be extracted as a
DataFrame object using the mcols
accessor:
mcols(gt3)
Seqinfo can be extracted using the seqinfo
accessor:
seqinfo(gt3)
Methods for accessing the length and names are also defined:
length(gt3) names(gt3)
GTuples objects can be divided into groups using the
split
method. This produces a GTuplesList object, a
class that will be discussed in detail in the next section:
sp <- split(gt3, rep(1:2, each=5)) sp
If you then grab the components of this GenomicTuplesList
, they can also be
combined by using the c
and append
methods:
c(sp[[1]], sp[[2]])
The expected subsetting operations are also available for GTuples objects:
gt3[2:3]
A second argument to the [
subset operator can be used to
specify which metadata columns to extract from the GTuples
object. For example:
gt3[2:3, "GC"]
You can also assign into elements of the GTuples object. Here
is an example where the 2nd row of a GTuples object is replaced
with the 1st row of gt3
:
gt3_mod <- gt3 gt3_mod[2] <- gt3[1] head(gt3_mod, n = 3)
There are also methods to repeat, reverse, or select specific portions of GTuples objects:
rep(gt3[2], times = 3) rev(gt3) head(gt3, n = 2) tail(gt3, n = 2) window(gt3, start = 2, end = 4)
Basic tuple characteristics of GTuples objects can be extracted
using the start
, end
, and tuples
methods.
Warning: While the width
method is well-defined, namely as $pos_{m} - pos_{1} + 1$, this may not be what is required. Instead, please see the IPD
method that will be discussed in the next section.
start(gt3) end(gt3) tuples(gt3)
Most of the intra-range methods defined for GenomicRanges::GRanges objects are not currently defined via extension for GTuples objects due to the differences between ranges and tuples. Those not currently defined, and which return an error message, are:
narrow
flank
promoters
resize
Ops
I am happy to add these methods if appropriate, so please contact me if you have suggestions for good definitions.
Both the trim
and shift
methods are well-defined,
although the former is somewhat limited since it will return an error if
the internal positions exceed the seqlengths
:
shift(gt3, 500)
# Raises warning due to tuple being outside of seqlength x <- shift(gt3[1], 999) x # Returns an error because internal position exceeds sequence length, resulting # in a malformed tuple when trimmed. trim(x)
None of the inter-range methods defined for GenomicRanges::GRanges objects are currently defined via extension for GTuples objects due to the differences between ranges and tuples. Those not currently defined, and which return an error message, are:
range
reduce
gaps
disjoin
isDisjoint
disjointBins
I am happy to add these methods if appropriate, so please contact me if you have suggestions for good definitions.
None of the interval set operations defined for GenomicRanges::GRanges objects are currently defined via extension for GTuples objects due to the differences between ranges and tuples. Those not currently defined, and which return an error message, are:
union
intersect
setdiff
punion
pintersect
psetdiff
I am happy to add these methods if appropriate, so please contact me if you have suggestions for good definitions.
GTuples have a few specifically defined methods that do not
exist for GenomicRanges::GRanges. These are tuples
,
size
and IPD
.
The tuples
method we have already seen and is somewhat
analogous to the ranges
method for
GenomicRanges::GRanges, although returning an integer matrix
rather than an IRanges::IRanges object:
tuples(gt3)
The size
method returns the size of the tuples stored in the
object:
size(gt3)
Every m-tuple with $m \geq 2$ has an associated vector of intra-pair
distances ($IPD$). This is defined as
$IPD = (pos_{2} - pos_{1}, \ldots, pos_{m} - pos_{m - 1})$. The
IPD
method returns this as an integer matrix, where the $i^{th}$
row contains the $IPD$ for the $i^{th}$ tuple:
IPD(gt3)
While the GTuples class can be thought of as a matrix-link
object, with the number of columns equal to the size
of the
tuples plus two (one for the seqname
and one for the
strand
), internally, it extends the GenomicRanges::GRanges
class. Specifically, the ranges
slot stores an
IRanges::IRanges object containing $pos_{1}$ and $pos_{m}$ and,
if size
$> 2$, a matrix is used to store the co-ordinates of the "internal
positions", $pos_{2}, \ldots, pos_{m - 1}$ in the internalPos
slot. If
size
$\leq 2$ then the internalPos
slot is set to NULL
. The
size
is stored as an integer in the size
slot.
While there are arguments for creating stand-alone GTuples and GTuplesList classes, by extending the GenomicRanges::GRanges and GenomicRanges::GRangesList classes we get a lot of very useful functionality "for free" via appropriately defined inheritance.
The GTuplesList class is a container to store a S4Vectors::List of GTuples objects. It extends the GenomicRanges::GRangesList class.
Currently, all GTuples in a GTuplesList must have the
same size
[^size]. I expect that users will mostly use GTuples objects and
have little need to directly use GTuplesList objects.
[^size]: This may be changed in future versions of r Biocpkg("GenomicTuples")
.
seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1") gt3 <- GTuples(seqnames = Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)), tuples = matrix(c(1:10, 2:11, 3:12), ncol = 3), strand = Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)), score = 1:10, GC = seq(1, 0, length = 10), seqinfo = seqinfo) gtl3 <- GTuplesList(A = gt3[1:5], B = gt3[6:10]) gtl3
For even more information on the GTuplesList class, be sure to consult the documentation:
?GTuplesList
Most methods defined for GenomicRanges::GRangesList are also applicable to GTuplesList. Those that are not yet defined, which are those that make sense for ranges but generally not for tuples, return error messages.
If a method that is not defined for GTuplesList but is defined for GenomicRanges::GRangesList is truly required, then this can be achieved by first coercing the GTuplesList object to a GenomicRanges::GRangesList object, noting that this is generally a destructive operation:
as(gtl3, "GRangesList")
These are very similar to those available for GTuples objects, except that they typically return a list since the input is now essentially a list of GTuples objects:
seqnames(gtl3) # Returns a list of integer matrices tuples(gtl3) tuples(gtl3)[[1]] strand(gtl3)
The length
and names
methods will return the length and
names of the list, respectively:
length(gtl3) names(gtl3)
Seqinfo can be extracted using the seqinfo
accessor:
seqinfo(gtl3)
The elementNROWS
method returns a list of integers
corresponding to the result of calling length
on each individual
GTuples object contained by the GTuplesList. This is a
faster alternative to calling lapply
on the GTuplesList:
elementNROWS(gtl3)
You can also use isEmpty
to test if a GTuplesList object
contains anything:
isEmpty(gtl3) isEmpty(GTuplesList())
Finally, in the context of a GTuplesList object, the
mcols
method performs a similar operation to what it does on a
GTuples object. However, this metadata now refers to
information at the list level instead of the level of the individual
GTuples objects:
mcols(gtl3) <- c("Feature A", "Feature B") mcols(gtl3)
GTuplesList objects can be unlisted to combine the separate GTuples objects that they contain as an expanded GTuples:
ul <- unlist(gtl3) ul
You can also combine GTuplesList objects together using append
or c
.
Subsetting of GTuplesList objects is identical to subsetting of GenomicRanges::GRangesList objects:
gtl3[1] gtl3[[1]] gtl3["A"] gtl3$B
When subsetting a GTuplesList, you can also pass in a second parameter (as with a GTuples object) to again specify which of the metadata columns you wish to select:
gtl3[1, "score"] gtl3["B", "GC"]
The head
, tail
, rep
, rev
, and window
methods all behave as you would
expect them to for a List object. For example, the elements referred to by
window
are now list elements instead of GTuples elements:
rep(gtl3[[1]], times = 3) rev(gtl3) head(gtl3, n = 1) tail(gtl3, n = 1) window(gtl3, start = 1, end = 1)
Basic tuple characteristics of GTuplesList objects can be
extracted using the start
, end
, and tuples
methods. These are very similar to those available for GTuples
objects, except that they typically return a list since the input is now
essentially a list of GTuples objects.
WARNING: While the width
method is well-defined, namely it returns an IntegerList of $pos_{m} - pos_{1} + 1$, this is not generally what is desired or required. Instead, please see the IPD
method that is discussed later.
start(gtl3) end(gtl3) tuples(gtl3)
Most of the intra-range methods defined for GenomicRanges::GRangesList objects are not currently defined via extension for GTuples objects due to the differences between ranges and tuples. Those not currently defined, and which return an error message, are:
flank
promoters
resize
restrict
I am happy to add these methods if appropriate, so please contact me if you have suggestions for good definitions.
The shift
method is well-defined:
shift(gtl3, 500) shift(gtl3, IntegerList(A = 300L, B = 500L))
None of the inter-range methods defined for GenomicRanges::GRangesList objects are currently defined via extension for GTuplesList objects due to the differences between ranges and tuples. Those not currently defined, and which return an error message, are:
range
reduce
disjoin
isDisjoint
I am happy to add these methods if appropriate, so please contact me if you have suggestions for good definitions.
None of the interval set operations defined for GenomicRanges::GRangesList objects are currently defined via extension for GTuplesList objects due to the differences between ranges and tuples. Those not currently defined, and which return an error message, are:
punion
pintersect
psetdiff
I am happy to add these methods if appropriate, so please contact me if you have suggestions for good definitions.
Like for GenomicRanges::GRangesList objects, for GTuplesList objects there
is a family of apply methods. These include lapply
, sapply
, mapply
,
endoapply
, mendoapply
, Map
, and Reduce
. The different looping methods
defined for GTuplesList objects are useful for returning different kinds of
results. The standard lapply
and sapply
behave according to convention,
with the lapply
method returning a list and sapply
returning a more
simplified output:
lapply(gtl3, length) sapply(gtl3, length)
As with GenomicRanges::GRangesList objects, there is also a multivariate
version of sapply
, called mapply
, defined for GTuplesList objects. And,
if you don't want the results simplified, you can call the Map
method, which
does the same things as mapply
but without simplifying the output:
gtl3_shift <- shift(gtl3, 10) names(gtl3) <- c("shiftA", "shiftB") mapply(c, gtl3, gtl3_shift) Map(c, gtl3, gtl3_shift)
The endoapply
method will return the results as a GTuplesList object rather
than as a list:
endoapply(gtl3, rev)
There is also a multivariate version of the endoapply
method in the form of
the mendoapply
method:
mendoapply(c, gtl3, gtl3_shift)
Finally, the Reduce
method will allow the GTuples objects to be collapsed
across the whole of the GTuplesList object:
Reduce(c, gtl3)
Like GTuples, GTuplesList have a few specifically defined methods that do
not exist for GenomicRanges::GRangesList. These are tuples
, size
and
IPD
. These are identical to the methods for GTuples, except that they
typically return a list since the input is now essentially a List of GTuples
objects.
tuples(gtl3) tuples(gtl3)[[1]] size(gtl3) IPD(gtl3) IPD(gtl3)[[1]]
The GTuplesList class extends the GenomicRanges::GRangesList class.
findOverlaps
-based methodsThe definition of what constitutes an "overlap" between genomic tuples, or
between genomic tuples and genomic ranges, lies at the heart of all
findOverlaps
-based methods[^findOverlaps] for GTuples and GTuplesList
objects.
[^findOverlaps]: The findOverlaps
-based methods are findOverlaps
,
countOverlaps
, overlapsAny
and subsetByOverlaps
.
I have chosen a definition that matches my intuition of what constitutes an "overlap" between genomic tuples or between genomic tuples and genomic
ranges. However, I am open to suggestions on amending or extending this
behaviour in future versions of r Biocpkg("GenomicTuples")
.
I consider two genomic tuples to be equal (type = "equal"
) if they have
identical sequence names (seqnames
), strands (strand
) and tuples
(tuples
). For 1-tuples and 2-tuples, this means we can simply
defer to the findOverlaps
-based methods for
GenomicRanges::GRanges and GenomicRanges::GRangesList
objects via inheritance. However, we cannot do the same for m-tuples
with $m > 2$ since this would ignore the "internal positions".
Therefore, I have implemented a special case of the
findOverlaps
method for when size
$> 2$ and
type = "equal"
, which ensures that the "internal positions" are also checked
for equality.
In all other cases genomic tuples are treated as genomic ranges. This
means that when type = "any"
, type = "start"
, type = "end"
or
type = "within"
then the genomic tuples are treated as if they were genomic
ranges. Specifically, GTuples (resp. GTuplesList) are treated as though
they were GenomicRanges::GRanges (resp. GenomicRanges::GRangesList) with
pos1
= start
and posm
= end
.
Genomic tuples are always treated as genomic ranges when searching for overlaps between genomic tuples and genomic ranges.
It is easiest to understand the above definitions by studying a few examples.
Firstly, for 1-tuples where the GTuples methods use the GenomicRanges::GRanges methods:
# Construct example 1-tuples gt1 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr2'), tuples = matrix(c(10L, 10L, 10L, 10L), ncol = 1), strand = c('+', '-', '*', '+')) # GRanges version of gt1 gr1 <- as(gt1, "GRanges") findOverlaps(gt1, gt1, type = 'any') # GTuples and GRanges methods identical identical(findOverlaps(gt1, gt1, type = 'any'), findOverlaps(gr1, gr1, type = 'any')) findOverlaps(gt1, gt1, type = 'start') # GTuples and GRanges methods identical identical(findOverlaps(gt1, gt1, type = 'start'), findOverlaps(gr1, gr1, type = 'start')) findOverlaps(gt1, gt1, type = 'end') # GTuples and GRanges methods identical identical(findOverlaps(gt1, gt1, type = 'end'), findOverlaps(gr1, gr1, type = 'end')) findOverlaps(gt1, gt1, type = 'within') # GTuples and GRanges methods identical identical(findOverlaps(gt1, gt1, type = 'within'), findOverlaps(gr1, gr1, type = 'within')) findOverlaps(gt1, gt1, type = 'equal') # GTuples and GRanges methods identical identical(findOverlaps(gt1, gt1, type = 'equal'), findOverlaps(gr1, gr1, type = 'equal')) # Can pass other arguments, such as select and ignore.strand findOverlaps(gt1, gt1, type = 'equal', ignore.strand = TRUE, select = 'last')
Next, for 2-tuples where the GTuples methods use the GenomicRanges::GRanges methods:
# Construct example 2-tuples gt2 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr1', 'chr2'), tuples = matrix(c(10L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 25L, 20L), ncol = 2), strand = c('+', '-', '*', '+', '+')) # GRanges version of gt2 gr2 <- as(gt2, "GRanges") findOverlaps(gt2, gt2, type = 'any') # GTuples and GRanges methods identical identical(findOverlaps(gt2, gt2, type = 'any'), findOverlaps(gr2, gr2, type = 'any')) findOverlaps(gt2, gt2, type = 'start') # GTuples and GRanges methods identical identical(findOverlaps(gt2, gt2, type = 'start'), findOverlaps(gr2, gr2, type = 'start')) findOverlaps(gt2, gt2, type = 'end') # GTuples and GRanges methods identical identical(findOverlaps(gt2, gt2, type = 'end'), findOverlaps(gr2, gr2, type = 'end')) findOverlaps(gt2, gt2, type = 'within') # GTuples and GRanges methods identical identical(findOverlaps(gt2, gt2, type = 'within'), findOverlaps(gr2, gr2, type = 'within')) findOverlaps(gt2, gt2, type = 'equal') # GTuples and GRanges methods identical identical(findOverlaps(gt2, gt2, type = 'equal'), findOverlaps(gr2, gr2, type = 'equal')) # Can pass other arguments, such as select and ignore.strand findOverlaps(gt2, gt2, type = 'equal', ignore.strand = TRUE, select = 'last')
Finally, for m-tuples with $m > 2$ where GTuples methods use the
GenomicRanges::GRanges methods unless type = "equal"
:
# Construct example 3-tuples gt3 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr1', 'chr2'), tuples = matrix(c(10L, 10L, 10L, 10L, 10L, 20L, 20L, 20L, 25L, 20L, 30L, 30L, 35L, 30L, 30L), ncol = 3), strand = c('+', '-', '*', '+', '+')) # GRanges version of gt3 gr3 <- as(gt3, "GRanges") findOverlaps(gt3, gt3, type = 'any') # GTuples and GRanges methods identical identical(findOverlaps(gt3, gt3, type = 'any'), findOverlaps(gr3, gr3, type = 'any')) # TRUE findOverlaps(gt3, gt3, type = 'start') # GTuples and GRanges methods identical identical(findOverlaps(gt3, gt3, type = 'start'), findOverlaps(gr3, gr3, type = 'start')) # TRUE findOverlaps(gt3, gt3, type = 'end') # GTuples and GRanges methods identical identical(findOverlaps(gt3, gt3, type = 'end'), findOverlaps(gr3, gr3, type = 'end')) # TRUE findOverlaps(gt3, gt3, type = 'within') # GTuples and GRanges methods identical identical(findOverlaps(gt3, gt3, type = 'within'), findOverlaps(gr3, gr3, type = 'within')) # TRUE findOverlaps(gt3, gt3, type = 'equal') # GTuples and GRanges methods **not** identical because GRanges method ignores # "internal positions". identical(findOverlaps(gt3, gt3, type = 'equal'), findOverlaps(gr3, gr3, type = 'equal')) # FALSE # Can pass other arguments, such as select and ignore.strand findOverlaps(gt3, gt3, type = 'equal', ignore.strand = TRUE, select = 'last')
I have chosen a definition that matches my intuition of what constitutes a
comparison between genomic tuples. However, I am open to suggestions
on amending or extending this behaviour in future versions of
r Biocpkg("GenomicTuples")
.
The comparison of two genomic tuples, x
and y
, is done by
first comparing the seqnames(x)
to seqnames(y)
, then
strand(x)
to strand(y)
and finally tuples(x)
to tuples(y)
.
Ordering of seqnames
and strand
is as implemented GenomicRanges::GRanges.
Ordering of tuples
is element-wise, i.e. $pos_{1}, \ldots, pos_{m}$ are
compared in turn. For example, chr1:+:10, 20, 30
is considered less than
chr1:+:10, 20, 40
. This defines what I will refer to as the "natural order"
of genomic tuples.
The above is implemented in the pcompare
method for
GTuples, which performs "generalized range-wise comparison" of two
GTuples objects, x
and y
. That is, pcompare(x, y)
returns an integer
vector where the $i^{th}$ element is a code describing how the $i^{th}$ element
in x
is qualitatively positioned relatively to the $i^{th}$ element in y
.
A code that is < 0
, = 0
, or > 0
, corresponds to x[i] < y[i]
,
x[i] == y[i]
, or x[i] > y[i]
, respectively.
The 6 traditional binary comparison operators (==
, !=
, <=
, >=
, <
, and
>
), other comparison operators (match
, order
, sort
, and rank
) and
duplicate-based methods (duplicated
and unique
) all use this "natural
order".
It is easiest to understand the above definitions by studying a few examples, here using 3-tuples:
# Construct example 3-tuples gt3 <- GTuples(seqnames = c('chr1', 'chr1', 'chr1', 'chr1', 'chr2', 'chr1', 'chr1'), tuples = matrix(c(10L, 10L, 10L, 10L, 10L, 5L, 10L, 20L, 20L, 20L, 25L, 20L, 20L, 20L, 30L, 30L, 35L, 30L, 30L, 30L, 35L), ncol = 3), strand = c('+', '-', '*', '+', '+', '+', '+')) gt3 # pcompare each tuple to itself pcompare(gt3, gt3) gt3 < gt3 gt3 > gt3 gt3 == gt3 # pcompare the third tuple to all tuples pcompare(gt3[3], gt3) gt3[3] < gt3 gt3[3] > gt3 gt3[3] == gt3 ## Some comparisons where tuples differ only in one coordinate # Ordering of seqnames # 'chr1' < 'chr2' for tuples with otherwise identical coordinates gt3[1] < gt3[5] # TRUE # Ordering of strands # '+' < '-' < '*' for tuples with otherwise identical coordiantes gt3[1] < gt3[2] # TRUE gt3[1] < gt3[2] # TRUE gt3[1] < unstrand(gt3[2]) # TRUE gt3[2] < unstrand(gt3[2]) # TRUE # Ordering of tuples # Tuples checked sequentially from pos1, ..., posm for tuples with otherwise # identical coordinates gt3[6] < gt3[1] # TRUE due to pos1 gt3[2] < gt3[4] # TRUE due to pos2 gt3[1] < gt3[7] # TRUE due to pos3 # Sorting of tuples # Sorted first by seqnames, then by strand, then by tuples sort(gt3) # Duplicate tuples # Duplicate tuples must have identical seqnames, strand and positions (tuples) duplicated(c(gt3, gt3[1:3])) unique(c(gt3, gt3[1:3]))
I am very grateful to all the Bioconductor developers but particularly wish
to thank the developers of r Biocpkg("GenomicRanges")
(Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).), which
r Biocpkg("GenomicTuples")
uses heavily and is based upon. A special thanks to Hervé Pagès for his assistance and fixes when making upstream changes to r Biocpkg("GenomicRanges")
.
Here is the output of sessionInfo
on the system on which
this document was compiled:
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.