Description Usage Arguments Value TwoBitFile objects Note Author(s) See Also Examples
These functions support the import and export of the UCSC 2bit
compressed sequence format. The main advantage is speed of subsequence
retrieval, as it only loads the sequence in the requested
intervals. Compared to the FA format supported by Rsamtools, 2bit
offers the additional feature of masking and also has better support
in Java (and thus most genome browsers). The supporting
TwoBitFile
class is a reference to a TwoBit file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ## S4 method for signature 'TwoBitFile,ANY,ANY'
import(con, format, text,
which = as(seqinfo(con), "GenomicRanges"), ...)
## S4 method for signature 'TwoBitFile'
getSeq(x, which = as(seqinfo(x), "GenomicRanges"))
import.2bit(con, ...)
## S4 method for signature 'ANY,TwoBitFile,ANY'
export(object, con, format, ...)
## S4 method for signature 'DNAStringSet,TwoBitFile,ANY'
export(object, con, format)
## S4 method for signature 'DNAStringSet,character,ANY'
export(object, con, format, ...)
export.2bit(object, con, ...)
|
con |
A path, URL or |
object,x |
The object to export, either a |
format |
If not missing, should be “twoBit” or “2bit” (case insensitive). |
text |
Not supported. |
which |
A range data structure coercible to |
... |
Arguments to pass down to methods to other methods. For
import, the flow eventually reaches the |
For import, a DNAStringSet
.
TwoBitFile
objectsA TwoBitFile
object, an extension of
RTLFile
is a reference to a TwoBit file. To cast
a path, URL or connection to a TwoBitFile
, pass it to the
TwoBitFile
constructor.
A TwoBit file embeds the sequence information, which can be retrieved with the following:
seqinfo(x)
:
Gets the Seqinfo
object indicating
the lengths of the sequences for the intervals in the
file. No circularity or genome information is available.
The 2bit format only suports A, C, G, T and N (via an internal
mask). To export sequences with additional IUPAC ambiguity codes,
first pass the object through
replaceAmbiguities
from the Biostrings
package.
Michael Lawrence
export-methods in the BSgenome package for exporting a BSgenome object as a twoBit file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | test_path <- system.file("tests", package = "rtracklayer")
test_2bit <- file.path(test_path, "test.2bit")
test <- import(test_2bit)
test
test_2bit_file <- TwoBitFile(test_2bit)
import(test_2bit_file) # the whole file
which_range <- IRanges(c(10, 40), c(30, 42))
which <- GRanges(names(test), which_range)
import(test_2bit, which = which)
seqinfo(test_2bit_file)
## Not run:
test_2bit_out <- file.path(tempdir(), "test_out.2bit")
export(test, test_2bit_out)
## just a character vector
test_char <- as.character(test)
export(test_char, test_2bit_out)
## End(Not run)
|
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colMeans, colSums, colnames, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
setdiff, sort, table, tapply, union, unique, unsplit, which,
which.max, which.min
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following object is masked from 'package:base':
expand.grid
Loading required package: IRanges
Loading required package: GenomeInfoDb
A DNAStringSet instance of length 1
width seq names
[1] 100 TGATGGAAGAATTATTTGAAAGC...ATAGTCCAGAGACTACAACTTCA gi|157704452|ref|...
A DNAStringSet instance of length 1
width seq names
[1] 100 TGATGGAAGAATTATTTGAAAGC...ATAGTCCAGAGACTACAACTTCA gi|157704452|ref|...
A DNAStringSet instance of length 2
width seq
[1] 21 AATTATTTGAAAGCCATATAG
[2] 3 ACT
Seqinfo object with 1 sequence from an unspecified genome:
seqnames seqlengths isCircular genome
gi|157704452|ref|AC_000143.1| 100 NA <NA>
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.