TileDBRealizationSink: Write arrays to TileDB

View source: R/TileDBRealizationSink.R

TileDBRealizationSinkR Documentation

Write arrays to TileDB

Description

Write array data to a TileDB backend via DelayedArray's RealizationSink machinery.

Writing a TileDBArray

TileDBRealizationSink(
    dim, 
    dimnames=NULL, 
    type="double", 
    path=getTileDBPath(), 
    attr=getTileDBAttr(), 
    storagetype=NULL,
    dimtype=getTileDBDimType(),
    sparse=FALSE,
    extent=getTileDBExtent(), 
    offset=1L,
    cellorder=getTileDBCellOrder(),
    tileorder=getTileDBTileOrder(),
    capacity=getTileDBCapacity(),
    context=getTileDBContext()
)

returns a TileDBRealizationSink object that can be used to write content to a TileDB backend. It accepts the following arguments:

  • dim, an integer vector (usually of length 2) to specify the array dimensions.

  • dimnames, a list of length equal to dim, containing character vectors with names for each dimension. Defaults to NULL, i.e., no dimnames.

  • type, a string specifying the R data type for the newly written array. Currently only "double", "integer" and "logical" arrays are supported.

  • path, a string containing the location of the new TileDB backend.

  • attr, a string specifying the name of the attribute to store.

  • storagetype, a string specifying the TileDB data type for the attribute, e.g., "UINT8", "FLOAT32". If NULL, this is automatically determined from type using r_to_tiledb_type.

  • dimtype, a string specifying the TileDB data type for the dimension.

  • sparse, a logical scalar indicating whether the array should be stored in sparse form.

  • extent, an integer scalar (or vector of length equal to dim) specifying the tile extent for each dimension. Larger values improve compression at the cost of unnecessary data extraction during reads.

  • offset, an integer scalar (or vector of length equal to dim) specifying the starting offset for each dimension's domain.

  • cellorder, a string specifying the ordering of cells within each tile.

  • tileorder, a string specifying the ordering of tiles across the array.

  • capacity, an integer scalar specifying the size of each data tile in the sparse case.

  • context is the TileDB context, defaulting to the output of tiledb_ctx().

writeTileDBArray(x, sparse=is_sparse(x), ...) writes the matrix-like object x to a TileDB backend, returning a TileDBArray object referring to that backend. Appropriate values for dim, dimnames and type are determined automatically from x itself. All other arguments described for TileDBRealizationSink can be passed into ... to configure the representation.

Coercing to a TileDBArray

as(x, "TileDBArray") will coerce a matrix-like object x to a TileDBArray object.

as(x, "TileDBArraySeed") will coerce a matrix-like object x to a TileDBArraySeed object.

as(x, "TileDBMatrix") will coerce a matrix-like object x to a TileDBArraySeed object.

as(x, "TileDBArray") will coerce a TileDBRealizationSink x to a TileDBArray object.

as(x, "TileDBArraySeed") will coerce a TileDBRealizationSink x to a TileDBArraySeed object.

as(x, "DelayedArray") will coerce a TileDBRealizationSink x to a TileDBArray object.

Sink internals

write_block(sink, viewport, block) will write the subarray block to the TileDBRealizationSink sink at the specified viewport, returning sink upon completion. See write_block in DelayedArray for more details.

type(x) will return a string specifying the type of the TileDBRealizationSink x.

Examples

X <- matrix(rnorm(100000), ncol=200)
path <- tempfile()
out <- writeTileDBArray(X, path=path)

# Works for integer matrices.
Xi <- matrix(rpois(100000, 2), ncol=200)
pathi <- tempfile()
outi <- writeTileDBArray(Xi, path=pathi)

# Works for logical matrices.
Xl <- matrix(rpois(100000, 0.5) > 0, ncol=200)
pathl <- tempfile()
outl <- writeTileDBArray(Xl, path=pathl)

# Works for sparse numeric matrices.
Y <- Matrix::rsparsematrix(1000, 1000, density=0.01)
path2 <- tempfile()
out2 <- writeTileDBArray(Y, path=path2)

# And for sparse logical matrices.
path2l <- tempfile()
out2l <- writeTileDBArray(Y > 0, path=path2l)

# Works for dimnames.
rownames(X) <- sprintf("GENE_%i", seq_len(nrow(X)))
path3 <- tempfile()
out3 <- writeTileDBArray(X, path=path3)


LTLA/TileDBArray documentation built on Oct. 9, 2024, 7:52 a.m.