library(MultiAssayExperiment) library(HDF5Array) library(SummarizedExperiment)
The HDF5Array
package provides an on-disk representation of large datasets
without the need to load them into memory. Convenient lazy evaluation
operations allow the user to manipulate such large data files based on
metadata. The DelayedMatrix
class in the DelayedArray
package provides a
way to connect to a large matrix that is stored on disk.
First, we create a small matrix for constructing the DelayedMatrix
class.
smallMatrix <- matrix(rnorm(10e5), ncol = 20)
We add rownames and column names to the matrix object for compatibility with
the MultiAssayExperiment
representation.
rownames(smallMatrix) <- paste0("GENE", seq_len(nrow(smallMatrix))) colnames(smallMatrix) <- paste0("SampleID", seq_len(ncol(smallMatrix)))
Here we use the DelayedArray
constructor function to create a
DelayedMatrix
object.
smallMatrix <- DelayedArray(smallMatrix) class(smallMatrix) # show method smallMatrix dim(smallMatrix)
Finally, the rhdf5
package stores dimnames
in a standard location.
In order to make use of this functionality, we would use writeHDF5Array
with the with.dimnames
argument:
testh5 <- tempfile(fileext = ".h5") writeHDF5Array(smallMatrix, filepath = testh5, name = "smallMatrix", with.dimnames = TRUE)
To see the file structure we use h5ls
:
h5ls(testh5)
Note that a large matrix from an HDF5 file can also be loaded using the
HDF5ArraySeed
and DelayedArray
functions.
hdf5Data <- HDF5ArraySeed(file = testh5, name = "smallMatrix") newDelayedMatrix <- DelayedArray(hdf5Data) class(newDelayedMatrix) newDelayedMatrix
DelayedMatrix
with MultiAssayExperiment
A DelayedMatrix
alone conforms to the MultiAssayExperiment
API requirements.
Shown below, the DelayedMatrix
can be put into a named list
and passed into
the MultiAssayExperiment
constructor function.
HDF5MAE <- MultiAssayExperiment(experiments = list(smallMatrix = smallMatrix)) sampleMap(HDF5MAE) colData(HDF5MAE)
SummarizedExperiment
with DelayedMatrix
backendA more information rich DelayedMatrix
can be created when used in conjunction
with the SummarizedExperiment
class and it can even include rowRanges
.
The flexibility of the MultiAssayExperiment
API supports classes with
minimal requirements. Additionally, this SummarizedExperiment
with the
DelayedMatrix
backend can be part of a bigger MultiAssayExperiment
object.
Below is a minimal example of how this would work:
HDF5SE <- SummarizedExperiment(assays = smallMatrix) assay(HDF5SE) MultiAssayExperiment(list(HDF5SE = HDF5SE))
Additional scenarios are currently in development where an HDF5Matrix
is
hosted remotely. Many opportunities exist when considering on-disk and off-disk
representations of data with MultiAssayExperiment
.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.