library(BiocStyle) knitr::opts_chunk$set(error=FALSE, warning=FALSE, message=FALSE)
The r Biocpkg("chihaya")
package saves DelayedArray
objects for efficient, portable and stable reproduction of delayed operations in a new R session or other programming frameworks.
r Biocpkg("DelayedArray")
operations into our specification, we provide a layer of protection against changes in the S4 class structure that would invalidate serialized RDS objects.Check out the specification for more details.
Make a DelayedArray
object with some operations:
library(DelayedArray) x <- DelayedArray(matrix(runif(1000), ncol=10)) x <- x[11:15,] / runif(5) x <- log2(x + 1) x showtree(x)
Save it into a HDF5 file with saveDelayed()
:
library(chihaya) tmp <- tempfile(fileext=".h5") saveDelayed(x, tmp) rhdf5::h5ls(tmp)
And then load it back in later:
y <- loadDelayed(tmp) y
Of course, this is not a particularly interesting case as we end up saving the original array inside our HDF5 file anyway. The real fun begins when you have some more interesting seeds.
We can use the delayed nature of the operations to avoid breaking sparsity. For example:
library(Matrix) x <- rsparsematrix(1000, 1000, density=0.01) x <- DelayedArray(x) + runif(1000) tmp <- tempfile(fileext=".h5") saveDelayed(x, tmp) rhdf5::h5ls(tmp) file.info(tmp)[["size"]] # Compared to a dense array. tmp2 <- tempfile(fileext=".h5") out <- HDF5Array::writeHDF5Array(x, tmp2, "data") file.info(tmp2)[["size"]] # Loading it back in. y <- loadDelayed(tmp) showtree(y)
We can also store references to external files, thus avoiding data duplication:
library(HDF5Array) test <- HDF5Array(tmp2, "data") stuff <- log2(test + 1) stuff tmp <- tempfile(fileext=".h5") saveDelayed(stuff, tmp) rhdf5::h5ls(tmp) file.info(tmp)[["size"]] # size of the delayed operations + pointer to the actual file y <- loadDelayed(tmp) y
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.