matter-options: Options for "matter" Objects

matter-optionsR Documentation

Options for “matter” Objects

Description

Set global parameters for matter.

Usage

## Set defaults for common arguments
matter_defaults(nchunks = 20L, chunksize = NA_real_,
    serialize = NA, verbose = FALSE)

Arguments

nchunks

The number of chunks to use for chunk processing (e.g., in chunkApply. This sets getOption("matter.default.nchunks"). For IO-bound operations, using fewer chunks will often be faster, but use more memory.

chunksize

The approximate chunk size in bytes for chunk processing (e.g., in chunkApply. This sets getOption("matter.default.chunksize"). For IO-bound operations, using larger chunks will often be faster, but use more memory. If set to NA_real_, then the chunk size is determined by the number of chunks.

serialize

Whether data in virtual memory should be realized on the manager and serialized to the workers (TRUE), passed to the workers in virtual memory as-is (FALSE), or if matter should decide the behavior based on the cluster configuration (NA). This sets getOption("matter.default.serialize"). If all workers have access to the same virtual memory resources (whether file storage or shared memory), then it can be significantly faster to avoid serializing the data.

verbose

Whether progress messages should be printed. This sets getOption("matter.default.verbose").

Details

The matter package provides the following options:

  • options(matter.compress.atoms=3): The compression ratio threshold to be used to determine when to compress atoms in a matter object. Setting to 0 or FALSE means that atoms are never compressed.

  • options(matter.default.nchunks=20L): The default number of chunks to use when iterating over matter objects. For IO-bound operations, using fewer chunks will often be faster, but use more memory.

  • options(matter.default.chunksize=NA_real_): The default chunk size in bytes to use when iterating over matter objects. For IO-bound operations, using larger chunks will often be faster, but use more memory. If set to NA_real_, then the chunk size is determined by the number of chunks.

  • options(matter.default.serialize=NA): Whether virtual memory chunks should be realized on the manager and serialized to the workers (TRUE), passed to the workers as-is FALSE, or if matter should decide based on the cluster configuration (NA). If all workers have access to the same virtual memory resources (whether file storage or shared memory), then it can be significantly faster to avoid serializing the data.

  • options(matter.default.verbose=FALSE): The default verbosity for printing progress messages.

  • options(matter.matmul.bpparam=NULL): An optional BiocParallelParam passed to bplapply when performing matrix multiplication with matter_mat and sparse_mat objects.

  • options(matter.show.head=TRUE): Should a preview of the beginning of the data be displayed when the object is printed?

  • options(matter.show.head.n=6): The number of elements, rows, and/or columns to be displayed by the object preview.

  • options(matter.coerce.altrep=FALSE): When coercing matter objects to native R objects (such as matrix), should a matter-backed ALTREP object be returned instead? The initial coercion will be cheap, and the result will look like a native R object. This does not guarantee that the full data is never read into memory. Not all functions are ALTREP-aware at the C-level, so some operations may still trigger the full data to be read into memory. This should only ever happen once, as long as the object is not duplicated, though.

  • options(matter.wrap.altrep=FALSE): When coercing to a matter-backed ALTREP object, should the object be wrapped in an ALTREP wrapper? (This is always done in cases where the coercion preserves existing attributes.) This allows setting of attributes without triggering a (potentially expensive) duplication of the object when safe to do so.

  • options(matter.temp.dir=tempdir()): Temporary directory where anonymous matter object files (i.e., those created with path=NULL) should be created.

  • options(matter.temp.gc=TRUE): If TRUE, then anonymous matter object files (i.e., those created with path=NULL) are automatically cleaned up when all R objects referencing them have been garbage collected. If FALSE, then they are only removed at the end of the R session (and only if they are in R's default temporary directory).


kuwisdelu/matter documentation built on Oct. 19, 2024, 10:31 a.m.