celdaGridSearch: Run Celda in parallel with multiple parameters

celdaGridSearchR Documentation

Run Celda in parallel with multiple parameters

Description

Run Celda with different combinations of parameters and multiple chains in parallel. The variable availableModels contains the potential models that can be utilized. Different parameters to be tested should be stored in a list and passed to the argument paramsTest. Fixed parameters to be used in all models, such as sampleLabel, can be passed as a list to the argument paramsFixed. When verbose = TRUE, output from each chain will be sent to a log file but not be displayed in stdout.

Usage

celdaGridSearch(
  x,
  useAssay = "counts",
  altExpName = "featureSubset",
  model,
  paramsTest,
  paramsFixed = NULL,
  maxIter = 200,
  nchains = 3,
  cores = 1,
  bestOnly = TRUE,
  seed = 12345,
  perplexity = TRUE,
  verbose = TRUE,
  logfilePrefix = "Celda"
)

## S4 method for signature 'SingleCellExperiment'
celdaGridSearch(
  x,
  useAssay = "counts",
  altExpName = "featureSubset",
  model,
  paramsTest,
  paramsFixed = NULL,
  maxIter = 200,
  nchains = 3,
  cores = 1,
  bestOnly = TRUE,
  seed = 12345,
  perplexity = TRUE,
  verbose = TRUE,
  logfilePrefix = "Celda"
)

## S4 method for signature 'matrix'
celdaGridSearch(
  x,
  useAssay = "counts",
  altExpName = "featureSubset",
  model,
  paramsTest,
  paramsFixed = NULL,
  maxIter = 200,
  nchains = 3,
  cores = 1,
  bestOnly = TRUE,
  seed = 12345,
  perplexity = TRUE,
  verbose = TRUE,
  logfilePrefix = "Celda"
)

Arguments

x

A numeric matrix of counts or a SingleCellExperiment with the matrix located in the assay slot under useAssay. Rows represent features and columns represent cells.

useAssay

A string specifying the name of the assay slot to use. Default "counts".

altExpName

The name for the altExp slot to use. Default "featureSubset".

model

Celda model. Options available in availableModels.

paramsTest

List. A list denoting the combinations of parameters to run in a celda model. For example, list(K = seq(5, 10), L = seq(15, 20)) will run all combinations of K from 5 to 10 and L from 15 to 20 in model celda_CG.

paramsFixed

List. A list denoting additional parameters to use in each celda model. Default NULL.

maxIter

Integer. Maximum number of iterations of sampling to perform. Default 200.

nchains

Integer. Number of random cluster initializations. Default 3.

cores

Integer. The number of cores to use for parallel estimation of chains. Default 1.

bestOnly

Logical. Whether to return only the chain with the highest log likelihood per combination of parameters or return all chains. Default TRUE.

seed

Integer. Passed to with_seed. For reproducibility, a default value of 12345 is used. Seed values seq(seed, (seed + nchains - 1)) will be supplied to each chain in nchains. If NULL, no calls to with_seed are made.

perplexity

Logical. Whether to calculate perplexity for each model. If FALSE, then perplexity can be calculated later with resamplePerplexity. Default TRUE.

verbose

Logical. Whether to print log messages during celda chain execution. Default TRUE.

logfilePrefix

Character. Prefix for log files from worker threads and main process. Default "Celda".

Value

A SingleCellExperiment object. Function parameter settings and celda model results are stored in the metadata "celda_grid_search" slot.

See Also

celda_G for feature clustering, celda_C for clustering of cells, and celda_CG for simultaneous clustering of features and cells. subsetCeldaList can subset the celdaList object. selectBestModel can get the best model for each combination of parameters.

Examples

## Not run: 
data(celdaCGSim)
## Run various combinations of parameters with 'celdaGridSearch'
celdaCGGridSearchRes <- celdaGridSearch(celdaCGSim$counts,
  model = "celda_CG",
  paramsTest = list(K = seq(4, 6), L = seq(9, 11)),
  paramsFixed = list(sampleLabel = celdaCGSim$sampleLabel),
  bestOnly = TRUE,
  nchains = 1,
  cores = 1)

## End(Not run)

campbio/celda documentation built on April 5, 2024, 11:47 a.m.