View source: R/seqArchR_auxiliary_functionsII.R
set_config | R Documentation |
This function sets the configuration for 'seqArchR'.
set_config(
chunk_size = 500,
k_min = 1,
k_max = 50,
mod_sel_type = "stability",
bound = 10^-6,
cv_folds = 5,
parallelize = FALSE,
n_cores = NA,
n_runs = 100,
alpha_base = 0,
alpha_pow = 1,
min_size = 25,
result_aggl = "complete",
result_dist = "euclid",
checkpointing = TRUE,
flags = list(debug = FALSE, time = FALSE, verbose = TRUE, plot = FALSE)
)
chunk_size |
Numeric. Specify the size of the inner chunks of sequences. |
k_min |
Numeric. Specify the minimum of the range of values to be tested for number of NMF basis vectors. Default is 1. |
k_max |
Numeric. Specify the maximum of the range of values to be tested for number of NMF basis vectors. Default is 50. |
mod_sel_type |
Character. Specify the model selection strategy to be used. Default is 'stability'. Another option is 'cv', short for cross-validation. Warning: The cross-validation approach can be time consuming and computationally expensive than the stability-based approach. |
bound |
Numeric. Specify the lower bound value as criterion for choosing the most appropriate number of NMF factors. Default is 1e-08. |
cv_folds |
Numeric. Specify the number of cross-validation folds used for model selection. Only used when mod_sel_type is set to 'cv'. Default value is 5. |
parallelize |
Logical. Specify whether to parallelize the procedure. Note that running seqArchR serially can be time consuming, especially when using cross-validation for model selection. See 'n_cores'. Consider parallelizing with at least 2 or 4 cores. |
n_cores |
The number of cores to be used when 'parallelize' is set to TRUE. If 'parallelize' is FALSE, nCores is ignored. |
n_runs |
Numeric. Specify the number of bootstrapped runs to be performed with NMF. Default value is 100. When using cross-validation more than 100 iterations may be needed (upto 500). |
alpha_base, alpha_pow |
Specify the base and the power for computing 'alpha' in performing model selection for NMF. alpha = alpha_base^alpha_pow. Alpha specifies the regularization for NMF. Default: 0 and 1 respectively. _Warning_: Currently, not used (for future). |
min_size |
Numeric. Specify the minimum number of sequences, such that any cluster/chunk of size less than or equal to it will not be further processed. Default is 25. |
result_aggl |
Character. Specify the agglomeration method to be used
for final result collation with hierarchical clustering. Default is
'complete' linkage. Possible values are those allowed with
|
result_dist |
Character. Specify the distance method to be used for
final result collation with hierarchical clustering. Default is 'cor' for
correlation. Possible values are those allowed with
|
checkpointing |
Logical. Specify whether to write intermediate
checkpoints to disk as RDS files. Checkpoints and the final result are
saved to disk provided the 'o_dir' argument is set in |
flags |
List with four logical elements as detailed.
|
Setting suitable values for the following parameters is dependent on the data: 'inner_chunk_size', 'k_min', 'k_max', 'mod_sel_type', 'min_size', 'result_aggl', 'result_dist'.
A list with all params for seqArchR set
# Set seqArchR configuration
seqArchRconfig <- seqArchR::set_config(
chunk_size = 100,
parallelize = TRUE,
n_cores = 2,
n_runs = 100,
k_min = 1,
k_max = 20,
mod_sel_type = "stability",
bound = 10^-8,
flags = list(debug = FALSE, time = TRUE, verbose = TRUE,
plot = FALSE)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.