secom_linear | R Documentation |
Obtain the sparse correlation matrix for linear correlations
between taxa. The current version of secom_linear
function supports
either of the three correlation coefficients: Pearson, Spearman, and
Kendall's \tau
.
secom_linear(
data,
taxa_are_rows = TRUE,
assay.type = assay_name,
assay_name = "counts",
rank = tax_level,
tax_level = NULL,
aggregate_data = NULL,
meta_data = NULL,
pseudo = 0,
prv_cut = 0.5,
lib_cut = 1000,
corr_cut = 0.5,
wins_quant = c(0.05, 0.95),
method = c("pearson", "spearman"),
soft = FALSE,
alpha_grid = 0,
thresh_len = 100,
n_cv = 10,
thresh_hard = 0,
max_p = 0.005,
n_cl = 1,
verbose = TRUE
)
data |
a |
taxa_are_rows |
logical. Whether taxa are positioned in the rows of the feature table. Default is TRUE. |
assay.type |
alias for |
assay_name |
character. Name of the feature table within the data object
(only applicable if the data object is a |
rank |
alias for |
tax_level |
character. The taxonomic level of interest. The input data
can be agglomerated at different taxonomic levels based on your research
interest. Default is NULL, i.e., do not perform agglomeration, and the
SECOM anlysis will be performed at the lowest taxonomic level of the
input |
aggregate_data |
The abundance data that has been aggregated to the desired
taxonomic level. This parameter is required only when the input data is in
|
meta_data |
a |
pseudo |
numeric. Add pseudo-counts to the data. Default is 0 (no pseudo-counts). |
prv_cut |
a numerical fraction between 0 and 1. Taxa with prevalences
(the proportion of samples in which the taxon is present)
less than |
lib_cut |
a numerical threshold for filtering samples based on library
sizes. Samples with library sizes less than |
corr_cut |
numeric. To avoid false positives caused by taxa with small
variances, taxa with Pearson correlation coefficients greater than
|
wins_quant |
a numeric vector of probabilities with values between
0 and 1. Replace extreme values in the abundance data with less
extreme values. Default is |
method |
character. It indicates which correlation coefficient is to be computed. It can be either "pearson" or "spearman". |
soft |
logical. |
alpha_grid |
a numeric vector of penalty parameters for the element-wise L1 norm to induce sparsity. Default is 0. |
thresh_len |
numeric. Grid-search is implemented to find the optimal
values over |
n_cv |
numeric. The fold number in cross validation. Default is 10 (10-fold cross validation). |
thresh_hard |
Numeric. Pairwise correlation coefficients
(in their absolute value) that are less than or equal to |
max_p |
numeric. Obtain the sparse correlation matrix by
p-value filtering. Pairwise correlation coefficients with p-value greater
than |
n_cl |
numeric. The number of nodes to be forked. For details, see
|
verbose |
logical. Whether to display detailed progress messages. |
a list
with components:
s_diff_hat
, a numeric vector of estimated
sample-specific biases.
y_hat
, a matrix of bias-corrected abundances
cv_error
, a numeric vector of cross-validation error
estimates, which are the Frobenius norm differences between
correlation matrices using training set and validation set,
respectively.
thresh_grid
, a numeric vector of thresholds
in the cross-validation.
thresh_opt
, numeric. The optimal threshold through
cross-validation.
mat_cooccur
, a matrix of taxon-taxon co-occurrence
pattern. The number in each cell represents the number of complete
(nonzero) samples for the corresponding pair of taxa.
corr
, the sample correlation matrix (using the measure
specified in method
) computed using the bias-corrected
abundances y_hat
.
corr_p
, the p-value matrix corresponding to the sample
correlation matrix corr
.
corr_th
, the sparse correlation matrix obtained by
thresholding based on the method specified in soft
.
corr_fl
, the sparse correlation matrix obtained by
p-value filtering based on the cutoff specified in max_p
.
corr_reg
, the correlation matrix obtained by
winsorizing small eigenvalues.
Huang Lin
secom_dist
library(ANCOMBC)
if (requireNamespace("microbiome", quietly = TRUE)) {
data(atlas1006, package = "microbiome")
# subset to baseline
pseq = phyloseq::subset_samples(atlas1006, time == 0)
# run secom_linear function
set.seed(123)
res_linear = secom_linear(data = list(pseq), taxa_are_rows = TRUE,
tax_level = "Phylum",
aggregate_data = NULL, meta_data = NULL, pseudo = 0,
prv_cut = 0.5, lib_cut = 1000, corr_cut = 0.5,
wins_quant = c(0.05, 0.95), method = "pearson",
soft = FALSE, alpha_grid = 0,
thresh_len = 20, n_cv = 10,
thresh_hard = 0.3, max_p = 0.005, n_cl = 2)
corr_th = res_linear$corr_th
corr_fl = res_linear$corr_fl
} else {
message("The 'microbiome' package is not installed. Please install it to use this example.")
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.