Description Usage Arguments Details Value Author(s) References Examples
Determine taxa whose absolute abundances, per unit volume, of
the ecosystem (e.g. gut) are significantly different with changes in the
covariate of interest (e.g. the group effect). The current version of
ancombc
function implements Analysis of Compositions of Microbiomes
with Bias Correction (ANCOM-BC) in cross-sectional data while allowing
the adjustment of covariates.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
phyloseq |
a phyloseq-class object, which consists of a feature table
(microbial observed abundance table), a sample metadata, a taxonomy table
(optional), and a phylogenetic tree (optional). The row names of the
metadata must match the sample names of the feature table, and the row names
of the taxonomy table must match the taxon (feature) names of the feature
table. See |
formula |
the character string expresses how the microbial absolute abundances for each taxon depend on the variables in metadata. |
p_adj_method |
method to adjust p-values by. Default is "holm".
Options include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY",
"fdr", "none". See |
zero_cut |
a numerical fraction between 0 and 1. Taxa with proportion of
zeroes greater than |
lib_cut |
a numerical threshold for filtering samples based on library
sizes. Samples with library sizes less than |
group |
the name of the group variable in metadata. Specifying
|
struc_zero |
whether to detect structural zeros. Default is FALSE. |
neg_lb |
whether to classify a taxon as a structural zero in the corresponding study group using its asymptotic lower bound. Default is FALSE. |
tol |
the iteration convergence tolerance for the E-M algorithm. Default is 1e-05. |
max_iter |
the maximum number of iterations for the E-M algorithm. Default is 100. |
conserve |
whether to use a conservative variance estimate of the test statistic. It is recommended if the sample size is small and/or the number of differentially abundant taxa is believed to be large. Default is FALSE. |
alpha |
level of significance. Default is 0.05. |
global |
whether to perform global test. Default is FALSE. |
The definition of structural zero can be found at
ANCOM-II.
Setting neg_lb = TRUE
indicates that you are using both criteria
stated in section 3.2 of
ANCOM-II
to detect structural zeros; otherwise, the algorithm will only use the
equation 1 in section 3.2 for declaring structural zeros. Generally, it is
recommended to set neg_lb = TRUE
when the sample size per group is
relatively large (e.g. > 30).
a list
with components:
feature_table
, a data.frame
of pre-processed
(based on zero_cut
and lib_cut
) microbial observed
abundance table.
zero_ind
, a logical matrix
with TRUE indicating
the taxon is identified as a structural zero for the specified
group
variable.
samp_frac
, a numeric vector of estimated sampling
fractions in log scale (natural log). Note that for each sample,
if it contains missing values for any variable specified in the
formula
, the corresponding sampling fraction estimate
for this sample will return NA
since the sampling fraction
is not estimable with the presence of missing values.
resid
, a matrix
of residuals from the ANCOM-BC
log-linear (natural log) model.
Rows are taxa and columns are samples.
delta_em
, estimated bias terms through E-M algorithm.
delta_wls
, estimated bias terms through weighted
least squares (WLS) algorithm.
res
, a list
containing ANCOM-BC primary result,
which consists of:
beta
, a data.frame
of coefficients obtained
from the ANCOM-BC log-linear (natural log) model.
se
, a data.frame
of standard errors (SEs) of
beta
.
W
, a data.frame
of test statistics.
W = beta/se
.
p_val
, a data.frame
of p-values. P-values are
obtained from two-sided Z-test using the test statistic W
.
q_val
, a data.frame
of adjusted p-values.
Adjusted p-values are obtained by applying p_adj_method
to p_val
.
diff_abn
, a logical data.frame
. TRUE if the
taxon has q_val
less than alpha
.
res_global
, a data.frame
containing ANCOM-BC
global test result for the variable specified in group
,
each column is:
W
, test statistics.
p_val
, p-values, which are obtained from two-sided
Chi-square test using W
.
q_val
, adjusted p-values. Adjusted p-values are
obtained by applying p_adj_method
to p_val
.
diff_abn
, A logical vector. TRUE if the taxon has
q_val
less than alpha
.
Huang Lin
kaul2017analysisANCOMBC
\insertReflin2020analysisANCOMBC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | #================Build a Phyloseq-Class Object from Scratch==================
library(phyloseq)
otu_mat = matrix(sample(1:100, 100, replace = TRUE), nrow = 10, ncol = 10)
rownames(otu_mat) = paste0("taxon", 1:nrow(otu_mat))
colnames(otu_mat) = paste0("sample", 1:ncol(otu_mat))
meta = data.frame(group = sample(LETTERS[1:4], size = 10, replace = TRUE),
row.names = paste0("sample", 1:ncol(otu_mat)),
stringsAsFactors = FALSE)
tax_mat = matrix(sample(letters, 70, replace = TRUE),
nrow = nrow(otu_mat), ncol = 7)
rownames(tax_mat) = rownames(otu_mat)
colnames(tax_mat) = c("Kingdom", "Phylum", "Class", "Order",
"Family", "Genus", "Species")
OTU = otu_table(otu_mat, taxa_are_rows = TRUE)
META = sample_data(meta)
TAX = tax_table(tax_mat)
physeq = phyloseq(OTU, META, TAX)
#========================Run ANCOMBC Using a Real Data=======================
library(phyloseq)
library(microbiome)
library(tidyverse)
data(GlobalPatterns)
# Aggregate to phylum level
phylum_data = aggregate_taxa(GlobalPatterns, "Phylum")
# The taxonomy table
tax_mat = as(tax_table(phylum_data), "matrix")
# Run ancombc function
out = ancombc(phyloseq = phylum_data, formula = "SampleType",
p_adj_method = "holm", zero_cut = 0.90, lib_cut = 1000,
group = "SampleType", struc_zero = TRUE, neg_lb = FALSE,
tol = 1e-5, max_iter = 100, conserve = TRUE,
alpha = 0.05, global = TRUE)
res = out$res
res_global = out$res_global
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.