test_de | R Documentation |
Conduct a quasi-likelihood ratio test for a Gamma-Poisson fit.
test_de(
fit,
contrast,
reduced_design = NULL,
full_design = fit$model_matrix,
subset_to = NULL,
pseudobulk_by = NULL,
pval_adjust_method = "BH",
sort_by = NULL,
decreasing = FALSE,
n_max = Inf,
max_lfc = 10,
compute_lfc_se = FALSE,
verbose = FALSE
)
fit |
object of class |
contrast |
The contrast to test. Can be a single column name (quoted or as a string)
that is removed from the full model matrix of |
reduced_design |
a specification of the reduced design used as a comparison to see what
how much better |
full_design |
option to specify an alternative |
subset_to |
a vector with the same length as |
pseudobulk_by |
DEPRECTATED, please use the pseudobulk function instead. |
pval_adjust_method |
one of the p-value adjustment method from
p.adjust.methods. Default: |
sort_by |
the name of the column or an expression used to sort the result. If |
decreasing |
boolean to decide if the result is sorted increasing or decreasing
order. Default: |
n_max |
the maximum number of rows to return. Default: |
max_lfc |
set the maximum absolute log fold change that is returned. Large log fold changes
occur for lowly expressed genes because the ratio of two small numbers can be impractically large. For example, limiting
the range of log fold changes can clarify the patterns in a volcano plot. Default: |
compute_lfc_se |
Compute standard errors for the log fold changes, and add a column |
verbose |
a boolean that indicates if information about the individual steps are printed
while fitting the GLM. Default: |
The cond()
helper function simplifies the specification of a contrast for complex experimental designs.
Instead of working out which combination of coefficients corresponds to a research question,
you can simply specify the two conditions that you want to compare.
You can only call the cond
function inside the contrast
argument. The arguments are the selected factor levels
for each covariate. To compare two conditions, simply subtract the two cond
calls. Internally, the package
calls model.matrix using the provided values and the original formula from the fit to produce a vector.
Subtracting two of these vectors produces a contrast vector. Missing covariates are filled with the first factor level
or zero for numerical covariates.
a data.frame
with the following columns
the rownames of the input data
the p-values of the quasi-likelihood ratio test
the adjusted p-values returned from p.adjust()
the F-statistic: F = (Dev_{full} - Dev_{red}) / (df_1 * disp_{ql-shrunken})
the degrees of freedom of the test: ncol(design) - ncol(reduced_design)
the degrees of freedom of the fit: ncol(data) - ncol(design) + df_0
the log2-fold change. If the alternative model is specified by reduced_design
will
be NA
.
the standard error of the log2-fold change. Only calculated if compute_lfc_se == TRUE
.
Lund, S. P., Nettleton, D., McCarthy, D. J., & Smyth, G. K. (2012). Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Statistical Applications in Genetics and Molecular Biology, 11(5). https://doi.org/10.1515/1544-6115.1826.
glm_gp()
# Make Data
Y <- matrix(rnbinom(n = 30 * 100, mu = 4, size = 0.3), nrow = 30, ncol = 100)
annot <- data.frame(mouse = sample(LETTERS[1:6], size = 100, replace = TRUE),
celltype = sample(c("Tcell", "Bcell", "Macrophages"), size = 100, replace = TRUE),
cont1 = rnorm(100), cont2 = rnorm(100, mean = 30))
annot$condition <- ifelse(annot$mouse %in% c("A", "B", "C"), "ctrl", "treated")
head(annot)
se <- SummarizedExperiment::SummarizedExperiment(Y, colData = annot)
# Fit model
fit <- glm_gp(se, design = ~ condition + celltype + cont1 + cont2)
# Test with reduced design
res <- test_de(fit, reduced_design = ~ celltype + cont1 + cont2)
head(res)
# Test with contrast argument, the results are identical
res2 <- test_de(fit, contrast = conditiontreated)
head(res2)
# Test with explicit specification of the conditions
# The results are still identical
res3 <- test_de(fit, contrast = cond(condition = "treated", celltype = "Bcell") -
cond(condition = "ctrl", celltype = "Bcell"))
head(res3)
# The column names of fit$Beta are valid variables in the contrast argument
colnames(fit$Beta)
# You can also have more complex contrasts:
# the following compares cont1 vs cont2:
test_de(fit, cont1 - cont2, n_max = 4)
# You can also sort the output
test_de(fit, cont1 - cont2, n_max = 4,
sort_by = "pval")
test_de(fit, cont1 - cont2, n_max = 4,
sort_by = - abs(f_statistic))
# If the data has multiple samples, it is a good
# idea to aggregate the cell counts by samples.
# This is called forming a "pseudobulk".
se_reduced <- pseudobulk(se, group_by = vars(mouse, condition, celltype),
cont1 = mean(cont1), cont2 = min(cont2))
fit_reduced <- glm_gp(se_reduced, design = ~ condition + celltype)
test_de(fit_reduced, contrast = "conditiontreated", n_max = 4)
test_de(fit_reduced, contrast = cond(condition = "treated", celltype = "Macrophages") -
cond(condition = "ctrl", celltype = "Macrophages"),
n_max = 4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.