calculate_markers | R Documentation |
Performs the Wilcoxon rank sum test to identify differentially expressed genes between two groups of cells.
calculate_markers(
expression_matrix,
cells1,
cells2,
logfc_threshold = 0,
min_pct_threshold = 0.1,
avg_expr_threshold_group1 = 0,
min_diff_pct_threshold = -Inf,
rank_matrix = NULL,
feature_names = NULL,
used_slot = "data",
norm_method = "SCT",
pseudocount_use = 1,
base = 2,
adjust_pvals = TRUE,
check_cells_set_diff = TRUE
)
expression_matrix |
A matrix of gene expression values having genes in rows and cells in columns. |
cells1 |
A vector of cell indices for the first group of cells. |
cells2 |
A vector of cell indices for the second group of cells. |
logfc_threshold |
The minimum absolute log fold change to consider a
gene as differentially expressed. Defaults to |
min_pct_threshold |
The minimum fraction of cells expressing a gene
form each cell population to consider the gene as differentially expressed.
Increasing the value will speed up the function. Defaults to |
avg_expr_threshold_group1 |
The minimum average expression that a gene
should have in the first group of cells to be considered as differentially
expressed. Defaults to |
min_diff_pct_threshold |
The minimum difference in the fraction of cells
expressing a gene between the two cell populations to consider the gene as
differentially expressed. Defaults to |
rank_matrix |
A matrix where the cells are ranked based on their
expression levels with respect to each gene. Defaults to |
feature_names |
A vector of gene names. Defaults to |
used_slot |
Parameter that provides additional information about the
expression matrix, whether it was scaled or not. The value of this parameter
impacts the calculation of the fold change. If |
norm_method |
The normalization method used to normalize the expression
matrix. The value of this parameter impacts the calculation of the average
expression of the genes when |
pseudocount_use |
The pseudocount to add to the expression values when
calculating the average expression of the genes, to avoid the 0 value for
the denominator. Defaults to |
base |
The base of the logharithm. Defaults to |
adjust_pvals |
A logical value indicating whether to adjust the p-values
for multiple testing using the Bonferonni method. Defaults to |
check_cells_set_diff |
A logical value indicating whether to check if
thw two cell groups are disjoint or not. Defaults to |
A data frame containing the following columns:
gene
: The gene name.
avg_log2FC
: The average log fold change between the two cell groups.
p_val
: The p-value of the Wilcoxon rank sum test.
p_val_adj
: The adjusted p-value of the Wilcoxon rank sum test.
pct.1
: The fraction of cells expressing the gene in the first cell group.
pct.2
: The fraction of cells expressing the gene in the second cell group.
avg_expr_group1
: The average expression of the gene in the first cell group.
set.seed(2024)
# create an artificial expression matrix
expr_matrix <- matrix(
c(runif(100 * 50), runif(100 * 50, min = 3, max = 4)),
ncol = 200, byrow = FALSE
)
colnames(expr_matrix) <- as.character(1:200)
rownames(expr_matrix) <- paste("feature", 1:50)
calculate_markers(
expression_matrix = expr_matrix,
cells1 = 101:200,
cells2 = 1:100
)
# TODO should be rewritten such that you don't create new matrix objects inside
# just
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.