topTable: Show table of results for top clusters or cluster-marker...

View source: R/topTable.R

topTableR Documentation

Show table of results for top clusters or cluster-marker combinations

Description

Show table of results for top (most highly significant) clusters or cluster-marker combinations

Usage

topTable(
  res,
  d_counts = NULL,
  d_medians = NULL,
  order = TRUE,
  order_by = "p_adj",
  all = FALSE,
  top_n = 20,
  show_counts = FALSE,
  show_props = FALSE,
  show_meds = FALSE,
  show_logFC = FALSE,
  show_all_cols = FALSE,
  sort_cols = TRUE,
  format_vals = FALSE,
  digits = 3
)

Arguments

res

Output object from either the diffcyt wrapper function or one of the individual differential testing functions (testDA_edgeR, testDA_voom, testDA_GLMM, testDS_limma, or testDS_LMM). If the output object is from the wrapper function, the objects res and d_counts will be automatically extracted. Alternatively, these can be provided directly.

d_counts

(Optional) SummarizedExperiment object containing cluster cell counts, from calcCounts. (If the output object from the wrapper function is provided, this will be be automatically extracted.)

d_medians

(Optional) SummarizedExperiment object containing cluster medians (median marker expression for each cluster-sample combination), from calcMedians. Assumed to contain a logical vector id_state_markers in the meta-data (accessed with metadata(d_medians)$id_state_markers), which identifies the set of 'cell state' markers in the list of assays. (If the output object from the wrapper function is provided, this will be be automatically extracted.)

order

Whether to order results by values in column order_by (default: column p_adj containing adjusted p-values). Default = TRUE.

order_by

Name of column to use to order rows by values, if order = TRUE. Default = "p_adj" (adjusted p-values); other options include "p_val", "cluster_id", and "marker_id".

all

Whether to display all clusters or cluster-marker combinations (instead of top top_n). Default = FALSE.

top_n

Number of clusters or cluster-marker combinations to display (if all = FALSE). Default = 20.

show_counts

Whether to display cluster cell counts by sample (from d_counts). Default = FALSE.

show_props

Whether to display cluster cell count proportions by sample (calculated from d_counts). Default = FALSE.

show_meds

Whether to display median expression values for each cluster-marker combination (from d_medians). Default = FALSE.

show_logFC

Whether to display log fold change (logFC) values. Default = FALSE.

show_all_cols

Whether to display all columns from output object (e.g. logFC, logCPM, LR, etc.) Default = FALSE.

sort_cols

Whether to sort columns of counts, proportions, and medians; by levels of factor sample_id in colData of d_medians (requires object d_medians to be provided). Default = TRUE.

format_vals

Whether to display rounded values in numeric columns. This improves readability of the summary table, but should not be used when exact numeric values are required for subsequent steps (e.g. plotting). Default = FALSE.

digits

Number of significant digits to show, if format_vals = TRUE. Default = 3. (Note: for percentages shown if show_props = TRUE, digits = 1 is used.)

Details

Summary function to display table of results for top (most highly significant) detected clusters or cluster-marker combinations.

The differential testing functions return results in the form of p-values and adjusted p-values for each cluster (DA tests) or cluster-marker combination (DS tests), which can be used to rank the clusters or cluster-marker combinations by their evidence for differential abundance or differential states. The p-values and adjusted p-values are stored in the rowData of the output SummarizedExperiment object generated by the testing functions.

This function displays a summary table of results. By default, the top_n clusters or cluster-marker combinations are shown, ordered by adjusted p-values. Optionally, cluster counts, proportions, and median expression by cluster-marker combination can also be included. The format_vals and digits arguments can be used to display rounded values to improve readability of the summary table.

Value

Returns a DataFrame table of results for the top_n clusters or cluster-marker combinations, ordered by values in column order_by (default: adjusted p-values). Optionally, cluster counts, proportions, and median expression by cluster-marker combination are also included.

Examples

# For a complete workflow example demonstrating each step in the 'diffcyt' pipeline, 
# see the package vignette.

# Function to create random data (one sample)
d_random <- function(n = 20000, mean = 0, sd = 1, ncol = 20, cofactor = 5) {
  d <- sinh(matrix(rnorm(n, mean, sd), ncol = ncol)) * cofactor
  colnames(d) <- paste0("marker", sprintf("%02d", 1:ncol))
  d
}

# Create random data (without differential signal)
set.seed(123)
d_input <- list(
  sample1 = d_random(), 
  sample2 = d_random(), 
  sample3 = d_random(), 
  sample4 = d_random()
)

# Add differential abundance (DA) signal
ix_DA <- 801:900
ix_cols_type <- 1:10
d_input[[3]][ix_DA, ix_cols_type] <- d_random(n = 1000, mean = 2, ncol = 10)
d_input[[4]][ix_DA, ix_cols_type] <- d_random(n = 1000, mean = 2, ncol = 10)

# Add differential states (DS) signal
ix_DS <- 901:1000
ix_cols_DS <- 19:20
d_input[[1]][ix_DS, ix_cols_type] <- d_random(n = 1000, mean = 3, ncol = 10)
d_input[[2]][ix_DS, ix_cols_type] <- d_random(n = 1000, mean = 3, ncol = 10)
d_input[[3]][ix_DS, c(ix_cols_type, ix_cols_DS)] <- d_random(n = 1200, mean = 3, ncol = 12)
d_input[[4]][ix_DS, c(ix_cols_type, ix_cols_DS)] <- d_random(n = 1200, mean = 3, ncol = 12)

experiment_info <- data.frame(
  sample_id = factor(paste0("sample", 1:4)), 
  group_id = factor(c("group1", "group1", "group2", "group2")), 
  stringsAsFactors = FALSE
)

marker_info <- data.frame(
  channel_name = paste0("channel", sprintf("%03d", 1:20)), 
  marker_name = paste0("marker", sprintf("%02d", 1:20)), 
  marker_class = factor(c(rep("type", 10), rep("state", 10)), 
                        levels = c("type", "state", "none")), 
  stringsAsFactors = FALSE
)

# Create design matrix
design <- createDesignMatrix(experiment_info, cols_design = "group_id")

# Create contrast matrix
contrast <- createContrast(c(0, 1))

# Test for differential abundance (DA) of clusters (using default method 'diffcyt-DA-edgeR')
out_DA <- diffcyt(d_input, experiment_info, marker_info, 
                  design = design, contrast = contrast, 
                  analysis_type = "DA", method_DA = "diffcyt-DA-edgeR", 
                  seed_clustering = 123, verbose = FALSE)

# Test for differential states (DS) within clusters (using default method 'diffcyt-DS-limma')
out_DS <- diffcyt(d_input, experiment_info, marker_info, 
                  design = design, contrast = contrast, 
                  analysis_type = "DS", method_DS = "diffcyt-DS-limma", 
                  seed_clustering = 123, verbose = FALSE)

# Display results for top DA clusters
topTable(out_DA, format_vals = TRUE)

# Display results for top DS cluster-marker combinations
topTable(out_DS, format_vals = TRUE)


lmweber/diffcyt documentation built on March 19, 2024, 5:24 a.m.