knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The primary function HDStIM()
in the HDStIM
package follows a heuristic approach to group cells into responding and non-responding. For a combination of cell population and stimulation type (e.g., CD127+ T-helper cells and interferon-alpha), HDStIM()
starts by performing K-means clustering on the combined set of cells from stimulated and unstimulated samples. K-means clustering is performed on combined expression data of all the state (signaling/intracellular) markers. Upon clustering using a contingency table as shown below, a Fisher's exact test determines the effect size and the statistical significance of partitioning. Cells from the combinations that pass the Fisher's exact test (p-value < 0.05) are considered responding. An optional UMAP can also be calculated to visually verify the cell partitioning in responding and non-responding groups by using auxiliary plotting scripts provided in the package.
In addition to an auxiliary script to plot UMAPs, the package also comes with two other plotting scripts for K-means clustering and Fisher’s exact test and state marker density before and after mapping.
An example of the contingency table used for Fisher's exact test.
matrix(c(60, 40, 20, 80),nrow = 2, ncol = 2, dimnames = list(c("Cluster1", "Cluster2"), c("Stim", "Unstim")))
As stated above, HDStIM()
is the primary function of the HDStIM
package. We will use the example data set chi11
(from mass cytometry) included in the package.
Note:chi11
is a minimal dataset included for unit testing only. Therefore, it does not represent a typical mass/flow cytometry assay.
library(HDStIM) mapped_data <- HDStIM(chi11$expr_data, chi11$state_markers, chi11$cluster_col, chi11$stim_label, chi11$unstim_label, seed_val = 123, umap = TRUE, umap_cells = 500, verbose = FALSE) class(mapped_data) attributes(mapped_data)
HDStIM()
returns a list with the mapped expression data, data to plot stacked bar plots to visualize the K-means and Fisher's exact test results, and data to plot the optional UMAPs. The list also includes tables containing statistical information from K-means and Fisher's exact test and other information passed as the function attributes.
head(mapped_data$response_mapping_main)
head(mapped_data$stacked_bar_plot_data)
head(mapped_data$umap_plot_data)
head(mapped_data$all_fisher_p_val)
head(mapped_data$all_k_means_data)
Using the stacked_bar_plot_data
, plot_K_Fisher()
generates bar plots showing the percentage of cells from the stimulated and unstimulated samples clustered in the two K-means clusters a given cell population and stimulation type.
plot_K_Fisher()
returns a list of ggplot objects. If the path is specified, it can also render and save the plots in PNG format.
k_plots <- plot_K_Fisher(mapped_data, path = NULL, verbose = FALSE) k_plots[[1]]
Note: You can only generate these plots if you have asked UMAPs to be calculated in
the HDStIM()
function.
UMAP plots can be helpful for visually inspecting how well HDStIM()
has mapped responding vs. non-responding cells for a cell population and stimulation type. plot_umap()
also returns a list of ggplot objects and if the path is specified, it will render and save the plots in PNG format.
u_plots <- plot_umap(mapped_data, path = NULL, verbose = FALSE) u_plots[[1]]
For each state/signaling markers distribution plots shows the kernel density estimation of the pre HDStIM()
data from both stimulated and unstimulated samples along with the density from cells from stimulated samples mapped as responding. plot_exprs()
also returns a list of ggplot objects and if the path is specified, it will render and save the plots in PNG format.
e_plots <- plot_exprs(mapped_data, path = NULL,verbose = FALSE) library(ggplot2) e_plots[[1]] + theme(text = element_text(size = 11))
marker_ranking_boruta()
function runs Boruta on the stimulation - cell population combinations that passed the Fisher's exact test to rank the markers according to their contribution to the response. The function returns a list with a tibble containing attribute statistics calculated by Boruta and ggplot objects. If the path is not NULL, plots are also rendered and saved in the specified folder in PNG format.
m_ranks <- marker_ranking_boruta(mapped_data, path = NULL, n_cells = NULL, max_runs = 100, seed_val = 123, verbose = FALSE) head(m_ranks$attribute_stats) m_ranks$plots[[1]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.