knitr::opts_chunk$set( collapse=TRUE, comment="#>", message=FALSE, warning=FALSE )
This vignette describes the workflow for running
r Githubpkg("montilab/K2Taxonomer")
recursive partitioning on single-cell gene expression data [@reed_2020].
Note, that many of these steps are shared with that of bulk expression analyses.
A vignette for running r Githubpkg("montilab/K2Taxonomer")
on bulk expression data
can be found here.
The single-cell expression workflow performs partitioning at the level of annotated cell clusters and/or cell types. Note, for recursive partitioning it is recommended to use the same data matrix from which clustering tasks were performed, such as an integrated data matrix generated prior to and used for clustering. For tasks downstream of partitioning, users can specify an alternative data matrix.
## K2Taxonomer package library(K2Taxonomer) ## Seurat package library(Seurat) ## For drawing dendrograms library(ggdendro)
data("ifnb_small")
data("cellMarker2_genesets")
DimPlot(ifnb_small, label = TRUE, raster =TRUE, pt.size = 3, alpha = 0.4) + NoLegend()
r Githubpkg("montilab/K2Taxonomer")
## Integrated expression matrix used for clustering data integrated_expression_matrix <- ifnb_small@assays$integrated$scale.data ## Normalized expression matrix to be used for downstream analyses normalized_expression_matrix <- ifnb_small@assays$SCT$data ## Profile-level information cell_data <- ifnb_small@meta.data
K2
objectThe K2preproc()
initializes the K2
object and runs pre-processing steps.
Here, you can specify all arguments used throughout the analysis. Otherwise,
you can specify these arguments within the specific functions for which they
are implemented. See help pages for more information.
A description of arguments implemented in this vignette are
integrated
across experiments/samples because these are generally considered insufficent for statistical analyses of singe-cell gene expression.# Initialize `K2` object K2res <- K2preproc(object = integrated_expression_matrix, eMatDS = normalized_expression_matrix, colData = cell_data, cohorts="cell_type", nBoots = 200, clustFunc = "cKmeansDownsampleSqrt", genesets = cellMarker2_genesets)
The r Githubpkg("montilab/K2Taxonomer")
is run by K2tax()
. At each
recursion of the algorithm, the observations are partitioned
into two sub-groups based on a compilation of repeated K=2 clustering on
bootstrapped sampling of features.
K2res <- K2tax(K2res)
r Githubpkg("montilab/K2Taxonomer")
results## Get dendrogram from K2Taxonomer dendro <- K2dendro(K2res) ## Plot dendrogram ggdendrogram(dendro)
K2visNetwork(K2res)
r Githubpkg("montilab/K2Taxonomer")
resultsK2res <- runDGEmods(K2res)
### Perform Fisher Exact Test based over-representation analysis K2res <- runFISHERmods(K2res) ### Perform single-sample gene set scoring K2res <- runScoreGeneSets(K2res) ### Perform partition-level differential gene set score analysis K2res <- runDSSEmods(K2res)
DGEtable <- getDGETable(K2res) head(DGEtable)
getDGEInter(K2res, minDiff = 1, node = c("A"), pagelength = 10)
plotGenePathway(K2res, feature = "FTL", node = "A")
plotGenePathway(K2res, feature = "FTL", node = "A", use_plotly = FALSE)
ENRtable <- getEnrichmentTable(K2res) head(ENRtable)
getEnrichmentInter(K2res, nodes = c("A"), pagelength = 10)
plotGenePathway(K2res, feature = "Monocyte", node = "A", type = "gMat")
plotGenePathway(K2res, feature = "Monocyte", node = "A", type = "gMat", use_plotly = FALSE)
For more information about K2Taxonomer dashboards, read this vignette.
# Not run K2dashboard(K2res, "K2results_ifnb_small")
Given their size, parts of the K2Taxonomer workflow can take a long time with single-cell data sets. Accordingly, it is generally recommended to run the workflow using parallel computing. This can be implemented easily by setting the useCors
argument in K2preproc()
# Not run K2res <- K2preproc(object = integrated_expression_matrix, eMatDS = normalized_expression_matrix, colData = cell_data, cohorts="cell_type", nBoots = 200, useCors = 8, ## Runs K2Taxonomer in parellel with eight cores. clustFunc = "cKmeansDownsampleSqrt", genesets = cellMarker2_genesets)
In addition to expression matrices, Seurat
objects may be input directly with the object argument. When implemented, colData isn't specified and this information is pulled from the meta data of the Seurat
object. Two additional arguments may be set, each specifying the assay of the Seurat
objects to pull expression data from.
K2res_seurat <- K2preproc(object = ifnb_small, cohorts="cell_type", seuAssay = "integrated", seuAssayDS = "SCT", nBoots = 200, clustFunc = "cKmeansDownsampleSqrt", genesets = cellMarker2_genesets) ## Run recursive partitioning algorithm K2res_seurat <- K2tax(K2res_seurat) ## Get dendrogram from K2Taxonomer dendro_seurat <- K2dendro(K2res_seurat) ## Plot dendrogram ggdendrogram(dendro_seurat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.