knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Dropout events make the lowly expressed genes indistinguishable from true zero expression and different than the low expression present in cells of the same type. This issue makes any subsequent downstream analysis difficult. ccImpute is an imputation tool that uses cell similarity established by consensus clustering to impute the most probable dropout events in the scRNA-seq datasets. ccImpute demonstrates performance which exceeds the performance of existing imputation approaches while introducing the least amount of new noise as measured by clustering performance characteristics on datasets with known cell identities.
ccImpute
is an imputation tool and it does not provide functions for the
pre-processing the data. This tool expects the user to preprocess the data
prior to using it. The input data is expected to be in log-normalized format.
This manual includes sample minimal pre-processing of dataset from
scRNAseq database using the
scater tool.
library(scRNAseq) library(scater) library(ccImpute) library(SingleCellExperiment) library(stats) library(mclust)
The following code loads Darmanis dataset(Darmanis et al. "A survey of human brain transcriptome diversity at the single cell level."(2015)) and computes log-transformed normalized counts:
data <- DarmanisBrainData() data <- logNormCounts(data)
# Compute PCA reduction of the dataset reducedDims(data) <- list(PCA=prcomp(t(logcounts(data)))$x) # Get an actual number of cell types k <- length(unique(colData(data)$cell.type)) # Cluster the PCA reduced dataset and store the assignments assgmts <- kmeans(reducedDim(data, "PCA"), centers = k, iter.max = 1e+09, nstart = 1000)$cluster # Use ARI to compare the k-means assignments to label assignments adjustedRandIndex(assgmts, colData(data)$cell.type)
logcounts(data) <- impute(assays(data)$logcounts, k = k, nCores = 2)
# Recompute PCA reduction of the dataset reducedDims(data) <- list(PCA=prcomp(t(logcounts(data)))$x) # Cluster the PCA reduced dataset and store the assignments assgmts <- kmeans(reducedDim(data, "PCA"), centers = k, iter.max = 1e+09, nstart = 1000)$cluster # Use ARI to compare the k-means assignments to label assignments adjustedRandIndex(assgmts, colData(data)$cell.type)
R
session information.## Session info library("sessioninfo") options(width = 120) session_info()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.