iSMNN | R Documentation |
This function iSMNN is designed to perform iterative supervised batch effect correction for scRNA-seq data by refining mutual nearest neighbors (MNNs) within corresponding clusters (or cell types) on the top of corrected data. It takes as input raw expression matrices from two or more batches and a list of the unified cluster labels (output from unifiedClusterLabelling of SMNN package). It outputs a Seurat object that contains the the batch-corrected expression matrix for batches
iSMNN(object.list = merge.list, batch.cluster.labels = batch.cluster.labels, matched.clusters = c("Endothelial cells", "Macrophage", "Fibroblast"), strategy = "Short.run", iterations = 5, dims = 1:20, npcs = 30)
object.list |
A list of |
assay |
A vector of assay names specifying which assay to use when constructing anchors. If NULL, the current default assay for each object is used. |
batch.cluster.labels |
is a list of vectors specifying the cluster labels of each cell from each batch. Cells not belonging to any clusters should be set to 0. |
matched.clusters |
specifies the cell clusters matched between two or more batches. |
strategy |
specifies the iteration option chosen for batch effect correction that in the first option |
iterations |
defines the number of iterations to execute. |
reference |
A vector specifying the object/s to be used as a reference during integration. If NULL (default),
all pairwise anchors are found (no reference/s). If not NULL, the corresponding objects in |
anchor.features |
Can be either:
|
scale |
Whether or not to scale the features provided. Only set to FALSE if you have previously scaled the features you want to use for each object in the object.list |
reduction |
Dimensional reduction to perform when finding anchors. Can be one of:
|
l2.norm |
Perform L2 normalization on the CCA cell embeddings after dimensional reduction |
dims |
Which dimensions to use from the CCA to specify the neighbor search space |
k.anchor |
How many neighbors (k) to use when picking anchors |
k.filter |
How many neighbors (k) to use when filtering anchors |
k.score |
How many neighbors (k) to use when scoring anchors |
max.features |
The maximum number of features to use when specifying the neighborhood search space in the anchor filtering |
nn.method |
Method for nearest neighbor finding. Options include: rann, annoy |
eps |
Error bound on the neighbor finding algorithm (from RANN) |
k.weight |
Number of neighbors to consider when weighting. Default is |
verbose |
Print progress bars and output |
sd.weigth |
defines the bandwidth of the Gaussian smoothing kernel used to compute the correction vector for each cell. Default is |
iSMNN returns a Seurat object that contains the the batch-corrected expression matrix for batches
Yuchen Yang <yyuchen@email.unc.edu>, Gang Li <franklee@live.unc.edu>, Li Qian <li_qian@med.unc.edu>, Yun Li <yunli@med.unc.edu>
Yuchen Yang, Gang Li, Li Qian, Yun Li. iSMNN 2020
# Load the example data data_SMNN data("data_iSMNN") # Provide the marker genes for cluster matching markers <- c("Col1a1", "Pdgfra", "Ptprc", "Pecam1") # Specify the cluster labels for each marker gene cluster.info <- c("fibroblast", "fibroblast", "macrophage", "endothelial cells") # Harmonize cluster labels across batches library(SMNN) batch.cluster.labels <- unifiedClusterLabelling(batches = list(data_SMNN$batch1.mat, data_iSMNN$batch2.mat), features.use = markers, cluster.labels = cluster.info, min.perc = 0.3) names(batch.cluster.labels[[1]]) <- colnames(data_iSMNN$batch1.mat) names(batch.cluster.labels[[2]]) <- colnames(data_iSMNN$batch2.mat) # Construct the input object for batches using Seurat library(Seurat) merge <- CreateSeuratObject(counts = cbind(data_iSMNN$batch1.mat, data_iSMNN$batch2.mat), min.cells = 0, min.features = 0) batch_id <- c(rep("batch1", ncol(data_iSMNN$batch1.mat)), rep("batch2", ncol(data_iSMNN$batch2.mat))) names(batch_id) <- colnames(merge) merge <- AddMetaData(object = merge, metadata = batch_id, col.name = "batch_id") merge.list <- SplitObject(merge, split.by = "batch_id") merge.list <- lapply(X = merge.list, FUN = function(x) { x <- NormalizeData(x) x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000) }) # Correct batch effect corrected.results <- iSMNN(object.list = merge.list, batch.cluster.labels = batch.cluster.labels, matched.clusters = c("endothelial cells", "macrophage", "fibroblast"), strategy = "Short.run", iterations = 5, dims = 1:20, npcs = 30, k.filter = 30)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.