SingleR | R Documentation |
Returns the best annotation for each cell in a test dataset, given a labelled reference dataset in the same feature space.
SingleR(
test,
ref,
labels,
method = NULL,
clusters = NULL,
genes = "de",
sd.thresh = 1,
de.method = "classic",
de.n = NULL,
de.args = list(),
aggr.ref = FALSE,
aggr.args = list(),
recompute = TRUE,
restrict = NULL,
quantile = 0.8,
fine.tune = TRUE,
tune.thresh = 0.05,
prune = TRUE,
assay.type.test = "logcounts",
assay.type.ref = "logcounts",
check.missing = TRUE,
num.threads = bpnworkers(BPPARAM),
BNPARAM = NULL,
BPPARAM = SerialParam()
)
test |
A numeric matrix of single-cell expression values where rows are genes and columns are cells. Alternatively, a SummarizedExperiment object containing such a matrix. |
ref |
A numeric matrix of (usually log-transformed) expression values from a reference dataset,
or a SummarizedExperiment object containing such a matrix;
see Alternatively, a list or List of SummarizedExperiment objects or numeric matrices containing multiple references. Row names may be different across entries but only the intersection will be used, see Details. |
labels |
A character vector or factor of known labels for all samples in Alternatively, if |
method |
Deprecated. |
clusters |
A character vector or factor of cluster identities for each cell in |
genes , sd.thresh , de.method , de.n , de.args |
Arguments controlling the choice of marker genes used for annotation, see |
aggr.ref , aggr.args |
Arguments controlling the aggregation of the references prior to annotation, see |
recompute |
Deprecated and ignored. |
restrict |
A character vector of gene names to use for marker selection.
By default, all genes in |
quantile , fine.tune , tune.thresh , prune |
Further arguments to pass to |
assay.type.test |
An integer scalar or string specifying the assay of |
assay.type.ref |
An integer scalar or string specifying the assay of |
check.missing |
Logical scalar indicating whether rows should be checked for missing values (and if found, removed). |
num.threads |
Integer scalar specifying the number of threads to use for index building and classification. |
BNPARAM |
Deprecated and ignored. |
BPPARAM |
A BiocParallelParam object specifying how parallelization should be performed in other steps,
see |
This function is just a convenient wrapper around trainSingleR
and classifySingleR
.
The function will automatically restrict the analysis to the intersection of the genes in both ref
and test
.
If this intersection is empty (e.g., because the two datasets use different gene annotations), an error will be raised.
If clusters
is specified, per-cell profiles are summed to obtain per-cluster profiles.
Annotation is then performed by running classifySingleR
on these profiles.
This yields a DataFrame with one row per level of clusters
.
The default settings of this function are based on the assumption that ref
contains or bulk data.
If it contains single-cell data, this usually requires a different de.method
choice.
Read the Note in ?trainSingleR
for more details.
A DataFrame is returned containing the annotation statistics for each cell (one cell per row).
This is identical to the output of classifySingleR
.
Aaron Lun, based on code by Dvir Aran.
Aran D, Looney AP, Liu L et al. (2019). Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunology 20, 163–172.
# Mocking up data with log-normalized expression values:
ref <- .mockRefData()
test <- .mockTestData(ref)
ref <- scuttle::logNormCounts(ref)
test <- scuttle::logNormCounts(test)
# Running the classification with different options:
pred <- SingleR(test, ref, labels=ref$label)
table(predicted=pred$labels, truth=test$label)
k.out<- kmeans(t(assay(test, "logcounts")), center=5) # mock up a clustering
pred2 <- SingleR(test, ref, labels=ref$label, clusters=k.out$cluster)
table(predicted=pred2$labels, cluster=rownames(pred2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.