View source: R/scrublet_doubletDetection.R
runScrublet | R Documentation |
scrublet
.A wrapper function that calls scrub_doublets
from python
module scrublet
. Simulates doublets from the observed data and uses
a k-nearest-neighbor classifier to calculate a continuous
scrublet_score
(between 0 and 1) for each transcriptome. The score
is automatically thresholded to generate scrublet_call
, a boolean
array that is TRUE
for predicted doublets and FALSE
otherwise.
runScrublet(
inSCE,
sample = NULL,
useAssay = "counts",
simDoubletRatio = 2,
nNeighbors = NULL,
minDist = NULL,
expectedDoubletRate = 0.1,
stdevDoubletRate = 0.02,
syntheticDoubletUmiSubsampling = 1,
useApproxNeighbors = TRUE,
distanceMetric = "euclidean",
getDoubletNeighborParents = FALSE,
minCounts = 3,
minCells = 3L,
minGeneVariabilityPctl = 85,
logTransform = FALSE,
meanCenter = TRUE,
normalizeVariance = TRUE,
nPrinComps = 30L,
tsneAngle = NULL,
tsnePerplexity = NULL,
verbose = TRUE,
seed = 12345
)
inSCE |
A SingleCellExperiment object. |
sample |
Character vector or colData variable name. Indicates which
sample each cell belongs to. Default |
useAssay |
A string specifying which assay in the SCE to use. Default
|
simDoubletRatio |
Numeric. Number of doublets to simulate relative to
the number of observed transcriptomes. Default |
nNeighbors |
Integer. Number of neighbors used to construct the KNN
graph of observed transcriptomes and simulated doublets. If |
minDist |
Float Determines how tightly UMAP packs points together. If
|
expectedDoubletRate |
The estimated doublet rate for the experiment.
Default |
stdevDoubletRate |
Uncertainty in the expected doublet rate. Default
|
syntheticDoubletUmiSubsampling |
Numeric. Rate for sampling UMIs when
creating synthetic doublets. If |
useApproxNeighbors |
Boolean. Use approximate nearest neighbor method
(annoy) for the KNN classifier. Default |
distanceMetric |
Character. Distance metric used when finding nearest
neighbors. See detail. Default |
getDoubletNeighborParents |
Boolean. If |
minCounts |
Numeric. Used for gene filtering prior to PCA. Genes
expressed at fewer than |
minCells |
Integer. Used for gene filtering prior to PCA. Genes
expressed at fewer than |
minGeneVariabilityPctl |
Numeric. Used for gene filtering prior to
PCA. Keep the most highly variable genes (in the top
|
logTransform |
Boolean. If |
meanCenter |
If |
normalizeVariance |
Boolean. If |
nPrinComps |
Integer. Number of principal components used to embed
the transcriptomes prior to k-nearest-neighbor graph construction.
Default |
tsneAngle |
Float. Determines angular size of a distant node as measured
from a point in the t-SNE plot. If |
tsnePerplexity |
Integer. The number of nearest neighbors that is used
in other manifold learning algorithms. If |
verbose |
Boolean. If |
seed |
Seed for the random number generator, can be set to |
For the list of valid values for distanceMetric
, see the
documentation for
annoy (if
useApproxNeighbors
is TRUE
) or
sklearn.neighbors.NearestNeighbors
(if useApproxNeighbors
is FALSE
).
A SingleCellExperiment object with
scrub_doublets
output appended to the colData slot. The columns
include scrublet_score
and scrublet_call
.
plotScrubletResults
, runCellQC
data(scExample, package = "singleCellTK")
## Not run:
sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'")
sce <- runScrublet(sce)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.