detect_outlier | R Documentation |
This algorithm will try to find comp
number of components
in quality control metrics using a Gaussian mixture model. Outlier
detection is performed on the component with the most genes detected. The
rest of the components will be considered poor quality cells. More cells
will be classified low quality as you increase comp
.
detect_outlier(
sce,
comp = 1,
sel_col = NULL,
type = c("low", "both", "high"),
conf = c(0.9, 0.99),
batch = FALSE
)
sce |
a |
comp |
the number of component used in GMM. Depending on the quality of the experiment. |
sel_col |
a vector of column names which indicate the columns to use for QC. By default it will be the statistics generated by 'calculate_QC_metrics()' |
type |
only looking at low quality cells ('low') or possible doublets ('high') or both ('both') |
conf |
confidence interval for linear regression at lower and upper tails.Usually, this is smaller for lower tail because we hope to pick out more low quality cells than doublets. |
batch |
whether to perform quality control separately for each batch. Default is FALSE. If set to TRUE then you should have a column called 'batch' in the 'colData(sce)'. |
detect outlier using Mahalanobis distances
an updated SingleCellExperiment
object with an 'outlier'
column in colData
data("sc_sample_data")
data("sc_sample_qc")
sce = SingleCellExperiment(assays = list(counts = as.matrix(sc_sample_data)))
organism(sce) = "mmusculus_gene_ensembl"
gene_id_type(sce) = "ensembl_gene_id"
QC_metrics(sce) = sc_sample_qc
demultiplex_info(sce) = cell_barcode_matching
UMI_dup_info(sce) = UMI_duplication
# the sample qc data already run through function `calculate_QC_metrics`
# for a new sce please run `calculate_QC_metrics` before `detect_outlier`
sce = detect_outlier(sce)
table(QC_metrics(sce)$outliers)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.