library(sesame)
sesameDataCache()

Calculate Quality Metrics

The main function to calculate the quality metrics is sesameQC_calcStats. This function takes a SigDF, calculates the QC statistics, and returns a single S4 sesameQC object, which can be printed directly to the console. To calculate QC metrics on a given list of samples or all IDATs in a folder, one can use sesameQC_calcStats within the standard openSesame pipeline. When used with openSesame, a list of sesameQCs will be returned. Note that one should turn off preprocessing using prep="":

## calculate metrics on all IDATs in a specific folder
sesameQCtoDF(openSesame(idat_dir, prep="", func=sesameQC_calcStats))
## or a list of prefixes, with parallel processing
sesameQCtoDF(openSesame(sprintf("%s/%s", idat_dir, idat_prefixes), prep="",
    func=sesameQC_calcStats, BPPARAM=BiocParallel::MulticoreParam(24)))

The results display frac_dt_cg, RGratio, RGdistort by default. For other QC metrics, SeSAMe divides sample quality metrics into multiple groups. These groups are listed below and can be referred to by short keys. For example, "intensity" generates signal intensity-related quality metrics.

library(knitr)
kable(data.frame(
    "Short Key" = c(
        "detection",
        "numProbes",
        "intensity",
        "channel",
        "dyeBias",
        "betas"),
    "Description" = c(
        "Signal Detection",
        "Number of Probes",
        "Signal Intensity",
        "Color Channel",
        "Dye Bias",
        "Beta Value")))

By default, sesameQC_calcStats calculates all QC groups. To save time, one can compute a specific QC group by specifying one or multiple short keys in the funs= argument:

sdfs <- sesameDataGet("EPIC.5.SigDF.normal")[1:2] # get two examples
## only compute signal detection stats
qcs = openSesame(sdfs, prep="", func=sesameQC_calcStats, funs="detection")
qcs[[1]]

We consider signal detection the most important QC metric.

One can retrieve the actual stat numbers from sesameQC using the sesameQC_getStats (the following generates the fraction of probes with detection success):

sesameQC_getStats(qcs[[1]], "frac_dt")

After computing the QCs, one can optionally combine the sesameQC objects into a data frame for easy comparison.

## combine a list of sesameQC into a data frame
head(do.call(rbind, lapply(qcs, as.data.frame)))

Note that when the input is an SigDF object, calling sesameQC_calcStats within openSesame and as a standalone function are equivalent.

sdf <- sesameDataGet('EPIC.1.SigDF')
qc = openSesame(sdf, prep="", func=sesameQC_calcStats, funs=c("detection"))
## equivalent direct call
qc = sesameQC_calcStats(sdf, c("detection"))
qc

Rank Quality Metrics

options(rmarkdown.html_vignette.check_title = FALSE)

SeSAMe features comparison of your sample with public data sets. The sesameQC_rankStats() function ranks the input sesameQC object with sesameQC calculated from public datasets. It shows the rank percentage of the input sample as well as the number of datasets compared.

sdf <- sesameDataGet('EPIC.1.SigDF')
qc <- sesameQC_calcStats(sdf, "intensity")
qc
sesameQC_rankStats(qc, platform="EPIC")

Quality Control Plots

SeSAMe provides functions to create QC plots. Some functions takes sesameQC as input while others directly plot the SigDF objects. Here are some examples:

More about quality control plots can be found in Supplemental Vignette.

Session Info

sessionInfo()


zwdzwd/sesame documentation built on Jan. 8, 2025, 4:50 a.m.