library(sesame) sesameDataCache()
The main function to calculate the quality metrics is sesameQC_calcStats
.
This function takes a SigDF, calculates the QC statistics, and returns a single
S4 sesameQC
object, which can be printed directly to the console. To calculate
QC metrics on a given list of samples or all IDATs in a folder, one can use
sesameQC_calcStats
within the standard openSesame
pipeline. When used with
openSesame
, a list of sesameQC
s will be returned. Note that one should turn
off preprocessing using prep=""
:
## calculate metrics on all IDATs in a specific folder sesameQCtoDF(openSesame(idat_dir, prep="", func=sesameQC_calcStats)) ## or a list of prefixes, with parallel processing sesameQCtoDF(openSesame(sprintf("%s/%s", idat_dir, idat_prefixes), prep="", func=sesameQC_calcStats, BPPARAM=BiocParallel::MulticoreParam(24)))
The results display frac_dt_cg
, RGratio
, RGdistort
by default. For other
QC metrics, SeSAMe divides sample quality metrics into multiple groups. These
groups are listed below and can be referred to by short keys. For example,
"intensity" generates signal intensity-related quality metrics.
library(knitr) kable(data.frame( "Short Key" = c( "detection", "numProbes", "intensity", "channel", "dyeBias", "betas"), "Description" = c( "Signal Detection", "Number of Probes", "Signal Intensity", "Color Channel", "Dye Bias", "Beta Value")))
By default, sesameQC_calcStats
calculates all QC groups. To save time, one
can compute a specific QC group by specifying one or multiple short keys in
the funs=
argument:
sdfs <- sesameDataGet("EPIC.5.SigDF.normal")[1:2] # get two examples ## only compute signal detection stats qcs = openSesame(sdfs, prep="", func=sesameQC_calcStats, funs="detection") qcs[[1]]
We consider signal detection the most important QC metric.
One can retrieve the actual stat numbers from sesameQC
using the
sesameQC_getStats (the following generates the fraction of probes with
detection success):
sesameQC_getStats(qcs[[1]], "frac_dt")
After computing the QCs, one can optionally combine the sesameQC
objects into
a data frame for easy comparison.
## combine a list of sesameQC into a data frame head(do.call(rbind, lapply(qcs, as.data.frame)))
Note that when the input is an SigDF
object, calling sesameQC_calcStats
within openSesame
and as a standalone function are equivalent.
sdf <- sesameDataGet('EPIC.1.SigDF') qc = openSesame(sdf, prep="", func=sesameQC_calcStats, funs=c("detection")) ## equivalent direct call qc = sesameQC_calcStats(sdf, c("detection")) qc
options(rmarkdown.html_vignette.check_title = FALSE)
SeSAMe features comparison of your sample with public data sets. The
sesameQC_rankStats()
function ranks the input sesameQC
object with
sesameQC
calculated from public datasets. It shows the rank percentage of the
input sample as well as the number of datasets compared.
sdf <- sesameDataGet('EPIC.1.SigDF') qc <- sesameQC_calcStats(sdf, "intensity") qc sesameQC_rankStats(qc, platform="EPIC")
SeSAMe provides functions to create QC plots. Some functions takes sesameQC as input while others directly plot the SigDF objects. Here are some examples:
sesameQC_plotBar()
takes a list of sesameQC objects and creates bar
plot for each metric calculated.
sesameQC_plotRedGrnQQ()
graphs the dye bias between the two color channels.
sesameQC_plotIntensVsBetas()
plots the relationship between $\beta$ values
and signal intensity and can be used to diagnose artificial readout and
influence of signal background.
sesameQC_plotHeatSNPs()
plots SNP probes and can be used to detect sample
swaps.
More about quality control plots can be found in Supplemental Vignette.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.