r study
r author
r format(Sys.time(), '%d %B, %Y')
qc.summary$parameters
There are r nrow(qc.summary$sex.summary$tab)
samples analysed.
To separate females and males, we use the difference of total median intensity for Y chromosome probes and X chromosome probes. This will give two distinct clusters of intensities. Females will be clustered on the left and males on the right.
There are r sum(qc.summary$sex.summary$tab$outliers)
sex detection outliers, and r sum(qc.summary$sex.summary$tab$sex.mismatch == "TRUE")
sex detection mismatches.
tab <- qc.summary$sex.check if(nrow(tab) > 0) kable(tab,row.names=F)
This is a plot of the difference between median
chromosome Y and chromosome X probe intensities ("XY diff").
Cutoff for sex detection was
XY diff = r qc.summary$parameters$sex.cutoff
. Mismatched samples are shown in red. The dashed lines represent r qc.summary$parameters$sex.outlier.sd
SD from the mean xy difference. Samples that fall in this interval are denoted as outliers.
(qc.summary$sex.summary$graph)
To explore the quality of the samples, it is useful to plot the median methylation intensity against the median unmethylation intensity with the option to color outliers by group.
There are r sum(qc.summary$meth.unmeth.summary$tab$outliers)
outliers from the meth vs unmeth comparison.
Outliers are samples whose predicted median methylated signal is
more than r qc.summary$parameters$meth.unmeth.outlier.sd
standard deviations
from the expected (regression line).
tab <- subset(qc.summary$meth.unmeth.summary$tab, outliers) if(nrow(tab) > 0) kable(tab,row.names=F)
This is a plot of the methylation signals vs unmethylated signals
(qc.summary$meth.unmeth.summary$graph)
There were r sum(qc.summary$controlmeans.summary$tab$outliers)
outliers detected based on deviations from mean values for control probes. The 450k array contains control probe which can be used to evaluate the quality of specific sample processing steps (staining, extension,target removal, hybridization, bisulfate conversion etc.). Control probes are grouped in 42 categories of control type. For each category a plot has been generated which shows the control means for each sample. Outliers are deviations from the mean. Some of the control probe categories have a very small number of probes. See Page 222 in this doc: https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/infinium_assays/infinium_hd_methylation/infinium-hd-methylation-guide-15019519-01.pdf. The most important control probes are the bisulfate1 and bisulfate2 control probes.
tab <- subset(qc.summary$controlmeans.summary$tab, outliers) if(nrow(tab) > 0) kable(tab,row.names=F)
The distribution of sample control means are plotted here:
(qc.summary$controlmeans.summary$graph)
To further explore the quality of each sample the proportion of probes that didn't pass the detection pvalue has been calculated.
There were r sum(qc.summary$sample.detectionp.summary$tab$outliers)
samples
with a high proportion of undetected probes
(proportion of probes with
detection p-value > r qc.summary$parameters$detection.threshold
is > r qc.summary$parameters$detectionp.samples.threshold
).
tab <- subset(qc.summary$sample.detectionp.summary$tab, outliers) if(nrow(tab) > 0) kable(tab,row.names=F)
Distribution:
(qc.summary$sample.detectionp.summary$graph)
To further assess the quality of each sample the proportion of probes that didn't pass the number of beads threshold has been calculated.
There were r sum(qc.summary$sample.beadnum.summary$tab$outliers)
samples
with a high proportion of probes with low bead number
(proportion of probes with
bead number < r qc.summary$parameters$bead.threshold
is > r qc.summary$parameters$beadnum.samples.threshold
).
tab <- subset(qc.summary$sample.beadnum.summary$tab, outliers) if(nrow(tab) > 0) kable(tab,row.names=F)
Distribution:
(qc.summary$sample.beadnum.summary$graph)
To explore the quality of the probes, the proportion of samples that didn't pass the detection pvalue threshold has been calculated.
There were r sum(qc.summary$cpg.detectionp.summary$tab$outliers)
probes with only background signal in a high proportion of samples
(proportion of samples with detection p-value > r qc.summary$parameters$detection.threshold
is > r qc.summary$parameters$detectionp.cpgs.threshold
).
Manhattan plot shows the proportion of samples.
if (!is.null(qc.summary$cpg.detectionp.summary$graph)) (qc.summary$cpg.detectionp.summary$graph)
To further explore the quality of the probes, the proportion of samples that didn't pass the number of beads threshold has been calculated.
There were r sum(qc.summary$cpg.beadnum.summary$tab$outliers)
CpGs
with low bead numbers in a high proportion of samples
(proportion of samples with bead number < r qc.summary$parameters$bead.threshold
is > r qc.summary$parameters$beadnum.cpgs.threshold
).
Manhattan plot of proportion of samples.
if (!is.null(qc.summary$cpg.beadnum.summary$graph)) (qc.summary$cpg.beadnum.summary$graph)
child.filename <- file.path(report.path, "missing.rmd") if (!is.null(qc.summary$cell.counts)) child.filename <- file.path(report.path, "cell-counts.rmd")
The array contains 65 snp probes which can be used to identify sample swaps by comparing these genotypes to genotype calls from a genotype array. First you could check the quality of these snp probes before using them for sample quality. Distributions of SNP probe beta values are used to determine the quality of the snp probe and should show 3 peaks, one for each genotype probability.
(qc.summary$genotype.summary$graphs$snp.beta)
child.filename <- file.path(report.path, "missing.rmd") if (!is.null(qc.summary$genotype.summary$graphs$snp.concordance)) child.filename <- file.path(report.path, "genotype-concordance.rmd")
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.