View source: R/duplicateDiscordance.R
duplicateDiscordance | R Documentation |
A function to compute pair-wise genotype discordances between multiple genotyping instances of the same subject.
duplicateDiscordance(genoData, subjName.col,
one.pair.per.subj=TRUE, corr.by.snp=FALSE,
minor.allele.only=FALSE, allele.freq=NULL,
scan.exclude=NULL, snp.exclude=NULL,
snp.block.size=5000, verbose=TRUE)
genoData |
|
subjName.col |
A character string indicating the name of the annotation variable that will be identical for duplicate scans. |
one.pair.per.subj |
A logical indicating whether a single pair of scans should be randomly selected for each subject with more than 2 scans. |
corr.by.snp |
A logical indicating whether correlation by SNP should be computed (may significantly increase run time). |
minor.allele.only |
A logical indicating whether discordance should be calculated only between pairs of scans in which at least one scan has a genotype with the minor allele (i.e., exclude major allele homozygotes). |
allele.freq |
A numeric vector with the frequency of the A allele
for each SNP in |
scan.exclude |
An integer vector containing the ids of scans to be excluded. |
snp.exclude |
An integer vector containing the ids of SNPs to be excluded. |
snp.block.size |
Integer block size for SNPs if |
verbose |
Logical value specifying whether to show progress information. |
duplicateDiscordance
calculates discordance metrics both by
scan and by SNP. If one.pair.per.subj=TRUE
(the default), each
subject with more than two duplicate genotyping instances will have
two scans randomly selected for computing discordance. If
one.pair.per.subj=FALSE
, discordances will be calculated
pair-wise for all possible pairs for each subject.
A list with the following components:
discordance.by.snp |
data frame with 5 columns: 1. snpID, 2. discordant (number of discordant pairs), 3. npair (number of pairs examined), 4. n.disc.subj (number of subjects with at least one discordance), 5. discord.rate (discordance rate i.e. discordant/npair) |
discordance.by.subject |
a list of matrices (one for each subject) with the pair-wise discordance between the different genotyping instances of the subject |
correlation.by.subject |
a list of matrices (one for each subject) with the pair-wise correlation between the different genotyping instances of thesubject |
If corr.by.snp=TRUE
, discordance.by.snp
will also have a
column "correlation" with the correlation between duplicate subjects.
For this calculation, the first two samples per subject are selected.
Tushar Bhangale, Cathy Laurie, Stephanie Gogarten, Sarah Nelson
GenotypeData
,
duplicateDiscordanceAcrossDatasets
,
duplicateDiscordanceProbability
,
alleleFrequency
library(GWASdata)
file <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
gds <- GdsGenotypeReader(file)
data(illuminaScanADF)
genoData <- GenotypeData(gds, scanAnnot=illuminaScanADF)
disc <- duplicateDiscordance(genoData, subjName.col="subjectID")
# minor allele discordance
afreq <- alleleFrequency(genoData)
minor.disc <- duplicateDiscordance(genoData, subjName.col="subjectID",
minor.allele.only=TRUE, allele.freq=afreq[,"all"])
close(genoData)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.