Description Usage Arguments Value Author(s) See Also Examples
Impute missing genotypes in a genotype file (.lfmm) by using ancestry and genotype frequency estimates from an snmf
run. The function generates a new lfmm
file. See lfmm
and lfmm2
.
1 | impute (object, input.file, method, K, run)
|
object |
An snmfProject object. |
input.file |
A path (character string) to an input file in lfmm format with missing genotypes. The same input data must be used when generating the snmf object. |
method |
A character string: "random" or "mode". With "random", imputation is performed by using the genotype probabilities. With "mode", the most likely genotype is used for matrix completion. |
K |
An integer value. The number of ancestral populations. |
run |
An integer value. A particular run used for imputation (usually the run number that minimizes the cross entropy criterion). |
NULL |
The function writes the imputed genotypes in an output file having the "_imputed.lfmm" suffix. |
Olivier Francois
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ### Example of analysis ###
data("tutorial")
# creation of a genotype file with missing genotypes
# The data contain 400 SNPs for 50 individuals.
dat = as.numeric(tutorial.R)
dat[sample(1:length(dat), 100)] <- 9
dat <- matrix(dat, nrow = 50, ncol = 400 )
write.lfmm(dat, "genotypes.lfmm")
################
# running snmf #
################
project.snmf = snmf("genotypes.lfmm", K = 4,
entropy = TRUE, repetitions = 10,
project = "new")
# select the run with the lowest cross-entropy value
best = which.min(cross.entropy(project.snmf, K = 4))
# Impute the missing genotypes
impute(project.snmf, "genotypes.lfmm", method = 'mode', K = 4, run = best)
# Compare with truth
# Proportion of correct imputation results:
mean( tutorial.R[dat == 9] == read.lfmm("genotypes.lfmm_imputed.lfmm")[dat == 9] )
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.