impute: Impute missing genotypes using an snmf object

Description Usage Arguments Value Author(s) See Also Examples

Description

Impute missing genotypes in a genotype file (.lfmm) by using ancestry and genotype frequency estimates from an snmf run. The function generates a new lfmm file. See lfmm and lfmm2.

Usage

1
impute (object, input.file, method, K, run) 

Arguments

object

An snmfProject object.

input.file

A path (character string) to an input file in lfmm format with missing genotypes. The same input data must be used when generating the snmf object.

method

A character string: "random" or "mode". With "random", imputation is performed by using the genotype probabilities. With "mode", the most likely genotype is used for matrix completion.

K

An integer value. The number of ancestral populations.

run

An integer value. A particular run used for imputation (usually the run number that minimizes the cross entropy criterion).

Value

NULL

The function writes the imputed genotypes in an output file having the "_imputed.lfmm" suffix.

Author(s)

Olivier Francois

See Also

snmf lfmm lfmm2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
### Example of analysis ###

data("tutorial")
# creation of a genotype file with missing genotypes
# The data contain 400 SNPs for 50 individuals.

dat = as.numeric(tutorial.R)
dat[sample(1:length(dat), 100)] <-  9
dat <- matrix(dat, nrow = 50, ncol = 400 )
write.lfmm(dat, "genotypes.lfmm")

################
# running snmf #
################

project.snmf = snmf("genotypes.lfmm", K = 4, 
        entropy = TRUE, repetitions = 10,
        project = "new")
        
# select the run with the lowest cross-entropy value
best = which.min(cross.entropy(project.snmf, K = 4))

# Impute the missing genotypes
impute(project.snmf, "genotypes.lfmm", method = 'mode', K = 4, run = best)

# Compare with truth
# Proportion of correct imputation results:
mean( tutorial.R[dat == 9] == read.lfmm("genotypes.lfmm_imputed.lfmm")[dat == 9] )

LEA documentation built on Nov. 8, 2020, 8:19 p.m.