estGeno | R Documentation |
Clean up genotype data by error correction based on genotype estimation using a hidden Markov model.
estGeno(
object,
node = "raw",
recomb_rate = 0.04,
error_rate = 0.0025,
call_threshold = 0.9,
het_parent = FALSE,
optim = TRUE,
iter = 2,
n_threads = 1,
dummy_reads = 5,
...
)
## S4 method for signature 'GbsrGenotypeData'
estGeno(
object,
node,
recomb_rate,
error_rate,
call_threshold,
het_parent,
optim,
iter,
n_threads,
dummy_reads
)
object |
A GbsrGenotypeData object. |
node |
Either "raw" or "filt" to indicate whether raw or filtered read
counts are used for genotype estimation. See |
recomb_rate |
A numeric value to indicate the expected recombination frequency per chromosome per megabase pairs. |
error_rate |
A numeric value of the expected sequence error rate. |
call_threshold |
A numeric value of the probability threshold to accept estimated genotype calls. |
het_parent |
A logical value to indicate whether parental samples are outbred or inbred. If FALSE, this function assume all true genotype of markers in parents are homozygotes. |
optim |
A logical value to specify whether to conduct parameter optimization for error correction. |
iter |
An integer value to specify the number of iterative parameter updates. |
n_threads |
An integer value to specify the number of
threads used for the calculation. The default is 1 and if |
dummy_reads |
An integer to specify the number of dummy reads to assign to dummy parental samples for genotype estimation. See details. |
... |
Unused. |
If you have not set parental samples by setParents()
and
initialized the scheme object using initScheme()
, you have the scheme
object without explicit parental information that is assumed to be a
bi-parental population. In this case, estGeno()
will run in the parentless
mode. In the parentless mode, the algorithm assumes that the given population
is a bi-parental population. The number of reference allele reads and
the number of alternative allele reads of the dummy parents are set based on
dummy_reads
, respectively. Dummy parent 1 has dummy_reads
of the
reference allele reads and 0 alternative allele reads at all markers, while
dummy parent 2 has 0 and dummy_reads
of reference and alternative allele
reads at all markers. If the parents of your population were outbred lines or
you cannot assume one of the parents has completely reference homozygotes and
another has laternative homozygotes at all markers, Set dummy_reads = 0
to
leave uncertainty to estimate parental genotypes based on the offspring
genotypes. Nevertheless, the parentless mode is less accurate and has more
chance to get a genotype estimate randomly selected from the equally likely
genotype estimates.
A GbsrGenotypeData object in which the "estimated.haplotype", "corrected.genotype" and "parents.genotype" nodes were added.
# Load data in the GDS file and instantiate a [GbsrGenotypeData] object.
gds_fn <- system.file("extdata", "sample.gds", package = "GBScleanR")
gds <- loadGDS(gds_fn)
# Find the IDs of parental samples.
parents <- grep("Founder", getSamID(gds), value = TRUE)
# Set the parents and flip allele information
# if the reference sample (Founder1 in our case) has homozygous
# alternative genotype at some markers of which alleles will
# be swapped to make the reference sample have homozygous
# reference genotype.
gds <- setParents(gds, parents = parents)
# Initialize a scheme object stored in the slot of the GbsrGenotypeData.
# We chose `crosstype = "pair"` because two inbred founders were mated
# in our breeding scheme.
# We also need to specify the mating matrix which has two rows and
# one column with integers 1 and 2 indicating a sample (founder)
# with the memberID 1 and a sample (founder) with the memberID 2
# were mated.
gds <- initScheme(gds, mating = cbind(c(1:2)))
# Add information of the next cross conducted in our scheme.
# We chose 'crosstype = "selfing"', which do not require a
# mating matrix.
gds <- addScheme(gds, crosstype = "selfing")
# Execute error correction by estimating genotype and haplotype of
# founders and offspring.
gds <- estGeno(gds)
# Close the connection to the GDS file.
closeGDS(gds)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.