Description Usage Arguments Details Value Author(s) References See Also Examples
A fast lightweight function that reads from a VCF file and
returns the result as a GenotypeMatrix
object
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## S4 method for signature 'TabixFile,GRanges'
readGenotypeMatrix(file, regions, subset,
noIndels=TRUE, onlyPass=TRUE,
na.limit=1, MAF.limit=1,
na.action=c("impute.major", "omit", "fail"),
MAF.action=c("invert", "omit", "ignore", "fail"),
sex=NULL)
## S4 method for signature 'TabixFile,missing'
readGenotypeMatrix(file, regions, ...)
## S4 method for signature 'character,GRanges'
readGenotypeMatrix(file, regions, ...)
## S4 method for signature 'character,missing'
readGenotypeMatrix(file, regions, ...)
|
file |
a |
regions |
a |
subset |
a numeric vector with indices or a character vector with
names of samples to restrict to; if specified, only these samples'
genotypes are read from the VCF file and all other samples are ignored
and omitted from the |
noIndels |
if |
onlyPass |
if |
na.limit |
all variants with a missing value ratio above this threshold will be omitted from the output object. |
MAF.limit |
all variants with an MAF above this threshold will be omitted from the output object. |
na.action |
if “impute.major”, all missing values will be imputed by major alleles in the output object. If “omit”, all variants containing missing values will be omitted in the output object. If “fail”, the function stops with an error if a variant contains any missing values. |
MAF.action |
if “invert”, all variants with an MAF exceeding 0.5 will be inverted in the sense that all minor alleles will be replaced by major alleles and vice versa. If “omit”, all variants with an MAF greater than 0.5 are omitted in the output object. If “ignore”, no action is taken and MAFs greater than 0.5 are kept as they are. If “fail”, the function stops with an error if any variant has an MAF greater than 0.5. |
sex |
if |
... |
for the three latter methods above, all other parameters
are passed on to the method with signature |
This method uses the tabix
API provided by the
Rsamtools package
to read from a VCF file, parses the result into a sparse matrix
along with positional information, and returns the result as a
GenotypeMatrix
object. Reading can be restricted
to certain regions by specifying the regions
object.
Note that it might not be possible to read a very large VCF file
as a whole.
For all variants, filters in terms of missing values and MAFs can be
applied. Moreover, variants with MAFs greater than 0.5 can filtered
out or inverted. For details, see descriptions of parameters
na.limit
, MAF.limit
, na.action
, and
MAF.action
above.
returns an object of class GenotypeMatrix
Ulrich Bodenhofer bodenhofer@bioinf.jku.at
http://www.bioinf.jku.at/software/podkat
http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42
Li, H., Handsaker, B., Wysoker, A., Fenell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and 1000 Genome Project Data Processing Subgroup (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.
1 2 3 | vcfFile <- system.file("examples/example1.vcf.gz", package="podkat")
readGenotypeMatrix(vcfFile)
readGenotypeMatrix(vcfFile, onlyPass=FALSE, MAF.action="ignore")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.