In order to deal with this type of data, we have extended the resources available at the r CRANpkg("resourcer")
package to VCF files. NOTE: PLINK files can be translated into VCF files using different pipelines. In R you can use r Biocpkg("SeqArray")
to get VCF files.
We use the Genomic Data Storage (GDS) format which efficiently manage VCF files into the R environment. This extension requires to create a Client and a Resolver function that are located into the r Githubpkg("isglobal-brge", "dsOmics")
package. The client function uses snpgdsVCF2GDS
function implemented in r Biocpkg("SNPrelate")
to coerce the VCF file to a GDS object. Then the GDS object is loaded into R as an object of class GdsGenotypeReader
from r Biocpkg("GWASTools")
package that facilitates downstream analyses.
The opal API server allows to incorporate this new type of resource as illustrated in Figure \@ref(fig:resourceVCF).
knitr::include_graphics(tools::file_path_as_absolute("../fig/opal_resource_VCF.png"))
It is important to notice that the URL should contain the tag method=biallelic.only&snpfirstdim=TRUE
since these are required parameters of snpgdsVCF2GDS
function. This is an example:
https://raw.githubusercontent.com/isglobal-brge/scoreInvHap/master/inst/extdata/example.vcf?method=biallelic.only&snpfirstdim=TRUE
In that case we indicate that only biallelic SNPs are considered ('method=biallelic.only') and that genotypes are stored in the individual-major mode, (i.e., list all SNPs for the first individual, and then list all SNPs for the second individual, etc) ('snpfirstdim=TRUE').
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.