Home

/

SNPediaR

/

SNPediaR

SNPediaR
In SNPediaR: Query data from SNPedia

BiocStyle::markdown()

About

SNPedia is a curated database containing information about thousands of SNPs. Related diseases, genotypes and references to relevant scientific publications are available trough their web. This site is powered by MediaWiki and information about each SNP is written in the corresponding wiki page.

The SNPediaR library provides tools for automatically search and download such pages. It also implements few functions to scrap some relevant information from the downloaded wiki text, and allows users to extend such parsing functionality.

Downloading pages

For a known set of pages, the function getPages downloads the corresponding wiki content using the MediaWiki web API.

We can for instance download the page Rs53576, corresponding to the rs53576 SNP doing:

library (SNPediaR)
pg <- getPages (titles = "Rs53576")
pg

We can use the same function to download several pages at a time, for instance we can download the 3 genotype pages corresponding with the same SNP: Rs53576(A;A), Rs53576(A;G) and Rs53576(G;G) as

pgs <- getPages (titles = c ("Rs53576(A;A)", "Rs53576(A;G)", "Rs53576(G;G)"))
pgs

Extracting relevant information requires parsing the wiki text. Some utility functions are already implemented in our library for such purpose and any other can be implemented by users.

The function extractSnpTags for instance, extracts the "tabular" information from SNP pages:

extractSnpTags (pg$Rs53576)

The function extractGenotypeTags can be used to get the "tabular" information from genotype pages:

sapply (pgs, extractGenotypeTags)

This same parsing can also be done while downloading the pages, including the wiki processing function as an argument of the in the getPages query.

If for instance we are just interested in the alleles and the magnitude associated with each of the genotypes we can do:

getPages (titles = c ("Rs53576(A;A)", "Rs53576(A;G)", "Rs53576(G;G)"),
          wikiParseFunction = extractGenotypeTags,
          tags = c ("allele1", "allele2", "magnitude"))

Customized parsing functions

Any wiki processing function can be included in the getPages. If a user wants for instance to extract all PubMed IDs from pages Rs53576 and Rs1815739, he or she can first define a parsing function like:

findPMID <- function (x) {
    x <- unlist (strsplit (x, split = "\n"))
    x <- grep ("PMID=", x, value = TRUE)
    x
}

and then call getPages as:

getPages (titles = c ("Rs53576", "Rs1815739"),
          wikiParseFunction = findPMID)

Session info

sessionInfo ()

Created: 2015-09-27 | Revised: 2016-06-03 | Compiled r Sys.Date()

Any scripts or data that you put into this service are public.

SNPediaR documentation built on Nov. 8, 2020, 5:08 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

SNPediaR
Query data from SNPedia

SNPediaR
In SNPediaR: Query data from SNPedia

About

Downloading pages

Customized parsing functions

Categories

Session info

Try the SNPediaR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SNPediaR Query data from SNPedia

SNPediaR In SNPediaR: Query data from SNPedia

About

Downloading pages

Customized parsing functions

Categories

Session info

Try the SNPediaR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SNPediaR
Query data from SNPedia

SNPediaR
In SNPediaR: Query data from SNPedia