injectSNPs | R Documentation |
Inject SNPs from a SNPlocs data package into a genome.
injectSNPs(x, snps)
SNPlocs_pkgname(x)
## S4 method for signature 'BSgenome'
snpcount(x)
## S4 method for signature 'BSgenome'
snplocs(x, seqname, ...)
## Related utilities
available.SNPs(type=getOption("pkgType"))
installed.SNPs()
x |
A BSgenome object. |
snps |
A SNPlocs object or the name of a SNPlocs data package.
This object or package must contain SNP information for the single
sequences contained in |
seqname |
The name of a single sequence in |
type |
Character string indicating the type of package ( |
... |
Further arguments to be passed to |
injectSNPs
returns a copy of the original genome x
where some
or all of the single sequences from x
are altered by injecting the
SNPs stored in snps
.
The SNPs in the altered genome are represented by an IUPAC ambiguity code
at each SNP location.
SNPlocs_pkgname
, snpcount
and snplocs
return NULL
if no SNPs were injected in x
(i.e. if x
is not a
BSgenome object returned by a previous call to injectSNPs
).
Otherwise SNPlocs_pkgname
returns the name of the package from
which the SNPs were injected, snpcount
the number of SNPs for each
altered sequence in x
, and snplocs
their locations in the
sequence whose name is specified by seqname
.
available.SNPs
returns a character vector containing the names of the
SNPlocs and XtraSNPlocs data packages that are currently available on the
Bioconductor repositories for your version of R/Bioconductor.
A SNPlocs data package contains basic information (location and alleles)
about the known molecular variations of class snp for a given
organism.
A XtraSNPlocs data package contains information about the known molecular
variations of other classes (in-del, heterozygous,
microsatellite, named-locus, no-variation, mixed,
multinucleotide-polymorphism) for a given organism.
Only SNPlocs data packages can be used for SNP injection for now.
installed.SNPs
returns a character vector containing the names of
the SNPlocs and XtraSNPlocs data packages that are already installed.
injectSNPs
, SNPlocs_pkgname
, snpcount
and snplocs
have the side effect to try to load the SNPlocs data package that was
specified thru the snps
argument if it's not already loaded.
H. Pagès
BSgenome-class,
IUPAC_CODE_MAP
,
injectHardMask
,
letterFrequencyInSlidingView
,
.inplaceReplaceLetterAt
## What SNPlocs data packages are already installed:
installed.SNPs()
## What SNPlocs data packages are available:
available.SNPs()
if (interactive()) {
## Make your choice and install with:
if (!require("BiocManager"))
install.packages("BiocManager")
BiocManager::install("SNPlocs.Hsapiens.dbSNP144.GRCh38")
}
## Inject SNPs from dbSNP into the Human genome:
library(BSgenome.Hsapiens.UCSC.hg38.masked)
genome <- BSgenome.Hsapiens.UCSC.hg38.masked
SNPlocs_pkgname(genome)
genome2 <- injectSNPs(genome, "SNPlocs.Hsapiens.dbSNP144.GRCh38")
genome2 # note the extra "with SNPs injected from ..." line
SNPlocs_pkgname(genome2)
snpcount(genome2)
head(snplocs(genome2, "chr1"))
alphabetFrequency(genome$chr1)
alphabetFrequency(genome2$chr1)
## Find runs of SNPs of length at least 25 in chr1. Might require
## more memory than some platforms can handle (e.g. 32-bit Windows
## and maybe some Mac OS X machines with little memory):
is_32bit_windows <- .Platform$OS.type == "windows" &&
.Platform$r_arch == "i386"
is_macosx <- substr(R.version$os, start=1, stop=6) == "darwin"
if (!is_32bit_windows && !is_macosx) {
chr1 <- injectHardMask(genome2$chr1)
ambiguous_letters <- paste(DNA_ALPHABET[5:15], collapse="")
lf <- letterFrequencyInSlidingView(chr1, 25, ambiguous_letters)
sl <- slice(as.integer(lf), lower=25)
v1 <- Views(chr1, start(sl), end(sl)+24)
v1
max(width(v1)) # length of longest SNP run
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.