BiocStyle::markdown()
Package: Pbase
Authors: Laurent Gatto and
Sebastian Gibb
Last compiled: r date()
Last modified: r file.info("Pbase-data.Rmd")$mtime
library("Pbase")
This vignette briefly introduces the central data object of the
Pbase
package, namely Proteins
instances, as depicted below. They
contain a set of protein sequences (10 in the figure below), composed
of the protein sequences (grey boxes) and annotation data (table on
the left). Each protein links to a set of ranges of interest, such as
protein domains of experimentally observed peptides (also in grey)
that are also decorated with their own annotation data. The figure
also show the accessors for the different data slots, that are
detailed in ?Proteins
.
Pbase:::pplot()
Proteins
objects are populated by protein sequences stemming from a
fasta file and the peptides typically originate from an LC-MSMS
experiment.
The original data used below is a 10 fmol
Peptide Retention Time Calibration Mixture
spiked into 50 ng HeLa background acquired on a Thermo Orbitrap Q
Exactive instrument. A restricted set of high scoring human proteins
from the UniProt release 2015_02
were searched using the MSGF+
search engine.
library("Biostrings") fafile <- system.file("extdata/HUMAN_2015_02_selected.fasta", package = "Pbase") fa <- readAAStringSet(fafile) fa
library("mzID") idfile <- system.file("extdata/Thermo_Hela_PRTC_selected.mzid", package = "Pbase") id <- flatten(mzID(idfile)) dim(id) head(id)
library("Pbase") p <- Proteins(fafile) p <- addIdentificationData(p, idfile) p
A Proteins
object is composed of a set of protein sequences
accessible with the aa
accessor as well as an optional set of
peptides features that are mapped as coordinates along the proteins,
available with pranges
. The actual peptide sequences can be extraced
with pfeatures
. The names of the protein sequences can be extraced
with seqnames
.
aa(p) seqnames(p) pranges(p) pfeatures(p)
A Proteins instance is further described by general metadata
list. Protein sequence and peptide features annotations can be
accessed with acols
and pcols
respectively, which return
DataFrame
instances.
metadata(p) acols(p) pcols(p)
Specific proteins can be extracted by index of name using
[
and proteins and their peptide features can be plotted
with the default plot method.
seqnames(p) plot(p[c(1,9)])
More details can be found in ?Proteins
. The object generated above
is also directly available as data(p)
.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.