BiocStyle::markdown()
Package: r Biocpkg("MsBackendMgf")
Authors: r packageDescription("MsBackendMgf")[["Author"]]
Last modified: r file.info("MsBackendMgf.Rmd")$mtime
Compiled: r date()
library(Spectra) library(BiocStyle)
The r Biocpkg("Spectra")
package provides a central infrastructure for the
handling of Mass Spectrometry (MS) data. The package supports
interchangeable use of different backends to import MS data from a
variety of sources (such as mzML files). The r Biocpkg("MsBackendMgf")
package allows the import of MS/MS data from mgf (Mascot Generic
Format)
files. This vignette illustrates the usage of the MsBackendMgf
package.
To install this package, start R
and enter:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("MsBackendMgf")
This will install this package and all eventually missing dependencies.
Mgf files store one to multiple spectra, typically centroided and of MS level 2. In our short example below, we load 2 mgf files which are provided with this package. Below we first load all required packages and define the paths to the mgf files.
library(Spectra) library(MsBackendMgf) fls <- dir(system.file("extdata", package = "MsBackendMgf"), full.names = TRUE, pattern = "mgf$") fls
MS data can be accessed and analyzed through Spectra
objects. Below
we create a Spectra
with the data from these mgf files. To this end
we provide the file names and specify to use a MsBackendMgf()
backend as source to enable data import.
sps <- Spectra(fls, source = MsBackendMgf())
With that we have now full access to all imported spectra variables that we list below.
spectraVariables(sps)
Besides default spectra variables, such as msLevel
, rtime
,
precursorMz
, we also have additional spectra variables such as the
TITLE
of each spectrum in the mgf file.
sps$rtime sps$TITLE
By default, fields in the mgf file are mapped to spectra variable names using
the mapping returned by the spectraVariableMapping
function:
spectraVariableMapping(MsBackendMgf())
The names of this character
vector are the spectra variable names (such as
"rtime"
) and the field in the mgf file that contains that information are the
values (such as "RTINSECONDS"
). Note that it is also possible to overwrite
this mapping (e.g. for certain mgf dialects) or to add additional
mappings. Below we add the mapping of the mgf field "TITLE"
to a spectra
variable called "spectrumName"
.
map <- c(spectrumName = "TITLE", spectraVariableMapping(MsBackendMgf())) map
We can then pass this mapping to the backendInitialize
method, or the
Spectra
constructor.
sps <- Spectra(fls, source = MsBackendMgf(), mapping = map)
We can now access the spectrum's title with the newly created spectra variable
"spectrumName"
:
sps$spectrumName
In addition we can also access the m/z and intensity values of each spectrum.
mz(sps) intensity(sps)
The MsBackendMgf
backend allows also to export data in mgf format. Below we
export the data to a temporary file. We hence call the export
function on our
Spectra
object specifying backend = MsBackendMgf()
to use this backend for
the export of the data. Note that we use again our custom mapping of variables
such that the spectra variable "spectrumName"
will be exported as the
spectrums' title.
fl <- tempfile() export(sps, backend = MsBackendMgf(), file = fl, mapping = map)
We next read the first lines from the exported file to verify that the title was exported properly.
readLines(fl)[1:12]
Note that the MsBackendMgf
exports all spectra variables as fields in the mgf
file. To illustrate this we add below a new spectra variable to the object and
export the data.
sps$new_variable <- "A" export(sps, backend = MsBackendMgf(), file = fl) readLines(fl)[1:12]
We can see that also our newly defined variable was exported. Also, because we
did not provide our custom variable mapping this time, the variable
"spectrumName"
was not used as the spectrum's title.
Sometimes it might be required to not export all spectra variables since some
exported fields might not be recognized/supported by external tools. Using the
selectSpectraVariables
function we can reduce our Spectra
object to export
to contain only relevant spectra variables. Below we restrict the data to only
m/z, intensity, retention time, acquisition number, precursor m/z and precursor
charge and export these to an mgf file. Also, some external tools don't support
the "TITLE"
field in the MGF file. To disable export of the spectrum ID/title
exportTitle = FALSE
can be used.
sps_ex <- selectSpectraVariables(sps, c("mz", "intensity", "rtime", "acquisitionNum", "precursorMz", "precursorCharge")) export(sps_ex, backend = MsBackendMgf(), file = fl, exportTitle = FALSE) readLines(fl)[1:12]
The MsBackendMgf package supports parallel processing for data
import. Parallel processing can be enabled by providing the parallel processing
setup to the backendInitialize
function (or the readMgf
function) with the
BPPARAM
parameter. By default (with BPPARAM = SerialParam()
) parallel
processing is disabled. If enabled, and data import is performed on a single
file, the extraction of spectra information on the imported MGF file is
performed in parallel. If data is to be imported from multiple files, the import
is performed in parallel on a per-file basis (i.e. in parallel from the
different files). Generally, the performance gain through parallel processing is
only moderate and it is only suggested if a large number of files need to be
processed, or if the MGF file is very large (e.g. containing over 100,000
spectra).
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.