BiocStyle::markdown()
library(BiocStyle)
XXX (Description of MTBLS1586)
MetClassNetR
integrates functions from different libraries.
Install current version of MetClassNetR
and additional
MetNet
functions from developmental branches. After
Installation re-load R and load required libraries.
# installation of additional packages #devtools::install_github("tnaake/MetNet") # note: Re-load R studio after installation #devtools::install_github("MetClassNet/MetClassNetR", ref = "devel_liesa") #devtools::install_github("blosloos/nontarget") #devtools::install_github("blosloos/nontargetData") # load required libraries library(MetClassNetR) library(Spectra) library(MsCoreUtils) library(MsBackendMgf) library(easyGgplot2) library(QFeatures) library(tidyverse) library(nontarget) library(visNetwork) # for visNetwork() and friends library(networkD3) # for saveNetwork()
Parallelization can be done to speed up calculations.
register(bpstart(SnowParam(4, progressbar = TRUE)))
MS1 data is read from Metabolights MAF or mzTab-M files into a
QFeatures
object. Information about the mass-to-charge
(mz) ratio and retention times (RT) were extracted from the MAF/ mzTab-M
file and stored into the QFeatures
-object. The data from
positive and negative ionization mode are stored in two different
QFeatures
-objects.
# Metabolights MAF files pos_maf <- system.file("extdata/MTBLS1586/m_MTBLS1586_LC-MS_positive_reverse-phase_metabolite_profiling_v2_maf.tsv", package = "MetClassNetR") neg_maf <- system.file("extdata/MTBLS1586/m_MTBLS1586_LC-MS_positive_reverse-phase_metabolite_profiling_v2_maf.tsv", package = "MetClassNetR") # create QFeature objects mtbls1586_qf_pos <- readMaf(file(pos_maf), ecol = 23, name = "mtbls1586_pos") mtbls1586_qf_neg <- readMaf(file(neg_maf), ecol = 23, name = "mtbls1586_neg")
MS2 data is read from a .mgf or .msp file into a Spectra
object.
# get MS2 data in .mgf files pos_mgf <- system.file("extdata/MTBLS1586/ms2_MTBLS1586_LC-MS_positive_reverse-phase_metabolite_profiling.mgf", package = "MetClassNetR") neg_mgf <- system.file("extdata/MTBLS1586/ms2_MTBLS1586_LC-MS_negative_reverse-phase_metabolite_profiling.mgf", package = "MetClassNetR") # load MS2 data ms2_spectra_pos <- Spectra::Spectra(pos_mgf, source = MsBackendMgf(), backend = MsBackendDataFrame()) ms2_spectra_neg <- Spectra::Spectra(neg_mgf, source = MsBackendMgf(), backend = MsBackendDataFrame())
check data (namespace of ids etc...)
# check minimal requirements checkQFeatures(mtbls1586_qf_neg) checkQFeatures(mtbls1586_qf_pos) checkSpectra(ms2_spectra_neg) checkSpectra(ms2_spectra_pos)
checkIDNamespace
sanity check will fail if Ids between
QFeatures
and Spectra
are not matching.Therefore,
MS2 spectra can be filled with empty spectra to have a complete list of spectra
for all the features.
# # fill spectra # ms2_spectra_neg <- fillSpectra(mtbls1586_qf_neg, ms2_spectra_neg) # ms2_spectra_pos <- fillSpectra(mtbls1586_qf_pos, ms2_spectra_pos) # # # check namespace # checkIdNamespace(mtbls1586_qf_neg, ms2_spectra_neg) # checkIdNamespace(mtbls1586_qf_pos, ms2_spectra_pos)
The mass-difference network is calculated using the MetNet
package. Here, all calculated mass-differences between all pairs of features
are compared to a transformation-list, containing different biochemical
reactions and their corresponding mass difference. Required columns are
"group" and "mass". Additional columns as "formula" can be added.
The transformation-list needs to be provided by the user. An example
can be found in the system files.
transformations_file <- system.file("extdata/MTBLS1586/transformations_MTBLS1586.csv", package = "MetClassNetR") transformations <- read_csv(transformations_file, col_names = TRUE) transformations[1:5,]
ZZZ
The master
-list will contain all types of experimental
networks that have been created in this workflow.
master <- list()
The mass-difference network is calculated using the MetNet
package. All mass-differences between all pairs of features are calculated
and are compared to the transformation
-list. This
transformation
-list contains different biochemical reactions
and their corresponding mass difference. Only if the mass-difference
between two features matches to a mass-difference in the transformation
file, within a certain ppm
-range, it will be displayed.
Different to the original MetNet
-functions, not only the
name of the mass-difference but also the value itself will be displayed
as output adjacency matrix.
The adjacency matrix will be than transformed to a data.frame
.
# Create mass difference network massdiff <- qfeat_structural(x = mtbls1586_qf_pos, assay_name = "mtbls1586_pos", transformation = transformations, # var = c("Name", "Formula difference", "mass"), var = c("name", "formula", "mass"), ppm = 10) # Create data.frame of adjacency matrix massdiff_df <- as.data.frame(massdiff) |> filter(binary == 1) massdiff_df[5:10,]
A summary of the current data-set on the present mass-difference
distribution is provided using the summary_mz
-function.
This function stores the summary as data.frame, which is also plotted.
Depending on the size of the data, it might be useful to filter
the number of determined mass-differences, e.g. by 1000 counts.
# Summary & filter for counts e.g. higher 1000 #sum_mz_f <- mz_summary(massdiff, filter = 1000, var = "group") # some visualizations #mz_vis(sum_mz_f, var = "group") # append mass difference network master <- c(master, massdiff)
A QFeatures
-object stores various information on LC-MS/MS
data. Also, annotations and possible database identifiers are present
in a QFeatures
-object. In order to add the annotations to
the adjacency-list, qfeat_annotation
is applied. It extracts
the annotations, if any present, and adds them to the desired
adjacency-list, for example the list from the mass-difference network.
In order to check if the annotations match the mass-difference, a filter
is applied so that only features remains that have annotations in both
features.
# Add annotations to mass-difference adjacency matrix rowData(massdiff) <- as.data.frame(rowData(mtbls1586_qf_pos[["mtbls1586_pos"]])) # display annotation for the feature "Cluster_3201" rowData(massdiff)["Cluster_3201", ] massdiff_anno <- as.data.frame(rowData(massdiff)) # Filter for present annotations anno_filter <- filter(massdiff_anno, massdiff_anno$metabolite_identification != "") anno_filter[25:30,]
Correlation networks will be calculated using functionality from
the MetNet
package. qfeat_statistical
uses the QFeatures
-Object as input and extracts the
feature intensity in order to calculate correlations from the
statistical
-function from MetNet
.
MetNet
implemented various models that can be applied
to calculate correlations, e.g. pearson, spearman, partial-pearson,
and many more which might also be selected. After generating the
correlation adjacency-matrix, it will also be transformed to a
data.frame
# Correlation network using partial pearson and GGM correlation corr <- qfeat_statistical(x = mtbls1586_qf_pos, assay_name = "mtbls1586_pos", model = c("pearson_partial", "ggm"), na.omit = TRUE, p.adjust = "BH") # Create data.frame from correlation adjacency matrix corr_df <- as.data.frame(corr) corr_df[25:30,] master <- c(master, corr) ## plots to view the distribution of correlations corr_fl <- data.frame(coef = c(corr_df$pearson_partial_coef, corr_df$ggm_coef), group = c(rep("pearson_partial", nrow(corr_df)), rep("ggm", nrow(corr_df))) ) plot <- ggplot2.histogram(data=corr_fl, xName='coef', groupName='group', legendPosition="right", alpha=0.5, binwidth=0.01, brewerPalette="Paired", addMeanLine=TRUE, meanLineSize=1) plot_custom <-ggplot2.customize(plot, xtitle="Correlation coefficient", ytitle="Count", showLegend=TRUE, axisLine=c(0.5, "solid", "black"), addDensity=TRUE, removePanelBorder=TRUE, backgroundColor="white", mainTitle="Correlation coefficient") plot_custom
ABC
homol <- qfeat_homol(x = mtbls1586_qf_pos, assay_name = "mtbls1586_pos", elements=c("C","H","O"), use_C=TRUE, minmz=5, maxmz=120, minrt=-30, maxrt=30, ppm=TRUE, mztol= 5, rttol=30, minlength=5, mzfilter=FALSE, #spar=.45, R2=.98, plotit=FALSE)
Spectral similarity network is created using positive MS2 spectra.
tolerance = 0.005 ppm = 0 # filter ms2 spectra 20 % of intensity filter <- TRUE my_filter <- function(x) { x > max(x, na.rm = TRUE) / 10 } ms2_spectra_pos <- filterIntensity(ms2_spectra_pos, my_filter) ## to speed up calculations: only use 50 spectra for spectral similarity spect <- spec_molNetwork(ms2_spectra_pos[1:50], MAPFUN = joinPeaksGnps, methods = "gnps", tolerance = tolerance, ppm = ppm, type = "inner") spect_adj <- addSpectralSimilarity(am_structural = massdiff, ms2_similarity = spect) # Create data.frame of adjacency matrix spect_df <- as.data.frame(spect_adj) |> filter(!is.na(gnps)) master <- c(master, spect_adj)
Created networks are exported and saved as .gml file in the current
working-directory. Feature Ids will be stored as additional name in
node attributes. All the other information from the adjacency-lists
will be stores as edge-attributes. It might also be defined which
columns of the adjacency-list should be stored, by using the
select
option in the exportNet2gml
-function.
## Mass-difference network to gml # note: only first 10 features are exported exportNet2gml(x = massdiff_df[1:10,], file = "mzdiff") ## Correlation network to gml # note: only first 10 features are exported exportNet2gml(x = pears_df[1:10,], file = "pears") ## Spectral similarity network2gml # note: only first 10 features are exported exportNet2gml(x = spect_df[1:10,], file = "similarity", select = c("gnps"))
Additionally, attributes may be exported as .gml attribute-file using
exportAttributes2gml
. Edge attributes are prepared by
merging the adjacency-lists from mass-difference and correlation-networks.
Node attributes are generated by extraction additional information as
annotation, mz, and RT values from the QFeatures
-object.
# Create and export attribute file attributes <- merge(corr_df, massdiff_df, by = c("Row", "Col")) # Prepare annotation file feat_anno <- as.data.frame(rowData(mtbls1586_qf_pos[["mtbls1586_pos"]])) feat_anno <- feat_anno[,c("metabolite_identification", "mz", "rtime")] colnames(feat_anno) <- c("Annotation", "mz", "RT") # note: only first 10 features are exported exportAttributes2gml(attributes[1:10,], file = "attributes_mtbls1586_pos", anno = feat_anno)
Export of homologous series
# # el <- as_tibble(homol[["Peaks in homologue series"]]) |> # filter(`to ID` != "0") |> select (c("peak ID", "HS cluster", # "m/z increment", "RT increment")) # # g <- el |> select(c("peak ID", "to ID")) |> as.matrix() |> graph_from_edgelist() |> as_data_frame() # # # ## Write to File # write_graph(g, "/tmp/g.gml", "gml")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.