Example R code showing the usage of the SWATH2stats package. The data processed is the publicly available dataset of S.pyogenes (Röst et al. 2014) (http://www.peptideatlas.org/PASS/PASS00289). The results file 'rawOpenSwathResults_1pcnt_only.tsv' can be found on PeptideAtlas (ftp://PASS00289@ftp.peptideatlas.org/../Spyogenes/results/). This is a R Markdown file, showing the result of processing this data. The lines shaded in grey represent the R code executed during this analysis.
The SWATH2stats package can be directly installed from Bioconductor using the commands below (http://bioconductor.org/packages/devel/bioc/html/SWATH2stats.html).
if (!require("BiocManager")) install.packages("BiocManager") BiocManager::install("SWATH2stats")
Load the SWATH-MS example data from the package, this is a reduced file in order to limit the file size of the package.
library(SWATH2stats) library(data.table) data('Spyogenes', package = 'SWATH2stats')
Alternatively the original file downloaded from the Peptide Atlas can be loaded from the working directory.
data <- data.frame(fread('rawOpenSwathResults_1pcnt_only.tsv', sep='\t', header=TRUE))
Extract the study design information from the file names. Alternatively, the study design table can be provided as an external table.
Study_design <- data.frame(Filename = unique(data$align_origfilename)) Study_design$Filename <- gsub('.*strep_align/(.*)_all_peakgroups.*', '\\1', Study_design$Filename) Study_design$Condition <- gsub('(Strep.*)_Repl.*', '\\1', Study_design$Filename) Study_design$BioReplicate <- gsub('.*Repl([[:digit:]])_.*', '\\1', Study_design$Filename) Study_design$Run <- seq_len(nrow(Study_design)) head(Study_design)
The SWATH-MS data is annotated using the study design table.
data.annotated <- sample_annotation(data, Study_design, column_file = "align_origfilename")
Remove the decoy peptides for a subsequent inspection of the data.
data.annotated.nodecoy <- subset(data.annotated, decoy==FALSE)
\newpage
Count the different analytes for the different injections.
count_analytes(data.annotated.nodecoy)
Plot the correlation of the signal intensity.
correlation <- plot_correlation_between_samples(data.annotated.nodecoy, column.values = 'Intensity')
Plot the correlation of the delta_rt, which is the deviation of the retention time from the expected retention time.
correlation <- plot_correlation_between_samples(data.annotated.nodecoy, column.values = 'delta_rt')
\newpage
Plot the variation of the signal across replicates.
variation <- plot_variation(data.annotated.nodecoy) variation[[2]]
Plot the total variation versus variation within replicates.
variation_total <- plot_variation_vs_total(data.annotated.nodecoy) variation_total[[2]]
Calculate the summed signal per peptide and protein across samples.
peptide_signal <- write_matrix_peptides(data.annotated.nodecoy) protein_signal <- write_matrix_proteins(data.annotated.nodecoy) head(protein_signal)
\newpage
Estimate the overall FDR across runs using a target decoy strategy.
par(mfrow = c(1, 3)) fdr_target_decoy <- assess_fdr_overall(data.annotated, n_range = 10, FFT = 0.25, output = 'Rconsole')
According to this FDR estimation one would need to filter the data with a lower mscore threshold to reach an overall protein FDR of 5%.
mscore4protfdr(data, FFT = 0.25, fdr_target = 0.05)
Filter data for values that pass the 0.001 mscore criteria in at least two replicates of one condition.
data.filtered <- filter_mscore_condition(data.annotated, 0.001, n_replica = 2)
Select only the 10 peptides showing strongest signal per protein.
data.filtered2 <- filter_on_max_peptides(data.filtered, n_peptides = 10)
\newpage
Filter for proteins that are supported by at least two peptides.
data.filtered3 <- filter_on_min_peptides(data.filtered2, n_peptides = 2)
Convert the data into a transition-level format (one row per transition measured).
data.transition <- disaggregate(data.filtered3)
Convert the data into the format required by MSstats.
MSstats.input <- convert4MSstats(data.transition) head(MSstats.input)
Convert the data into the format required by mapDIA.
mapDIA.input <- convert4mapDIA(data.transition) head(mapDIA.input)
Convert the data into the format required by aLFQ.
aLFQ.input <- convert4aLFQ(data.transition) head(aLFQ.input)
Session info on the R version and packages used.
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.