FragPipetoMSstatsPTMFormat: Convert output of TMT labeled Fragpipe data into MSstatsPTM...
In Vitek-Lab/MSstatsPTM: Statistical Characterization of Post-translational Modifications

FragPipetoMSstatsPTMFormat

R Documentation

Convert output of TMT labeled Fragpipe data into MSstatsPTM format.

Description

Takes as input TMT experiments which are the output of Fragpipe and converts into MSstatsPTM format. Requires msstats.csv file and an annotation file. Optionally an additional msstats.csv file can be uploaded if a corresponding global profiling run was performed. Site localization is performed and only high probability localizations are kept.

Usage

FragPipetoMSstatsPTMFormat(
  input,
  annotation = NULL,
  input_protein = NULL,
  annotation_protein = NULL,
  use_unmod_peptides = FALSE,
  label_type = "TMT",
  protein_id_col = "Protein",
  peptide_id_col = "Peptide.Sequence",
  mod_id_col = "STY",
  localization_cutoff = 0.75,
  remove_unlocalized_peptides = TRUE,
  Purity_cutoff = 0.6,
  PeptideProphet_prob_cutoff = 0.7,
  useUniquePeptide = TRUE,
  rmPSM_withfewMea_withinRun = FALSE,
  rmPeptide_OxidationM = TRUE,
  rmProtein_with1Feature = FALSE,
  summaryforMultipleRows = sum,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL
)

Arguments

`input`	data.frame of `msstats.csv` file produced by Philosopher
`annotation`	annotation with Run, Fraction, TechRepMixture, Mixture, Channel, BioReplicate, Condition columns or a path to file. Refer to the example 'annotation' for the meaning of each column. Channel column should be consistent with the channel columns (Ignore the prefix "Channel ") in msstats.csv file. Run column should be consistent with the Spectrum.File columns in msstats.csv file.
`input_protein`	same as `input` for global profiling run. Default is NULL.
`annotation_protein`	same as `annotation` for global profiling run. Default is NULL.
`use_unmod_peptides`	Boolean if the unmodified peptides in the input file should be used to construct the unmodified protein output. Only used if `input_protein` is not provided. Default is `FALSE`.
`label_type`	Type of labeling used for experiment. Must be one of "LF" or "TMT". Default is "TMT".
`protein_id_col`	Use 'Protein'(default) column for TMT. This needs to be changed to "ProteinName" for label free. For TMT, 'Master.Protein.Accessions' can be used instead to get the protein ID with single protein.
`peptide_id_col`	Use 'Peptide.Sequence'(default) column for TMT. Must be changed to "PeptideSequence" for label free. "Modified.Peptide.Sequence" can be used instead to get the modified peptide sequence.
`mod_id_col`	Column containing the modified Amino Acids. For example, a Phosphorylation experiment may pass `STY`. The corresponding column with `STY` combined with the mass (e.x. `STY.79.9663`) will be selected. Default is `STY`.
`localization_cutoff`	Minimum localization score required to keep modification. Default is .75.
`remove_unlocalized_peptides`	Boolean indicating if peptides without all sites localized should be kept. Default is TRUE (non-localized sites will be removed).
`Purity_cutoff`	Cutoff for purity. Default is 0.6. Purity refers to how much of the detected ion signal within a specific inclusion window belongs to the target molecule or its closely related forms, compared to any other unwanted signals or noise. Higher values indicate greater purity.
`PeptideProphet_prob_cutoff`	Cutoff for the peptide identification probability. Default is 0.7. The probability is confidence score determined by PeptideProphet and higher values indicate greater confidence.
`useUniquePeptide`	logical, if TRUE (default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
`rmPSM_withfewMea_withinRun`	TRUE will remove the features that have 1 or 2 measurements within each Run. Default is FALSE.
`rmPeptide_OxidationM`	TRUE (default) will remove the peptides including oxidation (M) sequence.
`rmProtein_with1Feature`	TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
`summaryforMultipleRows`	sum (default) or max - when there are multiple measurements for certain feature in certain run, select the feature with the largest summation or maximal value.
`use_log_file`	logical. If TRUE, information about data processing will be saved to a file.
`append`	logical. If TRUE, information about data processing will be added to an existing log file.
`verbose`	logical. If TRUE, information about data processing wil be printed to the console.
`log_file_path`	character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If 'append = TRUE', has to be a valid path to a file.

Value

list of one or two data.frame of class MSstatsTMT, named PTM and PROTEIN

Examples

# TMT Example (with global profiling run)
head(fragpipe_input)
head(fragpipe_annotation)
head(fragpipe_input_protein)
head(fragpipe_annotation_protein)

msstats_data = FragPipetoMSstatsPTMFormat(fragpipe_input,
                                          fragpipe_annotation,
                                          fragpipe_input_protein, 
                                          fragpipe_annotation_protein,
                                          label_type="TMT",
                                          mod_id_col = "STY",
                                          localization_cutoff=.75,
                                          remove_unlocalized_peptides=TRUE)
head(msstats_data$PTM)
head(msstats_data$PROTEIN)

# LFQ Example (w/out global profiling run)
input = system.file("tinytest/raw_data/Fragpipe/MSstats.csv", 
                                        package = "MSstatsPTM")
input = data.table::fread(input)
annot = system.file("tinytest/raw_data/Fragpipe/experiment_annotation.tsv", 
                                        package = "MSstatsPTM")
annot = data.table::fread(annot)                                        

msstats_data = FragPipetoMSstatsPTMFormat(input,
                                          annot,
                                          label_type="LF",
                                          mod_id_col = "STY",
                                          localization_cutoff=.75,
                                          protein_id_col = "ProteinName",
                                          peptide_id_col = "PeptideSequence")
head(msstats_data$PTM)

# Note that this is NULL because we did not include a global profiling run.
# Ideally, you should include an independent global profiling run.
head(msstats_data$PROTEIN)

Vitek-Lab/MSstatsPTM documentation built on April 12, 2025, 11:54 p.m.

Vitek-Lab/MSstatsPTM index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Vitek-Lab/MSstatsPTM
Statistical Characterization of Post-translational Modifications

FragPipetoMSstatsPTMFormat: Convert output of TMT labeled Fragpipe data into MSstatsPTM...
In Vitek-Lab/MSstatsPTM: Statistical Characterization of Post-translational Modifications

Convert output of TMT labeled Fragpipe data into MSstatsPTM format.

Description

Usage

Arguments

Value

Examples

Related to FragPipetoMSstatsPTMFormat in Vitek-Lab/MSstatsPTM...

R Package Documentation

Browse R Packages

We want your feedback!

Vitek-Lab/MSstatsPTM Statistical Characterization of Post-translational Modifications

FragPipetoMSstatsPTMFormat: Convert output of TMT labeled Fragpipe data into MSstatsPTM... In Vitek-Lab/MSstatsPTM: Statistical Characterization of Post-translational Modifications

Convert output of TMT labeled Fragpipe data into MSstatsPTM format.

Description

Usage

Arguments

Value

Examples

Related to FragPipetoMSstatsPTMFormat in Vitek-Lab/MSstatsPTM...

R Package Documentation

Browse R Packages

We want your feedback!

Vitek-Lab/MSstatsPTM
Statistical Characterization of Post-translational Modifications

FragPipetoMSstatsPTMFormat: Convert output of TMT labeled Fragpipe data into MSstatsPTM...
In Vitek-Lab/MSstatsPTM: Statistical Characterization of Post-translational Modifications