FragPipetoMSstatsPTMFormat | R Documentation |
Takes as input TMT experiments which are the output of Fragpipe and converts
into MSstatsPTM format. Requires msstats.csv
file and an annotation file.
Optionally an additional msstats.csv
file can be uploaded if a
corresponding global profiling run was performed. Site localization is
performed and only high probability localizations are kept.
FragPipetoMSstatsPTMFormat(
input,
annotation = NULL,
input_protein = NULL,
annotation_protein = NULL,
use_unmod_peptides = FALSE,
label_type = "TMT",
protein_id_col = "Protein",
peptide_id_col = "Peptide.Sequence",
mod_id_col = "STY",
localization_cutoff = 0.75,
remove_unlocalized_peptides = TRUE,
Purity_cutoff = 0.6,
PeptideProphet_prob_cutoff = 0.7,
useUniquePeptide = TRUE,
rmPSM_withfewMea_withinRun = FALSE,
rmPeptide_OxidationM = TRUE,
rmProtein_with1Feature = FALSE,
summaryforMultipleRows = sum,
use_log_file = TRUE,
append = FALSE,
verbose = TRUE,
log_file_path = NULL
)
input |
data.frame of |
annotation |
annotation with Run, Fraction, TechRepMixture, Mixture, Channel, BioReplicate, Condition columns or a path to file. Refer to the example 'annotation' for the meaning of each column. Channel column should be consistent with the channel columns (Ignore the prefix "Channel ") in msstats.csv file. Run column should be consistent with the Spectrum.File columns in msstats.csv file. |
input_protein |
same as |
annotation_protein |
same as |
use_unmod_peptides |
Boolean if the unmodified peptides in the input
file should be used to construct the unmodified protein output. Only used if
|
label_type |
Type of labeling used for experiment. Must be one of "LF" or "TMT". Default is "TMT". |
protein_id_col |
Use 'Protein'(default) column for TMT. This needs to be changed to "ProteinName" for label free. For TMT, 'Master.Protein.Accessions' can be used instead to get the protein ID with single protein. |
peptide_id_col |
Use 'Peptide.Sequence'(default) column for TMT. Must be changed to "PeptideSequence" for label free. "Modified.Peptide.Sequence" can be used instead to get the modified peptide sequence. |
mod_id_col |
Column containing the modified Amino Acids. For example, a Phosphorylation experiment may pass |
localization_cutoff |
Minimum localization score required to keep modification. Default is .75. |
remove_unlocalized_peptides |
Boolean indicating if peptides without all sites localized should be kept. Default is TRUE (non-localized sites will be removed). |
Purity_cutoff |
Cutoff for purity. Default is 0.6. Purity refers to how much of the detected ion signal within a specific inclusion window belongs to the target molecule or its closely related forms, compared to any other unwanted signals or noise. Higher values indicate greater purity. |
PeptideProphet_prob_cutoff |
Cutoff for the peptide identification probability. Default is 0.7. The probability is confidence score determined by PeptideProphet and higher values indicate greater confidence. |
useUniquePeptide |
logical, if TRUE (default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein. |
rmPSM_withfewMea_withinRun |
TRUE (default) will remove the features that have 1 or 2 measurements within each Run. |
rmPeptide_OxidationM |
TRUE (default) will remove the peptides including oxidation (M) sequence. |
rmProtein_with1Feature |
TRUE will remove the proteins which have only 1 peptide and charge. Defaut is FALSE. |
summaryforMultipleRows |
sum (default) or max - when there are multiple measurements for certain feature in certain run, select the feature with the largest summation or maximal value. |
use_log_file |
logical. If TRUE, information about data processing will be saved to a file. |
append |
logical. If TRUE, information about data processing will be added to an existing log file. |
verbose |
logical. If TRUE, information about data processing wil be printed to the console. |
log_file_path |
character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If 'append = TRUE', has to be a valid path to a file. |
list
of one or two data.frame
of class MSstatsTMT
, named PTM
and PROTEIN
# TMT Example (with global profiling run)
head(fragpipe_input)
head(fragpipe_annotation)
head(fragpipe_input_protein)
head(fragpipe_annotation_protein)
msstats_data = FragPipetoMSstatsPTMFormat(fragpipe_input,
fragpipe_annotation,
fragpipe_input_protein,
fragpipe_annotation_protein,
label_type="TMT",
mod_id_col = "STY",
localization_cutoff=.75,
remove_unlocalized_peptides=TRUE)
head(msstats_data$PTM)
head(msstats_data$PROTEIN)
# LFQ Example (w/out global profiling run)
input = system.file("tinytest/raw_data/Fragpipe/MSstats.csv",
package = "MSstatsPTM")
input = data.table::fread(input)
annot = system.file("tinytest/raw_data/Fragpipe/experiment_annotation.tsv",
package = "MSstatsPTM")
annot = data.table::fread(annot)
msstats_data = FragPipetoMSstatsPTMFormat(input,
annot,
label_type="LF",
mod_id_col = "STY",
localization_cutoff=.75,
protein_id_col = "ProteinName",
peptide_id_col = "PeptideSequence")
head(msstats_data$PTM)
# Note that this is NULL because we did not include a global profiling run.
# Ideally, you should include an independent global profiling run.
head(msstats_data$PROTEIN)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.