knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
PEBBA is an enrichment pipeline based on over representation analysis (ORA). It first ranks differentially expressed genes by their $\pi$ values, and then selects the top ten genes, which are used in the ORA, and the genes' enrichment to pathways are noted. At every run, more genes are added and the ORA scores are computed. At the end of the analysis, the whole list of DEGs is run, and users can obtain a profile of how pathways are being enriched with increasing numbers of selected DEGs. As a result, the user will have the enrichment pathways by the number of ranked DEGs sorted by the p-value of the hypergeometric test and then be able to select the cutoff by the number of the significant pathway.
PEBBA is easily run using only the pebba
function:
# Get an example data set to apply PEBBA: library(PEBBA) data(example_data) # Use example pathway gmt file: gmt_fname <- system.file("extdata", "pathways.gmt", package="PEBBA") pebba(example_data, gmt_fname, force=TRUE)
That's it! PEBBA will create a folder in the working directory containing tables with the enrichment results and interactive heatmaps which can be used to obtain different information about each pathway.
PEBBA produces 5 output tables, each with several columns. Let's go over each table.
These tables show the -log10 adjusted p-value of the pathway enrichment analysis, as provided by the clusterProfiler R package. Each row represents a particular pathway, and each column is the cutoff number of genes with which the enrichment analysis was run. The table is ordered by the pathways in which significant enrichment occurred with lower cutoff gene values.
The PathwayUp table shows the results of enriching when the DEG list is in decreasing order of p-values, and conversely for the PathwayDown table. For the PathwayAny table, the DEG list is ordered in increasing order of the absolute p-values.
This table provides information about each cutoff.
minimum_Pi_\<direction>: The lowest pi-value in that direction for each cutoff.
minimum_log2fc_combined: The lowest log2 fold-change value in any direction for each cutoff.
minimum_Pi_combined: The lowest pi-value in any direction for each cutoff.
maximum_MinuslogP_\<direction>: The largest -log10 of the adjusted p-value in that direction for each cutoff.
times_significant_\<direction>: The proportion of significant pathways in that direction for each cutoff.
maximum_MinuslogP_meanUPandDOWN: The mean value of the largest -log10 of the adjusted p-value across both directions for each cutoff. This is the pairwise mean of the "maximum_MinuslogP_up" and the "maximum_MinuslogP_down" columns.
This table returns information about each pathway.
PEBBA_score_meanUPandDOWN: For each pathway, the mean across both directions of the PEBBA score.
TopCut_highestMinuslogP_minUPandDOWN: For each pathway, the minimum value across both directions of which gene number cutoff provided the largest -log10 adjusted p-value.
PEBBA_score_maxUPandDOWN: For each pathway, the maximum value across both directions of the PEBBA score.
TopCut_highestMinuslogP_\<direction>: For each pathway, which gene number cutoff provided the largest -log10 adjusted p-value in that direction.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.