run_enrichment: Run feature set Enrichment Analysis
In bioFAM/MOFA2: Multi-Omics Factor Analysis v2

run_enrichment

R Documentation

Run feature set Enrichment Analysis

Description

Method to perform feature set enrichment analysis. Here we use a slightly modified version of the pcgse function.

Usage

run_enrichment(
  object,
  view,
  feature.sets,
  factors = "all",
  set.statistic = c("mean.diff", "rank.sum"),
  statistical.test = c("parametric", "cor.adj.parametric", "permutation"),
  sign = c("all", "positive", "negative"),
  min.size = 10,
  nperm = 1000,
  p.adj.method = "BH",
  alpha = 0.1,
  verbose = TRUE
)

Arguments

`object`	a `MOFA` object.
`view`	a character with the view name, or a numeric vector with the index of the view to use.
`feature.sets`	data structure that holds feature set membership information. Must be a binary membership matrix (rows are feature sets and columns are features). See details below for some pre-built gene set matrices.
`factors`	character vector with the factor names, or numeric vector with the index of the factors for which to perform the enrichment.
`set.statistic`	the set statisic computed from the feature statistics. Must be one of the following: "mean.diff" (default) or "rank.sum".
`statistical.test`	the statistical test used to compute the significance of the feature set statistics under a competitive null hypothesis. Must be one of the following: "parametric" (default), "cor.adj.parametric", "permutation".
`sign`	use only "positive" or "negative" weights. Default is "all".
`min.size`	Minimum size of a feature set (default is 10).
`nperm`	number of permutations. Only relevant if statistical.test is set to "permutation". Default is 1000
`p.adj.method`	Method to adjust p-values factor-wise for multiple testing. Can be any method in p.adjust.methods(). Default uses Benjamini-Hochberg procedure.
`alpha`	FDR threshold to generate lists of significant pathways. Default is 0.1
`verbose`	boolean indicating whether to print messages on progress

Details

The aim of this function is to relate each factor to pre-defined biological pathways by performing a gene set enrichment analysis on the feature weights.
This function is particularly useful when a factor is difficult to characterise based only on the genes with the highest weight.
We provide a few pre-built gene set matrices in the MOFAdata package. See https://github.com/bioFAM/MOFAdata for details.
The function we implemented is based on the pcgse function with some modifications. Please read this paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4543476 for details on the math.

Value

a list with five elements:

`\strong{pval}:`	matrices with nominal p-values.
`\strong{pval.adj}:`	matrices with FDR-adjusted p-values.
`\strong{feature.statistics}:`	matrices with the local (feature-wise) statistics.
`\strong{set.statistics}:`	matrices with the global (gene set-wise) statistics.
`\strong{sigPathways}`	list with significant pathways per factor.