Vignette illustrating the use of MOFA on simulated data
In MOFA: Multi-Omics Factor Analysis (MOFA)

Simulate an example data set

To illustrate the MOFA workflow we simulate a small example data set with 3 different views. makeExampleData generates an untrained MOFAobject containing the simulated data. If you work on your own data use createMOFAobject to create the untrained MOFA object (see our vignettes for CLL and scMT data).

library(MOFA)

By default the function makeExampleData produces a small data set containing 3 views with 100 features each and 50 samples. These parameters can be varied using the arguments n_views, n_features and n_samples.

set.seed(1234)
data <- makeExampleData()
MOFAobject <- createMOFAobject(data)
MOFAobject

Prepare MOFA: Set the training and model options

Once the untrained MOFAobject was created, we can specify details on data processing, model specifications and training options such as the number of factors, the likelihood models etc. Default option can be obtained using the functions getDefaultTrainOptions, getDefaultModelOptions and getDefaultDataOptions. We describe details on these option in our vignettes for CLL and scMT data.

Using prepareMOFA the model is set up for training.

TrainOptions <- getDefaultTrainOptions()
ModelOptions <- getDefaultModelOptions(MOFAobject)
DataOptions <- getDefaultDataOptions()

TrainOptions$DropFactorThreshold <- 0.01

Run MOFA

Once the MOFAobject is set up we can use runMOFA to train the model. As depending on the random initilization the results might differ, we recommend to use runMOFA multiple times (e.g. ten times, here we use a smaller number for illustration as the model training can take some time). As a next step we will compare the different fits and select the best model for downstream analyses.

n_inits <- 3
MOFAlist <- lapply(seq_len(n_inits), function(it) {

  TrainOptions$seed <- 2018 + it

  MOFAobject <- prepareMOFA(
  MOFAobject, 
  DataOptions = DataOptions,
  ModelOptions = ModelOptions,
  TrainOptions = TrainOptions
)

  runMOFA(MOFAobject)
})

Compare different random inits and select the best model

Having a list of trained models we can use compareModels to get an overview of how many factors were inferred in each run and what the optimized ELBO value is (a model with larger ELBO is preferred).

compareModels(MOFAlist)

With compareFactors we can get an overview of how robust the factors are between different model instances.

compareFactors(MOFAlist)

For down-stream analyses we recommned to choose the model with the best ELBO value as is done by selectModel.

MOFAobject <- selectModel(MOFAlist, plotit = FALSE)
MOFAobject

Downstream analyses

On the trained MOFAobject we can now start looking into the inferred factors, its weights etc. Here the data was generated using five factors, whose activity patterns we can recover using plotVarianceExplained.