clusterSamples: clusterSamples: K-means clustering on samples based on latent...
In bioFAM/MOFA: Multi-Omics Factor Analysis (MOFA)

Description Usage Arguments Details Value Examples

MOFA factors are continuous in nature but they can be used to predict discrete clusters of samples, similar to the iCluster model (Shen, 2009).
The clustering can be performed in a single factor, which is equivalent to setting a manual threshold; or using multiple factors, where multiple sources of variation are aggregated.
Importantly, this type of clustering is not weighted and does not take into account the different importance of the latent factors.

1	clusterSamples(object, k, factors = "all", ...)

`object`	a trained `MOFAmodel` object.
`k`	number of clusters
`factors`	character vector with the factor name(s), or numeric vector with the index of the factor(s) to use. Default is 'all'
`...`	extra arguments passed to `kmeans`

In some cases, samples can have missing values in the factor space. This occurs when a factor is active in a single view and some samples are missing this data.
In such a case, there are several strategies to follow:

Use clustering approaches that deal with NAs (not implemented in MOFA)
If the factor in question is not important, you can remove it with subsetFactors
If the factor in question is important and just a small number of samples are conflictive, you can manually set them to 0 using object@Expectations$Z[is.na(object@Expectations$Z)] <- 0

By default, the conflictive samples are ignored in the clustering procedure and NAs are returned.

output from kmeans function

# Example on the CLL data
filepath <- system.file("extdata", "CLL_model.hdf5", package = "MOFAdata")
MOFA_CLL <- loadModel(filepath)
# cluster samples based into 3 groups based on all factors
clusterSamples(MOFA_CLL, k=3, factors="all")
# cluster samples based into 2 groups based on factor 1
clusters <- clusterSamples(MOFA_CLL, k=2, factors=1)
# cluster can be visualized for example on the factors values:
plotFactorBeeswarm(MOFA_CLL, factor=1, color_by=clusters)

# Example on the scMT data
filepath <- system.file("extdata", "scMT_model.hdf5", package = "MOFAdata")
MOFA_scMT <- loadModel(filepath)
# cluster samples based into 2 groups based on all factor 1 and 2
clusters <- clusterSamples(MOFA_CLL, k=2, factors=1:2)
# cluster can be visualized for example on the factors values:
plotFactorScatter(MOFA_CLL, factors=1:2, color_by=clusters)