AICFunction: Model Selection Via Akaike Information Criterion

View source: R/mplnMCMCEMClustering.R

AICFunctionR Documentation

Model Selection Via Akaike Information Criterion

Description

Performs model selection using Akaike Information Criterion (AIC). Formula: - 2 * logLikelihood + 2 * nParameters.

Usage

AICFunction(
  logLikelihood,
  nParameters,
  clusterRunOutput = NA,
  gmin,
  gmax,
  parallel = FALSE
)

Arguments

logLikelihood

A vector with value of final log-likelihoods for each cluster size.

nParameters

A vector with number of parameters for each cluster size.

clusterRunOutput

Output from mplnVariational, mplnMCMCParallel, or mplnMCMCNonParallel, if available. Default value is NA. If provided, the vector of cluster labels obtained by mclust::map() for best model will be provided in the output.

gmin

A positive integer specifying the minimum number of components to be considered in the clustering run.

gmax

A positive integer, >gmin, specifying the maximum number of components to be considered in the clustering run.

parallel

TRUE or FALSE indicating if MPLNClust::mplnMCMCParallel has been used.

Value

Returns an S3 object of class MPLN with results.

  • allAICvalues - A vector of AIC values for each cluster size.

  • AICmodelselected - An integer specifying model selected by AIC.

  • AICmodelSelectedLabels - A vector of integers specifying cluster labels for the model selected. Only provided if user input clusterRunOutput.

  • AICMessage - A character vector indicating if spurious clusters are detected. Otherwise, NA.

Author(s)

Anjali Silva, anjali@alumni.uoguelph.ca

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory, New York, NY, USA, pp. 267–281. Springer Verlag.

Examples

trueMu1 <- c(6.5, 6, 6, 6, 6, 6)
trueMu2 <- c(2, 2.5, 2, 2, 2, 2)

trueSigma1 <- diag(6) * 2
trueSigma2 <- diag(6)

# Generating simulated data
sampleData <- MPLNClust::mplnDataGenerator(nObservations = 100,
                                 dimensionality = 6,
                                 mixingProportions = c(0.79, 0.21),
                                 mu = rbind(trueMu1, trueMu2),
                                 sigma = rbind(trueSigma1, trueSigma2),
                                 produceImage = "No")

# Clustering
mplnResults <- MPLNClust::mplnVariational(dataset = sampleData$dataset,
                                membership = sampleData$trueMembership,
                                gmin = 1,
                                gmax = 2,
                                initMethod = "kmeans",
                                nInitIterations = 2,
                                normalize = "Yes")

# Model selection
AICmodel <- MPLNClust::AICFunction(logLikelihood = mplnResults$logLikelihood,
                         nParameters = mplnResults$numbParameters,
                         clusterRunOutput = mplnResults$allResults,
                         gmin = mplnResults$gmin,
                         gmax = mplnResults$gmax,
                         parallel = FALSE)


anjalisilva/MPLNClust documentation built on Sept. 19, 2024, 7:34 a.m.