View source: R/mplnMCMCEMClustering.R
BICFunction | R Documentation |
Performs model selection using Bayesian Information Criterion (BIC) by Schwarz (1978). Formula: - 2 * logLikelihood + (nParameters * log(nObservations)).
BICFunction(
logLikelihood,
nParameters,
nObservations,
clusterRunOutput = NA,
gmin,
gmax,
parallel = FALSE
)
logLikelihood |
A vector with value of final log-likelihoods for each cluster size. |
nParameters |
A vector with number of parameters for each cluster size. |
nObservations |
A positive integer specifying the number of observations in the dataset analyzed. |
clusterRunOutput |
Output from mplnVariational, mplnMCMCParallel, or mplnMCMCNonParallel, if available. Default value is NA. If provided, the vector of cluster labels obtained by mclust::map() for best model will be provided in the output. |
gmin |
A positive integer specifying the minimum number of components to be considered in the clustering run. |
gmax |
A positive integer, >gmin, specifying the maximum number of components to be considered in the clustering run. |
parallel |
TRUE or FALSE indicating if MPLNClust::mplnMCMCParallel has been used. |
Returns an S3 object of class MPLN with results.
allBICvalues - A vector of BIC values for each cluster size.
BICmodelselected - An integer specifying model selected by BIC
BICmodelSelectedLabels - A vector of integers specifying cluster labels for the model selected. Only provided if user input clusterRunOutput.
BICMessage - A character vector indicating if spurious clusters are detected. Otherwise, NA.
Anjali Silva, anjali@alumni.uoguelph.ca
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics 6.
trueMu1 <- c(6.5, 6, 6, 6, 6, 6)
trueMu2 <- c(2, 2.5, 2, 2, 2, 2)
trueSigma1 <- diag(6) * 2
trueSigma2 <- diag(6)
# Generating simulated data
sampleData <- MPLNClust::mplnDataGenerator(nObservations = 100,
dimensionality = 6,
mixingProportions = c(0.79, 0.21),
mu = rbind(trueMu1, trueMu2),
sigma = rbind(trueSigma1, trueSigma2),
produceImage = "No")
# Clustering
mplnResults <- MPLNClust::mplnVariational(dataset = sampleData$dataset,
membership = sampleData$trueMembership,
gmin = 1,
gmax = 2,
initMethod = "kmeans",
nInitIterations = 2,
normalize = "Yes")
# Model selection
BICmodel <- MPLNClust::BICFunction(logLikelihood = mplnResults$logLikelihood,
nParameters = mplnResults$numbParameters,
nObservations = nrow(mplnResults$dataset),
clusterRunOutput = mplnResults$allResults,
gmin = mplnResults$gmin,
gmax = mplnResults$gmax,
parallel = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.