MTO Colloquium Lynette A. Hunt Monday, November 24 at 12:45 in WZ 204

advertisement
MTO Colloquium
Monday, November 24 at 12:45 in WZ 204
Lynette A. Hunt
(University of Waikato, Hamilton, New Zealand)
‘Model selection for the Multimix class of Mixture Models’
The mixture approach to clustering is a model based approach to clustering that
requires the specification of the form of the component distributions and the number of
groups that are to be fitted to the model. The Multimix class of mixture models (Hunt &
Jorgensen 1999) enables the clustering of data that have both categorical and continuous
attributes. However with this approach to clustering data, the user also has to decide on the
correlation structure that is to be incorporated into the model.
We investigate the performance of some commonly used model selection criteria in
the selection of an appropriate model when using the finite mixture model to cluster data
containing mixed categorical and continuous attributes. The performance of these criteria in
selecting both the form of the correlation structure and the number of components to be
used in the model is assessed using simulated data and a medical data set.
We found that the Bayesian information criterion and the integrated classification
likelihood can detect the number of groups to be fitted to a mixture model when the correct
partitioning is included in the model whilst the AIC and CLC perform in a less satisfactory
way. However we found that caution needs to be used when using informati on criteria to
select your model.
Hunt, L.A. & Jorgensen, M.A., (1999). ‘Mixture Model Clustering: a brief introduction to the
MULTIMIX program’. Australian and New Zealand Journal of Statistics, 41, No. 2, 153 -171.
Download