Discovery of Latent Patterns with Hierarchical Bayesian Mixed-Membership Models and the Issue of Model Choice
Cyrille J. Joutard (GREMAQ, University Toulouse, France), Edoardo M. Airoldi (Princetone University, USA), Stephen E. Edoardo M. (Carnegie Mellon University, USA) and Tanzy M. Love (Carnegie Mellon University, USA)
Copyright: © 2008
Statistical models involving a latent structure often support clustering, classification, and other data-mining tasks. Parameterizations, specifications, and constraints of alternative models can be very different, however, and may lead to contrasting conclusions. Thus model choice becomes a fundamental issue in applications, both methodological and substantive. Here, we work from a general formulation of hierarchical Bayesian models of mixed-membership that subsumes many popular models successfully applied to problems in the computing, social and biological sciences. We present both parametric and nonparametric specifications for discovering latent patterns. Context for the discussion is provided by novel analyses of the following two data sets: (1) 5 years of scientific publications from the Proceedings of the National Academy of Sciences; (2) an extract on the functional disability of Americans age 65+ from the National Long Term Care Survey. For both, we elucidate strategies for model choice and our analyses bring new insights compared with earlier published analyses.