AAAI 2014 Tutorial Latent Tree Models Part II: Definition and Properties Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. http://www.cse.ust.hk/~lzhang Part II: Concept and Properties Latent Tree Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 2 Basic Latent Tree Models (LTM) Bayesian network All variables are discrete Structure is a rooted tree Leaf nodes are observed (manifest variables) Internal nodes are not observed (latent variables) Parameters: Also known as Hierarchical latent class (HLC) models, HLC models (Zhang. JMLR 2004) P(Y1), P(Y2|Y1),P(X1|Y2), P(X2|Y2), … Semantics: AAAI 2014 Tutorial Nevin L. Zhang HKUST 3 Joint Distribution over Observed Variables Marginalizing out the latent variables in get a joint distribution over the observed variables , we . In comparison with Bayesian network without latent variables, LTM: Is computationally very simple to work with. Represent complex relationships among manifest variables. What does the structure look like without the latent variables? AAAI 2014 Tutorial Nevin L. Zhang HKUST 4 Pouch Latent Tree Models (PLTM) An extension of basic LTM (Poon et al. ICML 2010) Rooted tree Internal nodes represent discrete latent variables Each leaf node consists of one or more continuous observed variable, called a pouch. AAAI 2014 Tutorial Nevin L. Zhang HKUST 5 More General Latent Variable Tree Models Some internal nodes can be observed Internal nodes can be continuous Forest Primary focus of this tutorial: the basic LTM (Choi et al. JMLR 2011) AAAI 2014 Tutorial Nevin L. Zhang HKUST 6 Part II: Concept and Properties Latent Tree Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 7 Finite Mixture Models (FMM) Gaussian Mixture Models (GMM): Continuous attributes Graphical model AAAI 2014 Tutorial Nevin L. Zhang HKUST 8 Finite Mixture Models (FMM) GMM with independence assumption Block diagonal co-variable matrix Graphical Model AAAI 2014 Tutorial Nevin L. Zhang HKUST 9 Finite Mixture Models Latent class models (LCM): Discrete attributes Graphical Model Distribution for cluster k: Product multinomial distribution: All FMMs One latent variable Yielding one partition of data AAAI 2014 Tutorial Nevin L. Zhang HKUST 10 From FMMs to LTMs Start with several GMMs, Each based on a distinct subset of attributes Each partitions data from a certain perspective. Different partitions are independent of each other Link them up to form a tree model Get Pouch LTM Consider different perspectives in a single model Multiple partitions of data that are correlated. AAAI 2014 Tutorial Nevin L. Zhang HKUST 11 From FMMs to LTMs Start with several LCMs, Each based on a distinct subset of attributes Each partitions data from a certain perspective. Different partitions are independent of each other Link them up to form a tree model Get LTM Consider different perspectives in a single model Multiple partitions of data that are correlated. Summary: An LTM can be viewed as a collections of FMMs, with their latent variables linked up to form a tree structure. AAAI 2014 Tutorial Nevin L. Zhang HKUST 12 Part II: Concept and Properties Latent Tree Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 13 Phylogenetic trees TAXA (sequences) identify species Edge lengths represent evolution time Usually, bifurcating tree topology Durbin, et al. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. AAAI 2014 Tutorial Nevin L. Zhang HKUST 14 Probabilistic Models of Evolution Two assumptions There are only substitutions, no insertions/deletions (aligned) Each site evolves independently and identically P(x|y, t) = Pi=1 to m P(x(i) | y(i), t) One-to-one correspondence between sites in different sequences m is sequence length P(x(i)|y(i), t) Jukes-Cantor (Character Evolution) Model [1969] Rate of substitution a AAAI 2014 Tutorial Nevin L. Zhang HKUST 15 Phylogenetic Trees are Special LTMs When focus on one site, phylogenetic trees are special latent tree models The structure is a binary tree The variables share the same state space. Each conditional distribution is characterized by only one parameters, i.e., the length of the corresponding edge AAAI 2014 Tutorial Nevin L. Zhang HKUST 16 Hidden Markov Models Hidden Markov models are also special latent tree models All latent variables share the same state space. All observed variables share the same state space. P(yt |st ) and P(st+1 | st ) are the same for different t ’s. AAAI 2014 Tutorial Nevin L. Zhang HKUST 17 Part II: Concept and Basic Properties Latent Tree Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 18 Two Concepts of Models So far, a model consists of Observed and latent variables Connections among the variables Probability values For the rest of Part II, a model consists of Observed and latent variables Connections among the variables Probability parameters AAAI 2014 Tutorial Nevin L. Zhang HKUST 19 Model Inclusion AAAI 2014 Tutorial Nevin L. Zhang HKUST 20 Model Equivalence If m includes m’ and vice versa, then they are marginally equivalent. If they also have the same number of free parameters, then they are equivalent. It is not possible to distinguish between equivalent models based on data. AAAI 2014 Tutorial Nevin L. Zhang HKUST 21 Root Walking AAAI 2014 Tutorial Nevin L. Zhang HKUST 22 Root Walking Example Root walks to X2; Root walks to X3 AAAI 2014 Tutorial Nevin L. Zhang HKUST 23 Root Walking Theorem: Root walking leads to equivalent latent tree models. (Zhang, JMLR 2004) Special case of covered arc reversal in general Bayesian network, Chickering, D. M. (1995). A transformational characterization of equivalent Bayesian network structures. UAI. AAAI 2014 Tutorial Nevin L. Zhang HKUST 24 Implication Edge orientations in latent tree models are not identifiable. Technically, better to start with alternative definition of LTM: A latent tree model (LTM) is a Markov random field over an undirected tree, or tree-structured Markov network where variables at leaf nodes are observed and variables at internal nodes are hidden. AAAI 2014 Tutorial Nevin L. Zhang HKUST 25 Implication For technical convenience, we often root an LTM at one of its latent nodes and regard it as a directed graphical model. Rooting the model at different latent nodes lead to equivalent directed models. This is why we introduced LTM as directed models. AAAI 2014 Tutorial Nevin L. Zhang HKUST 26 Regularity |X|: Cardinality of variable X, i.e., the number of states. AAAI 2014 Tutorial Nevin L. Zhang HKUST 27 Regularity Can focus on regular models only Irregular models can be made regular Regularized models better than irregular models (Zhang, JMLR 2004) Theorem: The set of all regular models for a given set of observed variables is finite. AAAI 2014 Tutorial Nevin L. Zhang HKUST 28