Graphical models and Global Markov properties Graphs The arguments in this paper depend on results on statistical chain graph models. For this reason we briefly review introduce a few concepts, definitions and results from mathematical graph theory and from the theory of graphical models. In this paper, the most important concepts are the notions of separation in and moralizing of graphs. The review will therefore be restricted to concepts and notation needed for a definition of these two concepts. For a more comprehensive introduction of graph theoretical results of importance for graphical models we refer to Lauritzen (1996). Mathematical graphs A mathematical graph is a pair G = (V,E), where V is a set of nodes and E VV represents links between nodes. It is customary to display the graph as a visual diagram with dots representing nodes and edges or arrows representing links between nodes. Two nodes, a and b, are joined by an undirected edge in the visual graph if (a,b) E and (b,a) E or by an arrow pointing from a to b if (a,b) E and (b,a) E. E therefore consists of two subsets, E = U D, where U is the set of undirected edges U = {(a,b) E: (b,a) E} and D are directed arrows D = {(a,b) E: (b,a) E}. The graph is undirected if D is empty. Subgraphs The subset W V defines a subgraph GW = (W,F) where F = E(WW). Paths A path connecting to nodes a0 and ak in a graph is a sequence of distinct nodes, a0,a1,..,ak-1,ak such that (ai-1,ai) E for all i = 1,..,k. Connections and connectivity components Two nodes, a and b, are said to connect in the graph if there are paths from a to b and from b to a. The two paths connecting a and b may be one and the same undirected path or two distinct paths. Connection defines a number of disjoint subsets of connected nodes referred to as connectivity components, E = C1...Cc. Graph theory defines a number of different types of subsets of variables related to a specific node. Some of these definitions are shown in Table 1. 1 Table 1. Subsets related to a specific node, a V, of a graph. Notation pa(a) ne(a) bd(a) Definition {b V: (b,a)D } {b V: (a,b)U } pa(a) ne(a) The parents of the node The neighbors of the node The boundary of the node The definitions in Table 1 generalize to definitions for subsets of nodes. If A is a set of nodes then pa(A) = aApa(a), ne(A) = aAne(a)\A, bd(A) = pa(A) ne(A) Ancestral sets A subset, AV is ancestral, if bd(a)A for all aA. For any AV there will always be a smallest ancestral set, An(A), containing A. Separation Consider three disjoint subsets, AV, BV and SV. We say that S separates A from B in the graph if all paths from nodes in A to nodes in B and visa versa contains at least one node in S. The subgraph GV\S consequently has no paths linking nodes in A to nodes in B. Chain graphs Chain graphs are graphs where the set of nodes is partitioned into a number of disjoint subsets, V = V1...Vt, such that all nodes in the same block are joined by undirected edges while nodes from different subsets are connected by arrows pointing from subset with higher numbers to subsets with lower numbers. Truncated chain graphs The number of subsets defining the chain graph is referred to as the length of the chain. Connected with a chain graph, G, of length t, we define a number of truncated chain graphs, Trunc(G,s) = GW, where W = Vs...Vt. Note that the nodes of a truncated chain graph are ancestral in G. Chain components The connectivity components of chain graphs are called chain components. The nodes of a chain component always belong to the same block, but a block may contain more than one chain component. A subgraphs defined by a subset of nodes from a chain components is always undirected. 2 Moralized graphs The moral graph, GM, of a chain graph, G = (V,E), is an undirected graph with the same nodes as G where edges are included according to the following moralizing rules: Two nodes, a and b, are joined by an undirected edge in GM, 1) if (a,b) E and/or (b,a) E, 2) if they have a common descendant, a pa(c) and b pa(c), 3) if they have descendants in the same chain component. Separation in moralized graphs is very important for the theory of statistical chain graph models. It is illustrated in Figure 1 presenting moralized versions of a complete and a truncated chain graph. D and E are separated by {C,F,G} in the moralized graph, Figure 1b, and by {C,G} in the moralized truncated graph, Figure 1d. (a) Chain graph (b) Moralized chain graph 3 (c) Truncated chain graph (d) Moralized truncated chain graph Figure 1 Chain graphs. Dotted lines have been added during moralization. Chain graph models A chain graph model (CGM) is a multivariate statistical model defined by two sets of assumptions. The first partitions the variables into a set of recursive blocks, V1,…,Vt that usually is assumed to correspond to temporal and/or causal structure. The recursive structure defines a statistical model by rewriting the joint distribution of all variables as the product of conditional probabilities or densities of variables in specific blocks given all ancestors, P(V) = i=1..t-1P(Vi|Vi+1,..,Vt)P(Vt). The second set of assumptions defines the model by assuming that pairs of variables are conditionally independent given all legitimate variables where the set of legitimate variables consist of all concurrent or prior variables. The assumptions of the models are encoded in chain graphs where variables are represented by nodes partitioned into subsets corresponding to the recursive blocks. If the assumptions state that two variables are conditional independent given all legitimate variables, there will be no edge or arrow linking the corresponding nodes of the graph. The graphs are usually called independence graphs since the independence assumptions of the model may be read directly off the graph or Markov graphs because a number of Markov properties of the statistical model may be uncovered by a graph theoretical analysis of the graph (Lauritzen, 1996). Truncated chain graph models are defined in the same way as truncated chain graphs, that is as the distributions, P(Vs,Vs+1,..,Vt) = i=s..,t-1P(Vi|Vi+1,..,Vt)P(Vt). A truncated chain graph model is a chain graph model in its own right with a Markov graph which is equal to the truncated Markov graph, Trunc(G,s). One particular convenient feature of graphical models is that results concerning conditional independence in marginal models may be read directly off the graphs, due to the global Markov properties (Lauritzen, 1996 page 55). The global Markov property of chain graph models (Lauritzen, 1996 page 55): Let W be the smallest ancestral set, W = An(ABS), of three disjoint subsets of nodes of a Markov graph, G, of a chain graph model with subgraph H = GW. If S separates A from B in the moral graph, HM, of H then the variables in A are conditionally independent of the variables in B given the variables in S. 4 The following proposition is a direct consequence of the global Markov property. Proposition 1: Let G be the Markov graph of a chain graph model and let A, B and S be three disjoint subsets of nodes of G. If S separates A from B in GM, then AB | S. Proposition 1 follows from the fact that all paths from nodes in A to nodes in B in HM appearing in the statement of the global Markov property, will also appear as paths in GM. If S separates A from B in GM it will also separate A from B in HM. Proposition 1 also applies to truncated models where ABS t i s Vt with Markov subgraphs given by Trunc(G,s). The chain graph, Figure 1a, defines a chain graph model with three recursive blocks while Figure 1c defines a truncated model. Both models have moralized graphs with separation properties corresponding to global Markov properties. According to the definition of the model, D and F are conditionally independent given C, E and G. The moral graph in Figure 1b shows that it cannot be assumed that D and F are conditionally independent given all remaining variables in the model {A,B,C,E,G} because {A,B,C,E,G} does not separate D and F in Figure 1b. The moralized truncated graph tells us, however that DF | C,G because {C,G} separates D and F in Figure 1d. GMP hypotheses We refer to the set of conditional independences derived from the complete and truncated models by Proposition 1 as the set of GMP hypotheses. GMP hypotheses are useful during analysis by graphical models, because they all should be acceptable if the model is correctly specified and they will also be useful for the item screening procedures discussed in this paper. Assume that S1 separates A and B in the moral graph and that S2 satisfies S1 S2, A S2= and B S2=. From this it follows that S2 also separates A from B such that A B | S2. This suggests the following definitions: Definition 1: AB| S is a minimal GMP hypothesis if there are no smaller subsets, T S, that separates A and B in either the moralized chain graph or in one of the moralized truncated graphs. 5 Definition 2: AB| S is a smallest possible GMP hypothesis if there are no other minimal GMP hypotheses, AB| T, where the number of variables in T is smaller than the number of variables in S. Testing all GMP hypotheses in order to check the adequacy of a high-dimensional graphical model is not practical, but tests of the smallest possible and minimal GMP hypotheses will in most cases be enough to disclose that the fit of the model is inadequate. 6