Graphical models and Global Markov properties

advertisement
Graphical models and Global Markov properties
Graphs
The arguments in this paper depend on results on statistical chain graph models. For this reason we
briefly review introduce a few concepts, definitions and results from mathematical graph theory and
from the theory of graphical models. In this paper, the most important concepts are the notions of
separation in and moralizing of graphs. The review will therefore be restricted to concepts and
notation needed for a definition of these two concepts. For a more comprehensive introduction of
graph theoretical results of importance for graphical models we refer to Lauritzen (1996).
Mathematical graphs
A mathematical graph is a pair G = (V,E), where V is a set of nodes and E  VV represents links
between nodes. It is customary to display the graph as a visual diagram with dots representing
nodes and edges or arrows representing links between nodes. Two nodes, a and b, are joined by an
undirected edge in the visual graph if (a,b)  E and (b,a)  E or by an arrow pointing from a to b if
(a,b)  E and (b,a)  E. E therefore consists of two subsets, E = U  D, where U is the set of
undirected edges U = {(a,b)  E: (b,a)  E} and D are directed arrows D = {(a,b)  E: (b,a)  E}.
The graph is undirected if D is empty.
Subgraphs
The subset W  V defines a subgraph GW = (W,F) where F = E(WW).
Paths
A path connecting to nodes a0 and ak in a graph is a sequence of distinct nodes, a0,a1,..,ak-1,ak such
that (ai-1,ai)  E for all i = 1,..,k.
Connections and connectivity components
Two nodes, a and b, are said to connect in the graph if there are paths from a to b and from b to a.
The two paths connecting a and b may be one and the same undirected path or two distinct paths.
Connection defines a number of disjoint subsets of connected nodes referred to as connectivity
components, E = C1...Cc.
Graph theory defines a number of different types of subsets of variables related to a specific
node. Some of these definitions are shown in Table 1.
1
Table 1. Subsets related to a specific node, a  V, of a graph.
Notation
pa(a)
ne(a)
bd(a)
Definition
{b  V: (b,a)D }
{b  V: (a,b)U }
pa(a)  ne(a)
The parents of the node
The neighbors of the node
The boundary of the node
The definitions in Table 1 generalize to definitions for subsets of nodes. If A is a set of nodes then
pa(A) = aApa(a), ne(A) = aAne(a)\A, bd(A) = pa(A)  ne(A)
Ancestral sets
A subset, AV is ancestral, if bd(a)A for all aA. For any AV there will always be a smallest
ancestral set, An(A), containing A.
Separation
Consider three disjoint subsets, AV, BV and SV. We say that S separates A from B in the
graph if all paths from nodes in A to nodes in B and visa versa contains at least one node in S. The
subgraph GV\S consequently has no paths linking nodes in A to nodes in B.
Chain graphs
Chain graphs are graphs where the set of nodes is partitioned into a number of disjoint subsets, V =
V1...Vt, such that all nodes in the same block are joined by undirected edges while nodes from
different subsets are connected by arrows pointing from subset with higher numbers to subsets with
lower numbers.
Truncated chain graphs
The number of subsets defining the chain graph is referred to as the length of the chain. Connected
with a chain graph, G, of length t, we define a number of truncated chain graphs, Trunc(G,s) = GW,
where W = Vs...Vt. Note that the nodes of a truncated chain graph are ancestral in G.
Chain components
The connectivity components of chain graphs are called chain components. The nodes of a chain
component always belong to the same block, but a block may contain more than one chain
component. A subgraphs defined by a subset of nodes from a chain components is always
undirected.
2
Moralized graphs
The moral graph, GM, of a chain graph, G = (V,E), is an undirected graph with the same nodes as G
where edges are included according to the following moralizing rules:
Two nodes, a and b, are joined by an undirected edge in GM,
1) if (a,b)  E and/or (b,a)  E,
2) if they have a common descendant, a  pa(c) and b  pa(c),
3) if they have descendants in the same chain component.
Separation in moralized graphs is very important for the theory of statistical chain graph models. It
is illustrated in Figure 1 presenting moralized versions of a complete and a truncated chain graph. D
and E are separated by {C,F,G} in the moralized graph, Figure 1b, and by {C,G} in the moralized
truncated graph, Figure 1d.
(a) Chain graph
(b) Moralized chain graph
3
(c) Truncated chain graph
(d) Moralized truncated chain graph
Figure 1 Chain graphs. Dotted lines have been added during moralization.
Chain graph models
A chain graph model (CGM) is a multivariate statistical model defined by two sets of assumptions.
The first partitions the variables into a set of recursive blocks, V1,…,Vt that usually is assumed to
correspond to temporal and/or causal structure. The recursive structure defines a statistical model by
rewriting the joint distribution of all variables as the product of conditional probabilities or densities
of variables in specific blocks given all ancestors, P(V) = i=1..t-1P(Vi|Vi+1,..,Vt)P(Vt). The second
set of assumptions defines the model by assuming that pairs of variables are conditionally independent given all legitimate variables where the set of legitimate variables consist of all concurrent or
prior variables. The assumptions of the models are encoded in chain graphs where variables are
represented by nodes partitioned into subsets corresponding to the recursive blocks. If the
assumptions state that two variables are conditional independent given all legitimate variables, there
will be no edge or arrow linking the corresponding nodes of the graph. The graphs are usually
called independence graphs since the independence assumptions of the model may be read directly
off the graph or Markov graphs because a number of Markov properties of the statistical model may
be uncovered by a graph theoretical analysis of the graph (Lauritzen, 1996).
Truncated chain graph models are defined in the same way as truncated chain graphs, that is
as the distributions, P(Vs,Vs+1,..,Vt) = i=s..,t-1P(Vi|Vi+1,..,Vt)P(Vt). A truncated chain graph model is
a chain graph model in its own right with a Markov graph which is equal to the truncated Markov
graph, Trunc(G,s).
One particular convenient feature of graphical models is that results concerning conditional
independence in marginal models may be read directly off the graphs, due to the global Markov
properties (Lauritzen, 1996 page 55).
The global Markov property of chain graph models (Lauritzen, 1996 page 55): Let W be the
smallest ancestral set, W = An(ABS), of three disjoint subsets of nodes of a Markov graph, G,
of a chain graph model with subgraph H = GW. If S separates A from B in the moral graph, HM, of
H then the variables in A are conditionally independent of the variables in B given the variables in
S.
4
The following proposition is a direct consequence of the global Markov property.
Proposition 1: Let G be the Markov graph of a chain graph model and let A, B and S be three
disjoint subsets of nodes of G. If S separates A from B in GM, then AB | S.
Proposition 1 follows from the fact that all paths from nodes in A to nodes in B in HM appearing in
the statement of the global Markov property, will also appear as paths in GM. If S separates A from
B in GM it will also separate A from B in HM.
Proposition 1 also applies to truncated models where ABS 
t

i s
Vt with Markov
subgraphs given by Trunc(G,s). The chain graph, Figure 1a, defines a chain graph model with three
recursive blocks while Figure 1c defines a truncated model. Both models have moralized graphs
with separation properties corresponding to global Markov properties. According to the definition
of the model, D and F are conditionally independent given C, E and G. The moral graph in Figure
1b shows that it cannot be assumed that D and F are conditionally independent given all remaining
variables in the model {A,B,C,E,G} because {A,B,C,E,G} does not separate D and F in Figure 1b.
The moralized truncated graph tells us, however that DF | C,G because {C,G} separates D and F
in Figure 1d.
GMP hypotheses
We refer to the set of conditional independences derived from the complete and truncated models
by Proposition 1 as the set of GMP hypotheses. GMP hypotheses are useful during analysis by
graphical models, because they all should be acceptable if the model is correctly specified and they
will also be useful for the item screening procedures discussed in this paper.
Assume that S1 separates A and B in the moral graph and that S2 satisfies S1  S2, A  S2=
and B  S2=. From this it follows that S2 also separates A from B such that A  B | S2. This
suggests the following definitions:
Definition 1: AB| S is a minimal GMP hypothesis if there are no smaller subsets, T  S, that
separates A and B in either the moralized chain graph or in one of the moralized truncated graphs.
5
Definition 2: AB| S is a smallest possible GMP hypothesis if there are no other minimal GMP
hypotheses, AB| T, where the number of variables in T is smaller than the number of variables in
S.
Testing all GMP hypotheses in order to check the adequacy of a high-dimensional graphical model
is not practical, but tests of the smallest possible and minimal GMP hypotheses will in most cases
be enough to disclose that the fit of the model is inadequate.
6
Download