Graphical models for symmetrical relationships

advertisement
Graphical models
Svend Kreiner
Dept. of Biostatistics, University of Copenhagen
1
Graphical models for symmetrical relationships
Definition 1.
A graphical model is defined by a set of assumptions concerning
conditional independence by pairs of variables given the remaining – the rest of the – variables of the model.
The properties of graphical models are encapsulated in a
mathematical graph, a set of nodes and edges between nodes, as
shown in Figure 1.
Independence graph defined by the following five assumptions:
A╨C |B,D,E,F
A╨F |B,C,D,E
B╨D |A,C,E,F
D╨F |A,B,C,E
E╨F |A,B,C,D
2
Independence graphs are second
order mathematical models.
A model is an intentionally simplified
representation of some kind of system.
Certain properties of models correspond
to properties of the system.
Certain properties of the system is not represented in the model.
Certain properties of the model do not represent anything
A statistical model is a mathematical (probabilistic)
model of reality.
An independence graph is a mathematical
model of a probabilistic model.
Certain graph theoretical properties of independence graphs
correspond to properties of the statistical model.
Certain properties of statistical models have no
counterpart in independence graphs
Certain properties of independence graphs can not
be interpreted as properties of statistical models.
3
Graphical models for discrete data are loglinear.
Generators are defined by the cliques of the independence graph.
Cliques/generators: ABE, ADE, CDE, BCEF.
ln( P( A, B ,C , D, E , F ))   abe   ade   cde   bcef
4
Models collapse onto graphical marginal models.
Collaps over D: P(A,B,C,E,F)
The conditional distribution of a subset, of the variables given the
remaining variables will be a graphical model.
P(A,B,C,E,F|D)
5
The separation theorem: Separation implies conditional
independence.
Paths without detours from D to F:
D-E-F
D-C-F
D-A-B-F
D and F are separated by
A, E, and C

D ╨ F | A,E,C
B, C, and E

D ╨ F | B,E,C
or
6
Separation implies parametric collapsibility
in loglinear models.
All indirect paths from A to D goes through (B,E) and (C,E).
Parameters relating to A and D is the same in P(A,B,C,D,E,F) and
in the marginal models P(A,D,B,E) and P(A,D,C,E)
Marginal models sometimes have a simpler parametric
structure than implied by the marginal graphical model.
ln( P( A, B ,C , D, E , F ))   abe   ade   cde   bcef
ln( P( A, B, D, E ))   abe   ade   cde  *bce
7
Decomposition by separation of complete subsets leads to
decompositions of statistical models implying collapsibility in
terms of likelihood inference for certain types of models.
The BCE clique separate A and D from F

(A,D) ╨ F | (B,C,E)

P( A, B ,C , D , E , F )
 P( A, D| E , B ,C ) P( F|E , B ,C ) P( B ,C , E )
P( A, B ,C , D , E ) P( B ,C , E , F )

P( B ,C , E )
8
The topography of marginal models.
Marginal model: P(A,B,C,E)
exterior – all variables and edges not in the marginal model: The
exterior of P(A,B,C,E) is equal to variables D and F and edges
AD, ED, CD, BF, EF, and CF
boundary - all variables connected to a disconnected subset of
variables in the exterior. Two boundaries: AEC and BEC.
Boundaries are always complete and fixed in a marginal model.
border - the set of all boundaries
interior – variables and edges of a marginal model that are not
included in any boundary. The interior of P(A,B,C,E) is the ABedge.
problem core – the smallest irreducible component onto which
the model may be collapsed for an analysis of a given problem.
P(A,B,C,E) is not the core of the problem of estimating the AB
association.
9
Graphical regression models
Dependent variables:
Independent variables:
Y = (Y1,...,Yr),
X = (X1,...,Xs).
Definition 2.
A graphical regression model is a graphical model of P(Y|X).
Conditional independence assumptions are restricted to
Yi ╨ Yj | Y1,..,Yi-1,Yi+1,..,Yj-1,Yj+1, .,Yr,X1,..,Xs
Yi ╨ Xj | Y1,..,Yi-1,Yi+1,..,Yr,X1,..,Xj-1,Xj+1,..,Xs
Independence graph for a graphical regression model of
P(a,c,h|b,d,e,f,g).
10
Chain graph models
Recursive models:
Variables V = (V1,…,Vk)
Disjoint subsets of variables :(U1,…,Ur) such that V  U i
i
r 1
P(V )   P(U i | U i 1 ,..,U r )  P (U r )
1
Definition 2
A chain graph model is a block recursive model where P(Ur) is a
graphical model and each of the components P(Ui | Ui+1,..,Ur) are
graphical regression models.
Directed independence graph of a chain graph model
11
Regression graphs implied by a chain graph model
12
Download