Discrete Multivariate Analysis (Part 1)

advertisement
Discrete Multivariate Analysis
Analysis of Multivariate
Categorical Data
Example 1
In this study we examine n = 1237 individuals
measuring X, Systolic Blood Pressure and Y, Serum
Cholesterol
Serum
Cholesterol
<200
200-219
220-259
260+
Total
Data Set #1 - A two-way frequency table
Systolic Blood pressure
<127
127-146
147-166
167+
117
121
47
22
85
98
43
20
119
209
68
43
67
99
46
33
388
527
204
118
Total
307
246
439
245
1237
Example 2
The following data was taken from a study of
parole success involving 5587 parolees in Ohio
between 1965 and 1972 (a ten percent sample of
all parolees during this period).
The study involved a dichotomous response Y
– Success (no major parole violation) or
– Failure (returned to prison either as technical
violators or with a new conviction)
based on a one-year follow-up.
The predictors of parole success included are:
1. type of committed offence (Person offense or
Other offense),
2. Age (25 or Older or Under 25),
3. Prior Record (No prior sentence or Prior
Sentence), and
4. Drug or Alcohol Dependency (No drug or Alcohol
dependency or Drug and/or Alcohol dependency).
• The data were randomly split into two
parts. The counts for each part are
displayed in the table, with those for
the second part in parentheses.
• The second part of the data was set
aside for a validation study of the
model to be fitted in the first part.
Table
Success
Failure
Success
Failure
No drug or alcohol dependency
Drug and/or alcohol dependency
25 or older
Under 25
25 or Older
Under 25
Person
Other Person
Other
Person
Other Person
Other
offense offense offense offense offense offense offense offense
No prior Sentence of Any Kind
48
34
37
49
48
28
35
57
(44)
(34)
(29)
(58)
(47)
(38)
(37)
(53)
1
5
7
11
3
8
5
18
(1)
(7)
(7)
(5)
(1)
(2)
(4)
(24)
Prior Sentence
117
259
131
319
197
435
107
291
(111)
(253)
(131)
(320)
(202)
(392)
(103)
(294)
23
61
20
89
38
194
27
101
(27)
(55)
(25)
(93)
(46)
(215)
(34)
(102)
Analysis of a Two-way
Frequency Table:
Frequency Distribution
(Serum Cholesterol and Systolic Blood Pressure)
Serum
Cholesterol
<200
200-219
220-259
260+
Total
<127
117
85
119
67
388
Systolic Blood pressure
127-146
147-166
121
47
98
43
209
68
99
46
527
204
167+
22
20
43
33
118
Total
307
246
439
245
1237
Joint and Marginal Distributions
(Serum Cholesterol and Systolic Blood Pressure)
Serum
Cholesterol
<200
200-219
220-259
260+
Marginal
distn (BP)
<127
9.46
6.87
9.62
5.42
31.37
Systolic Blood pressure
127-146
147-166
9.78
3.80
7.92
3.48
16.90
5.50
8.00
3.72
42.60
16.49
167+
1.78
1.62
3.48
2.67
9.54
Marginal distn
(Serum Chol.)
24.82
19.89
35.49
19.81
100.00
The Marginal distributions allow you to look at the effect of one
variable, ignoring the other.
The joint distribution allows you to look at the two variables
simultaneously.
Conditional Distributions
( Systolic Blood Pressure given Serum Cholesterol )
Serum
Cholesterol
<200
200-219
220-259
260+
Marginal
distn (BP)
<127
38.11
34.55
27.11
27.35
31.37
Systolic Blood pressure
127-146
147-166
39.41
15.31
39.84
17.48
47.61
15.49
40.41
18.78
42.60
16.49
167+
7.17
8.13
9.79
13.47
9.54
Total
100.00
100.00
100.00
100.00
100.00
The conditional distribution allows you to look at the
effect of one variable, when the other variable is held
fixed or known.
Conditional Distributions
(Serum Cholesterol given Systolic Blood Pressure)
Serum
Cholesterol
<200
200-219
220-259
260+
Total
<127
30.15
21.91
30.67
17.27
100.00
Systolic Blood pressure
127-146
147-166
22.96
23.04
18.60
21.08
39.66
33.33
18.79
22.55
100.00
100.00
167+
18.64
16.95
36.44
27.97
100.00
Marginal distn
(Serum Chol.)
24.82
19.89
35.49
19.81
100.00
GRAPH: Conditional distributions of Systolic Blood
Pressure given Serum Cholesterol
50%
SERUM CHOLESTEROL
<200
40%
200-219
220-259
260+
30%
Marginal Distribution
20%
10%
<127
127-146
147-166
SYSTOLIC BLOOD P RESSURE
167+
Notation:
Let xij denote the frequency (no. of cases) where
X (row variable) is i and Y (row variable) is j.
c
xi  Ri   xij
j 1
r
x j  C j   xij
i 1
r
c
r
c
i 1
j 1
x  N   xij   xi   x j
i 1 j 1
Different Models
The Multinomial Model:
Here the total number of cases N is fixed and xij follows a
multinomial distribution with parameters ij
 ij  P  X  i, Y  j 
f  x11 , x12 ,

, xrc   
 x11
N
 x11 x12
 11 12
xrc 
N!
x11 x12

 11  12
x11 ! xrc !
ij  E  xij   N  ij
 rcx
rc

xrc
rc
The Product Multinomial Model:
Here the row (or column) totals Ri are fixed and for a
given row i, xij follows a multinomial distribution with
parameters j|i
f  x11 , x12 ,

, xrc    
i 1  xi1
r
Ri
ij  E  xij   Ri j|i
 x11 x12
 1|i  2|i
xic 
 cx|i
ic
The Poisson Model:
In this case we observe over a fixed period of time and
all counts in the table (including Row, Column and
overall totals) follow a Poisson distribution. Let ij
denote the mean of xij.
ij  E  xij 
fij  xij  
f  x11 , x12 ,

xij
ij
xij !
e
r
 ij
c
, xrc   
i 1
j 1

xij
ij
xij !
e
 ij
Independence
Multinomial Model
 ij  P  X  i, Y  j   P  X  i  P Y  j 
  i  j if independent
and
ij  N ij  N i  j
The estimated expected frequency in cell (i,j) in
the case of independence is:
 xi   x j 
mij  ˆij  Nˆiˆ j  N    
 N  N 

xi x j
N

Ri C j
N
The same can be shown for the other two models – the
Product Multinomial model and the Poisson model
namely
The estimated expected frequency in cell (i,j) in
the case of independence is:
mij  ˆij 
Ri C j
N

xi x j
x
Standardized residuals are defined for each cell:
rij 
xij  mij
mij
The Chi-Square Statistic
r
c
r
c
   r  
2
i 1 j 1
2
ij
x
i 1 j 1
ij
 mij 
2
mij
The Chi-Square test for independence
Reject H0: independence if
r
c
  
2
i 1 j 1
x
ij
 mij 
mij
2

2
 /2
df   r  1 c  1
Table
Expected frequencies, Observed frequencies,
Standardized Residuals
Serum
Cholesterol
<200
200-219
220-259
260+
Total
<127
96.29
(117)
2.11
77.16
(85)
0.86
137.70
(119)
-1.59
76.85
(67)
-1.12
388
2 = 20.85 (p = 0.0133)
Systolic Blood pressure
127-146
147-166
130.79
50.63
(121)
(47)
-0.86
-0.51
104.80
40.47
(98)
(43)
-0.66
0.38
187.03
72.40
(209)
(68)
1.61
-0.52
104.38
40.04
(99)
(46)
-0.53
0.88
527
204
167+
29.29
(22)
-1.35
23.47
(20)
-0.72
41.88
(43)
0.17
23.37
(33)
1.99
118
Total
307
246
439
245
1237
Example
In the example N = 57,407 cases in which individuals
were victimized twice by crimes were studied.
The crime of the first victimization (X) and the crime of
the second victimization (Y) were noted.
The data were tabulated on the following slide
Table 1: Frequencies
Ra
A
First
Ro
Victimization PP/PS
in pair
PL
B
HL
MV
Total
Ra
26
65
12
3
75
52
42
3
278
A
50
2997
279
102
2628
1117
1251
221
8645
Second Victimization in Pair
Ro PP/PS
PL
B
HL
11
6
82
39
48
238
85 2553 1083 1349
197
36
459
197
221
40
61
243
115
101
413
229 12137 2658 3689
191
102 2649 3210 1973
206
117 3757 1962 4646
51
24
678
301
367
1347
660 22558 9565 12394
MV Total
11
273
216 8586
47 1448
38
703
687 22516
301 9595
391 12372
269 1914
1960
Table 2: Standardized residuals
Ra
A
First
Ro
Victimization PP/PS
in pair
PL
B
HL
MV
Second Victimization in Pair
Ra
A
Ro PP/PS
PL
B
HL
1.4
1.8
1.6
-2.4
-1.0
-1.9
21.5
3.6
2.6
-1.4 -14.1 -9.2 -11.7
47.4
1.9
4.1
4.7
-4.6
-2.8
-5.2
28.0
-0.2
-0.4
5.8
-2.0
-0.2
-4.1
18.6
-3.3 -13.1 -5.0
-1.9
35.0 -17.9 -16.8
0.8
-8.6
-2.3
-0.8 -18.3 40.3
-2.2
-2.3 -14.2 -4.9
-2.1 -15.8 -2.2
38.2
-2.1
-4.0
0.9
0.4
-2.7
-1.0
-2.3
11,430 (highly significant)
MV
0.6
-4.5
-0.3
2.9
-2.9
-1.5
-1.5
25.2
Table 3: Conditional distribution of second
victimization given the first victimization (%)
First
Victimization
in pair
Ra
A
Ro
PP/PS
PL
B
HL
MV
Marginal
Ra
9.5
0.8
0.8
0.4
0.3
0.5
0.3
0.2
0.5
Second Victimization in Pair
A
Ro PP/PS
PL
B
18.3
4.0
2.2
30.0
14.3
2.8
1.0
29.7
12.6
34.9
19.3
2.5
31.7
13.6
13.6
14.5
5.7
34.6
16.4
8.7
11.7
1.8
1.0
11.8
53.9
11.6
2.0
1.1
27.6
33.5
10.1
1.7
0.9
30.4
15.9
11.5
2.7
1.3
35.4
15.7
15.1
2.3
1.1
39.3
16.7
HL
17.6
15.7
15.3
14.4
16.4
20.6
37.6
19.2
21.6
MV
4.0
2.5
3.2
5.4
3.1
3.1
3.2
14.1
3.4
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
Log Linear Model
Recall, if the two variables, rows (X) and columns (Y) are
independent then
ij  N ij  N i  j
and
ln ij  ln N  ln  i  ln   j
In general let
1
1
u   ln ij
u1(i )   ln ij  u
rc i j
c j
1
u2( j )   ln ij  u u12(i , j )  ln ij  u  u1( i )  u2( j )
r i
then
ln ij  u  u1(i )  u2( j )  u12( i , j ) (1)
where
u
1( i )
i
  u2( j )   u12( i , j )   u12( i , j )  0
j
i
j
Equation (1) is called the log-linear model for the
frequencies xij.
Note: X and Y are independent if
u12( i , j )  0 for all i, j
In this case the log-linear model becomes
ln ij  u  u1(i )  u2( j )
Another formulation
ln ij  u  u
*
*
12 I , j 
where u
*
1i 
u
u
*
12i , J 
*
2 j 
u
u
*
1 I 
*
12i , j 
u
*
2 J 
0
Three-way Frequency Tables
With two variables the dependence structure is
simple: the variables are either dependent or
independent.
When there are three or more variables the
dependence structure is much more complicated.
Marginal distributions
Distributions of two variables ignoring the third.
1. X1, X2 ignoring X3
2. X1, X3 ignoring X2
3. X2, X3 ignoring X1
Distributions of one variable ignoring the other
two.
1. X1 ignoring X2, X3
2. X2 ignoring X1, X3
3. X3 ignoring X1, X2
Conditional distributions
Distributions of two variables given the third.
1. X1, X2 given X3
2. X1, X3 given X2
3. X2, X3 given X1
Distributions of one variable given the other two.
1. X1 given X2, X3
2. X2 given X1, X3
3. X3 given X1, X2
Distributions of one variable given either of the
other two.
1. X1 given X2
2. X1 given X3
3. X2 given X1
4. X2 given X3
5. X3 given X1
6. X3 given X2
Example
Data from the Framingham Longitudinal Study
of Coronary Heart Disease (Cornfield [1962])
Variables
1. Systolic Blood Pressure (X)
–
< 127, 127-146, 147-166, 167+
2. Serum Cholesterol
–
<200, 200-219, 220-259, 260+
3. Heart Disease
–
Present, Absent
The data is tabulated on the next slide
Three-way Frequency Table
Coronary
Heart
Disease
Present
Absent
Serum
Cholesterol
(mm/100 cc)
<200
200-219
220-259
260+
<200
200-219
220-259
260+
Systolic Blood pressure (mm Hg)
<127
127-146
147-166
2
3
3
3
2
0
8
11
6
7
12
11
117
121
47
85
98
43
119
209
68
67
99
46
167+
4
3
6
11
22
20
43
33
Log-Linear model for three-way tables
Let ijk denote the expected frequency in cell
(i,j,k) of the table then in general
ln ij  u  u1(i )  u2( j )  u3( k )  u12( i , j )
u13(i ,k )  u23( j ,k )  u123( i , j ,k )
where
0   u1(i )   u2( j )   u3( k )   u12( i , j )   u12( i , j )
i
j
k
i
j
  u13( i ,k )   u13(i ,k )  u23( j ,k )   u23( j ,k )
i
k
j
k
  u123(i , j ,k )   u123( i , j ,k )  u123(i , j ,k )
i
j
k
Hierarchical Log-linear models
for categorical Data
For three way tables
The hierarchical principle:
If an interaction is in the model, also keep
lower order interactions and main effects
associated with that interaction
1.Model: (All Main effects model)
ln ijk = u + u1(i) + u2(j) + u3(k)
i.e. u12(i,j) = u13(i,k) = u23(j,k) = u123(i,j,k) = 0.
Notation:
[1][2][3]
Description:
Mutual independence between all three
variables.
2.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j)
i.e. u13(i,k) = u23(j,k) = u123(i,j,k) = 0.
Notation:
[12][3]
Description:
Independence of Variable 3 with variables 1
and 2.
3.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u13(i,k)
i.e. u12(i,j) = u23(j,k) = u123(i,j,k) = 0.
Notation:
[13][2]
Description:
Independence of Variable 2 with variables 1
and 3.
4.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u23(j,k)
i.e. u12(i,j) = u13(i,k) = u123(i,j,k) = 0.
Notation:
[23][1]
Description:
Independence of Variable 3 with variables 1
and 2.
5.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u13(i,k)
i.e. u23(j,k) = u123(i,j,k) = 0.
Notation:
[12][13]
Description:
Conditional independence between variables 2
and 3 given variable 1.
6.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u23(j,k)
i.e. u13(i,k) = u123(i,j,k) = 0.
Notation:
[12][23]
Description:
Conditional independence between variables 1
and 3 given variable 2.
7.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u13(i,k) + u23(j,k)
i.e. u12(i,j) = u123(i,j,k) = 0.
Notation:
[13][23]
Description:
Conditional independence between variables 1
and 2 given variable 3.
8.Model:
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u13(i,k)
+ u23(j,k)
i.e. u123(i,j,k) = 0.
Notation:
[12][13][23]
Description:
Pairwise relations among all three variables,
with each two variable interaction unaffected
by the value of the third variable.
9.Model: (the saturated model)
ln ijk = u + u1(i) + u2(j) + u3(k) + u12(i,j) + u13(i,k)
+ u23(j,k) + u123(i,j,k)
Notation:
[123]
Description:
No simplifying dependence structure.
Hierarchical Log-linear models for 3 way table
Model
[1][2][3]
[1][23]
[2][13]
[3][12]
[12][13]
[12][23]
[13][23]
[12][13] [23]
[123]
Description
Mutual independence between all three variables.
Independence of Variable 1 with variables 2 and 3.
Independence of Variable 2 with variables 1 and 3.
Independence of Variable 3 with variables 1 and 2.
Conditional independence between variables 2 and 3 given variable 1.
Conditional independence between variables 1 and 3 given variable 2.
Conditional independence between variables 1 and 2 given variable 3.
Pairwise relations among all three variables, with each two variable interaction
unaffected by the value of the third variable.
The saturated model
Maximum Likelihood Estimation
Log-Linear Model
For any Model it is possible to determine the maximum
Likelihood Estimators of the parameters
Example
Two-way table – independence – multinomial model
ij  E  xij   N  ij
f  x11 , x12 ,
or  ij 

, xrc   
 x11
ij
N
 x11 x12
 11 12
xrc 
N
N !  11   12 


 

x11 ! xrc !  N   N 
x11
x12
 rcx
rc
 rc 
 N 


xrc
Log-likelihood
rc   ln N !  ln xij !
l  11 , 12 ,
i
j
 ln N  xij   xij ln  ij 
i
j
 K   xij ln  ij 
i
where
i
j
j
K  ln N !  ln xij !  N ln N
i
j
With the model of independence
ln  ij   u  u1 i   u2 j 
and

l u, u11 ,

, u2 r   K 
, u1 c  , u21 ,
 x
ij
i
j
u  u    u   
1i
2 j
 K  Nu   xiu1 i    x j u2 j 
i
j
u    u    0
with
1i
2 j
i
j
also
    e
ij
i
j
i
j
u u1i  u2 j 
e
u
e e
u1i 
i
j
u2 j 
N
Let

g u, u11 ,
, u1 c  , u21 ,
 K  Nu   xiu1 i    x j u2 j 
i

, u2 r  , 1, 2 , 2 
j
 u

u1i 
u2 j 
1  u1i   1  u2 j     e  e  e  N 
i
j
j
 i

Now
 u
g
u1i 
u2 j  
 N    e  e  e   N 1     0
u
j
 i

  1
 u u1i 
g
u2 j  
 xi  1    e e  e 
u1 i 
j


 xi  1 
e
u1 i 
e
u1 i 
N 0
i
e
u1 i 
e
u1 i 
xi  1 xi


N
N
i
Since
xi  1
1 

N
i
x
i
i
N
r
1
N
and 1  0
u1i 
 xi K1
Now
e
or
u1i   ln xi  ln K1
 u     ln x
i
1i
i
i
 r ln K1  0
1
ln K1    ln xi
r i
Hence
u1 i 
and
1
 ln xi   ln xi
r i
Similarly u2 j 
1
 ln x j   ln x j
c i
Finally
    e
ij
i
j
i
j
u u1i  u2 j 
e
u
e e
u1i 
i
j
u2 j 
N
e 
u
Hence
N
e e
u1 i 
i
Now
e
u1 i 
j
xi



  xi 
 i 1 
r
and
u2 j 
e
1
r
u2 j 
x j

 c

  x j 
 j 1

1
r
r
c




u
e 
xi    x j 


 xi  x j  i1   j 1 
N
i
j
1
r

  c
1 r
   xi    x j 
N  i 1   j 1

1
c
1
c
1
c
Hence
1
1
u   ln xi   ln x j  ln N
r i
c j
Note
ln ij  u  u1i   u2 j  
1
1
ln xi   ln x j  ln N 

r i
c j
1
1
ln xi   ln xi  ln x j   ln x j
r i
c i
  ln N  ln xi  ln x j
or
ij 
xi x j
N
Comments
• Maximum Likelihood estimates can be
computed for any hierarchical log linear model
(i.e. more than 2 variables)
• In certain situations the equations need to be
solved numerically
• For the saturated model (all interactions and
main effects), the estimate of ijk… is xijk… .
Goodness of Fit Statistics
These statistics can be used to check
if a log-linear model will fit the
observed frequency table
Goodness of Fit Statistics
The Chi-squared statistic
 
2
 Observed  Expected 
Expected
2

x
ijk
 ˆ ijk

2
ˆ ijk
The Likelihood Ratio statistic:
 Observed 
G  2  Observed  ln 
  2  xijk
 Expected 
2
 xijk
 ln  ˆ
 ijk
d.f. = # cells - # parameters fitted
We reject the model if
2
or
G2 is
greater than  / 2
2



Example:
Variables
1.
Systolic Blood Pressure (B)
Serum Cholesterol (C)
Coronary Heart Disease (H)
Coronary
Heart
Disease
Present
Absent
Serum
Cholesterol
(mm/100 cc)
<200
200-219
220-259
260+
<200
200-219
220-259
260+
Systolic Blood pressure (mm Hg)
<127
127-146
147-166
2
3
3
3
2
0
8
11
6
7
12
11
117
121
47
85
98
43
119
209
68
67
99
46
167+
4
3
6
11
22
20
43
33
Goodness of fit testing of Models
MODEL
----B,C,H.
B,CH.
C,BH.
H,BC.
BC,BH.
BH,CH.
CH,BC.
BC,BH,CH.
DF
-24
21
21
15
12
18
12
9
LIKELIHOODRATIO CHISQ
----------83.15
51.23
59.59
58.73
35.16
27.67
26.80
8.08
PROB.
------0.0000
0.0002
0.0000
0.0000
0.0004
0.0673
0.0082
0.5265
PEARSON
CHISQ
------102.00
56.89
60.43
64.78
33.76
26.58
33.18
6.56
PROB.
------0.0000
0.0000
0.0000
0.0000
0.0007
0.0872
0.0009
0.6824
Possible Models:
1. [BH][CH] – B and C independent given H.
2. [BC][BH][CH] – all two factor interaction
model
n.s.
n.s.
Model 1: [BH][CH] Log-linear parameters
Heart disease -Blood Pressure Interaction
uHBi , j 
Bp
Hd
Pres
Abs
<127
-0.256
0.256
127-146
-0.241
0.241
z
147-166
0.066
-0.066
167+
0.431
-0.431
147-166
0.660
-0.660
167+
4.461
-4.461
uHBi , j 
u
HB i , j 
Bp
Hd
Pres
Abs
<127
-2.607
2.607
127-146
-2.733
2.733
Multiplicative effect
 HBi , j   exp uHBi , j    e
uHBi , j 
Bp
Hd
Pres
Abs
<127
0.774
1.291
127-146
0.786
1.272
147-166
1.068
0.936
167+
1.538
0.65
Log-Linear Model
ln ijk  u  uH i   uB j   uC k   uHBi , j   uHCi ,k 
ijk  e e
u uH i  uB j  uC  k  u HB i , j  uHC i ,k 
e
e
e
e
  H i  B j C k  HBi , j  HC i ,k 
uHCi ,k 
Heart Disease - Cholesterol Interaction
Chol
Hd
Pres
Abs
<200
-0.233
0.233
200-219
-0.325
0.325
z
220-259
0.063
-0.063
260+
0.494
-0.494
uHC i ,k 
u
HC  i ,k 
Chol
Hd
Pres
Abs
<200
-1.889
1.889
200-219
-2.268
2.268
220-259
0.677
-0.677
260+
5.558
-5.558
Multiplicative effect
 HC i ,k   exp uHBi ,k    e
uHBi ,k 
Chol
Hd
Pres
Abs
<200
0.792
1.262
200-219
0.723
1.384
220-259
1.065
0.939
260+
1.640
0.610
Model 2: [BC][BH][CH] Log-linear parameters
Blood pressure-Cholesterol interaction:
uBC  j ,k 
Bp
Chol
<200
200-219
220-259
260+
<200
0.222
0.114
-0.114
-0.221
200-219
-0.019
-0.041
0.154
-0.094
220-259
-0.034
0.013
-0.058
0.079
260+
-0.169
-0.086
0.018
0.237
z
uBC  j ,k 
u
BC  j ,k 
Bp
Chol
<200
200-219
220-259
260+
Multiplicative effect
<200
2.68
1.27
-1.502
-2.487
 BC  j ,k 
200-219
-0.236
-0.472
2.253
-1.175
220-259
-0.326
0.117
-0.636
0.785
260+
-1.291
-0.626
0.167
2.051
uHB j ,k 


 exp uBC  j ,k   e


Bp
Chol
<200
200-219
220-259
260+
<200
1.248
1.120
0.892
0.802
200-219
0.981
0.960
1.166
0.910
220-259
0.967
1.013
0.944
1.082
260+
0.844
0.918
1.018
1.267
Heart disease -Blood Pressure Interaction
uHBi , j 
Bp
Hd
Pres
Abs
<127
-0.211
0.211
127-146
-0.232
0.232
z
147-166
0.055
-0.055
167+
0.389
-0.389
147-166
0.542
-0.542
167+
3.938
-3.938
uHBi , j 
u
HB i , j 
Bp
Hd
Pres
Abs
<127
-2.125
2.125
127-146
-2.604
2.604
Multiplicative effect
 HBi , j   exp uHBi , j    e
uHBi , j 
Bp
Hd
Pres
Abs
<127
0.809
1.235
127-146
0.793
1.261
147-166
1.056
0.947
167+
1.475
0.678
uHCi ,k 
Heart Disease - Cholesterol Interaction
Chol
Hd
Pres
Abs
<200
-0.212
0.212
200-219
-0.316
0.316
z
220-259
0.069
-0.069
260+
0.460
-0.460
uHC i ,k 
u
HC  i ,k 
Chol
Hd
Pres
Abs
<200
-1.712
1.712
200-219
-2.199
2.199
220-259
0.732
-0.732
260+
5.095
-5.095
Multiplicative effect
 HC i ,k   exp uHBi ,k    e
uHBi ,k 
Chol
Hd
Pres
Abs
<200
0.809
1.237
200-219
0.729
1.372
220-259
1.071
0.933
260+
1.584
0.631
Next topic: Discrete Multivariate
Analysis II
Download