Using Stata Graphics as a Method of Understanding and Presenting Interaction Effects

advertisement
Using Stata Graphics as a Method of
Understanding and Presenting
Interaction Effects
Joanne M. Garrett, PhD
University of North Carolina at Chapel Hill
July 11, 2005
Problems with Understanding Interaction
• Interaction difficult concept to explain
– not a single answer (point estimate)
– linear combination of betas
• Often ignored because of difficulty
• Graph of interaction may be more intuitive for:
– students learning the concept
– presentations at professional meetings
Low Birth Weight Study*
Variable
Description
Coding
low
Low birth weigh
1 = ≤ 2500 gms
0 = >2500 gms
bwt
Birth weight
grams
smk
Smoked during
pregnancy
1 = smoked
0 = did not smoke
race
Mother’s race
1 = black
0 = white
age
Mother’s age
years
* “Loosely” adapted from data from Applied Logistic Regression,
David W. Hosmer, Jr. and Stanley Lemeshow
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
Logistic regression model: Add smk by race interaction
1
P( D  low | smk , race) 
1  e  0  1smk   2 race  3 smkxrace
Create interaction term and run logistic regression:
. gen smkxrace = smk * race
. logistic low smk race smkxrace
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
-------------------------------------------------low |
OR
z
P>|z|
[95% CI]
---------+---------------------------------------smk |
5.76
4.14
0.000
2.51 13.19
race |
5.43
4.13
0.000
2.43 12.15
smkxrace |
.320
-2.08
0.038
.109
.936
--------------------------------------------------
• Significant interaction: Relationship differs between
smoking and low birth wt depending on mother’s race
• Question: How to interpret the interaction OR?
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
• Convert OR’s to beta coefficients and solve by
categories of race: . logit
-------------------------------------------------low |
Coef.
z
P>|z|
[95% CI]
---------+---------------------------------------smk |
1.751
4.14
0.000
0.921 2.580
race |
1.693
4.13
0.000
0.889 2.497
smkxrace | -1.141
-2.08
0.038
-2.216 -0.066
_cons| -2.303
--------------------------------------------------
White: OR = e [β1 + β3(race)] = e [1.751 – 1.141(0)] = 5.76
Black: OR = e [β1 + β3(race)] = e [1.751 – 1.141(1)] = 1.84
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
White: . lincom smk + 0*smkxrace
---------------------------------------------low |
OR
z
P>|z|
[95% CI]
-----+---------------------------------------(1) |
5.76
4.14
0.000
2.51 13.2
----------------------------------------------
Black: . lincom smk + 1*smkxrace
---------------------------------------------low |
OR
z
P>|z|
[95% CI]
-----+---------------------------------------(1) |
1.84
1.75
0.081
.928 3.65
----------------------------------------------
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
Interpretation:
• Among whites, women who smoke have 5.8
times the odds of having a low birth wt baby
• Among blacks, there is no relationship between
smoking and having a low birth wt baby (OR=1.8,
but not statistically significant)
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
Misinterpretations:
• Black mothers have less “risk” for low birth wt
babies compared to white mothers
• It’s okay for black mothers to smoke
Alternative:
• Solve the equation for values of smk and race
• Graph the individual probabilities (“predxcat”)
. predxcat low, xvar(race smk) graph bar
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
+-------------------------------------------------+
| race
smk
numobs
prob
lower
upper |
|-------------------------------------------------|
| 0:White
0:No
88
0.091
0.046
0.171 |
| 0:White
1:Yes
104
0.365
0.279
0.462 |
| 1:NonWh
0:No
142
0.352
0.278
0.434 |
| 1:NonWh
1:Yes
44
0.500
0.356
0.644 |
+-------------------------------------------------+
Likelihood ratio test of interaction for race * smk:
LR Chi2(1)
Prob > Chi2
=
=
4.56
0.0328
Example 1: Low birth weight and interaction
between smoking and race (1=black, 0=white)
for Low Birth Weight
.5
.4
.3
.2
.1
0
0:White
1:Black
Race
Smoked During Pregnancy
0:No
1:Yes
Example 2: Low birth weight and interaction
between smoking and age (years)
Logistic regression model: Add smk by age interaction
1
P( D  low | smk , age) 
1  e  0  1smk   2 age  3 smkxage
Create interaction term and run logistic regression:
. gen smkxage = smk * age
. logistic low smk age smkxage
Example 2: Low birth weight and interaction
between smoking and age (years)
------------------------------------------------low |
OR
z
P>|z|
[95% CI]
---------+--------------------------------------smk | 0.382
-0.90
0.367
.047 3.110
age | 0.920
-2.61
0.009
.865 0.980
smkxage | 1.076
1.97
0.049
1.001 1.157
-------------------------------------------------
• Significant interaction: Relationship differs between
smoking and low birth wt depending on mother’s age
• Question: How to interpret the interaction OR for a
continuous interaction variable?
Example 2: Low birth weight and interaction
between smoking and age (years)
• Convert OR’s to beta coefficients and solve for
selected values of age: . logit
------------------------------------------------low |
Coef.
z
P>|z|
[95% CI]
--------+---------------------------------------smk | -.963
-0.90
0.367
-3.058
1.131
age | -.083
-2.61
0.009
-0.145 -0.021
smkxage |
.073
1.97
0.049
0.001
0.146
_cons|
.805
-------------------------------------------------
Age=15: OR = e [β1 + β3(age)] = e [–0.963 + 0.073(15)] = 1.14
Age=35: OR = e [β1 + β3(age)] = e [–0.963 + 0.073(35)] = 4.93
Example 2: Low birth weight and interaction
between smoking and age (years)
Age=15: . lincom smk + 15*smkxage
----------------------------------------------low |
OR
z
P>|z|
[95% CI]
-----+----------------------------------------(1) |
1.14
0.32
0.751
.503 2.59
-----------------------------------------------
Age=35: . lincom smk + 35*smkxage
----------------------------------------------low |
OR
z
P>|z|
[95% CI]
-----+----------------------------------------(1) |
4.93
2.59
0.010
1.47 16.5
-----------------------------------------------
Example 2: Low birth weight and interaction
between smoking and age (years)
Interpretation:
• Among 15 year olds, women who smoke have
1.1 times the odds of having a low birth wt baby
• Among 35 year olds, women who smoke have
4.9 times the odds of having a low birth wt baby
Example 2: Low birth weight and interaction
between smoking and age (years)
Misinterpretations:
• 15 year olds are not at risk for low birth wt babies
• It’s okay for 15 year olds to smoke
Alternative:
• Solve the equation and graph the probabilities for
different levels of smoke and age (“predxcon”)
. predxcon low, xvar(age) from(15) to(35) inc(2)
class(smk) graph
Example 2: Low birth weight and interaction
between smoking and age (years)
-> smk = 0
+---------------------------------+
| age
pred_y
lower
upper |
|---------------------------------|
| 15
.392
.272
.527 |
| 17
.353
.259
.461 |
| 19
.317
.243
.400 |
| (etc)
...
...
... |
+---------------------------------+
-> smk = 1
+---------------------------------+
| age
pred_y
lower
upper |
|---------------------------------|
| 15
.424
.285
.577 |
| 17
.419
.303
.546 |
| (etc)
...
...
... |
+---------------------------------+
Likelihood ratio test for interaction of age * smk:
LR Chi2(1) =
3.88
Prob > Chi2 =
0.05
Example 2: Low birth weight and interaction
between smoking and age (years)
for Low Birth Weight
.5
.4
.3
.2
.1
15
20
25
Mother's Age
30
35
Smoked During Pregnancy
0:No
1:Yes
p=0.05
Example 3: Birth weight (grams) and
interaction between smoking and race
Linear regression model:
bwt  0  1smk   2 race  3smkxrace
Create interaction term and run linear regression:
. gen smkxrace = smk * race
. regress low smk race smkxrace
Or:
. predxcat bwt, xvar(race smk) graph bar
Example 3: Birth weight (grams) and
interaction between smoking and race
Unadjusted Means
4,000
3,000
2,000
1,000
0
0:White
1:Black
Race
Smoked During Pregnancy
0:No
1:Yes
p=0.006
Example 4: Birth weight (grams) and
interaction between smoking and age
Linear regression model:
bwt  0  1smk   2 age  3smkxage
Create interaction term and run linear regression:
. gen smkxage = smk * age
. regress low smk age smkxage
Or: . predxcon bwt, xvar(age) from(15) to(35) inc(2)
class(smk) graph
Example 4: Birth weight (grams) and
interaction between smoking and age
3500
Predicted Values
3250
3000
2750
2500
15
20
25
Mother's Age
30
35
Smoked During Pregnancy
0:No
1:Yes
p=0.001
Example 5: Birth weight (grams) and interaction
between smoking and age, age2, age3
Linear regression model: add quadratic & cubic terms
Create interaction terms and run linear regression:
. gen age2 = age^2
. gen age3 = age^3
. gen smkxage = smk * age
. gen smkxage2 = smk * age2
. gen smkxage3 = smk * age3
. regress low smk age age2 age3 smkxage
smkxage2 smkxage3
Or: . predxcon bwt, xvar(age) from(15) to(35) inc(2)
class(smk) graph poly(3)
Example 5: Birth weight (grams) and interaction
between smoking and age, age2, age3
3500
Predicted Values
3250
3000
2750
2500
15
20
25
Mother's Age (cubic)
30
35
Smoked During Pregnancy
0:No
1:Yes
p=0.045
Conclusions
• Interaction can be a difficult concept for people
unfamiliar with the methodology
• Examining a graph of an interaction is an easier
way to get an intuitive feel for the effect
• A useful technique for explaining interaction to
students hearing it for the first time, before
introducing mathematical models
• A simple way to present study results at
meetings, even to a statistically savvy audience
Calculating and Graphing Predicted Values:
(when “X” is categorical)
. predxcat yvar, xvar(xvar1 xvar2)
yvar – dependent variable
continuous – defaults to linear regression
binary (0,1) – defaults to logistic regression
xvar(xvar) – nominal variable for categories of estimated
means or proportions
xvar(xvar1 xvar2) – categories of all combinations of xvar1
and xvar2; tests interaction
adjust(cov_list) – adjusts for any covariates
Calculating and Graphing Predicted Values:
(when “X” is categorical)
graph – display graph (otherwise shows list of predicted
values only)
bar – bar graph (instead of symbols – default)
model – for display purposes only; displays regression model
Some other options: level(#)
cluster(cluster_id)
savepred(ds_name)
Calculating and Graphing Predicted Values:
(when “X” is continuous)
. predxcon yvar, xvar(xvar) from(#) to(#) inc(#) graph
yvar – dependent variable
continuous – defaults to linear regression
binary (0,1) – defaults to logistic regression
xvar(xvar) – continuous independent variable; probabilities
calculated for each value of X
from(#) – bottom value for xvar
to(#) – top value for xvar
inc(#) – increment desired between bottom and top values
adjust(cov_list) – adjusts for any covariates
Calculating and Graphing Predicted Values:
(when “X” is continuous)
graph – display graph (otherwise shows list of predicted
values only)
class(class_var) – adds an xvar by class_var interaction term
poly(2 or 3) – polynomial terms added: 2=squared
3=squared and cubic
model – for display purposes only; displays regression model
Some other options: level(#)
cluster(cluster_id)
nolist
savepred(ds_name)
Download