Presentation slides

advertisement
Analysis of Interaction Effects
James Jaccard
New York University
Overview
Will cover the basics of interaction analysis, highlighting
multiple regression based strategies
Will discuss advanced issues and complications in
interaction analysis. This treatment will be somewhat
superficial but hopefully informative
Conceptual Foundations of Interaction Analysis
Causal Theories
Most (but not all) theories rely heavily on the concept of
causality, i.e., we seek to identify the determinants of a behavior
or mental state and/or the consequences of a behavior or
environmental/mental state
I am going to ground interaction analysis in a causal framework
Causal Theories
Causal theories can be complicated, but at their core, there are
five types of causal relationships in causal theories
Direct Causal Relationships
A direct causal relationship is when a variable, X, has a direct
causal influence on another variable, Y:
X
Y
Direct Causal Relationships
Frustration
+
Aggression
Direct Causal Relationships
Frustration
Quality of
Relationship
with Mother
+
Aggression
-
Adolescent
Drug Use
Indirect Causal Relationships
Indirect Causal Relationships
An indirect causal relationship is when a variable, X, has a
causal influence on another variable, Y, through an intermediary
variable, M:
X
M
Y
Indirect Causal Relationships
Quality of
Relationship
with Mother
Adolescent
School Work
Ethic
Adolescent
Drug Use
Spurious Relationship
Spurious Relationship
A spurious relationship is one where two variables that are not
causally related share a common cause:
C
X
Y
Bidirectional Causal Relationships
Bidirectional Causal Relationships
A bidirectional causal relationship is when a variable, X, has a
causal influence on another variable, Y, and that effect, Y, has a
“simultaneous” impact on X:
X
Y
Bidirectional Causal Relationships
Quality of
Relationship
with Mother
Adolescent
Drug Use
Moderated Causal Relationships
Moderated Causal Relationships
A moderated causal relationship is when the impact of a
variable, X, on another variable, Y, differs depending on the
value of a third variable, Z
Z
X
Y
Moderated Causal Relationships
Gender
Treatment vs.
No Treatment
Depression
Moderated Causal Relationships
Gender
Treatment vs.
No Treatment
Depression
Quality of ParentAdolescent Relationship
Exp Negative
Peers
Drug Use
Moderated Causal Relationships
Z
X
Y
The variable that “moderates” the relationship is called a
moderator variable.
Causal Theories
We put all these ideas together to build complex theories of
phenomena. Here is one example:
Quality of
Relationship
with Mother
Gender
Time Mother
Spends with
Child
Adolescent
School Work
Ethic
Adolescent
Drug Use
Interaction Analysis
Interactions, when translated into causal analysis, focus on
moderated relationships
When I encounter an interaction effect, I think:
Z
X
Y
Interaction Analysis
Key step in interaction analysis is to identify the focal
independent variable and the moderator variable.
Sometimes it is obvious – such as with the analysis of a
treatment for depression on depression as moderated by
gender
Gender
Treat vs Control
Depression
Interaction Analysis
Sometimes it is not obvious – such as an analysis of the
effects of gender and ethnicity on the amount of time an
adolescent spends with his or her mother
Gender
Ethnicity
Time Spent
Statistically, it matters not which variables take on which
role. Conceptually, it does.
The Statistical Analysis of Interactions
Some Common Practices
Omnibus tests – I do not use these
Hierarchical regression – I use sparingly
Focus on unstandardized coefficients - we tend to stay
away from standardized coefficients in interaction
analysis because they can be misleading and they do not
have “clean” mathematical properties
A “Trick” We Will Use: Linear
Transformations
Y = a + b1 X + e
Satisfaction = a + b1 Grade + e
Satisfaction = 12 + -.50 Grade + e
A “Trick” We Will Use: Linear
Transformations
Y = a + b1 X + e
Satisfaction = a + b1 Grade + e
Satisfaction = 12 + -.50 Grade + e
Satisfaction = 9 + -.50 (Grade – 6) + e
A “Trick” We Will Use: Linear
Transformations
Y = a + b1 X + e
Satisfaction = a + b1 Grade + e
Satisfaction = 12 + -.50 Grade + e
Satisfaction = 9 + -.50 (Grade – 6) + e
“Mean centering” is when we subtract the mean
Interaction Analysis
Will focus on four cases:
Categorical IV and Categorical MV
Continuous IV and Categorical MV
Categorical IV and Continuous MV
Continuous IV and Continuous MV
Assume you know the basics of multiple regression and
dummy variables in multiple regression
Categorical IV and Categorical MV
Categorical IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Gender (female = 1, male = 0)
Z = Grade (6th = 1, 7th = 0)
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Three questions:
Is there a gender difference for 6th graders?
Is there a gender difference for 7th graders?
Are these gender effects different?
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Gender effect for 6th grade: 8 – 7 = 1
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Gender effect for 6th grade: 8 – 7 = 1
Gender effect for 7th grade: 7 – 4 = 3
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Gender effect for 6th grade: 8 – 7 = 1
Gender effect for 7th grade: 7 – 4 = 3
Interaction contrast: (8-7) – (7– 4) = -2
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Y = a + b1 Gender + b2 Grade + b3 (Gender)(Grade)
Y = 4.0 + 3.0 Gender + b2 Grade + -2.0 (Gender)(Grade)
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Y = a + b1 Gender + b2 Grade + b3 (Gender)(Grade)
Y = 4.0 + 3.0 Gender + b2 Grade + -2.0 (Gender)(Grade)
Flipped: Y = 7.0 + 1.0 Gender + b2 Grade + 2.0 (Gender)(Grade)
Categorical IV and Categorical MV
6th
7th
Female
8.0
7.0
Male
7.0
4.0
Extend to groups > 2 (add 8th grade)
Inclusion of covariates
How to generate means and tables
Continuous IV and Categorical MV
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
Three questions:
For females: b = 0.33
For males: b = 0.20
Are the effects different: 0.33 – 0.20
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
For females: b = 0.33
For males: b = 0.20
Y = a + b1 Gender + 0.20 Time + 0.13 (Gender)(Time)
Continuous IV and Categorical MV
Y = Relationship satisfaction (0 to 10)
X = Time spent together (in hours)
Z = Gender (female = 1, male = 0)
For females: b = 0.33
For males: b = 0.20
Y = a + b1 Gender + 0.20 Time + 0.13 (Gender)(Time)
Flipped: Y = a + b1 Gender + 0.33 Time + -0.13 (Gender)(Time)
Continuous IV and Categorical MV
Do not estimate slopes separately; use flipped reference
group strategy
Extend to groups > 2 (use grade as example)
Categorical IV and Continuous MV
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Ad language: Half shown ad in Spanish (0) and half in
English (1)
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Ad language: Half shown ad in Spanish (0) and half in
English (1)
Latino identity: 1 = not at all, 7 = strong identify
Categorical IV and Continuous MV
Study conducted in Miami with bi-lingual Latinos
Ad language: Half shown ad in Spanish (0) and half in
English (1)
Latino identity: 1 = not at all, 7 = strong identify
Outcome = Attitude toward product (1 = unfavorable, 7 =
unfavorable)
Hypothesized moderated relationship
Common Analysis Form: Median Split
Many researchers not sure how to analyze this, so use
median split for continuous moderator variable and conduct
ANOVA
Why this is bad practice….
Categorical IV and Continuous MV
Identity
1
2
3
4
5
6
7
Mean English – Mean Spanish
1.50
1.00
0.50
0.00
-0.50
-1.00
-1.50
Categorical IV and Continuous MV
Identity
1
2
3
4
5
6
7
Mean English – Mean Spanish
1.50
1.00
0.50
0.00
-0.50
-1.00
-1.50
Y = a + b1 Ad language + b2 Identity + b3 Ad X Identity
Categorical IV and Continuous MV
In order to make intercept meaningful, subtracted 1 from
Latino Identity measure, so ranged from 0 to 6
Y = a + b1 Ad language + b2 Identity + b3 Ad X Identity
Categorical IV and Continuous MV
Categorical IV and Continuous MV
Mean attitude for Spanish ad for Latino ID = 1 is 3.215
Categorical IV and Continuous MV
Mean attitude for Spanish ad for Latino ID = 1 is 3.215
Mean difference for Latino ID = 1 is 1.707 (p < 0.05)
Categorical IV and Continuous MV
Mean attitude for Spanish ad for Latino ID = 1 is 3.215
Mean difference for Latino ID = 1 is 1.707 (p < 0.05)
Mean attitude for English ad for Latino ID = 1 is 4.922
Categorical IV and Continuous MV
Identity
1
2
3
4
5
6
7
Mean English Mean Spanish Difference
4.922
3.215
1.707*
Categorical IV and Continuous MV
Identity
1
2
3
4
5
6
7
Mean English Mean Spanish Difference
4.922
4.915
3.215
3.662
1.707*
1.253*
Categorical IV and Continuous MV
Identity
1
2
3
4
5
6
7
Mean English Mean Spanish Difference
4.922
4.915
4.908
3.215
3.662
4.108
1.707*
1.253*
0.800*
Categorical IV and Continuous MV
Identity
1
2
3
4
5
6
7
Mean English Mean Spanish Difference
4.922
4.915
4.908
4.901
4.895
4.888
4.882
3.215
3.662
4.108
4.555
5.002
5.449
5.896
1.707*
1.253*
0.800*
0.346*
-0.107
-0.561*
-1.014*
(Common practice, Mean = 3, SD = 1.2; Show R program)
Continuous IV and Continuous MV
Y: Child anxiety (0 to 20)
X: Parent anxiety (0 to 20)
Z: Parenting behavior: Control (0 to 20)
Continuous IV and Continuous MV
Y: Child anxiety (0 to 20)
X: Parent anxiety (0 to 20)
Z: Parenting behavior: Control (0 to 20)
Control
7
8
9
10
11
12
13
b for Y onto X
.10
.20
.30
.40
.50
.60
.70
Continuous IV and Continuous MV
Control
7
8
9
10
11
12
13
b for Y onto X
.10
.20
.30
.40
.50
.60
.70
Y = a + b1 Control + 0.10 PA + 0.10 (Control)(PA)
(Common practice versus regions of significance)
(Why we include component parts)
Advanced Topics
Three Way Interactions
Three Way Interactions
Identify focal independent variable
Identify first order moderator variable
Identify second order moderator variable
Grade
Ethnicity
Gender
Satisfaction
IC1 = (6-5) - (6-4) = -1
Three Way Interactions
European American
Grade 7 Grade 8
Female
6.0
6.0
Male
5.0
4.0
IC = (6-5) – (6-4) = -1
IC1 = (6-5) - (6-4) = -1
Three Way Interactions
European American
Latinos
Grade 7 Grade 8
Grade 7 Grade 8
Female
6.0
6.0
Female
6.0
6.0
Male
5.0
4.0
Male
6.0
6.0
IC = (6-5) – (6-4) = -1
IC = (6-6) – (6-6) = 0
IC1 = (6-5) - (6-4) = -1
Three Way Interactions
European American
Latinos
Grade 7 Grade 8
Grade 7 Grade 8
Female
6.0
6.0
Female
6.0
6.0
Male
5.0
4.0
Male
6.0
6.0
IC = (6-5) – (6-4) = -1
IC = (6-6) – (6-6) = 0
TW = [(6-5) – (6-4)] - [(6-6) – (6-6)] = -1
IC1 = (6-5) - (6-4) = -1
Three Way Interactions
European American (1)
Latinos (0)
G7 (1)
G8 (0)
Female (1)
6.0
6.0
Female (1)
Male (0)
5.0
4.0
Male (0)
IC = (6-5) – (6-4) = -1
G7 (1)
G8 (0)
6.0
6.0
6.0
6.0
IC = (6-6) – (6-6) = 0
TW = [(6-5) – (6-4)] - [(6-6) – (6-6)] = -1
Y = 6.0 + 0 Gender + b2 Grade + b3 Ethnic + 0 (Gender)(Grade)
+ b5 (Gender)(Ethnic) + b6 (Grade)(Ethnic) + -1 (Gender)(Grade)(Ethnic)
Modeling Non-Linear Interactions
Modeling Non-Linear Interactions
Y = α + β1 X + β2 Z + ε
β1 = α’ + β3 Z + β4 Z2
Modeling Non-Linear Interactions
Y = α + β1 X + β2 Z + ε
β1 = α’ + β3 Z + β4 Z2
Substitute right hand side for β1:
Y = α + (α’ + β3 Z + β4 Z2) X + β2 Z +
ε
Modeling Non-Linear Interactions
Y = α + β1 X + β2 Z + ε
β1 = α’ + β3 Z + β4 Z2
Substitute right hand side for β1:
Y = α + (α’ + β3 Z + β4 Z2) X + β2 Z +
ε
Expand:
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z +
ε
Modeling Non-Linear Interactions
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z +
ε
Re-arrange terms:
Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 +
ε
Modeling Non-Linear Interactions
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z +
ε
Re-arrange terms:
Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 +
ε
Re-label and you have your model:
Y = α + β1 X + β2 Z + β3 XZ + β4 XZ2 +
ε
Modeling Non-Linear Interactions
Y = α + α’X + β3 XZ + β4 XZ2 + β2 Z +
ε
Re-arrange terms:
Y = α + α’X + β2 Z + β3 XZ + β4 XZ2 +
ε
Re-label and you have your model:
Y = α + β1 X + β2 Z + β3 XZ + β4 XZ2 +
ε
Use centering strategy to isolate effect of X on Y (β1 ) at
any given value of Z; also consider modeling intercept
Exploratory Interaction Analysis
Exploratory Interaction Analysis
Use program in R
Y = Tenured or not (using MLPM)
X = Number of articles published
Z = Number of years since hired
Y = α + β1 X + ε
X COEFFICENT AND M VALUES
N
M Value
478
475
457
408
330
246
166
115
74
48
1.000
2.000
3.000
4.000
5.000
6.000
7.000
8.000
9.000
10.000
X Slope
.000
.002
.007
.007
.009
.008
.005
.009
.011
.001
Regression Mixture Modeling
Mixture Regression
BI = α + β1 Aact + β2 PN + β3 PBC + ε
When we regress Y onto a set of predictors, we assume that
people are drawn from a single population with common linear
coefficients
But, in reality, we probably are mixing heterogeneous
population segments with different coefficients characterizing
the segments
Mixture Regression
With “mixed” populations, the overall regression analysis can
characterize neither segment very well and lead to sub-optimal
inferences and intervention strategies
Another Example of Aggregation Bias
Mixture Regression
Latent
Class X
Aact
SN
Intention
PBC
Mixture Model for Heavy Episodic Drinking
A four class model fits data best (entries are linear coefficients)
Aact
SN
DN
PBC
Segment 1 (42%):
.33
.02
.01
-.01
Segment 2 (17%):
.10
.29
.30
.01
Segment 3 (21%):
.30
.29
.05
.04
Segment 4 (20%):
.48
.09
.25
-.03
Interaction Analysis and Establishing
Generalizability
Generalizability
It is common for people to conclude that an effect
“generalizes” in the absence of a statistically significant
interaction effect
Example with RCT of obesity treatment and gender
Problem is that we can never accept the null hypothesis of a
zero interaction contrast
Solution: Adopt the framework of equivalence testing
Generalizability
Step 1: Specify a threshold value that will be used to define
functional equivalence
Step 2: Specify the range of functional equivalence
Step 3: Calculate the 95% CI for the interaction contrast
Step 4: Determine if the CI is completely within the range of
functional equivalence
Measurement Error
Measurement Error
It is well known that measurement error can bias parameter
estimates in multiple regression. This holds with vigor for
interaction analysis
One approach to dealing with measurement error in general
is to use latent variable modeling
Measurement Error
e1
D1
Depression
Measurement Error
e1
e2
e3
D1
D2
D3
e1
D1
Depression
Depression
Latent Variable Regression
e4
e5
e6
X1
X2
X3
e10
e11
Y1
Y2
X
e9
X3Z3
e8
X2Z2
e7
X1Z1
Support
Y
d3
Z
Z1
Z2
Z3
e1
e2
e3
Latent Variable Regression
There are a about a half a dozen approaches to how best to
model latent variable interactions (e.g., quasi-maximum
likelihood; Bayesian). I recommend the approach developed
by Herbert Marsh as a good balance between utility and
complexity, coupled with Huber-White sandwich estimators
for robustness
Latent variable regression using multiple group analysis
Multi-Group Modeling in SEM
e4
e5
e6
X1
X2
X3
e7
e8
Y1
Y2
X
Y
d3
Z
Z1
Z2
Z3
e1
e2
e3
Assumption Violations
Assumption Violations
If assumptions of normality or variance homogeneity are
suspect
Use approaches with robust standard errors
Bootstrapping
Huber-White sandwich estimators
Be careful of outlier resistant robust methods
Rand Wilcox work with smoothers
Thank God It Has Ended!
Download