SS16.16B Part_II

advertisement
Overview of categorical by
continuous interactions:
Part II: Variables, specifications, and
calculations
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Continued from Part I
• Part I covers
– Definitions and concepts for interactions
– Possible shapes of patterns for interactions
between one categorical and one continuous
independent variable
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Creating variables and specifying
models to test for interactions
involving continuous independent
variables
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Interaction between a continuous and
a categorical independent variable (IV)
• Example: Race and income-to-poverty ratio.
– Race is a 2-category IV classified
• non-Hispanic black (NHB),
• non-Hispanic white (NHW,)
– IPR is a continuous variable calculated as annual
family income (in $) divided by the Federal Poverty
Level for a family of that size and age composition.
• IPR ranges from 0 to more than 10 in this sample.
• Federal Poverty Level for a family of 2 adults and 2
children in 2010 was about $22,000
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Independent variables:
continuous by continuous interaction
• Mother’s age at time of child’s birth, years
– One continuous variable for the main effect: age
• Family income to poverty ratio, in multiples of
the Federal Poverty Level
– One continuous variable for the main effect: IPR
• Interaction: Mother’s age and IPR
– Age_IPR = age × IPR
– Resulting interaction term variable will also be
continuous
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Model specification to test an interaction
between one continuous and one
categorical independent variable
• For a model with an interaction between two
independent variables, need all of the ALL of the
main effects and interaction term variables
related to those two independent variables.
• E.g., for a model of birth weight by race and IPR,
include the main effect and interaction terms
related to race and family IPR-to-poverty ratio:
– BW = f (NHB, IPR, NHB_IPR)
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Coding of variables
• The NHB main effect variable is defined as in the
previous example (of categorical by categorical
interaction).
• 1 = non-Hispanic black.
• 0 = all others, the reference category, in this example, nonHispanic white.
• However, for a continuous variable like income that
takes on many possible numeric values, it doesn’t
make sense to create a lot of dummy variables.
• Instead, use income-poverty ratio in its continuous form.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating an interaction term from a dummy
and a continuous main effects term
• The value of the interaction term variable is
defined as the product of the two component
main effects variables:
X1_ X2 = X1 × X2
– Result will be one continuous interaction term
variable.
• Thus NHB_IPR is the product of NHB and IPR.
– If NHB = 1 and IPR = 2.3 then the interaction term
NHB_IPR = 2.3
– If NHB = 0 and IPR = 2.3, then NHB_IPR = 0
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Coding of main effects and interaction
term variables: race and IPR
Case characteristics –
SELECTED VALUES
Non-H white & IPR = 0.5
Non-H white & IPR = 1.0
Non-H white & IPR = 2.0
Non-H white & IPR = 5.0
Non-H black & IPR = 0.5
Non-H black & IPR = 1.0
Non-H black & IPR = 2.0
Non-H black & IPR = 5.0
Variables
Main effects terms Interaction term
NHB
IPR
NHB_IPR
0
0
0
0
1
1
1
1
0.5
1.0
2.0
5.0
0.5
1.0
2.0
5.0
0
0
0
0
0.5
1.0
2.0
5.0
E.g., IPR = 0.5 means income is half the Federal Poverty Level (FPL); IPR =
2.0 means income is twice the FPL.
For a two-category race variable (non-Hispanic white = reference category).
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Coding of race and IPR variables:
Non-Hispanic white infants
Case characteristics
Non-H white & IPR = 0.5
Non-H white & IPR = 1.0
Non-H white & IPR = 2.0
Non-H white & IPR = 5.0
Variables
Main effects terms Interaction term
NHB
IPR
NHB_IPR
0
0
0
0
0.5
1.0
2.0
5.0
0
0
0
0
E.g., IPR = 0.5 means income is half the Federal Poverty Level (FPL); IPR =
2.0 means income is twice the FPL.
For a two-category race variable (non-Hispanic white = reference category).
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Coding of race and IPR variables:
Non-Hispanic black infants
Case characteristics
Non-H black & IPR = 0.5
Non-H black & IPR = 1.0
Non-H black & IPR = 2.0
Non-H black & IPR = 5.0
Variables
Main effects terms Interaction term
NHB
IPR
NHB_IPR
1
1
1
1
0.5
1.0
2.0
5.0
0.5
1.0
2.0
5.0
E.g., IPR = 0.5 means income is half the Federal Poverty Level (FPL); IPR =
2.0 means income is twice the FPL.
For a two-category race variable (non-Hispanic white = reference category).
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
General equation for predicted value
of DV based on an interaction model
• The general equation to calculate the predicted value
of the dependent variable includes
– main effects coefficients
– interaction term coefficients
– values of the independent variables
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating overall effect of interaction
for specific case characteristics
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
• Each coefficient is multiplied by the value of the
associated variable for cases with the characteristics
of interest.
• To see which coefficients pertain to which cases, fill
in values of variables for different combinations of
race and the income-to-poverty ratio (IPR).
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Example: Estimated coefficients
β
Intercept
Main effect terms
Non-Hispanic black (NHB)
Income-to-poverty ratio (IPR)
Interaction term
NHB_IPR
3,106
–177
23
–5
IPR = family income ($) / Federal Poverty Level for a family of
that size and age composition.
Reference category: Non-Hispanic whites.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Interpreting the intercept
• The intercept β0 from an OLS model is an estimate of the
level of the dependent variable when continuous
variables take the value 0, for infants in the reference
category for all categorical variables.
• In a model where
– The dependent variable is birth weight in grams.
– The reference category is specified to be non-Hispanic white
infants.
• β0 is an estimate of birth weight when IPR = 0, for nonHispanic white infants.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Review: Coding of main effect and
interaction term variables: race and income
Reference category
Case characteristics
– SELECTED VALUES
Non-H white & IPR = 0.0
Non-H white & IPR = 0.5
Non-H white & IPR = 1.0
Variables
Main effects terms Interaction term
NHB
IPR
NHB_IPR
0
0
0
0.0
0.5
1.0
0
0
0
E.g., IPR = 0.5 means family income is half the Federal Poverty Level (FPL); IPR =
2.0 means family income is twice the FPL.
For a two-category race variable (non-Hispanic white = reference category).
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating the value of the intercept
for one group
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
Non-H white & IPR = 0.0
NHB
0
IPR
0.0
NHB_IPR
0.0
The intercept for non-Hispanic whites is calculated:
= β0 + (βNHB × 0) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0
Thus, the intercept for non-Hispanic white infants (when
IPR = 0) collapses to include only β0 because all of the
other coefficients in the formula are multiplied by a
value of 0.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Interpreting the
IPR/birth weight pattern
• IPR is a continuous variable
– The coefficient is an estimate of the effect on the dependent
for a 1-unit increase in the continuous IV, with categorical
variables set to their reference category values.
• So βIPR estimates the increment in birth weight for every
one-unit increase in IPR (e.g., from family income at the
poverty line to twice the poverty line)
– It is the slope of the IPR/birth weight curve for infants in the
reference category, in this case, non-Hispanic white infants.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating values for the IPR/birth
weight curve for white infants
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
Non-H white & IPR = 1.5
NHB
0
IPR
1.5
NHB_IPR
0.0
= β0 + (βNHB × 0) + (βIPR × 1.5) + (βNHB_IPR × 0)
= β0 + (βIPR × 1.5)
Because non-Hispanic whites are the reference category for race, the
equation collapses to include only the IPR main effect (βIPR) because the
other coefficients are multiplied by 0.
= β0 + (βIPR × IPR)
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Interpreting the race main effect
• The main effect βNHB estimates the difference
in birth weight between non-Hispanic black
infants and those in the reference category
(non-Hispanic whites), when continuous
variables are set at the value 0.
• It is an estimate of the difference in intercept
between black and white infants when IPR is
0.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating the intercept for different
values of the categorical variable
NHB
IPR
NHB_IPR
Non-H white & IPR = 0.0
0
0.0
0.0
As we saw a moment ago, for the intercept for non-Hispanic
whites is calculated:
= β0 + (βNHB × 0) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0
Non-H black & IPR = 0.0
NHB
1
IPR
0.0
NHB_IPR
0.0
For non-Hispanic blacks, the intercept is calculated:
= β0 + (βNHB × 1) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0 + βNHB
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
More on the race main effect
• It is an estimate of the difference in intercept
between black and white infants when IPR is
0.
= β0 + βNHB = 3,106 + (– 177) = 2,929
• In other words, black infants born to families
with an IPR of zero have a predicted birth
weight of 2,929 grams.
– or 177 grams LOWER than that of their white
counterparts.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating values for the IPR/birth
weight curve for white infants
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
= β0 + (βNHB × 0) + (βIPR × IPR) + (βNHB_IPR × 0)
= β0 + (βIPR × IPR)
Because non-Hispanic whites are the reference category for race, the
equation collapses to include only the IPR main effect (βIPR) because the
other coefficients are multiplied by 0.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculating values for the IPR birth
weight curve for black infants
Non-H black & IPR = 1.5
NHB
1
IPR
1.5
NHB_IPR
1.5
= β0 + (βNHB × 1) + (βIPR × 1.5) + (βNHB_IPR × 1.5)
For Non-Hispanic blacks, the equation includes all
three terms (βNHB, βIPR, and βNHB_IPR) because each
of those coefficients is multiplied by a non-zero
value.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Interpreting the coefficient on the
interaction between race and IPR
• The slope
– for blacks = βIPR + βNHB_IPR = 23 + (–5) = 18
– for whites = βIPR = 23
• The race_IPR coefficient tests whether the slope
of the IPR/birth weight pattern is different for
non-Hispanic black infants than for their nonHispanic white counterparts.
– βNHB_IPR is thus the estimated difference in slope for
blacks compared to whites.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
More on the race/IPR interaction
• The estimated coefficients mean that each 1unit increase in IPR is associated with
 23 grams more birth weight among non-Hispanic
white infants.
 18 grams more birth weight among non-Hispanic
black infants.
 Thos values are the slopes of the respective
IPR/BW curves for the two racial/ethnic
groups.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Preparing to graph the slope of
IPR/birth weight by race
• For infants in the reference category (nonHispanic white),
– Multiply selected values of IPR by βIPR and add to β0
to obtain predicted birth weight at interesting values
of IPR.
• For non-Hispanic black infants,
– Multiply selected values of IPR by βIPR + βNHB_IPR then
add to β0 + βNHB .
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Calculated birth weight by race
for selected values of IPR
IPR (family
Non-Hispanic white
income in
multiples of the
Formula
Result
FPL)
0
1
…
6
Non-Hispanic black
= β0 + 0 × βIPR
= 3,106 + 0×23
= β0 + 1× βIPR
= 3,106 + 1×23
= 3,106 + 23
Formula
Result
= β0 + βNHB + 0 × (βIPR + βNHB_IPR)
3,106 = 3,106 – 177 + 0 × (23 – 5)
2,929
= β0 + βNHB + 1 × (βIPR + βNHB_IPR)
= 3,106 – 177 + 1 × (23 – 5)
3,129 = 2,929 + 1 × (18) = 2,929 + 18
2,947
= β0 + 6 × βIPR
= 3,106 + 6×23
= 3,106 + 390
= β0 + βNHB + 6 × (βIPR + βNHB_IPR)
= 3,106 – 177 + 6 × (23 – 5)
3,244 = 2,929 + 6 × (18) = 2,929 + 108
3,037
β0 = 3,106; βIPR = 23; βNHB = –177; βNHB_IPR = –5
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Use a spreadsheet to calculate and
graph the interaction
• Spreadsheets can
– Store
• The estimated coefficients
• The input values of the independent variables
• The correct generalized formula to calculate the predicted
values for many combinations of the IVs involved in the
interaction
– Graph the overall pattern
• See spreadsheet template and voice-over
explanation
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Predicted birth weight by
race/ethnicity and IPR
= βIPR = 23 = slope of IPR/
BW curve for ref cat *
3,300
Birth weight (grams)
= β0 = intercept = 3,106 =
predicted BW for ref cat *
3,200
= βIPR + βNHB_IPR = 23 – 5 = 18
= slope of IPR/ BW curve for
non-Hispanic black infants
3,100
3,000
2,900
= β0 + βNHB = 3,106 + (– 177) = 2,929
= intercept for black infants
2,800
0
1
2
4
6
IPR
* Ref cat = Reference category = non-Hispanic white infants.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Overall shape of the race/IPR/
birth weight pattern
• Based on this set of βs, black infants have
– a lower birth weight than whites at all IPR levels.
• Negative coefficient on the NHB main effect yields a
lower intercept for blacks than for whites.
– a slower rate of birth weight increase as IPR rises.
• Negative coefficient on NHB_IPR, which yields a
shallower slope of the IPR/birth weight curve for blacks
than for whites.
• Thus the deficit in birth weight for blacks
widens with increasing IPR.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Summary
• An interaction between a continuous and a categorical
independent variable will yield differences in the
intercept and/or slope of the association between the
continuous IV and the DV.
• Calculating the overall shape of an interaction requires
adding together the pertinent main effects and
interaction term βs for combinations of the categorical
IV and selected values of the continuous IV in the
interaction.
– A spreadsheet can be helpful for storing and organizing the
βs, input values, and formulas.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Be parsimonious in deciding which
interactions to test
• The number of variables in the regression model
proliferates rapidly with each additional
interaction.
• Specify interactions only between key
independent variables.
• Communicating results becomes unwieldy:
– Considerable behind-the-scenes calculations.
– Extra tables or charts to convey the shape of the
interaction.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested resources
• Chapter 16, Miller, J. E. 2013. The Chicago
Guide to Writing about Multivariate Analysis,
2nd Edition.
• Chapters 8 and 9 of Cohen et al. 2003. Applied
Multiple Regression/Correlation Analysis for
the Behavioral Sciences, 3rd Edition. Florence,
KY: Routledge.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested online resources
• Podcasts on
– Creating charts to present interactions
– Writing prose to present results of interactions
– Introduction to testing statistical significance of
interactions
– Approaches to testing statistical significance of
interactions
– Using simple slopes for compound coefficients
– Using alternative reference categories to test
contrasts within interactions
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested exercises
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Problem set for chapter 16
– Suggested course extensions for chapter 16
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Contact information
Jane E. Miller, PhD
jmiller@ifh.rutgers.edu
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Download