PPT

advertisement
Research Hypotheses and
Multiple Regression: 2
• Comparing model performance across populations
• Comparing model performance across criteria
Comparing model performance across groups
This involves the same basic idea as comparing a bivariate
correlation across groups
• only now we’re working with multiple predictors in a
multivariate model
This sort of analysis has multiple important uses …
• theoretical – different behavioral models for different groups?
• psychometric – important part of evaluating if “measures”
are equivalent for different groups (such as gender, race,
across cultures or within cultures over time) is unraveling
the multivariate relationships among measures & behaviors
• applied – prediction models must not be “biased”
Comparing model performance across groups
There are three different questions involved in this comparison …
Does the predictor set “work better” for one group than another?
• Asked by comparing R2 of predictor set from the 2 groups ?
• we will build a separate model for each group (allowing
different regression weights for each group)
• then use Fisher’s Z-test to compare the resulting R2s
Are the models “substitutable”?
• use a cross-validation technique to compare the models
• use Steiger’s t-test to compare R2 of “direct” & “crossed”
models
Are the regression weights of the 2 groups “different” ?
• use Z-tests to compare the weights predictor-bypredictor
• or using interaction terms to test for group differences
Things to remember when doing these tests!!!
• the more collinear the variables being substituted, the more
collinear they will be -- for this reason there can be strong
collinearity between two models that share no predictors
• the weaker the two models (lower R²), the less likely they are to
be differentially correlated with the criterion
• nonnill-H0: tests are possible -- and might be more informative!!
• these are not very powerful tests !!!
• compared to avoiding a Type II error when looking for a given
r , you need nearly twice the sample size to avoid a Type II
error when looking for an r-r of the same magnitude
• these tests are also less powerful than tests comparing
nested models
So, be sure to consider sample size, power and the magnitude of
the r-difference between the non-nested models you compare !
Comparing multiple regression models across groups  3
Group #1 (larger n)
“direct model”
R²D1
y’1 = b1x + b1z + a1
?s
Group #2 (smaller n)
“direct model”
y’2 = b2x + b2z + a2
R²D2
Does the predictor set “work better” for one group than another?
Compare R²D1 & R²D2
using Fisher’s Z-test
• Retain H0: predictor set “works equally” for 2 groups
• Reject H0: predictor set “works better” for higher R2 group
Remember!!
We are comparing the R2 “fit” of the models…
But, be sure to use R in the computator!!!!
Are the multiple regression models “substitutable” across groups?
Group #1 (larger n)
“G1 direct model”
R²D1
y’1 = b1x + b1z + a1
Apply the model (bs &
a) from Group 2 to the
data from Group 1
“G1 crossed model”
y’1 = b2x + b2z + a2
Compare R²D1 & R²X1
Group #2 (smaller n)
“ G2 direct model”
y’2 = b2x + b2z + a2
R²D2
Apply the model (bs & a)
from Group 1 to the data
from Group 2
R²X1
“G1 crossed model”
y’1 = b2x + b2z + a2
Compare R²D2 & R²X2
using Hotelling’s t-test or Steiger’s Z-test
will need rDX -- correlation between models – from each group
R²X2
Are the regression weights of the 2 groups “different” ?
• test of an interaction of predictor and grouping variable
• Z-tests using pooled standard error terms
Asking if a single predictor has a different regression weight for
two different groups is equivalent to asking if there is an
interaction between that predictor and group membership.
(Please note that asking about a regression slope difference and
about a correlation difference are two different things – you know
how to use Fisher’s Test to compare correlations across groups)
This approach uses a single model, applied to the full sample…
Criterion’ = b1predictor + b2group + b3predictor*group + a
If b3 is significant, then there is a difference between then
predictor regression weights of the two groups.
However, this approach gets cumbersome when applied to
models with multiple predictors. With 3 predictors we would look
at the model …
y’ = b1G + b2P1 + b3G*P1 + b4P2 + b5G*P2 + b6P3 + b7G*P3 +a
Each interaction term is designed to tell us if a particular
predictor has a regression slope difference across the groups.
Because the collinearity among the interaction terms and
between a predictor’s term and other predictor’s interaction
terms all influence the interaction b weights, there has been
dissatisfaction with how well this approach works for multiple
predictors.
Also, because his approach does not involve constructing
different models for each group, it does not allow…
• the comparison of the “fit” of the two models
• an examination of the “substitutability” of the two models
Another approach is to apply a significance test to each
predictor’s b weights from the two models – to directly test for a
significant difference. (Again, this is different from comparing the
same correlation from 2 groups).
The most common formula is …
bG1 - bG2
Z = -----------------SE b-difference
However, there are
competing formulas for
“SE b-difference “
The most common formula (e.g., Cohen, 1983) is…
SE b-difference
(dfbG1 * SEbG12) + (dfbG2 * SEbG22)
= --------------------------------------------√
dfbG1 + dfbG2
However, work by two research groups has demonstrated
that, for large sample studies (both N > 30) this Standard
Error estimator is negatively biased (produces error
estimates that are too small), so that the resulting Z-values
are too large, promoting Type I & Type 3 errors.
• Brame, Paternost, Mazerolle & Piquero (1998)
• Clogg, Petrova & Haritou (1995)
Leading to the formulas …
SE b-difference = √ ( SEbG12 + SEbG22 )
and…
bG1 - bG2
Z = --------------------------√ ( SEbG12 + SEbG22 )
Match the question with the most direct test…
Practice is better correlated to performance for novices
than for experts.
The structure of a model involving practice, motivation
& recent experience is different for novices than
experts.
Practice has a larger regression weight in the model for
novices than for experts.
Practice contributes to the regression model for
novices, but not for experts.
Testing r for
each group
Comparing r
across groups
Testing b for
each group
Comparing R2
across groups
A model involving practice, motivation & recent
experience better predicts performance of novices than
experts.
Comparing R2
of direct &
crossed models
Practice is correlated with performance for novices, but
not for experts.
Comparing b
across groups
Comparing model performance across criteria
• same basic idea as comparing correlated correlations, but now
the difference between the models is the criterion, not
the predictor. There are two important uses of this type of
comparison
• theoretical/applied -- do we need separate models to predict
related behaviors?
• psychometric -- do different measures of the same construct
have equivalent models (i.e., measure the same thing) ?
• the process is similar to testing for group differences, but what
changes is the criterion that is used, rather than the group
that is used
• we’ll apply the Hotelling’s t-test and/or Steiger’s Z-test to
compare the structure of the two models
Are multiple regression models “substitutable” across criteria?
Criterion #1
“A direct model”
“A”
R²DA
A’ = bx + bx + a
Apply the model (bs & a)
from group 2 to the data
from group 1
“A crossed model”
Criterion #2
“B”
“B direct model”
B’ = bx + bx + a
R²DB
Apply the model (bs & a)
from group 1 to the data
from group 2
R²XA
A’ = bx + bx + a
Compare R²DA & R²XA
“B crossed model”
R²XB
B’ = bx + bx + a
Compare R²DB & R²XB
using Hotelling’s t-test or Steiger’s Z-tests (will need rDX -- r between models)
Retaining the H0: for each suggests group comparability in terms of the
“structure” of a single model for the two criterion variables -- there is no
direct test of the differential “fit” of the two models to the two criteria.
Download