Correcting for Reliability and Range-Restriction in Meta

advertisement
Correcting for reliability 1
Running head: CORRECTIONS IN META-ANALYSIS
Correcting for Reliability and Range-Restriction in Meta-Analysis
Frederick L. Oswald
and
Patrick D. Converse
Michigan State University
Frederick L. Oswald and Patrick D. Converse, Department of Psychology, Michigan
State University. Symposium presented at the 20th Annual Conference of the Society for
Industrial and Organizational Psychology, Los Angeles, CA. Please direct all
correspondence to Fred Oswald, Psychology Building, Michigan State University, East
Lansing, MI 48824-1116; e-mail: foswald@msu.edu
Correcting for reliability 2
Correcting for Reliability and Range-Restriction in Meta-Analysis
For more than 25 years, meta-analysis has been used as a quantitative tool in
organizational research for summarizing empirical findings across studies,
complementing the narrative review by demonstrating whether validities across research
conducted in a given domain show consistent levels (a lack of situational specificity) or at
least are non-zero (validity generalization). Part of the process in reaching such
conclusions is to correct observed effect sizes for statistical artifacts that systematically
attenuate observed predictor-criterion correlations (see Hunter & Schmidt, 2004).
Measurement unreliability is pervasive in psychological measures, so given a reasonable
estimate of true-score variance for a measure, that value can be substituted for the
observed variance, hopefully providing a more accurate estimate of effect in the
population of interest. Range restriction is another effect attenuating observed
correlations in personnel selection settings, so that an estimate of effect in the unrestricted
population will be larger than in the restricted sample. Range restriction may be direct,
where individuals in the restricted sample were selected top-down on the predictor
measure, or it may be incidental, where selection occurs on a third variable that is
correlated with the predictor and/or the criterion. Having a substantive understanding of
the nature of error in one’s measures and the selection process that led to the observed
effect size helps lead one toward making appropriate statistical corrections. Of course,
these corrections can be informative when estimating population correlations in one’s
own study – no study needs to wait for a meta-analysis to make such corrections.
Correcting for reliability 3
Several articles have outlined the computational details for different methods of
statistically correcting correlations and computing their standard errors (Bobko & Riecke,
1980; Hunter, Schmidt, & Le, 2004; Raju & Brand, 2003; Sackett & Yang, 2000). The
present paper outlines still another method whose rationale is brief but justifies a
compelling alternative to previous approaches. Then, several different correction
approaches will be applied to examples that vary the typical values found in
psychological research for measurement reliability, direct range restriction, and criterionrelated validity.
Rationale
The proposed method for correcting correlations takes into account the fact that
personnel selection studies usually have incumbent data on an organizational criterion of
interest; they do not have criterion data for applicants who did not get hired.
Additionally, because incumbents were selected on a predictor often correlated with these
criteria, the criterion reliability in the incumbent sample typically underestimates the
criterion reliability in the applicant population due to this incidental range restriction. In
other words, selection researchers often have the predictor reliability in the applicant
sample, but only have the restricted criterion reliability in the incumbent sample. An
appropriate simultaneous correction for (a) direct range restriction on the predictor and
(b) measurement unreliability requires the estimate for the unrestricted reliability
coefficient for the criterion (Stauffer & Mendoza, 2001).
Stauffer and Mendoza (2001) advance a related problem in terms of predictor
reliability, when only the restricted reliability for the predictor is known. More
importantly, they make the general argument that simultaneous correction requires the
Correcting for reliability 4
range-restriction correction based on the observed (restricted) correlation as well as the
unrestricted reliability coefficients. The following method explains how to estimate the
unrestricted criterion reliability and then applies the simultaneous correction formula.
Method
An estimate of the unrestricted criterion reliability coefficient can be obtained in
the following steps. First, Gulliksen provides an equation relating criterion reliability to
validity under direct range restriction on the predictor (Gulliksen, 1950, Eq. 22, p. 140).
He shows that, under Classical Test Theory assumptions of (a) equal standard errors of
measurement and (b) equal standard errors of the estimate regardless of the amount of
range restriction on an applicant sample, that the following ratio should equal a constant
C:
C
1  ryy
(1)
1  rxy2
where ryy and rxy are the range-restricted criterion reliability and validity coefficient,
respectively. This constant must also hold when there is no range restriction, such that
C
1  R yy
(2)
1  Rxy2
where R yy and R xy are the unrestricted criterion reliability and validity coefficient,
respectively. Then, setting Equations 1 and 2 equal to one another and rearranging, the
formula for the unrestricted criterion reliability becomes
R yy  1 
(1  ryy )(1  R xy2 )
(1  rxy2 )
.
(3)
Correcting for reliability 5
This equation has two known values, ryy and rxy , and two unknown values, R yy and R xy .
R xy can be estimated, however, using the traditional formula for direct range restriction
on the predictor x (Gulliksen, Eq. 18, p. 137):
Rxy 
krxy
k 2 rxy2  rxy2  1
(4)
where all terms are defined as before, and k is the ratio of unrestricted to restricted
standard deviations for x. Before we substitute results from this equation into Equation 3,
first let us recall the formula for the correlation corrected for measurement unreliability:
R xy* 
R xy
R xx R yy
(5)
Where Rxy is the validity coefficient corrected for range restriction, R xx is the estimate of
predictor reliability, and R yy is the estimate of criterion reliability corrected for incidental
range restriction. Again, Stauffer and Mendoza (2001) note that range restriction
corrections are based on the nature of the range restriction, not on the nature of the type
of reliability coefficient available. Therefore, Rxy corrects the observed validity
coefficient, rxy , for range restriction, where the observed validity coefficient is not
corrected for measurement unreliability.
Therefore, Equation 4 substitutes for Rxy in the numerator of Equation 5. For the
denominator of Equation 5, R xx is usually available from applicant sample data, and R yy
has been estimated in Equation 3, which together yields the following:
Correcting for reliability 6
krxy
Rxy* 
R xx
 (1  ryy )(1  Rxy2 )  2 2
2
1 
 k rxy  rxy  1
2
(1  rxy )


(6)
Equation 6, in turn, requires substituting Equation 5 for Rxy in the denominator. Then
with this substitution and a fair amount of algebraic rearrangement, Equation 6 becomes
Rxy* 
krxy
Rxx (k 2 rxy2  rxy2  ryy )
(7)
This equation allows one to estimate Rxy* , the correlation corrected for range restriction
and measurement error variance, based on the available estimates: (a) the unrestricted
predictor reliability, Rxx ; (b) the incidentally range-restricted criterion reliability, ryy ; and
(c) the restricted validity coefficient rxy .
Empirical Comparisons
Table 2 empirically compares methods applied to examples that combine different
levels of the observed validity (rxy = .1, .2, .3), selection ratio (SR = 20%, 50%), observed
(unrestricted) applicant predictor reliability (Rxx = .7, .9), and observed (incidentally
restricted) incumbent criterion reliability (ryy = .5, .7). All factors are completely crossed
yielding 24 combinations.
Two correction methods were compared (see Table 1 for a summary of them).
The Proposed Method reflects the approach just described: the correction first applies the
direct range restriction formula to the observed correlation, which is then corrected for by
the unrestricted predictor reliability and the unrestricted criterion reliability, the latter of
which is estimated from the incidentally restricted criterion reliability. The Rule-ofThumb Method, suggests first correcting the range-restricted observed validity by the
Correcting for reliability 7
range-restricted criterion reliability, then correcting that value for range restriction and
the unrestricted predictor reliability (e.g., see Raju, Burke, Normand, & Langlois, 1991).
Table 1
Summary of Two Methods for Simultaneous Corrections to the Observed Correlation
Method 1 –
Proposed Method
Method 2 –
Rule-of-thumb
Method
Range restriction is
based on…
Observed
correlation
Predictor reliability
coefficient is…
Observed and
unrestricted
Corrected
correlation
(corrected by the
restricted criterion
reliability)
Observed and
unrestricted
Criterion reliability
coefficient is…
Estimated and
unrestricted
(corrected for
incidental range
restriction)
Observed and
incidentally range
restricted (used in
the range restriction
correction)
Results and Discussion
Given the call by Stauffer and Mendoza (2001) to base range-restriction
corrections on observed validities, then correcting those by unrestricted predictor and
criterion related validities (the latter being estimated), we fully expected to find empirical
differences in the Proposed Method, which is based on their recommendation, with the
Rule-of-Thumb method. However the findings we did obtain were exactly the same (see
Table 2), which led us to believe – and confirm – that these two methods are completely
equivalent algebraically. What this means is that the spirit of the message by Stauffer
and Mendoza (2001) is correct and perhaps has gone largely unrecognized, but in the end
we found out (the hard way) that following their call led to results that are no different
from the rule-of-thumb approach that meta-analysts already adopt when correcting
correlations individually for statistical artifacts.
Correcting for reliability 8
Table 2
A Comparison of the Proposed Method with the Rule-of-Thumb Method
case
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
rxy
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.1
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
SR
0.2
0.2
0.2
0.2
0.5
0.5
0.5
0.5
0.2
0.2
0.2
0.2
0.5
0.5
0.5
0.5
0.2
0.2
0.2
0.2
0.5
0.5
0.5
0.5
Rxx
0.7
0.7
0.9
0.9
0.7
0.7
0.9
0.9
0.7
0.7
0.9
0.9
0.7
0.7
0.9
0.9
0.7
0.7
0.9
0.9
0.7
0.7
0.9
0.9
ryy
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
0.5
0.7
corrected-rxy
Proposed
Method
0.3492271
0.2980042
0.3079893
0.2628150
0.2756176
0.2340742
0.2430719
0.2064340
0.6375673
0.5568183
0.5622815
0.4910676
0.5252105
0.4518904
0.4631921
0.3985299
0.8459926
0.7586787
0.7460953
0.6690917
0.7334763
0.6422888
0.6468653
0.5664455
Rule-ofThumb
0.3492271
0.2980042
0.3079893
0.2628150
0.2756176
0.2340742
0.2430719
0.2064340
0.6375673
0.5568183
0.5622815
0.4910676
0.5252105
0.4518904
0.4631921
0.3985299
0.8459926
0.7586787
0.7460953
0.6690917
0.7334763
0.6422888
0.6468653
0.5664455
Note. rxy = observed validity coefficient. SR = selection ratio, Rxx = unrestricted predictor
reliability, ryy = incidentally restricted criterion reliability.
This study’s focus was on correcting individual studies in a meta-analysis,
considering the incidental range-restriction effect on criterion reliability. We did not
consider cases where predictor reliability was only available for the sample of
incumbents who were directly selected on the predictor; we assumed that predictor
information would be available for all applicants. Our continuing work will investigate
this, however, as this may in fact lead to results differing from the rule-of-thumb method
(see Stauffer & Mendoza, 2001, p. 65, Eqs. 4 and 5).
Correcting for reliability 9
It is important to note some important boundary conditions in this work
confirming the rule-of-thumb approach. First, we did not investigate corrections to
correlations that use artifact distributions, an approach adopted when the set of studies in
a meta-analysis do not report complete psychometric information (Hunter & Schmidt,
2004). Second, correcting for incidental range restriction on the predictor, rather than
direct range restriction, is an important concern in cases where incumbents were selected
explicitly on a variable correlated with the predictor of interest. We did not model that
effect in this study because it has been thoroughly investigated in detailed simulations
elsewhere (see Le, 2003).
In conclusion we want to make the general point that researchers, before
conducting a meta-analysis, should develop a deep familiarity with the measures and
settings used in the research domain of interest. Additionally, researchers should be well
aware that meta-analysis is subject to many of the threats to validity that individual
studies face (Shadish, Cook, & Campbell, 2002). Given that, statistical corrections
should not be applied mechanically (see Oswald & McCloy, 2003), but where they can be
applied appropriately, it pays off by increasing accuracy of the point estimates that also
happens to offset the increase in the associated standard error (Bobko & Riecke, 1980).
The present study showed that meta-analysts employing the traditional rule-of-thumb
method that uses an incidentally range-restricted criterion reliability coefficient (and an
unrestricted predictor reliability coefficient) achieves the same results as the proposed
method that follows Stauffer and Mendoza’s (2001) recommendation by correcting for
this incidental range restriction before applying a simultaneous correction to the observed
validity.
Correcting for reliability 10
References
Bobko, P., & Riecke, A. (1980). Large sample estimates for standard errors of functions
of correlation coefficients. Applied Psychological Measurement, 4, 385-398.
Gulliksen, H. (1950). Theory of mental tests. New York: Wiley.
Hunter, J. E., Schmidt, F. L., & Le, H. (2004). Implications of direct and indirect range
restriction for meta-analysis methods and findings. Unpublished manuscript
Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and
bias in research findings. Thousand Oaks, CA: Sage.
Le, H. A. (2003). Correcting for indirect range restriction in meta-analysis: Testing a
new meta-analytic method. Unpublished doctoral dissertation, University of Iowa,
Iowa City, IA.
Oswald, F. L., & McCloy, R. A. (2003). Meta-analysis and the art of the average. In K.
R. Murphy (Ed.), Validity generalization: A critical review (p. 311-338).
Mahwah, NJ: Erlbaum.
Raju, N. S., Burke, M., Normand, J., & Langlois, G. M. (1991). A new meta-analytic
approach. Journal of Applied Psychology, 76, 432-446.
Raju, N. S., & Brand, P. A. (2003). Determining the significance of correlations corrected
for unreliability and range restriction. Applied Psychological Measurement, 27,
52-71.
Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded
typology. Journal of Applied Psychology, 85, 112-118.
Correcting for reliability 11
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasiexperimentation designs for generalized causal inference. Boston: HoughtonMifflin.
Stauffer, J. M., & Mendoza, J. L. (2001). The proper sequence for correcting correlation
coefficients for range restriction and reliability. Psychometrika, 66, 63-68.
Download