Correcting for reliability 1 Running head: CORRECTIONS IN META-ANALYSIS Correcting for Reliability and Range-Restriction in Meta-Analysis Frederick L. Oswald and Patrick D. Converse Michigan State University Frederick L. Oswald and Patrick D. Converse, Department of Psychology, Michigan State University. Symposium presented at the 20th Annual Conference of the Society for Industrial and Organizational Psychology, Los Angeles, CA. Please direct all correspondence to Fred Oswald, Psychology Building, Michigan State University, East Lansing, MI 48824-1116; e-mail: foswald@msu.edu Correcting for reliability 2 Correcting for Reliability and Range-Restriction in Meta-Analysis For more than 25 years, meta-analysis has been used as a quantitative tool in organizational research for summarizing empirical findings across studies, complementing the narrative review by demonstrating whether validities across research conducted in a given domain show consistent levels (a lack of situational specificity) or at least are non-zero (validity generalization). Part of the process in reaching such conclusions is to correct observed effect sizes for statistical artifacts that systematically attenuate observed predictor-criterion correlations (see Hunter & Schmidt, 2004). Measurement unreliability is pervasive in psychological measures, so given a reasonable estimate of true-score variance for a measure, that value can be substituted for the observed variance, hopefully providing a more accurate estimate of effect in the population of interest. Range restriction is another effect attenuating observed correlations in personnel selection settings, so that an estimate of effect in the unrestricted population will be larger than in the restricted sample. Range restriction may be direct, where individuals in the restricted sample were selected top-down on the predictor measure, or it may be incidental, where selection occurs on a third variable that is correlated with the predictor and/or the criterion. Having a substantive understanding of the nature of error in one’s measures and the selection process that led to the observed effect size helps lead one toward making appropriate statistical corrections. Of course, these corrections can be informative when estimating population correlations in one’s own study – no study needs to wait for a meta-analysis to make such corrections. Correcting for reliability 3 Several articles have outlined the computational details for different methods of statistically correcting correlations and computing their standard errors (Bobko & Riecke, 1980; Hunter, Schmidt, & Le, 2004; Raju & Brand, 2003; Sackett & Yang, 2000). The present paper outlines still another method whose rationale is brief but justifies a compelling alternative to previous approaches. Then, several different correction approaches will be applied to examples that vary the typical values found in psychological research for measurement reliability, direct range restriction, and criterionrelated validity. Rationale The proposed method for correcting correlations takes into account the fact that personnel selection studies usually have incumbent data on an organizational criterion of interest; they do not have criterion data for applicants who did not get hired. Additionally, because incumbents were selected on a predictor often correlated with these criteria, the criterion reliability in the incumbent sample typically underestimates the criterion reliability in the applicant population due to this incidental range restriction. In other words, selection researchers often have the predictor reliability in the applicant sample, but only have the restricted criterion reliability in the incumbent sample. An appropriate simultaneous correction for (a) direct range restriction on the predictor and (b) measurement unreliability requires the estimate for the unrestricted reliability coefficient for the criterion (Stauffer & Mendoza, 2001). Stauffer and Mendoza (2001) advance a related problem in terms of predictor reliability, when only the restricted reliability for the predictor is known. More importantly, they make the general argument that simultaneous correction requires the Correcting for reliability 4 range-restriction correction based on the observed (restricted) correlation as well as the unrestricted reliability coefficients. The following method explains how to estimate the unrestricted criterion reliability and then applies the simultaneous correction formula. Method An estimate of the unrestricted criterion reliability coefficient can be obtained in the following steps. First, Gulliksen provides an equation relating criterion reliability to validity under direct range restriction on the predictor (Gulliksen, 1950, Eq. 22, p. 140). He shows that, under Classical Test Theory assumptions of (a) equal standard errors of measurement and (b) equal standard errors of the estimate regardless of the amount of range restriction on an applicant sample, that the following ratio should equal a constant C: C 1 ryy (1) 1 rxy2 where ryy and rxy are the range-restricted criterion reliability and validity coefficient, respectively. This constant must also hold when there is no range restriction, such that C 1 R yy (2) 1 Rxy2 where R yy and R xy are the unrestricted criterion reliability and validity coefficient, respectively. Then, setting Equations 1 and 2 equal to one another and rearranging, the formula for the unrestricted criterion reliability becomes R yy 1 (1 ryy )(1 R xy2 ) (1 rxy2 ) . (3) Correcting for reliability 5 This equation has two known values, ryy and rxy , and two unknown values, R yy and R xy . R xy can be estimated, however, using the traditional formula for direct range restriction on the predictor x (Gulliksen, Eq. 18, p. 137): Rxy krxy k 2 rxy2 rxy2 1 (4) where all terms are defined as before, and k is the ratio of unrestricted to restricted standard deviations for x. Before we substitute results from this equation into Equation 3, first let us recall the formula for the correlation corrected for measurement unreliability: R xy* R xy R xx R yy (5) Where Rxy is the validity coefficient corrected for range restriction, R xx is the estimate of predictor reliability, and R yy is the estimate of criterion reliability corrected for incidental range restriction. Again, Stauffer and Mendoza (2001) note that range restriction corrections are based on the nature of the range restriction, not on the nature of the type of reliability coefficient available. Therefore, Rxy corrects the observed validity coefficient, rxy , for range restriction, where the observed validity coefficient is not corrected for measurement unreliability. Therefore, Equation 4 substitutes for Rxy in the numerator of Equation 5. For the denominator of Equation 5, R xx is usually available from applicant sample data, and R yy has been estimated in Equation 3, which together yields the following: Correcting for reliability 6 krxy Rxy* R xx (1 ryy )(1 Rxy2 ) 2 2 2 1 k rxy rxy 1 2 (1 rxy ) (6) Equation 6, in turn, requires substituting Equation 5 for Rxy in the denominator. Then with this substitution and a fair amount of algebraic rearrangement, Equation 6 becomes Rxy* krxy Rxx (k 2 rxy2 rxy2 ryy ) (7) This equation allows one to estimate Rxy* , the correlation corrected for range restriction and measurement error variance, based on the available estimates: (a) the unrestricted predictor reliability, Rxx ; (b) the incidentally range-restricted criterion reliability, ryy ; and (c) the restricted validity coefficient rxy . Empirical Comparisons Table 2 empirically compares methods applied to examples that combine different levels of the observed validity (rxy = .1, .2, .3), selection ratio (SR = 20%, 50%), observed (unrestricted) applicant predictor reliability (Rxx = .7, .9), and observed (incidentally restricted) incumbent criterion reliability (ryy = .5, .7). All factors are completely crossed yielding 24 combinations. Two correction methods were compared (see Table 1 for a summary of them). The Proposed Method reflects the approach just described: the correction first applies the direct range restriction formula to the observed correlation, which is then corrected for by the unrestricted predictor reliability and the unrestricted criterion reliability, the latter of which is estimated from the incidentally restricted criterion reliability. The Rule-ofThumb Method, suggests first correcting the range-restricted observed validity by the Correcting for reliability 7 range-restricted criterion reliability, then correcting that value for range restriction and the unrestricted predictor reliability (e.g., see Raju, Burke, Normand, & Langlois, 1991). Table 1 Summary of Two Methods for Simultaneous Corrections to the Observed Correlation Method 1 – Proposed Method Method 2 – Rule-of-thumb Method Range restriction is based on… Observed correlation Predictor reliability coefficient is… Observed and unrestricted Corrected correlation (corrected by the restricted criterion reliability) Observed and unrestricted Criterion reliability coefficient is… Estimated and unrestricted (corrected for incidental range restriction) Observed and incidentally range restricted (used in the range restriction correction) Results and Discussion Given the call by Stauffer and Mendoza (2001) to base range-restriction corrections on observed validities, then correcting those by unrestricted predictor and criterion related validities (the latter being estimated), we fully expected to find empirical differences in the Proposed Method, which is based on their recommendation, with the Rule-of-Thumb method. However the findings we did obtain were exactly the same (see Table 2), which led us to believe – and confirm – that these two methods are completely equivalent algebraically. What this means is that the spirit of the message by Stauffer and Mendoza (2001) is correct and perhaps has gone largely unrecognized, but in the end we found out (the hard way) that following their call led to results that are no different from the rule-of-thumb approach that meta-analysts already adopt when correcting correlations individually for statistical artifacts. Correcting for reliability 8 Table 2 A Comparison of the Proposed Method with the Rule-of-Thumb Method case 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 rxy 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 SR 0.2 0.2 0.2 0.2 0.5 0.5 0.5 0.5 0.2 0.2 0.2 0.2 0.5 0.5 0.5 0.5 0.2 0.2 0.2 0.2 0.5 0.5 0.5 0.5 Rxx 0.7 0.7 0.9 0.9 0.7 0.7 0.9 0.9 0.7 0.7 0.9 0.9 0.7 0.7 0.9 0.9 0.7 0.7 0.9 0.9 0.7 0.7 0.9 0.9 ryy 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 0.5 0.7 corrected-rxy Proposed Method 0.3492271 0.2980042 0.3079893 0.2628150 0.2756176 0.2340742 0.2430719 0.2064340 0.6375673 0.5568183 0.5622815 0.4910676 0.5252105 0.4518904 0.4631921 0.3985299 0.8459926 0.7586787 0.7460953 0.6690917 0.7334763 0.6422888 0.6468653 0.5664455 Rule-ofThumb 0.3492271 0.2980042 0.3079893 0.2628150 0.2756176 0.2340742 0.2430719 0.2064340 0.6375673 0.5568183 0.5622815 0.4910676 0.5252105 0.4518904 0.4631921 0.3985299 0.8459926 0.7586787 0.7460953 0.6690917 0.7334763 0.6422888 0.6468653 0.5664455 Note. rxy = observed validity coefficient. SR = selection ratio, Rxx = unrestricted predictor reliability, ryy = incidentally restricted criterion reliability. This study’s focus was on correcting individual studies in a meta-analysis, considering the incidental range-restriction effect on criterion reliability. We did not consider cases where predictor reliability was only available for the sample of incumbents who were directly selected on the predictor; we assumed that predictor information would be available for all applicants. Our continuing work will investigate this, however, as this may in fact lead to results differing from the rule-of-thumb method (see Stauffer & Mendoza, 2001, p. 65, Eqs. 4 and 5). Correcting for reliability 9 It is important to note some important boundary conditions in this work confirming the rule-of-thumb approach. First, we did not investigate corrections to correlations that use artifact distributions, an approach adopted when the set of studies in a meta-analysis do not report complete psychometric information (Hunter & Schmidt, 2004). Second, correcting for incidental range restriction on the predictor, rather than direct range restriction, is an important concern in cases where incumbents were selected explicitly on a variable correlated with the predictor of interest. We did not model that effect in this study because it has been thoroughly investigated in detailed simulations elsewhere (see Le, 2003). In conclusion we want to make the general point that researchers, before conducting a meta-analysis, should develop a deep familiarity with the measures and settings used in the research domain of interest. Additionally, researchers should be well aware that meta-analysis is subject to many of the threats to validity that individual studies face (Shadish, Cook, & Campbell, 2002). Given that, statistical corrections should not be applied mechanically (see Oswald & McCloy, 2003), but where they can be applied appropriately, it pays off by increasing accuracy of the point estimates that also happens to offset the increase in the associated standard error (Bobko & Riecke, 1980). The present study showed that meta-analysts employing the traditional rule-of-thumb method that uses an incidentally range-restricted criterion reliability coefficient (and an unrestricted predictor reliability coefficient) achieves the same results as the proposed method that follows Stauffer and Mendoza’s (2001) recommendation by correcting for this incidental range restriction before applying a simultaneous correction to the observed validity. Correcting for reliability 10 References Bobko, P., & Riecke, A. (1980). Large sample estimates for standard errors of functions of correlation coefficients. Applied Psychological Measurement, 4, 385-398. Gulliksen, H. (1950). Theory of mental tests. New York: Wiley. Hunter, J. E., Schmidt, F. L., & Le, H. (2004). Implications of direct and indirect range restriction for meta-analysis methods and findings. Unpublished manuscript Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings. Thousand Oaks, CA: Sage. Le, H. A. (2003). Correcting for indirect range restriction in meta-analysis: Testing a new meta-analytic method. Unpublished doctoral dissertation, University of Iowa, Iowa City, IA. Oswald, F. L., & McCloy, R. A. (2003). Meta-analysis and the art of the average. In K. R. Murphy (Ed.), Validity generalization: A critical review (p. 311-338). Mahwah, NJ: Erlbaum. Raju, N. S., Burke, M., Normand, J., & Langlois, G. M. (1991). A new meta-analytic approach. Journal of Applied Psychology, 76, 432-446. Raju, N. S., & Brand, P. A. (2003). Determining the significance of correlations corrected for unreliability and range restriction. Applied Psychological Measurement, 27, 52-71. Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85, 112-118. Correcting for reliability 11 Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasiexperimentation designs for generalized causal inference. Boston: HoughtonMifflin. Stauffer, J. M., & Mendoza, J. L. (2001). The proper sequence for correcting correlation coefficients for range restriction and reliability. Psychometrika, 66, 63-68.