The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 95 Years of Research Findings. Frank Schmidt Abstract This presentation is an update of an article Jack Hunter and I published in Psychological Bulletin that summarized research findings on the validity of job selection methods up to 1998. The two co-authors on this update are In-Sue Oh and Jon Shaffer, who are former PhD students of mine. The ensuing 12 years have seen additional research findings that improve the accuracy of the validity estimates presented in the 1998 article. During this time, a new and more accurate procedure for correcting for the downward bias caused by range restriction has become available (Hunter et al., 2006). This more accurate procedure has revealed that the older, less accurate procedure had underestimated the validity of general mental ability (GMA) and specific cognitive aptitudes (e.g., verbal ability, quantitative ability, etc) by 25% or more. Also, the increased availability of primary validity studies has allowed new and expanded meta-analyses of some selection methods, which has refined and changed some of the validity estimates for the prediction of job performance. For some personnel measures, these new data have produced important changes in estimated validity and incremental validity over GMA. For example, an updated meta-analysis shows that job sample or work sample tests are somewhat less valid than had been indicated by the older data. Finally, meta-analytic results are now available for some newer predictors not included in the 1998 article. These include Situational Judgment Tests (SJTs), measures of so-called “Emotional Intelligence”, person-job fit measures, person-organization fit measures, and the personality trait of Emotional Stability. No new meta-analyses of validity for the prediction of performance in training programs have been reported. But application of the improved method of correcting for range restriction has changed the earlier estimates of validity for performance in training programs. For each of 25 personal selection procedures used to predict job performance, I will present the mean operational validity as revealed by meta-analyses. I will also present the incremental validity (if any) produced by each procedure over that produced by GMA. I will present this same information for 11 procedures used to predict performance in job training programs. Results show that many procedures that are valid predictors of job performance nevertheless have little or no incremental validity over that of GMA. This reduction in apparent incremental validity results from the relatively large increase in the estimated validity of GMA resulting from use of the more accurate correction for range restriction. At the time of the earlier 1998 article, it was apparent that GMA plays a central role in the determination of both job and training performance. However, the updated findings indicate that the dominance of GMA is even greater than previously believed. References [Includes recent relevant articles not cited in the Abstract.] 1 Hunter, J.E., Schmidt, F.L., & Le, H. (2006). Implications of direct and indirect range restriction for metaanalysis methods and findings. Journal of Applied Psychology, 91, 594 – 612. Schmidt, F. L. (2010). How to detect and correct the lies that data tell. Perspectives on Psychological Science, 5, 233 – 242. Schmidt, F. L. (2011). A theory of sex differences in technical aptitude and some supporting evidence. Perspectives on Psychological Science, 6, 560 – 573. Schmidt, F. L. (2012). Cognitive tests used in selection can have content validity as well as criterion validity: A broader research review and implications for practice. International Journal of Selection and Assessment, 20, 1 – 13. Schmidt, F. L. (in press). A general theoretical integrative model of individual differences in interests, abilities, personality traits, and academic and occupational performance. Perspectives on Psychological Science. Schmidt, F.L., & Hunter, J.E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262 – 274. Schmidt, F. L., Le, H., & Oh, I-S. (2013). Are true scores and constructs the same? A critical examination of their substitutability and the implications for research results. International Journal of Selection and Assessment, 21, 341 – 354. Schmidt, F. L., & Oh, I-S. (2013). Methods for second order meta-analysis and illustrative applications. Organizational Behavior and Human Decision Processes, 121, 204 – 218. Schmidt, F. L., Oh, I-S., & Le, H. (2006). Increasing the accuracy of corrections for range restriction: Implications for selection procedure validities and other research results. Personnel Psychology, 59, 281 – 305. Schmidt, F. L., Oh, I-S., & Shaffer, J. A. (2008). Increased accuracy for range restriction corrections: Implications for the role of personality and general mental ability in job and training performance. Personnel Psychology, 61, 827 – 868. Oh, I-S., Postlethwaite, B. E., & Schmidt, F. L. (2013). Re-thinking the validity of unstructured interviews: Implications of recent developments in meta-analysis. In D. J. Svyantek, & K. Mahoney (Eds.), Received wisdom, kernels of truth, and boundary conditions in organizational studies. Charlotte, NC: Information Age Publishing. Pp 297 – 329. 2 Table 1. Selection methods for job performance Operational validity (r) Selection procedures/predictors GMA testsa Multiple R Gain in validity % gain in validity Standardized Regression weights SuppleGMA ment .65 b Integrity tests .46 .78 .130 20% .63 .43 Employment interviews (structured)c .58 .76 .117 18% .52 .43 .60 .75 .099 15% .48 .41 .22 .70 .053 8% .67 .27 Reference checks .26 .70 .050 8% .65 .26 Biographical data measuresg .35 .68 .036 6% .91 -.34 Job experience h .13 .67 .023 4% .66 .17 .18 .67 .020 3% .64 .16 .26 .67 .018 3% .76 -.19 .37 .66 .014 2% .78 -.19 .49 .66 .013 2% .55 .16 .11 .66 .009 1% .65 .11 .10 .66 .008 1% .65 .10 Interests .10 .66 .008 1% .65 .10 Emotional Intelligence (ability)p .24 .65 .007 1% .70 -.11 Emotional Intelligence (mixed)q .24 .65 .005 1% .63 .09 .34 .65 .004 1% .71 -.10 .13 .65 .004 1% .64 .07 .33 .65 .003 0% .69 -.07 .26 .65 .000 0% .64 .03 .12 .65 .000 0% .64 .02 .44 .65 .000 0% .63 .02 .45 .65 .000 0% .64 .02 .48 .65 .000 0% .65 -.01 Employment interviews (unstructured) Conscientiousness e f Person-job fit measures SJT (knowledge) i j Assessment centersk Peer ratings l T & E point method m n Years of education o r GPA Person-organization fit measures Work sample testst SJT (behavioral tendency)u Emotional Stability v Job tryout procedure w Behavioral consistency methodx Job knowledge y s d Note. SJT = situational judgment tests; T & E = training and experience. Selection procedures whose operational validity is equal to and greater than .10 are listed in the order of gain in operational validity. Unless otherwise noted, all operational validity estimates are corrected for measurement error in the criterion measure and indirect range restriction (IRR) on the predictor measure to estimate operational validity for applicant populations. The correlations between GMA and supplementary predictors (used to compute multiple Rs, gain in validity, and standardized regression weights) are corrected for IRR on GMA but not for measurement error in either measure; these correlations indicate observed correlations between predictors in applicant populations (unrestricted observed correlation). a From Schmidt, Shaffer, and Oh (2008, Table 3). Individual meta-analytic estimates are reported in Table 1 on p. 838. The average of these estimates across eight meta-analytic estimates (.647) is presented in Table 3 on p. 843. We used this average in the current analyses. 3 b From Ones, Viswesvaran, and Schmidt (1993, Table 8). This operational validity is from predictive studies conducted on job applicants as in Schmidt and Hunter (1998); the same source was used in Schmidt and Hunter (1998), but the operational validity reported in this table was corrected for IRR. The unrestricted observed correlation with GMA is .046 (Ones, 1993, Table 3). c, d From McDaniel, Whetzel, Schmidt, and Maurer (1994, Table 4). This operational validity is from primary studies where overall job performance was measured using research-purpose measures, and thus represents the most unbiased estimates available. The same source was used in Schmidt and Hunter (1998), but the operational validity estimates used in this table were corrected for IRR with the metaanalytic reliability estimates for the interview measure from Conway, Jako, and Goodman (1995). When the predictor reliability estimate from the McDaniel et al. (1994) was used in correcting for IRR, the operational validity for structured and unstructured interviews were .53 and .46 and gain in validity over GMA tests was .088 and .33, respectively. The unrestricted observed correlations with GMA are .305 and .402 for structured and unstructured interviews, respectively (Salgado & Moscoso, 2002, Tables 4 and 3, respectively). e, v From Schmidt et al. (2008, Table 1). The unrestricted observed correlations with GMA are -.069 for Conscientiousness and .159 for Emotional Stability (Judge, Jackson, Shaw, Scott, & Bruce, 2007, Table 3). True score correlations alone were reported in Judge et al. (2007). We attenuated the true score correlations for predictor unreliability in both variables using the psychometric information provided by Timothy A. Judge. f From Hunter and Hunter (1984, Table 9). The same source/validity was used in Hunter and Schmidt (1998). The correlation with GMA is assumed to be zero as in Hunter and Schmidt (1998). g From Rothstein, Schmidt, Erwin, Owens, and Sparks (1990, Table 5). The same source/validity was used in Schmidt and Hunter (1998). The unrestricted observed correlation with GMA is .761 (Schmidt & Hunter, 1988, p. 283). h From Sturman (2003, Table 1). The unrestricted observed correlation with GMA is -.069 (Judge et al., 2007, Table 3). i From Kristof-Brown, Zimmerman, and Johnson (2005, Table 1). The correlation with GMA (in fact, college GPA) is .023 (Cable & Judge, 1996); note that the value is based on one primary study. j, u From McDaniel, Hartman, Whetzel, & Grubb III (2007, Table 3). The unrestricted observed correlations with GMA are .589 and .364 for SJT (knowledge) and SJT (behavioral tendency), respectively (McDaniel et al., 2007, Table 3). k From Arthur, Day, McNelly, and Edens (2003, Table 3). The correlation with GMA is .710 (Collins, Schmidt, Sanchez-Ku, Thomas, McDaniel, & Le, 2003). l From Hunter and Hunter (1984, Table 10). The same source/validity was used in Hunter and Schmidt (1998). Based on Schmidt and Hunter (1998), we used the unrestricted observed correlation with GMA is .594 (.50 without correcting for RR). m, x From McDaniel, Schmidt, & Hunter (1988). The correlations with GMA are .000 and .682 for T &E point and behavioral consistency methods, respectively (Schmidt & Hunter, 1998); note that these are assumed values. n From Hunter and Hunter (1984, Table 9). The same source/validity was used in Hunter and Schmidt (1998). The correlation with GMA is zero (Schmidt & Hunter, 1998); note that this is an assumed value. 4 o From Hunter and Hunter (1984, Table 9). The same source/validity was used in Hunter and Schmidt (1998). The correlation with GMA is zero (Schmidt & Hunter, 1998); note that this is an assumed value. p, q From Van Rooy and Viswesvaran (2004, Table 1 for ability-based measures and Table 2 for mixed traits-based measures). The unrestricted observed correlations with GMA are .497 and .245 for abilitybased measures and mixed traits-based measures, respectively (Van Rooy, Viswesvaran, & Pluta, 2005, Tables 3 and 4, respectively). r From Roth, BeVier, Switzer, and Schippmann (1996). The operational validity estimates for GPA (combination of college, graduate, and PhD/MD GPAs) and college GPA are the same. The unrestricted observed correlation with GMA is .619 (Robbins, Lauver, Le, Davis, Langley, & Carlstrom, 2004, Table 5). s From Arthur, Bell, Villado, and Doverspike (2006, Table 1). The correlation with GMA (in fact, GPA) is .092 (Cable & Judge, 1996; 1997); we performed a small meta- analysis of these two articles in order to derive the estimate used in this study. t From Roth, Bobko, and McFarland (2005, Table 1). The unrestricted observed correlation with GMA is .585 (Roth et al., 2005, Table 4). w From Hunter and Hunter (1984, Table 9). The same source/validity was used in Schmidt and Hunter (1998). Based on Schmidt & Hunter (1998), we used the correlation of .663 (.38 without correcting for RR) for this analysis. x From Hunter and Hunter (1984, Table 9). The same source/validity was used in Schmidt and Hunter (1998). The correlation with GMA is zero (Schmidt & Hunter, 1998); note that this is an assumed value. y From Hunter and Hunter (1984, Table 11). The same source/validity was used in Schmidt and Hunter (1998). Based on Schmidt and Hunter (1998), we used the correlation of .747 (.48 without correcting for RR) for this analysis. 5 Table 2. Selection methods for training performance Operational Selection procedures/predictors GMA testsa validity (r) Multiple R Gain in validity % gain in validity Standardized Regression weights SuppleGMA ment .67 b Integrity tests .43 .78 .109 16% .65 .40 .30 .74 .073 11% 1.04 -.50 .25 .73 .061 9% .69 .29 .48 .72 .051 8% .57 .28 .23 .71 .038 6% .67 .23 .20 .70 .029 4% .67 .20 Interests .18 .69 .024 4% .67 .18 Peer ratingsi .36 .67 .002 0% .70 -.06 .14 .67 .001 0% .66 .03 .01 .67 .000 0% .67 .01 Biographical data measures Conscientiousness c d Employment interviewse f Reference checks Years of education g h Emotional Stabilityj Job experience (years) k Note. Operational Validity estimates in parentheses are what is reported in Schmidt and Hunter (1998, Table 2). Selection procedures whose operational validity is equal to and greater than .10 are listed in the order of gain in operational validity. Unless otherwise noted, all operational validity estimates are corrected for measurement error in the criterion measure and indirect range restriction (IRR) on the predictor measure to estimate operational validity for applicant populations. The correlations between GMA and supplementary predictors (used to compute multiple Rs, gain in validity, and standardized regression weights) are corrected for IRR on GMA but not for measurement error in either measure; these correlations indicate observed correlations between predictors in applicant populations (unrestricted observed correlation). Details on these correlations are reported in the footnote for Table 1; the same correlations were used in Tables 1 and 2. a From Schmidt, Shaffer, and Oh (2008, Table 3). Individual meta-analytic estimates are reported in Table 2 on p. 840. The average of these estimates across eight meta-analytic estimates (.668) is presented in Table 3 on p. 843. We used this average in the current analyses. b From Schmidt, Ones, and Viswesvaran (1994). The same source was used in Schmidt and Hunter (1998), but the operational validity reported in this table was corrected for IRR. c From Hunter and Hunter (1984, Table 8). The same source/validity was used in Schmidt and Hunter (1998). d, j From Schmidt et al. (2008, Table 2). e From McDaniel et al. (1994, Table 5). It is noted that the validities for structured and unstructured interviews are very similar (the difference is .03), so we averaged them as in Schmidt and Hunter (1998, Table 2). The same source was used in Schmidt and Hunter (1998), but the operational validity reported in this table was corrected for IRR with the meta-analytic reliability estimates for the interview measure from Conway et al. (1995). When the predictor reliability estimate from the McDaniel et al. (1994) was used in correcting for IRR, the operational validity for structured and unstructured interviews were .43 and gain in validity over GMA tests was .032. The unrestricted observed correlation between employment 6 interviews and GMA was assumed to be the average (.354) of the correlations of GMA with structured and unstructured interviews (Salgado & Moscoso, 2002). f, g, i From Hunter and Hunter (1984, Table 8). The same source/validity was used in Schmidt and Hunter (1998). h From Hunter and Hunter (1984, Table 6). The same source/validity was used in Schmidt and Hunter (1998). k From Hunter and Hunter (1984, Table 6). The same source/validity was used in Schmidt and Hunter (1998). 7 References Arthur, W. Jr., Bell, S. T., Villado, A. J., & Doverspike, D. (2006). The use of personorganization fit in employment decision making: An assessment of its criterion-related validity. Journal of Applied Psychology, 91, 786-801. Arthur, W. Jr., Day, E. A., McNelly, T. L., & Edens, P. S. (2003). Meta-analysis of the criterionrelated validity of assessment center dimensions. Personnel Psychology, 56, 125-154. Cable, D. M., & Judge, T. A. (1996). Person-organization fit, job choice decisions, and organizational entry. Organizational Behavior and Human Decision Processes, 67, 294311. Cable, D. M., & Judge, T. A. (1997). Interviewers' perceptions of person-organization fit and organizational selection decisions. Journal of Applied Psychology, 82, 546-561. Collins, J. M., Schmidt, F. L., Sanchez-Ku, M., Thomas, L., McDaniel, M. A., & Le, H. (2003). Can basic individual differences shed light on the construct meaning of assessment center evaluations? International Journal of Selection and Assessment, 11, 17-29. Conway, J. M., Jako, R. A., & Goodman, D. F. (1995). A meta-analysis of interrater and internal consistency reliability of selection interviews. Journal of Applied Psychology, 80, 565579. Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternative predictors of job performance. Psychological Bulletin, 96, 72-98. Judge, T. A., Jackson, C., Shaw, J. C., Scott, B. A., & Rich, B. L. (2007). Self-efficacy and work-related performance: The integral role of individual differences. Journal of Applied Psychology, 92, 107-127. Kristof-Brown, A. L., Zimmerman, R. D., & Johnson, E. C. (2005). Consequences of Individual’s Fit at Work: A Meta-Analysis of Person-Job, Person-Organization, PersonGroup, and Person-Supervisor Fit. Personnel Psychology, 58, 281-342. McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Mauer, S. D. (1994). The validity of employment interviews: A comprehensive review and meta-analysis. Journal of Applied Psychology, 79, 599-616. McDaniel, M.A., Hartman, N.S., Whetzel, D.L. & Grubb. W.L., III (2007). Situational judgment tests, response instructions and validity: A meta-analysis. Personnel Psychology, 60, 6391. McDaniel, M. A., Schmidt, F.L., & Hunter, J. E. (1988). A meta-analysis of the validity of methods for rating training and experience in personnel selection. Personnel Psychology, 41, 283-314. Ones, D. S. (1993). The construct validity of integrity tests. Unpublished doctoral dissertation, University of Iowa, Iowa City. Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis of integrity test validities: Findings and implications for personnel selection and theories of job performance. Journal of Applied Psychology Monograph, 78, 679-703. Robbins, S.B., Lauver, K., Le, H., Davis, D., Langley, R., & Carlstrom, A. (2004). Do psychosocial and study skill factors predict college outcomes? A meta-analysis. Psychological Bulletin, 130, 261-288. Roth, P. L., BeVier, C. A., Switzer, F. S., & Schippmann, J. (1996). Meta-analyzing the relationship between grades and job performance. Journal of Applied Psychology, 81(5), 548-556. 8 Roth, P. L., Bobko, P., & McFarland, L. A. (2005). A meta-analysis of work sample test validity: Updating and integrating some classic literature. Journal of Personnel Psychology, 58, 1009-1037. Rothstein, H. R., Schmidt, F. L., Erwin, F. W., Owens, W. A., & Sparks, C. P. (1990). Biographical data in employment selection: Can validities be made generalizable? Journal of Applied Psychology, 75, 175-184. Salgado, J. F., & Moscoso, S. (2002). Comprehensive meta-analysis of the construct validity of the selection interview. European Journal of Work and Organizational Psychology, 11, 299-324. Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262–274. Schmidt, F. L., Ones, D. S., & Viswesvaran, C. (1994). The personality characteristic of integrity predicts job training success. Presented at the 6th Annual Convention of the American Psychological Society, Washington, DC. Schmidt, F. L., Shaffer, J. A., & Oh, I.-S. (2008). Increased accuracy of range restriction corrections: Implications for the role of personality and general mental ability in job and training performance. Personnel Psychology, 61, 827-868. Sturman, M. C. (2003). Searching for the inverted U-shaped relationship between time and performance: Meta-analyses of the experience/performance, tenure/performance, and age/performance relationships. Journal of Management, 29, 609-640. Van Rooy, D. L., & Viswesvaran, C. (2004). Emotional intelligence: A meta-analytic investigation of predictive validity and nomological net. Journal of Vocational Behavior, 65, 71–95. Van Rooy, D. L., Viswesvaran, C., & Pluta, P. (2005). An evaluation of construct validity: What is this thing called emotional intelligence? Human Performance, 18, 445-462. 9