Part 4 Staffing Activities: Selection Chapter 7: Measurement Chapter 8: External Selection I Chapter 9: External Selection II Chapter 10: Internal Selection McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved. Part 4 Staffing Activities: Selection Chapter 07: Measurement Staffing Organizations Model Organization Mission Goals and Objectives Organization Strategy HR and Staffing Strategy Staffing Policies and Programs Support Activities Core Staffing Activities Legal compliance Planning Recruitment: Selection: External, internal Measurement, external, internal Job analysis Employment: Decision making, final match Staffing System and Retention Management 7-3 Chapter Outline Importance and Use of Measures Key Concepts Measurement Scores Correlation Between Scores Quality of Measures Reliability of Measures Validity of Measures Validation of Measures in Staffing Validity Generalization Staffing Metrics and Benchmarks Collection of Assessment Data Testing Procedures Acquisition of Tests and Test Manuals Professional Standards Legal Issues Disparate Impact Statistics Standardization and Validation 7-4 Learning Objectives for This Chapter Define measurement and understand its use and importance in staffing decisions Understand the concept of reliability and review the different ways reliability of measures can be assessed Define validity and consider the relationship between reliability and validity Compare and contrast the two types of validation studies typically conducted Consider how validity generalization affects and informs validation of measures in staffing Review the primary ways assessment data can be collected 7-5 Discussion Questions for This Chapter Imagine and describe a staffing system for a job in which there are no measures used. Describe how you might go about determining scores for applicants’ responses to (a) interview questions, (b) letters of recommendation, and (c) questions about previous work experience. Give examples of when you would want the following for a written job knowledge test Assume you gave a general ability test, measuring both verbal and computational skills, to a group of applicants for a specific job. Also assume that because of severe hiring pressures, you hired all of the applicants, regardless of their test scores. a low coefficient alpha (e.g., α = .35) a low test–retest reliability. How would you investigate the criterion-related validity of the test? How would you go about investigating the content validity of the test? What information does a selection decision maker need to collect in making staffing decisions? What are the ways in which this information can be collected? 7-6 Key Concepts Measurement Scores the process of assigning numbers to objects to represent quantities of an attribute of the objects the amount of the attribute being assessed Correlation between scores a statistical measure of the relation between the two sets of scores 7-7 Importance and Use of Measures Measures Methods or techniques for describing and assessing attributes of objects Examples Tests of applicant KSAOs Job performance ratings of employees Applicants’ ratings of their preferences for various types of job rewards 7-8 Importance and Use of Measures (continued) Summary of measurement process (a) Choose an attribute of interest (b) Develop operational definition of attribute (c) Construct a measure of attribute as operationally defined (d) Use measure to actually gauge attribute Results of measurement process Scores become indicators of attribute Initial attribute and its operational definition are transformed into a numerical expression of attribute 7-9 Measurement: Definition Process of assigning numbers to objects to represent quantities of an attribute of the objects Attribute/Construct - Knowledge of mechanical principles Objects - Job applicants 7-10 Ex. 7.1 Use of Measures in Staffing 7-11 Measurement: Standardization Involves Controlling influence of extraneous factors on scores generated by a measure and Ensuring scores obtained reflect the attribute measured Properties of a standardized measure Content is identical for all objects measured Administration of measure is identical for all objects Rules for assigning numbers are clearly specified and agreed on in advance 7-12 Measurement: Levels Nominal A given attribute is categorized and numbers are assigned to categories No order or level implied among categories Ordinal Objects are rank-ordered according to how much of attribute they possess Represents relative differences among objects Interval Objects are rank-ordered Differences between adjacent points on measurement scale are equal in terms of attribute Ratio Similar to interval scales equal differences between scale points for attribute being measured Have a logical or absolute zero point 7-13 Measurement: Differences in Objective and Subjective Measures Objective measures Subjective measures Rules used to assign numbers to attribute are predetermined, communicated, and applied through a system Scoring system is more elusive, often involving a rater who assigns the numbers Research results 7-14 Scores Definition Central tendency and variability Exh. 7.2: Central Tendency and Variability: Summary Statistics Percentiles Measures provide scores to represent amount of attribute being assessed Scores are the numerical indicator of attribute Percentage of people scoring below an individual in a distribution of scores Standard scores 7-15 Discussion questions Imagine and describe a staffing system for a job in which there are no measures used. Describe how you might go about determining scores for applicants’ responses to (a) interview questions, (b) letters of recommendation, and (c) questions about previous work experience. 7-16 Correlation Between Scores Scatter diagrams Used to plot the joint distribution of the two sets of scores Exh. 7.3: Scatter Diagrams and Corresponding Correlations Correlation coefficient Value of r summarizes both Strength of relationship between two sets of scores and Direction of relationship Values can range from r = -1.0 to r = 1.0 Interpretation - Correlation between two variables does not imply causation between them Exh. 7.4: Calculation of Product-Movement Correlation Coefficient 7-17 Exh. 7.3: Scatter Diagrams and Corresponding Correlations 7-18 Exh. 7.3: Scatter Diagrams and Corresponding Correlations 7-19 Exh. 7.3: Scatter Diagrams and Corresponding Correlations 7-20 Significance of the Correlation Coefficient Practical significance Refers to size of correlation coefficient The greater the degree of common variation between two variables, the more one variable can be used to understand another variable Statistical significance Refers to likelihood a correlation exists in a population, based on knowledge of the actual value of r in a sample from that population Significance level is expressed as p < value Interpretation -- If p < .05, there are fewer than 5 chances in 100 of concluding there is a relationship in the population when, in fact, there is not 7-21 Quality of Measures Reliability of measures Validity of measures Validity of measures in staffing Validity generalization 7-22 Quality of Measures: Reliability Definition: Consistency of measurement of an attribute Reliability of measurement is of concern A measure is reliable to the extent it provides a consistent set of scores to represent an attribute Both within a single time period and between time periods For both objective and subjective measures Exh. 7.6: Summary of Types of Reliability 7-23 Ex. 7.6: Summary of Types of Reliability 7-24 Quality of Measures: Reliability Measurement error Actual score = true score + error Deficiency error: Occurs when there is failure to measure some aspect of attribute assessed Contamination error: Represents occurrence of unwanted or undesirable influence on the measure and on individuals being measured 7-25 Ex. 7.7 - Sources of Contamination Error and Suggestions for Control 7-26 Quality of Measures: Reliability Procedures to calculate reliability estimates Coefficient alpha Interrater agreement Minimum level of interrater agreement - 75% or higher Test-Retest reliability Should be least .80 for a measure to have an acceptable degree of reliability Concerned with stability of measurement Level of r should range between r = .50 to r = .90 Intrarater agreement For short time intervals between measures, a fairly high relationship is expected - r = .80 or 90% 7-27 Quality of Measures: Reliability Implications of reliability Standard error of measurement Since only one score is obtained from an applicant, the critical issue is how accurate the score is as an indicator of an applicant’s true level of knowledge Relationship to validity Reliability of a measure places an upper limit on the possible validity of a measure A highly reliable measure is not necessarily valid Reliability does not guarantee validity - it only makes it possible 7-28 Quality of Measures: Validity Definition: Degree to which a measure truly measures the attribute it is intended to measure Accuracy of measurement Exh. 7.9: Accuracy of Measurement Accuracy of prediction Exh. 7.10: Accuracy of Prediction 7-29 Ex. 7.9: Accuracy of Measurement 7-30 Discussion questions Give examples of when you would want the following for a written job knowledge test a low coefficient alpha (e.g., α = .35) a low test–retest reliability. 7-31 Exh. 7.10: Accuracy of Prediction 7-32 Exh. 7.10: Accuracy of Prediction 7-33 Validity of Measures in Staffing Importance of validity to staffing process Predictors must be accurate representations of KSAOs to be measured Predictors must be accurate in predicting job success Validity of predictors explored through validation studies Two types of validation studies Criterion-related validation Content validation 7-34 Ex. 7.11: CriterionRelated Validation •Criterion Measures: measures of performance on tasks and task dimensions •Predictor Measure: it taps into one or more of the KSAOs identified in job analysis •Predictor–Criterion Scores: must be gathered from a sample of current employees or job applicants •Predictor–Criterion Relationship: the correlation must be calculated. 7-35 Ex. 7.12: Concurrent and Predictive Validation Designs 7-36 Ex. 7.12: Concurrent and Predictive Validation Designs 7-37 Content Validation Content validation involves Criterion measures are not used A judgment is made about the probable correlation between predictors and criterion measures Used in two situations Demonstrating the questions/problems (predictor scores) are a representative sample of the kinds of situations occurring on the job When there are too few people to form a sample for criterion-related validation When criterion measures are not available Exh. 7.14: Content Validation 7-38 Validity Generalization Degree to which validity can be extended to other contexts Contexts include different situations, samples of people and time periods Situation-specific validity vs. validity generalization Exh. 7.16: Hypothetical Validity Generalization Example Distinction is important because Validity generalization allows greater latitude than situation specificity More convenient and less costly not to have to conduct a separate validation study for every situation 7-39 Exhibit 7.16 Hypothetical Validity Generalization Example 7-40 Discussion questions Assume you gave a general ability test, measuring both verbal and computational skills, to a group of applicants for a specific job. Also assume that because of severe hiring pressures, you hired all of the applicants, regardless of their test scores. How would you investigate the criterion-related validity of the test? How would you go about investigating the content validity of the test? What information does a selection decision maker need to collect in making staffing decisions? What are the ways in which this information can be collected? 7-41 Staffing Metrics and Benchmarks Metrics Staffing metrics quantifiable measures that demonstrate the effectiveness (or ineffectiveness) of a particular practice or procedure job analysis validation Measurement Benchmarking as a means of developing metrics 7-42 Collection of Assessment Data Testing procedures Paper and pencil measures PC- and Web-based approaches Applicant reactions Acquisition of tests and test manuals Paper and pencil measures PC- and Web-based approaches Professional standards 7-43 Legal Issues Disparate impact statistics Applicant flow statistics Applicant stock statistics Standardization Lack of consistency in treatment of applicants is a major factor contributing to discrimination Example: Gathering different types of background information from protected vs. non-protected groups Example: Different evaluations of information for protected vs. non-protected groups Validation If adverse impact exists, a company must either eliminate it or justify it exists for job-related reasons (validity evidence) 7-44 Ethical Issues Issue 1 Do individuals making staffing decisions have an ethical responsibility to know measurement issues? Why or why not? Issue 2 Is it unethical for an employer to use a selection measure that has high empirical validity but lacks content validity? Explain. 7-45