Criterion-Related Validity

advertisement
Are Employers on Safe Grounds
Using Validity Generalization (VG) in
Making a Title VII Defense?
2006 SWARM Regional Conference
Little Rock, Arkansas
Biddle Consulting Group, Inc.
193 Blue Ravine, Ste 270
Folsom, CA 95630
1-800-999-0438
www.biddle.com
www.biddle.com
Copyright © 2006
Contact Information
Dan A. Biddle, Ph.D.
CEO, Biddle Consulting Group, Inc.
193 Blue Ravine, Ste 270
Folsom, CA 95630
1-800-999-0438
www.biddle.com
Email: Dan@Biddle.com
www.biddle.com
Copyright © 2006
Overview of Biddle Consulting
Group, Inc. (BCG)
• Since 1974
• Over 200+ cases in the EEO/AA area (both
plaintiff and defense cases)
• Pioneers in the EEO/AA field
• Administrative Skills Testing (OPAC)
• 911 Dispatcher Testing (CritiCall)
• AAP Software and Services
• EEO Litigation Assistance (expert
consulting and witness services)
www.biddle.com
Copyright © 2006
Agenda
• Criterion-Related Validity
• Validity Generalization (VG)
• Title VII Requirements for Tests that
Exhibit Adverse Impact
• VG, Title VII, and the Courts
• Recommendations
• Q&A
www.biddle.com
Copyright © 2006
The Building Blocks for VG:
Criterion-Related Validation
Studies
www.biddle.com
Copyright © 2006
Criterion-Related Validity
• Demonstrated by empirical data
showing that the selection procedure
is predictive of, or significantly
correlated with, important elements
of work behavior
• Relies on “correlations” between tests
and job criteria
www.biddle.com
Copyright © 2006
Criterion Validity
Test
Job
Performanc
e
The strength of this relationship
is reported as a “Validity
Coefficient”
www.biddle.com
Copyright © 2006
Performance Measure
Criterion-Related Study
70
60
50
40
30
20
10
0
0
20
40
60
80
100
Test Score
Score on some “Criteria” (e.g.,
job performance, days missed
work, etc.)
www.biddle.com
Score on a
“Test”
Copyright © 2006
Performance Measure
Criterion-Related Study
70
60
50
40
30
20
10
0
0
20
40
60
80
100
Test Score
Test Score = 22
Performance = 31
www.biddle.com
Correlation Demo
Test Score = 85
Performance = 55
Copyright © 2006
Interpreting Correlation Coefficients
+1.00
+0.50
0.00
-0.50
The closer to +1.00 or -1.00 the
stronger the relationship between
the variables
The stronger the relationship
between two variables, the better
the ability to predict one if given
the other
-1.00
www.biddle.com
Copyright © 2006
Guidelines for Interpreting Validity Coefficients
Validity Coefficient
Interpretation
>.35
very beneficial
.21 - .35
likely to be useful
.11 - .20
depends on
circumstances
unlikely to be useful
< .11
Source: Testing and Assessment: An Employer's
Guide to Good Practices (U.S. DOL, 1999).
www.biddle.com
Copyright © 2006
CRV and Statistical Power
• Power = the ability of a statistical
study to find “statistical significance”
if it exists
• Power is determined by:
– Sample size (N)
– Effect size (r)
– “1 tail” or “2 tail tests” and
– Statistical significance level (p)
www.biddle.com
Copyright © 2006
Statistical Power for Criterion-Related Validity
Studies
100%
Statistical Power
90%
80%
70%
r = 0.20
r = .25
r = .30
60%
50%
40%
30%
20%
30
50
70
90 110 130 150 170 190 210 230 250
Sample Size
www.biddle.com
Copyright © 2006
Validity Generalization (VG): A
Brief Overview
www.biddle.com
Copyright © 2006
VG = Meta Analysis Applied to Test
Validation Research
• VG applies meta-analysis techniques to
combine the results of several
validation studies to form general
theories about relationships between
variables across different situations
• Schmidt & Hunter (1977) opened the
gate to VG techniques in the personnel
testing field
www.biddle.com
Copyright © 2006
VG Uses and Applications
• VG is typically used to answer
questions about how:
– Specific Tests and/or
– Constructs (traits or abilities)
• Predict across:
– Criteria
– Occupations
– Settings
www.biddle.com
Copyright © 2006
• Meta-analysis Example: Results for Cognitive Ability for
Police Officer Occupation (Aamodt, 2004)
Criterion
K
N
r
ρ
Academy
61
14,437
0.41
0.62
Supervisor Ratings
61
16,231
0.16
0.27
Commendations
7
2,015
-0.01
-0.02
Activity
6
656
0.19
0.33
Absenteeism
5
1,402
-0.03
-0.05
Injuries
3
1,891
-0.06
-0.08
13
4,850
-0.06
-0.11
7
3,019
-0.12
-0.21
6
1,831
-0.03
0.06
Discipline Problems
Discipline Problems: Fired
or Suspended
Discipline Problems:
Complaints/Reprimands
K = number of studies, N = sample size, r = mean correlation, ρ
= mean correlation corrected for range restriction.
www.biddle.com
Copyright © 2006
Study
Validity
Sample Power
p#
Coefficient
Size
(1-tail) value Valid?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
www.biddle.com
0.030
0.135
0.180
0.290
0.340
0.180
0.150
0.110
0.090
0.126
0.210
0.390
0.198
0.164
0.109
0.094
0.020
0.114
0.164
0.070
0.010
0.010
120
130
140
150
120
130
140
150
120
130
140
150
120
130
140
150
120
130
140
150
120
130
87%
89%
91%
93%
87%
89%
91%
93%
87%
89%
91%
93%
87%
89%
91%
93%
87%
89%
91%
93%
87%
89%
0.37
0.06
0.02
0.00
0.00
0.02
0.04
0.09
0.16
0.08
0.01
0.00
0.02
0.03
0.10
0.13
0.41
0.10
0.03
0.20
0.46
0.46
No
No
Yes
Yes
Yes
Yes
Yes
No
No
No
Yes
Yes
Yes
Yes
No
No
No
No
Yes
No
No
No
• 90% power to detect
r=.25 using sample of
134
• 12 studies (over half)
showed no validity in
local settings
• 8 studies had low
correlations (< .11)
• VG output corrected
for unreliability and:
– Direct RR: .24
– Indirect RR: .48
Copyright © 2006
Factors That Can Influence Validity From “moving” Between Situations
Factors Before/At Testing Situation
Factors Occurring After Testing
•
•
•
•
•
•
•
•
•
•
Sample Size
Base Rate (% of applicants who
“show up qualified”)
Competitive Environment
Other Selection Procedures Used
Before/After the Test
Test Content
Test Administration Conditions
(proctoring, time limits, etc.)
Test Administration Modality
(e.g., written vs. online)
Test Use (ranked, banded, cutoffs
used)
Test Reliability (e.g., internal
consistency)
Test Bias (e.g., culturally-loaded
content)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
www.biddle.com
Job Content Comparability
Job Performance Criteria
Reliability of Job Performance Criteria
Level of Supervision/Autonomy
Level/Quality of Training Provided
Org./Unit Demands & Constraints
Job Satisfaction
Management Styles and Role Clarity
Reward Structures and Processes
Organizational Citizenship, Morale, and
Commitment of the Workforce
Organizational Culture, Norms, Beliefs,
Values, Expectations Surrounding Loyalty
and Conformity
Organizational Socialization Strategies for
New Employees
Formal and Information Communication
(Style, Levels, and Networks)
Centralization and Formalization of
Decision-Making
Organization Size
Physical Environment
Copyright © 2006
Title VII Requirements for
Tests that Exhibit Adverse
Impact
www.biddle.com
Copyright © 2006
TEST
How Can Testing Practices be
Challenged?
Title VII Disparate Impact
Discrimination Flowchart
YES
Is the PPT
Valid?
YES
Alternative
Employment
Practice?
NO
Defendant Prevails
www.biddle.com
Adverse
Impact?
NO
END
NO
Plaintiff
Prevails
YES
Plaintiff Prevails
Copyright © 2006
Test Validation & Adverse Impact
Civil Rights Act of 1991
Amends Section 703 of the 1964 Civil Rights Act (Title VII)
(k)(1)(A). An unlawful employment practice based on
disparate impact is established under this title only if:
•
•
A(i) a complaining party demonstrates that a respondent uses
a particular employment practice that causes a disparate
impact on the basis of race, color, religion, sex, or national
origin, and the respondent fails to demonstrate that the
challenged practice is job-related for the position in question
and consistent with business necessity; OR,
A(ii) the complaining party makes the demonstration
described in subparagraph (C) with respect to an alternate
employment practice, and the respondent refuses to adopt
such alternative employment practice.
www.biddle.com
Copyright © 2006
Uniform Guidelines
Transportability (7B)
Job Duties
Performed
By Incumbents
In Original
Validation
Study
Job Duties
Performed By
Incumbents
In New Local
Situation
Validity Can be
“Transported”
www.biddle.com
Copyright © 2006
EEOC v. Atlas Paper (1989, 6th Circuit)
• “. . . the expert failed to visit and inspect the Atlas office and
never studied the nature and content of the Atlas clerical and
office jobs involved. The VG theory utilized by Atlas with
respect to this expert testimony under these circumstances is
not appropriate. Linkage or similarity of jobs in dispute in this
case must be shown by such on site investigation to justify
application of such a theory.”
•
The premise of the VG theory . . . is that intelligence tests are
always valid. The first major problem with a VG approach is
that it is radically at odds with Albemarle Paper v. Moody,
Griggs v. Duke Power, relevant case law within this circuit,
and the EEOC Guidelines, all of which require a showing that a
test is actually predictive of performance at a specific job.
The VG approach simply dispenses with that similarity or
manifest relationship requirement . . . (emphasis added)
(EEOC v. Atlas Paper, 868 F.2d. at 1499).
www.biddle.com
Copyright © 2006
VG, Title VII, and the Courts
• When the courts evaluate criterion-related
validity evidence, four basic elements are
typically inspected:
– Statistical significance
– Practical significance
– Type and relevance of the job criteria
– Evidence to support the specific use of the
test
• VG has a difficult time answering these
questions…
www.biddle.com
Copyright © 2006
Recommendations for Applying VG in
Personnel Testing Research
• Recommendation #1: Address the evaluation
criteria provided by the Uniform Guidelines,
Joint Standards, and SIOP Principles regarding
the evaluation of the internal quality of the VG
study. This will help insure that the VG study
itself can be relied upon for drawing
inferences.
• Key Factors:
– Publication Bias
– Corrections Made and Underlying
Assumptions/Justifications
– Similarities of Tests and Criteria
www.biddle.com
Copyright © 2006
Recommendations for Applying VG in
Personnel Testing Research
•
Recommendation #2. Address the criteria provided by the
Uniform Guidelines, Joint Standards, and SIOP Principles
regarding the similarity between the VG study and the local
situation.
– Helps to insure that the VG study can be relied upon and
the research is relevant to the local situation (similarities
between tests, jobs, job criteria, etc.).
– The most critical factor evaluated by courts when
considering VG evidence is the similarity between jobs
(see also 7B of the Uniform Guidelines).
– VG evidence is the strongest where there is clear
evidence that the job duties between the target position
and those in the positions in the VG study are highly
similar as shown by a job analysis in both situations.
www.biddle.com
Copyright © 2006
Recommendations for Applying VG in
Personnel Testing Research
• Recommendation #3: Only use VG evidence to
supplement other sources of validity evidence
(e.g., content validity or local criterion-related
validation studies) rather than being the sole
source.
– Supplementing a local criterion-related validity
study with evidence from a VG study may be
useful if an employer has evidence that
statistical artifacts (not situational moderators)
suppressed the actual validity of the test in the
local situation (provided that the job
comparability criteria of 7B UGESP has been
met).
www.biddle.com
Copyright © 2006
Recommendations for Applying VG in
Personnel Testing Research
• Recommendation #4: Evaluate the test
fairness evidence from the VG study using the
methods outlined by the Uniform Guidelines,
Joint Standards, and SIOP Principles.
• Recommendation #5: Evaluate and consider
using “alternate employment practices” that
are “substantially equally valid” (as required
by the 1991 Civil Rights Act Section 2000e2[k][1][A][ii] and Section 3B of the Uniform
Guidelines).
www.biddle.com
Copyright © 2006
Thank you!
www.biddle.com
Copyright © 2006
Download