Chapter 6: Validity

advertisement
Chapter 6: Validity Concepts
Definition
•
Accuracy: validate the interpretation of test performance
Face Validity
•
Degree to which a test superficially appears to measure domain
•
Math test
Establishing Construct Validity
•
Content Validity
•
Criterion-related Validity
•
Comprehensive evaluation of the theoretical framework for a test
Content-related Validity
•
Systematic examination
•
Free from irrelevant variable influence
Content-related validity: Process
•
Complete an examination of the literature
•
Generate an adequate sampling of the “item universe”
•
Domain must be proportionately represented in test
Content-related Validity: Procedure
•
Domain in consideration must be fully described
•
Description of procedures for item appropriateness & representativeness
•
Cover subject matter and objectives of testing
Content Validity Ratio (CVR)
•
Content Validity can be quantifiably measured
ne = number of panelists who agree an item is essential
•
N = total number of panelists
CVR Example
•
Gonzalez Anxiety Scale has 50 items
•
20 experts rate each item
•
not essential, somewhat essential, and essential
•
What is the CVR if 9 panelists rate item 1 essential?
•
Table provided on p. 179
•
Should we keep item 1?
Content Validity: Limitations
•
Biases
•
Cultural relativism
•
Level of expertise of the panelists
Criterion-related validity
•
Index of relationship between test and criterion
•
A criterion should be similar to the test, reliable, and valid
•
SAT predicts college performance (GPA)
Two kinds of Criterion Validity
•
Concurrent
•
Predictive
– Based on temporal (time) estimates
Concurrent Criterion-related Validity
•
Test and criterion are measured at roughly the same time
•
Impractical to wait for a secondary evaluation
– e.g., a diagnostic measure to generate diagnostic impression
Predictive Criterion-related Validity
•
Test and criterion are compared over a period of time
•
Used in Decision Theory
– e.g., A job abilities test is used to predict job performance
CRV: Limitations
•
Possible problems from "criterion contamination"
•
Coefficient affected by range of the sample
•
Homogeneous vs. heterogeneous sample
Construct-related validity
•
Extent to which a test measures a theoretical construct
•
Construct: psychological trait
Construct-related validity: Process
•
Theoretical relationships specified
•
Empirical relationships examined
•
Empirical evidence interpreted
Construct Validity: Techniques
•
Convergent validation
•
Discriminant validation
•
Factor Analysis
•
Multitrait-Multimethod Matrix
•
Reliability
Convergent Validation
•
A test should correlate highly with another test that is theoretically related
– e.g., a math test and numerical reasoning test
Discriminant Validation
•
A test ought not to correlate with a theoretically unrelated test
– e.g., a self-esteem test and a comprehension test
Factor Analysis
•
Descriptive statistical technique
•
Analyzing the factors/dimensions of the test
•
Factorial validity
Internal Consistency
•
Consider homogeneity of a test
•
Subtests (or items) correlate with test total score
•
Provides evidence that the test measures a single concept
Predicted Change Over Time
•
Examining pre and post test scores
•
Assessing predicted change after an experimental intervention
– e.g., a depression intervention should improve (change) scores on a depression scale
Predicted Differences Between Distinct Groups
•
Analyzing scores of contrasted groups
•
Depressed sample scores should differ from the non-depressed sample
Multitrait Multimethod Matrix (MTMM Matrix)
•
Campbell & Fiske (1959)
•
Correlation of 2 or more traits by 2 or more methods
•
Methods: self-report vs. spousal ratings vs. peer observations
•
Traits: job satisfaction vs. marital satisfaction vs. self-satisfaction
Reliability Coefficients
•
Monotrait monomethod
– Same trait, same method
Validity Coefficients
•
Squaring the validity coefficient computes the proportion of variance that could be
accounted for as a result of the test (predictor)
Monotrait Heteromethod
• Same trait, different method
Heterotrait monomethod
• Different trait, same method
Heterotrait heteromethod
• Different trait, different method
Validity Coefficient Magnitude
•
Nature of the group
•
Variability in gender, age, education, race
•
Validity coefficient tends to decrease across groups
Sample range
•
Homogeneity v. heterogeneity of the sample
•
The wider range of scores (variability) the higher the correlation
•
Comparison of extremely different (contrasted) groups
Test Reliability
•
A validity coefficient is limited by the reliability of the test and reliability of the criterion
•
An unreliable test is an invalid test
•
rxy = validity coefficient
•
rxx =test reliability
•
ryy = criterion reliability
Test-criterion Relationship
•
Both assumed to have linear and equal variances
•
Homoscedascity means equal variances
•
Curvilinear or unequal variances
Test Bias
•
Constant or systematic error in a test
•
A consideration when looking at cross-cultural issues
•
Is the test fair to all groups?
Differential Validity
•
Evaluate differences between the validity coefficients using cross-validation
•
Analysis could reveal shrinkage
Predictive Validity Coefficient Error
•
Margin of error to be expected in individuals predicted criterion score
•
Is there error in test validity?
•
Perform a Standard Error of Estimate (SEE)
Standard Error of Estimate (SEE)
•
sy2 = standard deviation of criterion score
rxy = square of the validity coefficient of the criterion
•
•
Example
Y = 70, sy = 10, rxy = .80
•
What is SEE?
Decision Theory Cronbach & Glaser (1965)
•
Criterion related-predictive validity
•
Expectancy data used in job selection testing
•
How well does a test predict job performance?
Possible Outcomes
•
1) Valid acceptance: True positive
•
2) Valid rejection: True Negative
•
3) False negative
•
4) False positive
Incremental Validity
• Base rate
• Cut-off score
• Incremental validity is the increase in predictive validity, over the base rate, because of a
test
Validity Summary
• Examiner is interested in obtaining information about:
– Examinee's knowledge of a particular domain
– Amount of construct possessed by examinee's on a specified domain
– Examinee's likely performance
Download