NOTES ON THE VALIDITY OF EDUCATIONAL ACHIEVEMENT TESTS 1. An educational achievement test is one designed to measure the extent to which the immediate goals of instruction in a particular area have been attained. Ideally, most immediate goals of instruction are derived from consideration of ultimate goals in such a manner that attainment of the immediate goals is a good indication that the ultimate goals will be attained. 2. Any educational achievement test in a given area is a sample of items from a hypothetical population of items which constitute one operational definition of the measurable, immediate goals of instruction in that area. 3. The value of an educational achievement test may, in principle, be judged on the basis of evidence concerning its content validity, concurrent validity, construct validity, predictive validity or some combination of these. However, for most tests of educational achievement, claims of validity must rest largely on evidence for content validity. It is seldom that evidence for other types of validity is both available and relevant. 4. The most fundamental and direct evidence of content validity is derived from detailed examination of the test itself. The more closely the achievements a test requires are related to the immediate objectives of instruction in the area, the greater its content validity. Therefore, it is appropriate for a teacher, or other school personnel, to take an educational achievement test for a particular purpose partly on the evidence provided by this direct examination. 5. Ability to judge the content validity of a test is aided by presenting an outline of the achievements covered by the test, by showing which items are intended to measure each achievement covered, and by summarizing other criteria, principles and standards which guided the test author(s) in selecting questions and writing items. 6. Since reliability is a necessary condition for validity, indices of score reliability and of item discrimination are useful in judging the content validity of an educational achievement test. Therefore, a test manual should summarize and interpret relevant statistical data resulting from internal analysis of test scores and responses to test items. 7. While it is theoretically possible to establish the validity of an educational achievement test by comparing scores on it with subsequent measure of attainment of the ultimate objectives of instruction, this process is ordinarily so difficult and time consuming that it is seldom if ever attempted. Instead, the less precise, but possibly more comprehensive, process of logical inference and subjective judgment usually must be used. 8. The value of a subjective judgment on the validity of an educational achievement test depends largely on the educational competence and expertise in testing of the person making the judgment. 9. Coefficients of correlation between scores on a given test and those on alternative measurement procedures may be useful in judging the concurrent, predictive, or construct validity of a test, but are completely irrelevant to its content validity. 10. The degree of concurrent validity is a direct function of the size of the coefficient of correlation between the test scores and the other measures the test is intended to replace. 11. The degree of construct validity is a complex function of the agreement between correlations predicted on the basis of theoretical definitions of the construct, and correlations obtained experimentally. 12. It is necessary for a test author to present evidence concerning all four types of validity for each educational achievement test he/she develops. 13. A test manual should summarize and interpret relevant statistical data on the relation between scores from the test and from alternative measures of achievement in the area, or between distributions of scores from groups know to differ with respect to the achievement measured.