Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 11 Interpreting Standardized Tests Introduction A standardized test is one that is administered, scored, and interpreted in identical fashion for all examinees. Standardized tests allow educators to gain a sense of the average level of performance for a well-defined group of students. Classroom teachers have no control over these types of tests, but must understand their nature and interpretation. Achievement tests measure academic skills; aptitude tests measure potential or future achievement. Introduction Nationally known standardized tests include: • MAT • CAT • CTBS • ITBS • PRAXIS I and II Many states also use state-mandated tests, which are authorized by state legislatures or boards of education, and are used as high school graduation requirements. Two types of standardized tests are norm-referenced (no predetermined passing score; performance is based on comparisons to others) and criterion-referenced (performance is compared to preestablished criteria). Methods of Reporting Scores on Standardized Tests Criterion-Referenced Tests • Permit teachers to draw inferences about what students can do relative to large domain. • Answer the following questions: What does this student know? What can this student do? What content and skills has the student mastered? • • Report raw scores, usually in the form of number or percentage of items answered correctly. Other, less common results include speed of performance, quality of performance, and precision of performance. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests • Permit comparisons to well-defined norm group (intended to represent current level of achievement for a specific group of students at a specific grade level). • Answer the following questions: What is the relative standing of this student across this broad domain of content? How does the student compare to other similar students? • Scores are often transformed to a common distribution—normal distribution or bell-shaped curve. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Normal distribution Three main characteristics: Distribution is symmetrical. Mean, median, and mode are the same score and are located at center of distribution. Percentage of cases in each standard deviation is known precisely. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Normal distribution (continued) Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Raw score Number of items answered correctly. Not very useful for norm-referenced tests. Score must be transformed in order to be useful for comparisons. • Percentile rank: Single number that indicates the percentage of norm group that scored below a given raw score. Ranges from 1 to 99; much more compact in middle of distribution (doesn’t represent equal units). Often misinterpreted as percentage raw scores. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Grade-equivalent score: The grade in the norm group for which a certain raw score was the median performance. Consists of two numerical components: The first number indicates grade level and the second indicates the month during that school year (ranges from 0 to 9); for example, grade-equivalent score of 4.2. Often misinterpreted as standard to be achieved. Although scores represent months, they do not represent equal units. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Standardized score: Score that result from transformation to fit normal distribution. Overcomes previous limitation of unequal units. Allows for comparison of performance across two different measures. Reports performance on various scales to determine how many standard deviations the score is away from the mean. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Standardized scores (continued) z-score More than 99% of scores fall in the range of –3.00 to +3.00. Sign indicates whether above or below mean; number indicates how many standard deviations away from mean. Half the students will be above; half will be below. Problems with interpreting negative scores. T-score Provides location of score in distribution with mean of 50 and standard deviation of 10 (over 99% of scores range from 20 to 80). Can be misinterpreted as percentages. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Standardized scores (continued) SAT/GRE score Provides location of score in distribution with mean of 500 and standard deviation of 100 (over 99% of scores range from 200 to 800). Stanine score Provides the location of a raw score in a specific segment or band of the normal distribution. Mean of 5 and standard deviation of 2; range from 1 to 9. Represents coarse groupings; does not provide very specific information. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Standardized scores (continued) Normal curve equivalent (NCE) score Mean of 50 and standard deviation of 21.06; matches percentile ranks at three specific points (1, 50, and 99). Unlike percentile ranks, represents equal units. Deviation IQ score Provides location of score in distribution with mean of 100 and standard deviation of 15 or 16. Primarily used with measures of mental ability. Methods of Reporting Scores on Standardized Tests Norm-Referenced Tests (continued) • Standardized scores (continued) All standardized scores provide the same information, simply reported on different scales. Methods of Reporting Scores on Standardized Tests Interpreting Student Performance Norm-Referenced Tests • Error exists in all educational measures. Can affect scores both negatively and positively. • Standard error of measurement (standard error or SEM): The average amount of measurement error across students in norm group. Provides a range (known as a confidence interval) of performance when both added and subtracted from test score. Confidence Interval = Score ± Standard Error Interpreting Student Performance • Standard error of measurement (continued) Purpose of confidence interval is to determine range of scores that we are reasonably confident represents a student’s true ability. 68% confidence interval (observed score ± one standard error). 96% confidence interval (observed score ± two standard errors). 99% confidence interval (observed score ± three standard errors). Interpreting Student Performance • Standard error of measurement (continued) On norm-referenced tests, confidence intervals are presented around student’s obtained percentile rank score. Known as national percentile bands. Can be used to compare subtests by examining the bands for overlap. When bands overlap, there is no real difference between estimates of true achievement on subtests. Uses of Test Results for Teachers Two main ways that test results can be used by teachers: • For revising instruction for entire class. • For developing intervention strategies for individual students. Standardized test results have not typically been used to aid teachers in making instructional decisions. Data-driven decision making takes some practice and experience for classroom teachers. Uses of Test Results for Teachers • For revising instruction for the entire class: Standardized Test Scores 1. Identify any content area or subtest where there are high percentages of students who performed below average. 2. Based on these percentages, rank order the 6–8 content areas or subtests with the poorest performance. 3. From this list, select 1–2 content areas to examine further by addressing the following: • Where is this content addressed in our district’s curriculum? • At what point in the school are these concepts/skills taught? • How are the students taught these concepts/skills? • How are students required to demonstrate that they have mastered the concepts/skills? In other words, how are they assessed in the classroom? 4. Identify new/different methods of instruction, reinforcement, assessment, etc. Revise Instruction Uses of Test Results for Teachers • For developing intervention strategies for individual students: Standardized Test Scores 1. Identify any content area or subtest where the student performed below average. 2. Rank order the 6–8 content areas or subtests with the poorest performance. 3. From this list, select 1–2 content areas to serve as the focus of the intervention. 4. Identify new/different methods of instruction, reinforcement, assessment, etc., in order to meet the needs of the individual student. Revise Instruction Analyzing Student Performance—An Example 1 A B 2 3