Ch 11: Standardized Tests I: Achievement Tests


Chapter 11.

What did you learn in school?

The public believes the answer to this question is best found in standardized testing . . . in seeing how “my” child or community compares to others.

Educators disagree.

Thus arises one of the hottest topics in education today -


Every teacher is involved in the administration and interpretation of standardized achievement tests in one form or another.

All educators need to be aware of the various purposes, strengths and shortcomings of these tests.

Potential Negative Effects of High Stakes Testing

on the way you teach and on the likely results.

Because of testing, the school may value some subjects at the expense of others, thus the curriculum contains some subjects

“judged” more important than other subjects.

As a teacher, you may be asked to focus more on aspects of your subject which could come at the expense of other aspects.

You may decide to place emphasis on teaching some students at the expense of other students.

The consequences of these decisions could lead to learning that is

Shallow or trivial

Short-lived or narrow

 By the way, these educational effects could be “nurtured” by schools and teachers without the insertion of high stakes testing .

Yet, there are potential positive effects

. . . for some teachers and schools in high stakes testing.

The status of having high achieving schools attracts many teacher applications and new residents.

Teachers may be rewarded in terms of salary and benefits.

Teachers may have more flexibility in terms of how they design instruction and run their classes.

Topics: Standardized Achievement Tests

Review of the various meanings of the term


Contrasting standardized achievement tests with teacher-made assessments

Six classes of standardized achievement tests

Special procedures to follow when administering standardized achievement tests

Remember from earlier discussions, the meaning of the term . . .

“Standardized” - A Cloze Review

Standardized Testing usually involves:

Uniform, clearly specified methods and procedures for __________ the test.


Attention to, and written reports on, three technical characteristics of testing to include consistency or __________, item __________, and test


reliability, analysis, bias

Based on many previous cases, the test has large-group scoring __________.


Often, but not always, a standardized test is group __________, machine

__________, and composed of largely __________ items. administered, scored, multiple-choice

Achievement tests are designed to measure what one already knows. To insure that the standardized tests are based on real school curricula, the makers pay attention to __________.

content validity

Most Standardized Tests are Timed

Howard Gardner says . . .

“Nothing of consequence would be lost by getting rid of timed tests by the College Board or, indeed, by (schools) in general.

Few tasks in life — and very few tasks in scholarship — actually depend on being able to read passages or solve math problems rapidly. As a teacher, I want my students to read, write and think well; I don't care how much time they spend on their assignments. For those few jobs where speed is important, timed tests may be useful.”

“Testing for Aptitude, Not for Speed,” New York Times, July 18, 2002

Contrasts with Teacher-Made Tests

 Level of detail covered

– standardized tests are more general; they’re mostly a sampling of what was studied.

 Research base

– teachers rarely have the time to prepare items as extensively as a standardized test company.

 Availability of norms

– teachers have only their previous students for comparison, and this is mostly informal; nationally standardized tests have norms allowing wider comparisons of achievement.


Frequency of occurrence – standardized tests are infrequent although their variety and their “high stakes” nature may make them “feel” dominant.

Does it make sense that we need both? Both have different purposes. By the way, it is interesting to note that those students who do well on teacher-made tests also do well on standardized tests. Students who don’t “like” tests tend to not like either type of testing.

Classification of Standardized Achievement Tests

. . . we will discuss the following six groupings.


Achievement batteries


Single area tests


Licensing and certification exams


State testing programs


National and international studies


Individually administered achievement tests

1) Achievement Batteries

A system of interrelated K-12 tests . . .

 Basic Idea:

To determine each student’s general achievement standing with respect to regional or national group performance over time and across subject areas. Typically used K-12.

The test battery is a group or system of interrelated tests that contain a fairly limited sample of questions covering many subject areas, many grade levels.

The direct comparability of normed scores across content areas and grade levels is one the greatest values of these achievement batteries. These tests are based on high quality sources of information for their content (e.g., National Learned Societies, Professional

Organizations, State Curricular Guides from large states).

The original intent was to monitor individual progress in the major areas of the school curriculum, with the school and the teacher being the intended score recipient. This reporting has been expanded to parents.

Methods of assessment found in a test battery include more than multiple-choice items (e.g., writing exercises, open-ended questions, performance measures).

Major batteries are more alike than they are dissimilar (e.g., Stanford Achievement Test

Series: Metropolitan Achievement Tests; Iowa Test of Basic Skills).

2) Single Area Achievement Tests

Typically high school and beyond . . .

 Basic Idea:

To determine each student’s specific achievement standing with respect to regional or national group performance in a single subject area. May include criterion-referenced interpretations. Typically used in high school or diagnostically K-12.

There are single area achievement tests related to nearly all high school subjects. Check out the Educational Testing Service website: ETS Test Link Overview

 Notice that there appears to be a wide range of quality. Check dates.

 They even ask if you would like to submit your tests to the Test Collection at ETS

(What does this suggest?).

Two Example Areas:

Diagnostic Tests are highly detailed (and therefore long to take) achievement tests with extensive subscores. These tests are administered individually and are used for formative evaluation. Reports may include criterion-referenced interpretations to aide intervention.

Advanced Placement (AP) Exams and SAT Subject Tests fall into this category and the outcome emphasis is on the final total score. These are most often used for summative assessment.

3) Licensing & Certification Exams

Using the Praxis exam series as an example . . .

 Basic Idea:

To determine each student’s specific achievement in a single subject area with a cut-off score defining acceptable performance

(government set minimal level in order to protect the public).

The Praxis series is a descendent of the National Teacher Exams.

Really a series of separate exams (e.g., Praxis I - Academic Skills; Praxis II - PLT and

Subject Areas; Praxis III First-Year Observation).

The scores are characteristically reported as scaled scores (mean and standard deviation created). The recipient often thinks the score is criterion-based while it is really norm-based.

Norms are based on whatever individuals took the exams in the most recent three-year period.

Each state sets its own cut-off scores. These score are usually typically between the 10 th and 20 th percentile.

The Ohio Department of Education (ODE) selects the tests required and sets the qualifying scores (i.e., cut-off scores). Both the selected tests and the qualifying scores are subject to change by ODE.

The State of Ohio uses the examinees’ scores as a measure of a university’s teacher education program’s adequacy (PASS/FAIL rates, not the scores themselves).

4) State Assessment Programs

The Ohio Report Card System . . .

Basic Idea: To maintain receipt of federal funds. While some states have a long history of testing programs, the NCLB Act created a mandate. Now all states have them. Typically,

State systems:

Concentrate on basic skills

Examine grades 3-8 . . . plus one high school grade

Employ existing achievement batteries

Are aimed at state content standards

Utilize a combination of multiple-choice & performance items

Issue reports to the public

Draw on a proficiency basis for reporting

Use high school graduation tests

Include public school students only

We will be discussing Ohio’s Report Card System in some detail.

5 ) National and International Assessment

Narrowly defined waves of comparison studies . . .

Basic Idea: To create benchmarks regarding the educational achievement of

American students across the nation and across the world. Each scheduled wave of testing may address only one or a few areas of interest and the cohorts may be small.

 Take a look at the following websites:

National Assessment of Educational Progress (NAEP)

Content areas covered

Ages/grades covered

Nature of reports

Trends in International Mathematics and Science Study (TIMSS) and Progress in

International Reading Study (PIRLS)

Content areas covered

Ages/grades covered

Nature of reports

6) Individually Administered

. . . may be coupled with to aptitude testing

Basic Idea: To diagnose discrepancies among various achievement levels or between achievement and mental ability. Sometimes these tests are called psychoeducational batteries. Administered by school psychologists.

As a teacher, know that these tests do exist. You would never administer these or use them to provide formative information about your curriculum. You may find yourself discussing elements of these as part of an IEP process.

Administering Standardized Tests

Same as discussed vis-à-vis teacher made tests, plus . . .

Before Testing

Attend to students’ test taking skills and test taking motivation

Read the directions in test manual to yourself in advance (both what to say and do)

Ensure availability of materials (e.g., know what you need – pencils, watch, etc.)

During the Test

 Follow the test manual directions exactly

Time the test session accurately (need clock/watch with second hand)

Monitor the situation and make notes on unusual situations (e.g., distractions)

After the Test

 Retrieve all materials

File your notes

Clean up answer documents if requested in test manual (smudges? correctly coded?)

Clearly Unethical Practices

Changing students’ answers (e.g., fill in blanks)

Deliberately not following directions (e.g., allow more time)

Giving the actual test items to students in advance

Practical Advice






Keep in mind the strengths and weaknesses of both standardized and teacher-made tests. Both contribute to a successful assessment program.

Know basis for norms and performance categories.

Understand the content outline of Ohio Praxis I and II exams.

Be familiar with sources like NAEP and TIMSS.

Know principles of good practice for administering standardized tests.

Terms Concepts to Review and

Study on Your Own





psychoeducational battery

standardized test