NCLB: Changing It, Fixing It, Living With It

advertisement
NCLB: Changing It; Fixing It; Living With It
NCLB: Changing
It; Fixing It; Living
With It
NATIONAL ASSOCIATION OF TEST
DIRECTORS 2007 SYMPOSIUM
Organized by:
Bonnie Wilkerson, Ed.D.
Northbrook (IL) School District 27
Edited by:
Joseph O'Reilly, Ph.D.
Mesa (AZ) Public Schools
National Association of Test Directors 2007 Proceedings
1
NCLB: Changing It; Fixing It; Living With It
This is the twenty-third volume of the published symposia, papers and
surveys of the National Association of Test Directors (NATD). This
publication serves an essential mission of NATD - to promote discussion
and debate on testing matters from both a theoretical and practical
perspective. In the spirit of that mission, the views expressed in this
volume are those of the authors and not NATD. The paper and
discussant comments presented in this volume were presented at the
April, 2007 meeting of the National Council on Measurement in
Education (NCME) in Chicago, Illinois.
National Association of Test Directors 2007 Proceedings
The authors, organizer and editor of this volume are:
Robert Linn
University of Colorado at Boulder
PO Box 1815
100 Fifth St
Ouray, CO 81427
robert.linn@colorado.edu
Judy Feil
Ohio Department of Education
Office of Assessment, Mail Stop #507
25 South Front Street
Columbus, OH 43215
judy.feil@ode.state.oh.us
David Kroeze
Northbrook School District 27
1250 Sanders Road
Northbrook, IL
kroeze.d@northbrook27.k12.il.us
Barbara Boyd
Education Partnerships Officer,
Nationwide Mutual Insurance Company
One Nationwide Plaza
Columbus, OH 43215
boydb@nationwide.com
Glynn Ligon
ESP Solutions Group
8627 N. Mopac, Suite 400
Austin, TX 78759
gligon@espsg.com
Bonnie Wilkerson, Organizer and Moderator
Northbrook School District 27
1250 Sanders Road
Northbrook, IL
wilkerson.b@nb27.org
Joe O'Reilly, Editor
Mesa Public Schools
63 East Main Street #101
Mesa, AZ 85201
(480) 472-0241
joreilly@mpsaz.org
1
NCLB: Changing It; Fixing It; Living With It
National Association of Test Directors 2007 Proceedings
Table of Contents
A Nationwide Overview of NCLB
Robert Linn.....................................................................................................................1
The State of NCLB at a State Department of Education
Judy Feil…………………………………………….................................................…19
Is A High Performing District’s Performance High Enough
for NCLB?
David Kroeze ….……………….………….……….………………………….......…28
Business Looks at NCLB’s Bottom Line
Barbara Boyd…….……………….………….……….…………………………....…39
Discussant Comments
Glynn Ligon…….……………….……………….………….………………….……48
1
NCLB: Changing It; Fixing It; Living With It
Needed Modifications of
NCLB
Bob Linn
CRESST, University of Colorado at Boulder
NCLB has much that is worthwhile. It is particularly praiseworthy for its
emphasis on all children, for the special attention it gives to improving learning
for children who have lagged behind in the past, and for the attention given to
closing persistent gaps in achievement. Although there are definition and
identification problems, NCLB has also called attention to the need to have
qualified teachers. It is important that the positive aspects of NCLB be preserved
National Association of Test Directors 2007 Proceedings
in the reauthorization process that is now under consideration. It is also
important, however, that some fundamental changes be made to make the law
more functional.
Among the major issues that need to be addressed are funding, providing
increased flexibility to states, districts, and schools in implementing the law, and
defining teacher quality. My focus, however, is much narrower and more
specific. My focus is on fixing the NCLB accountability system. In my judgment,
there are several fundamental problems with the NCLB accountability system
and those problems are serious enough that they threaten to undermine the more
laudable aspects of the law.
There are many problems with the NCLB accountability system, but I will focus
on just four of them that I believe are especially serious. These are: (1) unrealistic
expectations, (2) the meaning of proficient achievement, (3) the reliance on
current-status targets, and (4) the use of multiple hurdles. A fifth problem that is
really caused by states trying to do something reasonable in light of the four just
identified problems to keep an overabundance of schools from failing to make
adequate yearly progress (AYP). States have found ways to game the system
that have been approved by the U.S. Department of Education to avoid having
many more schools being identified as needs improvement than can possibly be
provided with effective assistance. I include as approved game playing such
activities as watering down the definition of proficient achievement, backloading state trajectories to the 2014 goal of 100% proficiency, increasing the
minimum number of students required to hold schools responsible for subgroup
results, and the use of confidence intervals - both for initial AYP calculations and
for safe-harbor calculations (see, for example, Porter, Linn, & Trimble, 2005 for a
discussion of some of these issues).
Unrealistic Expectations
The title of the law, No Child Left Behind, is rhetorically brilliant. No reasonable
person would argue that the education of some children should be ignored or
that some identified fraction of students should be left behind. Our society has a
moral obligation to attempt to provide a high quality education for all children.
Society also has a vested interest in enhancing the educational level of all
students. This does not mean, however, that it is reasonable to expect that all
students will, in fact, achieve a high level of proficiency in reading and
mathematics.
As will be discussed in the next section, NCLB does not provide a detailed
definition of proficient achievement, but it does specify that states must set
1
NCLB: Changing It; Fixing It; Living With It
“challenging achievement standards” and that the proficient level be one of two
high levels of achievement “that determine how well children are mastering the
material in the State academic content standards” (NCLB, 2001, Part A, Subpart
1, Sec. 1111 (b) (D) (ii)). The NCLB mandate that all children achieve at least the
proficient level is totally unrealistic when coupled with the mandate that states
set a “challenging” proficient achievement standard. As Rothstein, Jacobsen, and
Wilder (2006) so aptly put it, “’proficiency for all’ is an oxymoron” (p. 16).
I have previously used trends observed in reading and mathematics on the
National Assessment of Educational Progress (NAEP) and the distributions of
performance of students from other countries on the Third International
Mathematics and Science Studies (TIMSS) to show that 100% proficient is a goal
that is completely out of reach even with extraordinary effort on the part of
teachers and students (Linn, 2000; 2003; in press). Although there have been
fairly sizeable increases in the percentage of students who score at the proficient
level or above on the NAEP mathematics assessments, particularly at the fourth
grade, the rate of improvement would have to several times as fast in the twenty
years than it was in the last decade to come close to 100% by the year 2020, much
less to achieve that level in just seven more years. In reading, where the trend
lines on NAEP are best described as essentially flat, the prospect of obtaining
universal proficiency is even more remote.
As Rothstein and his colleagues point out, the issue is more than just the time
line. “There is no date by which all (or even nearly all) students in any
subgroup, even middle-class white students, can achieve proficiency” Rothstein,
Jacobsen, & Wilder, 2006, p. 1). The results of international assessments (see, for
example, Mullis, Martin, & Foy, 2005; Mullis, Martin, Gonzalez, & Kennedy,
2003; Organization for Economic Co-operation and Development, 2004) clearly
demonstrate that there is substantial variation in student achievement in every
participating country. As a result of the large within-country variation, no
country has nearly all its students performing above some high standard of
achievement that would correspond to proficient achievement as is demanded
by NCLB. Based on a linking of the 1999 TIMSS to the 2000 NAEP, Philips
(2007), for example, has shown that while countries such as Singapore and Korea
come reasonably close to having all their students perform at or above the NAEP
basic level on grade 8 mathematics, no country has even three quarters of their
students at or above the proficient level.
The only way to have all, or nearly all, students exceed a standard of
performance is to set that standard at a very low level. A standard that is
obtainable by the lowest achieving students will not be a challenge for
moderately high achieving students. On the other hand, a standard that is
9
National Association of Test Directors 2007 Proceedings
challenging to moderately high achieving students will be unobtainable for the
lowest achieving students (Rothstein, Jacobsen, & Wilder, 2006).
If proficient remains a challenging standard of achievement and the universal
proficiency goal is maintained for the year 2014, then nearly all schools will fail
to meet the goal. Although it may be reasonable to believe that improvement is
desirable for everyone and every school, it makes no sense to impose sanctions
on schools because they fall short of reaching the unrealistic goal of all students
achieving at the proficient level or above.
Goals should be ambitious, but they should also be obtainable given sufficient
effort and adequate resources. How can a goal be set that is ambitious, but
realistically obtainable? One way is to rely on past experience. Schools might be
rank ordered in terms of the rate of improvement in student achievement on the
state’s assessments in reading and in mathematics over the past 4 or 5 years.
The highest ranking, say 20%, of schools in terms of gains made on each
assessment could then be used to set the goal for all schools. If the top 20%
increased the percentage of students performing at the proficient level or above
by an average of, say, 2% per year in reading and, say, 3% per year in
mathematics then increases of 2% and 3% per year could be set as the goals for
reading and mathematics respectively for all schools. Those would certainly be
an ambitious goals for schools that had shown little if any improvement or
possibly even declined during the past 4 or 5 years, but it would also be based on
the knowledge that continued improvement at the identified rate is possible as
demonstrated by the performance of the set of schools that were used to set the
goals.
Proficient Achievement
As was previously noted, NCLB specifies that the proficient achievement
standard should be challenging and represent a high level of achievement, but it
does not give a detailed definition of proficient achievement. Rather, it is left to
each state to define the proficient standard. States typically set their achievement
standards for each assessment by convening a panel of judges who use a
provisional definition of proficient achievement when reviewing items on a state
assessment. Judges translate the provisional definition of proficient achievement
into a cut score on the test using the standard setting method selected by the
state.
The stringency of proficient standard varies widely from state to state. This is
evident from even a cursory consideration of the percentage of students who
scored at the proficient level for different states. Olson (2005) reported the
percentage of students who scored at the proficient level or above in reading and
1
NCLB: Changing It; Fixing It; Living With It
in mathematics at grades 4 and 8 for 47 states.1 The percentage proficient or
above in reading ranged from 35% to 89% at grade 4 and from 30% to 88% at
grade 8. In mathematics the range was even larger, from 29% to 92% at grade 4
and from 16% to 87%. These ranges are much larger than the corresponding
ranges found on the 2005 NAEP state-by-state assessments in reading and
mathematics at those grades. Furthermore, the states with extremely high or low
percentage proficient or above make no sense in terms of other things that are
known about education in those states. Only 16% of the students were proficient
or above on the 2005 grade 8 mathematics assessment in Missouri whereas 87%
of the 8th graders in Tennessee were reported to be at that level. Even without
resorting to a comparison of achievement on NAEP it simply is not credible that
more than 5 times as many 8th grade students are proficient in mathematics in
Tennessee than in Missouri. It becomes even less plausible when the 71
percentage point difference on the state assessments is contrasted with the
finding that the percentage of grade 8 students who scored at the proficient level
or above on NAEP mathematics assessment in 2005 was slightly higher for
Missouri (26%) than for Tennessee (21%) (Perie, Grigg, & Dion, 2005).
There are several preferable approaches to reporting results in terms of percent
proficient. For example, the standard could be defined to be equal to the state
median score in a base year. The percentage of students scoring above that
constant cut score would then be used to monitor improvement in achievement.
Target increases set based on past experience with the gains made by schools
showing high rates of improvement in each of the last several years. This might
lead to a figure something like a 3% increase per year in the percentage of
students above the state median. For a school with 50% of their students above
the state median in the 2006 base year the goal would 53%, in 2007. The goal
would be 74% and in 2014. That would represent a gigantic improvement in the
achievement of the state’s students, but might not be totally unrealistic, and
surely is not as implausible as 100% proficient or above.
Another alternative would be to use what Jim Popham (2004) has called gradelevel descriptions. At grade level might correspond more closely to the “basic”
than the “proficient” level in most states. Using past experience, targets could be
set that would bring the achievement of an ever increasing percentage of
students up to the “at-grade-level” standard.
1
The closest grade was used for states that did not have assessments in one of the subjects at grade 4 or
grade 8.
11
National Association of Test Directors 2007 Proceedings
Current-Status vs. Growth or Improvement
Although the NCLB accountability system might appear to focus on
improvement as suggested by the word progress in AYP, it actually focuses on
current status. Schools where students who are already achieving at relatively
high levels, for example, can actually have a decline in achievement from one
year to the next, and still make AYP. Schools with very low achievement
initially, on the other hand, will routinely fail to meet AYP even if they show
rather sizeable year-to-year gains in student achievement.
With the exception of the rarely applicable safe harbor provision, AYP focuses on
current achievement in a given year in comparison to an Annual Measurable
Objective (AMO) for that year rather than changes in achievement from one year
to the next. Consequently, schools that have a high achieving level to begin with
have a relatively easy time meeting AYP without any gains in achievement, at
least in the first few years. On the other hand, schools with initially low
achieving students would have to have extraordinary improvement in
achievement to meet AMO. Consequently, many schools that are actually
showing considerable progress, and deserve recognition for the gains they are
making, fail to meet AYP because of their initial low performance.
Basing evaluations of schools almost exclusively on current performance of
students in relationship to fixed targets ignores the fact that schools differ
substantially in the achievement of their students when they enter school. It
privileges schools serving students who are already high achieving and puts
schools serving initially low achieving students at a substantial disadvantage.
The inference that a school A is of low quality or that the teachers in school A are
less effective than those in School B based solely on the fact that the percentage of
students who are at the proficient level or above in a given year is smaller in
school A than the corresponding percentage at school B is simply not justified
because there are so many other possible explanations of the difference, most
notably that the students in the two schools differed in their levels of
achievement at the start of the year or when they entered first grade.
Many state devised school accountability systems base their evaluations of
schools on a combination of current status measures and improvement in
student achievement from one year to the next. Therefore it is not surprising that
a number of states have expressed interest in the possibility of changing the way
in which AYP is determined for NCLB to allow greater emphasis on
improvement.
A change in the NCLB accountability system that would allow schools to meet
AYP either because their current achievement met a target or because the
improvement in achievement met an improvement target seems desirable. This
1
NCLB: Changing It; Fixing It; Living With It
might be accomplished with a less stringent safe harbor criterion. Consistent
with proposals above, both the current year achievement target and the
improvement target should be set in light of what has been shown to be possible
by schools that have shown substantial gains over a period of 4 or 5 years.
An alternative way of evaluating change in achievement that is attractive to
several states is the use of longitudinal student records to track the growth in
achievement for individual students. Analytical procedures, commonly referred
to as value-added models, are used to estimate the school effects on student
growth. Consideration should be given to the possibility of allowing states to
use results of value-added analyses to provide evidence of improved
achievement. The value-added results could be used, possibly in combination
with status measures, to satisfy AYP requirements.
In response to widespread interest in approaches that focus on growth for
purposes of determining AYP, the U.S. Department of Education authorized a
pilot program that allowed states to submit proposals to use a growth model to
make AYP determinations. The pilot program was announced by Secretary
Spellings on November 21, 2005. Several “core principles” that must be met for a
proposal to be approved were identified in a letter from Secretary Spellings to
the Chief State School Officers regarding the pilot program. The first, and
perhaps the most constraining principle, specifies that the growth model “must
ensure that all students are proficient by 2013-2014 and set annual goals to
ensure that the achievement gap is closing for all subgroups of students”
(Spellings, 2005). Thus, despite the argument that the expectation is unrealistic,
the fixed achievement target of 100% proficient or above in 2013-2014 is
maintained.
Eight states submitted proposals to participate in the growth model pilot
program and two of those proposals (North Carolina and Tennessee) were
approved for implementation of growth model pilots in 2005-2006 (Spellings,
2006). Three more states (Arkansas, Delaware, and Florida) have been approved
for 2006-2007 and nine additional states have submitted proposals that are
currently under review (Olson, 2006).
The pilot program takes one step toward a system that would use information
about improvement as well as current achievement in determining whether or
not schools are performing adequately. This is an important step, but so far will
be applicable only for a small fraction of the states. It is also limited by the
continuing requirement that the amount of growth will lead to all students
reaching at least the proficient level by 2014. The option of using improvement
as well as current status to determine AYP needs to be available to more states
and, as was argued above, more realistic achievement goals need to be set.
13
National Association of Test Directors 2007 Proceedings
Many states lack a longitudinal data system that would allow them to implement
a value-added model. Improvement in performance of students in those states
could still be used in the determination of AYP by comparing the performance of
student cohorts from one year to the next. Comparisons of successive cohorts of
students (e.g., 4th grade student in 2006 compared to 4th grade student in 2005)
lacks some of the advantages of longitudinal tracking of student achievement,
but can still provide information on changes in student achievement that would
complement the comparisons of current performance to fixed targets each year.
Multiple Hurdles
There are many ways that a school can fail to meet the AYP requirements in a
given year, but only one way that it can meet them. It must meet or exceed the
participation rate requirements (95% of eligible students) for mathematics and
reading/English language arts for the student body as a whole and for each
subgroup of students where disaggregated reporting is required, and must meet
or exceed the percent proficient or above targets for all students and for all
subgroups. Thus, at a minimum, schools must clear 5 hurdles to make AYP.
Because of disaggregated reporting requirements for subgroups, schools with
diverse student bodies are frequently confronted with many more than the 5
hurdles based on all students in the school. As the number of subgroups for
which disaggregated reporting is required increases, the number of hurdles that
a school must clear rapidly increases (Marion, White, Carlson, Erpenbach,
Rabinowitz, & Sheinker, 2002).
Thus a school with more than the minimum number of students in each of
several subgroups identified for disaggregated reporting has substantially more
than 5 hurdles to clear. For example, a school with 6 subgroups (African
American students, Hispanic students, white students, students with limited
English proficiency, economically disadvantaged students, and students with
disabilities) meeting the minimum size requirement would have not 5, but 29,
hurdles to clear (the 5 when all students in the school are considered as a whole,
plus 24 for the 4 hurdles (participation rates in reading and mathematics, and
achievement in reading and mathematics), for each of the 6 subgroups. Thus, the
latter school could fail to make AYP in 29 different ways but could make AYP in
only one way – by clearing all 29 hurdles.
Requiring schools to meet AYP requirements for separate subgroups of students
is consistent with the NCLB goal of closing gaps in achievement for the identified
subgroups. Nevertheless, it is clear that NCLB’s multiple-hurdle approach
makes it considerably more difficult for large schools with diverse student bodies
1
NCLB: Changing It; Fixing It; Living With It
to meet AYP requirements than it is for small schools or schools with
homogenous student bodies (Kim & Sunderman, 2005; Linn, 2005).
There are alternatives to the conjunctive system of multiple hurdles used in the
NCLB school accountability system. The most obvious alternative is some form
of a compensatory system. With a compensatory approach, high achievement
that is above the goal in one content area can be used to compensate for
achievement that falls below the goal in another area. If the AMO for a given
year was 50% proficient or above in reading and 40% proficient or above in
mathematics, for example, then a school where, say, 55% of its students were
proficient or above in reading but only 38% of its students were proficient or
above in mathematics could make AYP under a compensatory system while it
would fail to do so under the current multiple-hurdle system. A number of state
accountability systems that were in place prior to the enactment of NCLB used a
compensatory approach.
Conclusion
NCLB has the potential to make substantial positive contributions to education.
It can contribute to the improvement of student achievement and, through its
focus on students who have lagged behind and too often been ignored in the
past, to the closing of achievement gaps among racial/ethnic groups, between
economically disadvantaged students and their more affluent counterparts,
between limited English proficient student and native English speakers, and
between students with and without disabilities. Some features of the NCLB
accountability system, however, need to be modified if the praiseworthy goals of
NCLB are going to be achieved.
The most important modification is to set performance targets for judging
adequate yearly progress that are more reasonable and for which there is a
realistic hope that they might be achieved given sufficient effort. The need for
more realistic goals applies to both the safe harbor provision of the law and to
the annual performance targets. The current definitions of proficient
achievement established by states lack any semblance of a common meaning.
Alternatives to defining proficiency should be considered that would provide
more meaningful and comparable achievement targets. Past data on what
schools showing exemplary gains in achievement should be used to set goals that
are ambitious, but obtainable with hard work and those goals should be
expressed in ways other than the currently poorly defined proficient academic
achievement standards that vary wildly from state to state.
15
National Association of Test Directors 2007 Proceedings
Changes to AYP requirements should be made that would allow schools to get
credit for gains in achievement as well as absolute performance in a given year.
The recently introduced pilot program that allows the use of longitudinal growth
models by a small number of states is a step in that direction, but the fact that the
unrealistic 100% proficiency requirement in 2013-2014 is maintained undercuts
the value of the program. It is limited to just 2 states in 2005-2006 and 5 states in
2003-2007. Furthermore, gains made by schools in states without longitudinal
tracking systems do not count toward making AYP if the school fails to meet the
AMO for either reading/English language arts or mathematics.
Finally, the multiple-hurdle approach used to determine AYP should be replaced
by a compensatory or hybrid approach. This would make the system fairer for
schools that serve heterogeneous student bodies. It would also enhance the
reliability of school classification.
1
NCLB: Changing It; Fixing It; Living With It
References
Kim, J. S. & Sunderman, G. L. (2005). Measuring academic proficiency under the
No Child Left Behind Act: Implications for educational equity. Educational
Researcher, 34(8), 3-13.
Linn, R. L. (2000). Assessments and Accountability. Educational Researcher,
29(2), 4-14.
Linn, R. L. (2003). Accountability: Responsibility and reasonable expectations.
Educational Researcher, 32, No. 7, 3-13.
Linn, R. L. (2005, June 28)). Conflicting demands of No Child Left Behind and
state systems: Mixed messages about school performance. Educational Policy
Analysis Archives, 13(33).
Linn, R. L. (in press). Toward a more effective definition of adequate yearly
progress. In G. Sunderman (Ed.), Holding NCLB accountable: Achieving
accountability, equity, and school reform. Thousand Oaks, CA, Corwin Press.
Marion, S. T., White, C., Carlson, D., Erpenbach, W. J., Rabinowitz, A. &
Sheinker. J. (2002). Making valid and reliable decisions in determining adequate yearly
progress. A paper series: Implementing the state accountability requirements
under the No Child Left Behind Act of 2001. Washington, DC: Council of Chief
State School Officers.
Mullis, I. V. S., Martin, M. O., & Foy, P. (2005). IEA’s TIMSS 2003 international
report on achievement in mathematics cognitive domains: Finding from a developmental
project. Chestnut Hill, MA, International Study Center, Lynch School of
Education, Boston College.
Mullis, I. V. S., Martin, M. O., Gonzalez, E. J., Y Kennedy, A. M. (2003). PIRLS
2001 international report: IEA’s study of reading literacy achievement in primary school
in 35 countries. Chestnut Hill, MA, International Study Center, Lynch School of
Education, Boston College.
No Child Left Behind Act of 2001, Public Law No. 107.110.
Olson, L. (2005). Room to Maneuver. Education Week, Special pull out section:
A progress report on the No Child Left Behind Act, pp. S1-S6, December 14.
17
National Association of Test Directors 2007 Proceedings
Olson, L. (2006). 3 state get OK to use “growth model” to gauge AYP. Education
Week, 26(12), Nov. 15, p. 24.
Organization for Economic Co-operation and Development. (2004). Learning for
tomorrow’s world: First results from PISA 2003. Paris, France: Author.
Perie, M., Grigg, W., & Dion, G. (2005). The nation’s report card: Mathematics 2005
(NCES 2006-453). U.S. Department of Education, National Center for Education
Statistics. Washington, DC: U.S. Government Printing Office.
Phillips, G. W. (2007). Expressing international educational achievement in terms of
U.S. performance standards: Linking NAEP achievement levels to TIMSS. Technical
Report. Washington, DC: American Institutes for Research.
Popham, W. J. (2004). Ruminations regarding NCLB’s most malignant
provisions: Adequate yearly progress. Available at
//www.ctredpol.org/pubs/Forum28July2004.
Porter, A. C., Linn, R. L., & Trimble, S. (2005). The effects of state decisions
about NCLB Adequate Yearly Progress targets. Educational Measurement: Issues
and Practice, 24(4), 32-39.
Rothstein, R., Jacobsen, R., & Wilder, T. (2006). ‘Proficiency for all’ – An
oxymoron. Paper presented at a Symposium, “Examining America’s
commitment to closing achievement gaps: NCLB and its alternatives,” sponsored
by the Campaign for Educational Equity, Teachers College, Columbia University,
November 13-14.
Spellings, M. (November 21, 2005). Letter to Chief State School Officers,
announcing growth model pilot program, with enclosures. Available at:
http://www.ed.gov/nclb/landing.jhtml.
Spellings, M. (May 17, 2006). Press release. Secretary Spellings approves
Tennessee and North Carolina growth model pilots for 2005-2006. Available at
http://www.ed.gov/news/pressreleases/2006/05/05172006a.html.
1
NCLB: Changing It; Fixing It; Living With It
The State of NCLB
at a State Department of
Education
Judy Feil
Ohio State Department of Education
We all know that NCLB is about more than annually measuring students academic
achievement in English/Language Arts and mathematics in grades 3 – 8 and once in the
grade span 9 – 12 and science three times once in three different grade spans. NCLB is
about the accountability of schools, districts, and states for the learning of all students and
for improving the educational systems that serve those students. It is about providing
resources, both fiscal and human, to struggling schools; giving technical assistance to
schools that need help; training and hiring highly qualified teachers at all grade levels;
and allowing school choice for parent in an effort to improve the educational
opportunities of all students, close the achievement gap, and provide a fairer and more
equitable education to all groups. Are these lofty goals? Yes. And who can argue with
19
National Association of Test Directors 2007 Proceedings
those goals – who is opposed to any of them? The underlying premise is that NCLB will
improve the educational system in America helping to raise the standards and
expectations for all: states, districts, schools, and students. I would like to share some of
the speed bumps along this road to improving opportunities for students – maybe even
share some potholes that states have fallen into and are struggle to overcome.
Let me remind everyone, although that is probably not necessary, that the AYP has three
parts – two of which are dependent upon a state assessment system. That assessment
system is the foundation upon which all the other tenants rest. It is the topic of my
presentation today.
There are three topics I wish to discuss today: some demands of NCLB, the costs of
NCLB, and the balancing act required to make NCLB work. These topics are in the
realm of my expertise as a state testing director. Remember that Ohio is not unique but is
one of 50 states struggling to comply with NCLB and, I might add, making it work.
Most of you are aware of the many demands on the state assessment system under
NCLB. Only a very few are listed here:
• Statewide System of
– Challenging Academic Content Standards
– Challenging Academic Achievement Standards
– Annual High-Quality Assessments
• Assessments with High Technical Quality
• Inclusion of All Students in the Assessment System
• Comparability of test results
Some key words are highlighted: challenging, annual, high technical quality, all students,
and comparability. Even though the balancing act required under NCLB is listed as the
third topic of discussion today, it is interwoven into all the remarks, subtly underlying all
the assessment work.
Consider the word “challenging” compared to the “all students.” What does the word
challenging mean when asked of the entire student population? Challenging for the
students who routinely perform at a low level of academic achievement means something
entirely difference for the students who perform at the high level of academic
achievement. What is challenging to one group of students may represent the impossible
for another group of students.
Also consider the “comparability” of test results and “all students.” All students in the
state must be given the same or comparable assessment. Some students should be
provided accommodations and/or special versions in order to have access to the
assessment. Are their results truly comparable to the student who just barely misses
qualifying for the accommodations? How can the results from the portfolio used as the
alternate assessment for the significantly cognitively disabled student be comparable to
the score results from a paper pencil on demand assessment? States must do a remarkable
balancing act to demonstrate comparability of test results to the NCLB peer reviewers.
1
NCLB: Changing It; Fixing It; Living With It
And lastly, the “annual” assessments coupled with “high technical quality.” Most people
would agree that the large scale standardized assessments developed by the testing
industry for decades were of high technical quality. But those products have been
shelved. The timeline to develop and collect evidence of high technical quality was
between 5 and 7 years for the testing companies. Gone also is that timeline. States are
required to product annual assessments of high technical quality and in most cases with
unique assessments every year. Sadly, quality must be balanced against expediency.
Despite these seemingly overwhelming challenging, states, to date, have been relatively
successful.
The Costs of NCLB
There are three primary costs associated with the state assessment system linked to the
high demands of NCLB: fiscal, human and public perception or credibility costs.
Fiscal Costs
As you can see from Chart 1, the costs for the state assessment system required by federal
law has risen steadily over the approximately 5 ½ years since NCLB was signed into law.
The red line graph is the federal appropriation for the mandated state assessments.
The blue line graph shown here represents the costs of the assessment contracts with the
testing companies working on the Ohio project. As new tests came on line to complete
the required battery of mandated tests, the costs increased; as the state worked to test all
students, the costs increased; and as the state worked to establish the high quality and
comparability of the assessments, special version, alternate assessment, and
accommodations, the costs increased.
Chart 1
21
National Association of Test Directors 2007 Proceedings
State Assessment Costs and Federal Appropriation
90,000,000
80,000,000
Dollar Costs
70,000,000
60,000,000
50,000,000
Federal Appropriation
O hio Assessment Costs
40,000,000
30,000,000
20,000,000
10,000,000
0
2002
2003
2004
2005
2006
2007
Fiscal Years
As can be seen in Chart 2, 28% was the highest proportion of the assessment costs paid
for by the federal appropriation but as the state costs increased, the appropriation
remained relatively flat and has now dropped to just 15.5% of the assessment costs.
Need it be said – unfunded mandate.
Chart 2
Federal Appropriation as Percentage of Total Costs
35.0%
28.0%
30.0%
23.9%
Percentage
25.0%
18.5%
20.0%
17.7%
15.5%
15.0%
10.0%
5.0%
0.0%
0.0%
2002
2003
2004
2005
2006
2007
Fiscal Year
Human Costs
1
NCLB: Changing It; Fixing It; Living With It
We have all seen the headlines about testing companies struggling to meet the
customized testing needs of the 50 states and the resulting problems when the highly
taxed testing industry stumbles. Tests are not delivered on time or tests are scored
incorrectly or test results are reported incorrectly or web based online testing systems fail
are headlines we have all seen. The days of off the shelf products that were the mainstay
of the testing industry for decades are over and those tests are for the most part being put
back on the shelf. Not only is customization the new requirement, but customization to
different state standards is the norm.
As a result of the strain on both the testing industry and the state departments, new
partnership relationships are essential for survival in the current environment. States are
assuming more of the quality assurance roles, are more closely monitoring the testing
contractors, and are more involved in the test development process than ever before. In
order to meet the ever increasing challenges of NCLB, states and their respective testing
companies must work as teams to jointly find resolution to problems and to find solutions
to the challenges if both entities are to survive. But that too has a price in human costs.
Over the past six years, Ohio has seen the number of test forms increase yearly until
leveling off in 2007 (Chart 3). The green line graph represents the number of general
education assessments. The red and blue line graphs represent respectively the number of
alternate assessment and special versions: Braille, large print, English audio on CDs,
Spanish bilingual, and 5 foreign language translations on CDs. Please note that the
number of test forms for the students needing special versions or the alternate
assessments (represents the “all” in NCLB) far exceeds to number of test forms for the
general education population. The numbers displayed here are Ohio’s response to the
NCLB requirements. Other states would have similar numbers
Chart 3
Number of Tests
300
Number of Forms
250
200
150
General Education
Alternate Assessment
Special Versions
Total Forms
100
50
0
2002
2003
2004
2005
Fiscal Year
2006
2007
23
National Association of Test Directors 2007 Proceedings
Herein lies one of the pot holes into which some states have fallen on their road to NCLB
peer review approval. Because the alternate assessments and special versions are much
more costly on a per student basis and because of manpower shortages, some states have
not dedicated the necessary resources to those groups. They have failed to find the
correct balance between the various needs.
These graphs represent the yearly forms for the Ohio statewide assessment system which
is monitored by the assessment office. The assessment office in Ohio has formed a
successful partnership with its testing contractor to produce high quality assessments
annually.
While the number of test forms has increased over the past several years the number of
assessment staff members in Ohio increased slightly until 2007 when the number of
employees dropped (Chart 4). Because the number of employees has not increased at the
same rate as the number of tests, the work load for each individual employee has
increased.
Chart 4
Staffing for Assessment
30
Number of People
25
20
15
10
5
0
2002
2003
2004
2005
2006
2007
Fiscal Year
1
NCLB: Changing It; Fixing It; Living With It
Other data give a different view of the assessment office staffing. Of the original 17
employees in the office in 2002, only 4 are still in the assessment office. Of the current
staff, only 4 have previous experience in large scale test development and/or test scoring
experience; 4 have technical experience in data management and statistics; and there has
been 3 testing directors in the six years since NCLB was signed into law.
Please do not get me wrong – I am not complaining. Ohio has a relatively large
assessment staff for which I am very thankful. Other states are not so fortunate.
The very nature of the work of the assessment office is sometimes a source of high
human costs. The work is very specialized and requires extensive attention to detail. It
can be tedious and time consuming with little recognition for a job well done. There is a
steep learning curve for the work involved. It takes between 3 to 6 months for a new
employee to be trained to work efficiently, effectively and accurately in assessment work.
That is why the turnover rate is so costly. The work is highly stressful both because of
the work load and because the recognition of the high stakes nature of the results. Yes,
the stakes are high for schools, districts and the state.
There is an ever increasing need for high quality work and continuous oversight of all
phases of the assessment system. There is very little margin or tolerance for error.
Because of condensed timelines of annual high quality tests, there is no forgiveness for
missed deadlines. The work is ongoing with little down time.
Because the department is a public organization, the staff must be response to the
public’s requests for help and information. Not only must the staff be available to help
with customer service but they must strive to make sure the assessment administration
logistics run as smoothly as possible. All aspects of the assessment administration and
reporting must be customized to the various stakeholder groups so that every group has
the help and information it needs to administer and understand the test results.
You must be thinking that what I am describing is an impossible task. That is not the
case. Most states have been successful in the meeting the NCLB demands. Because of
the dedicated hard working assessment staff and the team work model employed, the
yearly output from the Ohio office is over 7,500 new test questions a year with oversight
of an average of 90 committee meetings annually. Including embedded field test form
there are between 500 and 600 unique forms developed each year. Ohio administers and
scores tests and reports test results to approximately 2,380,000 students on an annual
basis.
Credibility Costs
The last cost I want to discuss is the cost to the credibility of the assessment systems and
the accountability system.
We have all seen the sensational headlines about state testing programs. These are a few
of the headlines Ohio has seen in the past two years – “Kids Suffer When Test Fails,” “16
25
National Association of Test Directors 2007 Proceedings
Local School Districts Have Students Affected by Test Error,” “Tests wrongly marked
fail,” “Company’s error sent more than 900 students word they didn’t pass,” “Ohio
Student Claim Test Confusing,” and “Test gaffe sent wrong message to student.”
One article began with the line “Errors on the state’s test do more than jeopardize the
credibility of Ohio’s accountability system. They risk significant harm to individual
students.” As this opening line states, the harm from testing errors is the student. But I
content that real harm in the NCLB environment can also happen to the school, district
and state.
Gone are the days of administering the shelf products and relatively low stakes of the test
results. Under NCLB’s requirement for customized assessments, the states have
ownership of the tests and are subsequently responsible for both the success and failure of
those tests. Sadly, however, when any aspect of the testing system fails, the credibility of
the state department is a stake.
Not only is the state ultimately responsible for the assessment system but it must walk a
fine line to keep external pressures in relative equilibrium.
There are many different sources of pressure on a state’s assessment system (Figure 1).
The federal mandates, state laws, state policy makers, the state board of education, the
general public and schools/districts. Each creates a different pressure for a different
reason.
Figure 1
Sources of Pressures
Federal
Requirements
State Policy
Makers
State Laws/
Legislature
Assessment
System
General Public/
Press
State Board of
Education
Schools/
Districts
1
NCLB: Changing It; Fixing It; Living With It
Let me give a few examples.
Early on in the initial stages of setting up the assessments, Ohio elected to use three types
of items on the assessments: multiple choice, short answer, and extended response. Up
until this year, Ohio has administered the grades 3 – 8 assessments in March with the
results due in May. There was a general feeling in some schools and districts that students
would do better is they has more instruction time before they were administered the
statewide tests. These groups put pressure on the state legislature to move the test
administration date to May.
After several years of debate, a state law was passed, effective this year, changing the test
administration date to May with the results due back in districts in time for the
accountability local report card. Because of the federal requirement of reporting the AYP
results before school began, the deadline for test results remained the same. The pressure
upon the state department was to make this change happen smoothly and error free. This
shortening of the return, scoring and reporting the test results to about half the time
previously allowed. Even though Ohio has worked closely with our testing contractor to
work out the logistics of this compressed timeline, it was necessary to form a closer
partnership with school districts to find resolution and solutions for unforeseen
consequences of this change. Have these new partnerships worked, we hope so but we
will not know for several months.
Another example of a balancing act is the 2% modified assessment. This assessment is
voluntary so there is no federal funding for this program. The regulations for this
assessment were just finalized and released and pressure is beginning to build within the
state to create a new series of tests for this special population. Is that wrong? Of course
not. Is it possible? Yes. So what is the problem? The balancing act is meeting the needs
of the students for who the 2% modified assessment is intended when the program is not
funded by the federal government because it is voluntary and the state budget language
funds only mandatory tests. This is complicated by the increased interest in formative
assessments and end-of-course exams that is sweeping the country and has been picked
up by the state board of education and policy makers within the state. With limited fiscal
and human resources, the assessment office will be doing a fine balancing act to make all
happen.
While the states contemplate and discuss, policy makers haggle, and politician form
alliances only to reconfigure into new support groups for the reenactment of NCLB
scheduled for this year, quietly across America in every school and classroom, teachers
teach and students learn. Isn’t that what NCLB is all about?
27
National Association of Test Directors 2007 Proceedings
NCLB: Changing It,
Fixing It, Living With It:
Is a High Performing
1
NCLB: Changing It; Fixing It; Living With It
District’s Performance
High Enough for NCLB?
David J. Kroeze
Northbrook [IL] School District 27
The No Child Left Behind Act of 2001 (NCLB) is noteworthy for its emphasis on
addressing the needs of all children, in particular, those students who typically
underachieve and lag behind others. It also focuses the attention of all school
districts in America on the need to use assessment data as a means to understand
student growth and to make curricular and instructional decisions that
specifically support the needs of low and underperforming students.
The Act, though well-intentioned, has inherent implementation flaws that need
to be addressed in its reauthorization. These flaws spotlight bureaucratic and
political aspects of national proficiency and achievement at the expense of
addressing the real life issues that practitioners face in their day-to-day work
with the teaching-learning process. As a result there are some unintended
outcomes that pose challenges for all school districts and cause them to devote
time and effort restructuring and retrofitting components of their curriculum,
instructional practice, and staffing. The adverse impact of the NCLB has been
widely documented (AASA, 2007; CCSSO, 2007; NGA, 2007; Linn, 2007; Nichols
& Berliner, 2007; Popham, 2004). Nevertheless, the potential for this law to have
lasting positive impact is very real.
This paper articulates the impact of NCLB on a high-performing school district,
Northbrook School District 27 (NSD27). At the outset, I provide a brief context of
NSD27 to give the reader an understanding of the District and the broader
picture of our focus on organizational improvement and student performance
29
National Association of Test Directors 2007 Proceedings
within that context. Three areas, then, will be addressed. First, this paper focuses
on the positive aspects of NCLB has brought to enhance our ability to meet
students’ needs. The paper also addresses the adverse impact that the law on our
efforts to address performance, community, and staff issues because of its
unrealistic expectations. The paper concludes with suggestions for improvement
as NCLB moves toward reauthorization.
The Context of Northbrook School District 27
Located in the suburbs outside Chicago, NSD27 is a small elementary school
district of approximately five square miles in a middle-to-upper middle class
socio-economic community with 15% of the population being non-White. It has a
community population of 10,000 and a student enrollment of approximately
1,300. The District is one of four school districts in the town serving the
northwest segment, which accounts for approximately 30% of the village
population. The village founders believed that small school districts with their
own governance structure would be able to better serve the needs of the
community and provide closer access to school leaders and Board of Education
members.
NSD27 started out small in population but grew to 2,000 students in the early
1970s due to baby boomer births. Since that time it experienced a steady
enrollment decline, with short periods of modest increases. For the past five
years enrollment has remained relatively stable.
NSD27 is a 154-year-old district having a history of high student academic
performance. It is recognized for its performance by outside groups who judge
the state’s more than 850 school districts. The District offers comprehensive
support programs for students along with many opportunities for students to be
involved in extra-curricular activities.
NSD27 offers academic programs and professional student services based on
curriculum guidelines and regulations established by the State Board of
Education. Regular academic programs are included in two elementary schools
(K-3), one intermediate school (4-5), and one middle school (6-8). The District
delivers regular educational programs via classroom and technology-based
instruction, educational learning labs, and school-related activities. Educational
program delivery occurs during the traditional school calendar with a four-week
summer school program.
NSD27 provides other programs that include special education, instruction for
English Language Learners (ELL), pre-kindergarten at-risk support, and gifted
1
NCLB: Changing It; Fixing It; Living With It
education. Numerous clubs, organizations, and sports programs provide beforeand after-school extra-curricular activities.
Based on national, state, and local comparison data, NSD27 is considered an
academically and financially high performing system in Illinois. For an
academically high-performing district, improving test scores is not the
overarching goal. Rather, the challenge is implementing an effective continuous
improvement model to affect systemic change. Continuous improvement
demands relentless pursuit of doing things better, more efficiently, at lower
costs, and with greater sensitivity to the customers’ needs. Ultimately, it
distinguishes the leading educational systems of the 21st century (Kimmelman &
Kroeze, 2002).
Over the past fifteen years NSD27 made a significant shift from what we call an
“Event-based” approach of operation to a “Systems Approach.” The event-based
approach led to what we called “Random Acts of Improvement” but did not
provide a profound or even noticeable impact on the District’s various
performance indicators. Therefore, the District developed a systemic approach of
continuous improvement to raise its level of organizational effectiveness. Taking
a more systemic view of the District provided a more interactive context where
alignment of processes and services could be studied for their impact on the
overall organization. After researching various continuous improvement
models, the Board of Education adopted the Malcolm Baldrige National Quality
Program Framework because it more appropriately reflected the type of
approaches and thinking that we needed to help our organization move to the
next level of maturity. NSD27 began its continuous improvement process at the
district administrators’ annual summer retreat in June 2003, implemented it fully
in 2004, and progressed to the point of submitting a Baldrige application for
review in 2006.
As the primary instructional leader of a high performing school district, it is my
responsibility to ensure that all students receive a quality educational program
that enables students to perform at highly competitive levels and to prepare
them to be part of what I call a “Worldwide Community of Excellence.” A highperforming district has some inherent advantages that enable its staff to
maximize its resources and provide students with the support they need to
perform well.
The community expectations for high performance are an ever-present reminder
that there is no substitute for excellence. It is within this context that I must
implement the provisions of the NCLB, manage the conventional reporting
mechanisms of the state report card to our community, and offer a
comprehensive well-rounded program of instruction to meet the needs of the
31
National Association of Test Directors 2007 Proceedings
whole child. This broader program offering supersedes the limited, narrower
scope of the state standards pertaining to NCLB. In our District NCLB state
learning standards are merely part of the curriculum, not the total curriculum.
Finally, I must provide a staff that is “highly qualified,” not only meeting the
requirements of NCLB but IDEA and other content expertise areas such as gifted
and English Language Learners.
It is within this context that NSD27 implements the No Child Left Behind Act of
2001. NCLB is part of the regulatory environment within which the District
operates and exists as part of the broader operation of the District.
Positive Aspects of NCLB
NCLB has actually called attention to a number of factors on which we have
capitalized to improve our ability to meet student needs and to improve our
practices. I focus on three distinct positive aspects: (a) Greater attention to low
performing students; (b) heightened awareness of our diversity; and (c)
enhanced the use of assessment data for instructional decision-making.
Greater Attention to Low Performing Students
In a high-performing school district, aggregate performance scores on NCLB
provide an overall picture of excellence and we are able to demonstrate that our
schools are more than sufficiently meeting the AYP requirements. Many highperforming districts could rest of their laurels and tout the aggregate
performance. However, the story needs to go deeper with particular attention to
individual students. We embraced the intent of NCLB and operationalized it for
our students and parents. In NSD27 if a student does not meet the state learning
standards, we address that student’s needs individually with support. We take
the student’s performance information and triangulate it with other formative
and norm-referenced data. The teacher and other key personnel in the school
develop an individual learning plan for every student that does not meet our
targets. These are monitored throughout the year.
One key measure that we use is the Northwest Evaluation Association (NWEA)
Measures of Academic Progress (MAP). This measure identifies the student’s
performance which is aligned with the Illinois State Learning Standards. Using a
companion NWEA tool called the Dynamic Reporting Suite, we are able to identify
a projection of student performance on the Illinois Standards Achievement Test
(ISAT). In the fall of each year we use our NWEA data and an additional NWEA
tool called the Learning Continuum. This tool enabled us to identify key skills and
1
NCLB: Changing It; Fixing It; Living With It
concepts that are deficient and then provide skill development support through
regular and extra instruction. This approach increases the probability that the
student will meet the AYP target the following year. Our internal analysis of
student growth shows that most students who initially do not meet the target
make substantial growth in the year, and will, within three years, meet the state
learning targets.
Another aspect of this notion of meeting individual needs of students
acknowledges that students can increase in their state standing even if they meet
state learning standards. A common point of conversation with our Board of
Education and parents is to set internal targets for ourselves to raise the
percentage of students who “exceed” state learning standards. Consistent with
the high performance expectations of our community, we identify internal
targets of performance that far exceed the current AYP expectations. We make it
a practice to have all of our students who do not score in the “exceeds” level
strive to increase their performance level over time. We consistently set new
standards of performance for all students and identify those skills that can be
enhanced that would improve their standing. By the time students graduate
from eighth grade, most of them have improved their standing on the ISAT.
Heightened Awareness of Our Diversity
NCLB also heightened our awareness of the diversity we have even in this very
homogeneous community. While NSD27 is socio-economically homogeneous,
we have a few areas of diversity. Specifically we have two subgroups under
NCLB, ethnic (13% Asian, 86% white) and special education population 18%. The
District implements well-developed processes to meet the needs of students with
specific needs. What NCLB offers is data that support our effort to determine if
our approaches are having the impact we believe they should on these students.
In other words, are these approaches really working? Analyzing the results of
NCLB for our subgroups enables us to make better decisions about the
approaches that we use to support these students. These data inform us about
the areas to that we can make the adjustments needed to address student needs.
Enhanced Use of Assessment Data for Instructional Decision-Making
NCLB encouraged and enabled districts’ use of assessment data to make better
instructional decisions for students. I have long believed that the use of
assessment data has been one of the most underutilized and misunderstood
components of the teaching-learning process (Curriculum-InstructionAssessment). Although testing students has become commonplace in school
districts, educators struggle with the challenges of assessing student learning. By
33
National Association of Test Directors 2007 Proceedings
this I mean that districts are collecting student achievement data, but it is
primarily being used to describe characteristics of student performance.
Understanding why students achieve the way they do is a more challenging task.
Teachers need to know how to use the data to inform their practice and
understand student progress. In essence, they have an abundance of data but
lack the proper understanding of how to make use of this information. Michael
Fullan (1998) in his book, What’s Worth Fighting For Out There? encourages
teachers and administrators to become more “assessment literate.” This can be
done in several ways including expanding their assessment repertoires, showing
parents and students how they arrive at their assessment decisions, collecting
assessment data as an ongoing part of classroom learning, monitoring how well
their students are achieving over time and communicating the results clearly to
parents and the public.
NCLB provided us with information that enabled us to revisit how we use
assessment data. We incorporated these data into our School Improvement
Planning, grade level planning, and, as mentioned, individual student learning
plans. NCLB drives quality information to the teacher level, allowing them to
examine whether the practices we are implementing have an impact on student
performance. Teachers in NSD27 take very seriously the NCLB data and work
hard to ensure that their instructional approaches are focused on meeting
students’ needs and are consistent with the District’s curriculum and the Illinois
State Learning Standards. We are now tracking information over time using
NCLB, NWEA and other performance measures that help us to evaluate not only
student performance but our approaches.
Negative Aspects of NCLB Implementation
NCLB is not without its critics or criticisms. From the perspective of a highperforming school district, there are three negative aspects that result from its
implementation: (a) statistical implications, (b) definition of proficiency, and (c)
requirements of highly qualified staff. The following section elaborates on each
of these aspects.
Statistical Implications
Like other high-performing school districts across the nation, NSD27s statistical
mean on national assessments is significantly higher than the national mean
(50%), typically 20-25% higher. This fact alone would seem to position highperforming school districts in an enviable position to meet AYP targets.
Currently, NSD27 meets AYP targets and does so each year with performances
1
NCLB: Changing It; Fixing It; Living With It
on the state assessments indicating that 90+% of NSD27 students meet or exceed
levels of proficiency in reading and mathematics.
While our overall means are high, the story does not end there. The impact of
individual student performances is problematic and gives misleading
perceptions of “growth” and “decline” within a grade level. Where large school
districts have the challenge of demonstrating substantial change, school districts
with small enrollments have the opposite problem. NSD27’s small enrollment
leaves it highly vulnerable to individual student performances that can result in
substantial fluctuations of overall percentages within the State’s performance
level designations (“exceeds”, “meets”, “does not meet”, and “academic
warning”). A single student’s performance can influence the performance level
percentages by multiple percentage points. In a community that expects
consistent, high performance, a perceived decline in performance is a red flag.
NSD27 works hard to educate our community regarding the variation we can
expect to see. For example, when reporting that a grade level increased or
decreased by five percentage points, we are quick to mention that this variation
reflects only 2-3 students and is not a significant change. As a result we continue
to advocate the use of trend data and triangulating other data to give a more
complete view of student performance at any grade level. Despite this work, we
find it to be an ongoing educational challenge.
Despite our pattern of strong performance on the ISAT, it is unlikely that NSD27
will meet 100% proficiency by 2014. We have a number of new students that
enter the district on a yearly basis and we implement what we call a New
Student Assimilation Process designed to diagnose student performance levels
and needs. This process enables us to begin immediately addressing student
learning and developmental needs. We also have low performing students who
are making progress but will not meet the proficiency levels by 2014, simply
because they have unique learning needs. The greater issue for NSD27 is to
ensure that we have implemented instructional and support programs for these
students that will result in their ongoing growth and development.
Linn points out from his study of proficiency and his work on the National
Assessment of Educational Progress (NAEP) and the Third International
Mathematics and Science Studies (TIMSS) that “100% proficient is a goal that is
completely out of reach even with extraordinary effort on the part of teachers an
students” (Linn, 2000; 2007). One question results: If districts with consistent
performances as strong as NSD27 cannot make 100% proficiency, then who can?
The target is not only unrealistic but statistically impossible (Linn, 2004, 2007)
Definition of Proficiency
35
National Association of Test Directors 2007 Proceedings
Proficiency standard is another negative aspect of NCLB. Whose definition of
proficiency are we using? As Linn (2004, 2007) has appropriately addressed,
there is a fundamental flaw with using proficiency as the assessment benchmark.
Currently, NCLB allows every state the right and responsibility to define a
proficiency standard. The lack of a universally accepted proficiency standard
fosters inevitable state-to-state assessment comparisons.
More importantly, the lack of clarity regarding proficiency standards does not
stop at the state or national level but raises the question of preparedness within a
global context. Today’s students will have to learn to function and be productive
as adults in a highly competitive global environment. Currently, the various
definitions of proficiency do give us an indication that national proficiency
improves our standing on an international basis.
I worked with two sets the TIMSS data (TIMSS 1995 and TIMSS-R 1999) as a
member of the First in the World Consortium (FITW) in partnership with Boston
College, Michigan State University, the North Central Regional Education
Laboratory (NCREL), and the US Department of Education. We sought to
understand our student performance by looking at what we defined as the A+
countries, those that achieved at the highest levels internationally. We also
studied the US performance which was close to the mean of its international
counterparts. The FITW performed competitively with the highest achieving
students in the world but we still had a number of areas where we could
improve our performance.
The proficiency standard issue of NCLB suggests that if we attain national
proficiency, we might improve our global standing. In other words, we would be
competitive internationally. This implication raises the following question in my
mind: Would American students be competitive with our international
counterparts?; and, more significantly: Would we be competitive with the
highest achieving students in the world if all students in the country were to
achieve proficiency (by 2014 or later)? The data suggest that we would not.
(Linn, 2000; 2007) The implication of this notion suggests that our current
variable definitions of proficiency are still removed from the reality of our future
global workforce. As a result, NSD27 relies on its development of rigorous and
coherent programs and performance indicators to define a high level of
performance.
Highly Qualified Staffing
The third negative aspect of NCLB exists within the expectations of highly
qualified staff. No one argues that all children have the right to be taught by a
1
NCLB: Changing It; Fixing It; Living With It
highly qualified teacher. Fortunately for high-performing school districts,
teachers desire to work in these resource-rich settings. However, even highperforming districts have significant challenges in meeting the criteria for
“highly qualified” staff.
The first challenge comes in the significant investment of time and resources in
examining every certificated staffing position. This action is necessary to ensure
that every teacher holds the appropriate certification, endorsements, and testing
necessary to meet the state’s “highly qualified” criteria. In addition to reviewing
current staff’s credentials, changes in the hiring process result. Beyond typical
personnel screening procedures, all applicants now are screened through the
“highly qualified” lens as well. There is no need to interview an applicant that
does not meet the criteria for “highly qualified.”
The second challenge of NCLB’s “highly qualified” criteria has a major impact on
staffing for subgroup populations, specifically special education (SpEd) and
English language learners (ELL). Currently, SpEd teachers must be “highly
qualified” in every instructional content area if they are the teacher of record. In
other words, they determine the student’s grade. In the middle grades, this
means SpEd teachers must have a minimum of 18 semester hours in every
content area plus three hours in middle school pedagogy and three hours in
adolescent psychology. These criteria are in addition to their SpEd certification
and endorsement requirements. As a result we (a) restructure our instructional
delivery approaches for SpEd students, and/or (b) send SpEd staff back to
college for coursework.
The restrictive nature of certification contributes to the third challenge of NCLB:
teachers of special populations are choosing to go back into the regular
classroom to get out from under the burdensome requirements to become
adequately certified. Many excellent teachers who have a heart for students with
special needs and have the gifts and talent to work with these students are
discouraged and are seeking a more reasonable approach to certification. Similar
procedures have been followed for teachers working with the ELL population.
As a result we have a greater challenge to find “highly qualified” teachers to fill
these critical special area positions. Meeting NCLB’s “highly qualified” criteria
have had a major impact on our school system.
Suggestions for Improving NCLB
As It Moves to Reauthorization
As NCLB moves into reauthorization, a number of recommendations have been
proposed to improve it. I would offer three improvements as they pertain to high
performing districts: (a) Provide alternative models to demonstrate target
37
National Association of Test Directors 2007 Proceedings
acquisition and student growth, (b) set more realistic goals for meeting AYP, and
(c) provide more realistic expectations for “highly qualified” criteria in special
areas.
Provide Alternative Models to Demonstrate Target Acquisition and Student
Growth
One of the fundamental flaws of the NCLB is its sole reliance on current status to
determine AYP. While this is one performance measure, it lacks the ability to
identify student learning over time. The use of cohort data will enable educators
to view growth over time and trends in performance. The basic AYP
measurement system should be expanded to include growth model/value added
approaches. This expansion would permit districts to demonstrate student
growth over time, and growth toward meeting the AYP targets. States should be
given great flexibility to design their accountability systems while continuing to
support the broader goals of NCLB.
Set More Realistic Goals for Meeting AYP
Second, the terminal goal of 100% of students meeting the proficiency standards
by 2014 needs to be re-examined. Statistically this goal is unattainable and will
only result in every school district failing to meet the expectation. As mentioned
earlier, we do not have a clear and consistent definition of proficiency
nationwide. Linn has concluded that it is impossible for us to attain 100%
proficiency despite the work of teachers and students (Linn, 2000; 2007).
In high-performing districts, our expectation is to educate students who will
success in global environment that is rapidly changing. The Third International
Mathematics and Science Studies (TIMSS) data indicate that there are clear
distributions of performance of students from the various countries. The real
question for districts like NSD27 is: Do we compete favorably with the highest
achieving students in the world? It is in this competitive environment that our
students will be working. My concern over the proficiency goal stems from the
lack of definition of proficiency and the lack of a connection with a universal
view of competence. Even if all students in America were to achieve the
proficiency standard, we do not know if this performance would make us
competitive with students in other countries much less the high achieving
students.
We must continue to set academically challenging benchmarks and strive to meet
them. Then we need to continue to compare our performance with our
international counterparts to strive for a competitive standing.
1
NCLB: Changing It; Fixing It; Living With It
More Realistic Expectations for Highly Qualified Criteria.
All students and in particular students with special needs deserve exceptional
teachers. In order to attract and retain quality educators, I would encourage
legislators to revisit the current criteria for “highly qualified” during the
reauthorization phase of NCLB. Providing appropriate service for special
education students and English language learners should remain a priority;
however, determining credentials and endorsements for these teachers demands
attention. NCLB needs to encourage quality teachers to enter and remain in the
field of working with special needs students, not discourage them. Establishing
less restrictive and more realistic criteria for “highly qualified” staff would go a
long way to lessen the burden of NCLB implementation.
Conclusion
NCLB has had an impact on all school districts including the high performing
ones. This paper delineates some the most important positive and negative
aspects of NCLB on a high-performing school district. Among the many positive
aspects of the law, it has drawn greater attention to students who do not meet
state standards and who are underachieving or lagging behind. Moreover, it has
heightened our awareness of our diverse groups of students and the practices
that we implement to meet their needs. NSD27 has embraced the guidance to
help us better meet these students’ needs. Overall NCLB has put the spotlight on
the need to use assessment data to meet students’ needs and improve our
programs and services.
The law has also had some adverse impacts on our districts. The fundamental
problems identified involved statistical anomalies, confusion surrounding
definitions of proficiency, and the restrictive nature of “highly qualified” criteria
for teachers of special needs students.
I offered a few suggestions for consideration that might improve NCLB as it
moves into reauthorization. Specifically, Congress could improve the law by
providing alternative models to demonstrate target acquisition and student
growth. This approach would enable districts to demonstrate progress to AYP
and provide trend data growth. Second, NCLB needs to set more realistic goals
for meeting AYP. Few, if any, school districts in America will meet 100%
proficiency. Moreover, reaching proficiency may not ensure our competitiveness
on a more global scale. Finally, the law should provide more realistic
expectations for “highly qualified” criteria in special areas. By doing so we can
attract and retain teachers who have training and talents to meet the needs of
39
National Association of Test Directors 2007 Proceedings
these special populations without experiencing the burdensome certification
requirements that are now in place.
References
American Association of School Administrators. (March 2007). NCLB Update:
Presentation at Annual Meeting of the American Association of School
Administrators in New Orleans, LA.
Aumiller, B.E. (2007). Case study of a K-8 school district's administrative
leadership's implementation of the Baldrige education criteria for performance
excellence. (Doctoral dissertation, University of Illinois, 2007). 12, 69-72.
Council of Chief State School Officers. (2007). Recommendations to reauthorize the
elementary and secondary education act. Washington, DC: CCSSO.
Dougherzy, C. (2006). Identifying and studying high-performing schools. Austin, TX:
National Center for Education Accountability.
Fullan, M., & Hargreaves, A. (1998 ). What’s worth fighting for out there. NY:
Teachers College Press.
Kimmelman, P. L., & Kroeze, D. J. (2002). Achieving world-class schools: Mastering
school improvement using a genetic model. Norwood, MA: Christopher-Gordon.
Linn, R.L. (2000). Assessments and accountability. Educational Researcher, 29(2),
4-14.
Linn, R.L. (2004) Accountability: Responsibility and reasonable expectations.
Educational Researcher, 32(7), 3-13.
Linn, R.L. (2007). Need modifications of NCLB. Paper presented at a symposium
sponsored by the National Association of Test Directors entitled “NCLB:
Changing It: Fixing It: Living with It” at the Annual Meeting of the National
Council of Measurement in Education, Chicago, IL.
National Governors Association. (2007). Reauthorization of NCLB. Washington,
DC: National Governors Association.
National School Boards Association. (2007). NCLB reauthorization: Guiding
principles. Alexandria, VA: NSBA.
1
NCLB: Changing It; Fixing It; Living With It
Nichols, S.L., & Berliner, D.C. (2007). Collateral damage: How high-stakes testing
corrupts America’s schools. Cambridge: Harvard Education Press.
Popham, W.J. (2004). America’s failing schools: How parents and teachers can cope
with no child left behind. New York: RoutledgeFalmer.
41
Current Guidance for Integrity In Testing
NCLB: A Business Perspective
Barb Boyd
Nationwide Insurance
Why does the business community care about education and the No Child
Left Behind legislation? What can the business community do to help? From
a business perspective, what should be considered for the NCLB
reauthorization? Answers to these and many other questions are addressed in
this brief.
Business Community and Education
Many businesses are actively involved and concerned about public K-12
education. There are numerous reasons for this involvement and concern:
 Genuine concern for the community - education is the path to economic
stability along with the accumulation of assets.
o Increased poverty in inner cities. The number of Franklin
County residents (Columbus is within Franklin County) living in
poverty has increased. In 1999, 11.6 percent of Franklin County
residents were below the federal poverty level threshold. By
2005, the percentage of residents living below the federal
poverty level threshold had risen to 14.5 percent.1 Just 10 years
42
National Association of Test Directors 2005 Annual Proceedings
ago 55 percent of the students in Columbus Public Schools
received free or reduced price lunch; now 74 percent of the
students receive free or reduced price lunch.2
o Earnings increase in relation to education level. U.S. census data
show that those with less than a high school degree earn about
$19,000 per year; high school graduates earn an average of
$27,000 a year. Those with bachelor’s degrees earn $51,000 a
year – almost twice that of those with a high school degree.
o While education can be the key to higher earnings, it is also
linked to the accumulation of assets. Research shows that, on
average, households headed by a high school graduate
accumulate ten times more wealth than households headed by a
high school dropout. Wealth is critical to the economic wellbeing of individuals and families as it is the gauge of a
household’s financial security and prospects. The potential
additional wealth, if all households were headed by high school
graduates, would be $2.7 billion for Ohio alone! 3
 In general, the business community has an enlightened self interest.
o There is more demand for workers with post-secondary
education. Large companies often have job openings for
applicants with post-secondary education. For example:
Nationwide has about 600 jobs open all the time; only 15 percent
of those positions are for applicants with only a high school
diploma.
o 13 of the 15 fastest growing occupations in Ohio require postsecondary education.
o Businesses need a quality workforce. We need workers who
have the technical as well as communication, creativity and
innovation skills to be contributing members of the company,
right away. It is too expensive for businesses to provide
remedial training while on the job.
o Businesses need knowledge workers. There will always be ideagenerating jobs within the United States. The following graphic
from Tough Choices or Tough Times illustrates that point.4
43
Current Guidance for Integrity In Testing
 The business community also needs consumers. The company I work
for is interested in protecting things that are important to you – your
home, car, wealth, and health. Those who are in poverty cannot afford
this protection which reduces the number of consumers of our
products.
There is a strong case for being involved in public education and Nationwide
has taken the initiative to become involved in the Columbus Public School
district. Nationwide provides volunteers, corporate giving, and philanthropic
investments.
 Volunteers:
 Math tutoring. Associates from all levels of the organization volunteer
to tutor students in math. We tutor students one day a week for five
weeks to prepare them to take the math portion of the 10th grade Ohio
Graduation Test. We bring the students to Nationwide’s downtown
offices to give them a sense of the business world.
44
National Association of Test Directors 2005 Annual Proceedings
 Partner in Education. Nationwide has partnered with a high need
elementary school for over 10 years. We now have 160 tutors that go to
the elementary school one day a week for the entire school year!
 Executives volunteer to be on sub-committees of the Columbus School
board such as the audit and accountability and budget development
committees. These are areas where business people have expertise and
can make a difference.
 Corporate giving:
 Principal for a Day is a program co-sponsored by the Nationwide CEO
and the Superintendent of the Columbus district. This program
encourages business leaders and those leaders who are up and coming
to spend a day with a principal in the Columbus district to get a real
view of what is happening in the school buildings and of the barriers
facing our students.
 Nationwide has committed three people for two years to develop a
leading indicator model using compliance rules for state and federal
AYP indicators for Columbus Public Schools. From a business
perspective, this seems critical to making dramatic improvements. It is
common for the private sector to look at results at least on a monthly
basis, make adjustments and monitor again. By the end of the year
there are few surprises, because of the continuous monitoring.
 Philanthropic investments:
 Nationwide supports many non-profit organizations that support the
students of the Columbus School district.
As you can see, Nationwide cares about public education and we are willing
to put our time and efforts towards improving our students’ chances at
success. I believe Nationwide is setting the standard on corporate
involvement in education in Columbus.
Business Community and NCLB
Outcomes Attained
The establishment of standards, holding schools accountable to meet them
and focusing on groups that have lagged the mainstream are the primary
elements of NCLB that are of interest to the business community.5 The
Business Roundtable has stated the need for a well trained and productive
workforce and the need to increase the competitiveness of our workforce.
They believe that measurable results are needed from the education system to
ensure there is a strong workforce.6 Business needs a labor supply to fuel
45
Current Guidance for Integrity In Testing
innovation and economic growth. We also need stronger science, technology,
engineering and math education (STEM).
In Ohio, we found a gap in the required curriculum in Ohio. Today only a
quarter of Ohio graduates take higher level math and science which is
necessary to get good meaningful jobs. We believe we need more rigorous
math and science course work beyond the 10th grade graduation test. To
eliminate this gap, the legislature just the added STEM curriculum to
graduation requirements for all of Ohio students. I believe this gap was
found, in part, due to the heightened awareness due to NCLB accountability.
As a person in the private sector, I also believe we need increased
accountability and increased transparency in order to improve the product –
the graduate of the system. I have seen evidence that increased accountability
has intensified the attention given to ensuring all students meet the standards.
The transparency provided by accountability has given us insights into issues
such as teacher effectiveness, early childhood readiness, and mobility. I
believe this caused us to discuss issues we hadn’t faced before. We have also
learned that we can successfully assess student performance, or assessment of
learning.
Improvements to Consider
There are a few items that could be improved with the No Child Left Behind
Act:
 Invest in an infrastructure for local districts to develop leading
indicators, such as we are providing for the Columbus Public Schools.
This will enable the educators to ensure that students are learning the
state standards prior to being tested.
 In Ohio, we count the same tests three different ways, making the
reporting of the accountability confusing to the general public. Ohio is
considering adding a fourth way, the value-added model, one which
could be more informative. The value-added model realizes that
students come to schools at different levels of readiness. The valueadded model will measure the progress of individual students from
year to year instead of comparing different cohorts each year. I believe
this model will provide even more transparency into methods that are
most effective. While the value-added approach is informative, we will
need to find a way to report the accountability that is less confusing to
the public.

NCLB tests critical topics such as math and reading however, it doesn’t
test technology skills, creativity and innovation, or communication
skills; skills we desperately need in a flat world.7
46
National Association of Test Directors 2005 Annual Proceedings
The Bottom-Line
The business community needs school systems to be successful – we need a
viable workforce with higher level skills; rote skills can be done elsewhere at a
much cheaper rate. However, some of the higher level skills are not easily
measured.
The accountability and transparency of NCLB has the potential to show us
where districts and businesses can work together to improve the outcome for
our children.
Businesses are willing to invest the time, money and resources to improve the
effectiveness of the system. I have given you examples of what Nationwide is
doing in Columbus, Ohio. I encourage you to find companies within your
community with which to partner. The bottom-line is that it takes all of us to
educate a child – it is our workforce!
Notes
1 What
Matters, Community Assessment 2004, United Way of Central Ohio, 2004
Public Schools, 2007
3 Elena Gouskova and Frank Stafford, University of Michigan Institute for
Social Research, 2005
4 Tough Choices or Tough Times, National Center on Education and the
Economy, 2007
5 The Columbus Dispatch, editorial, March 12, 2007
6 Education and the Workforce, www.businessroundtable.org, 2006
7 The World is Flat, Thomas L. Friedman, 2005
2 Columbus
47
Current Guidance for Integrity In Testing
Discussant’s Comments
Glynn Ligon
ESP Solutions Group
Bob Linn took a short amount of time for what I think is one of the most important
points in his presentation -- the compensatory model which is looking across different
areas rather than having all of these different points of failure. If you have his paper it
is well worth look at that that and considering the compensatory model even more.
We have to thank Judy for the mini-poster session by proxy. [Ed. note: due to illness
Judy could not fly and her paper was read by the moderator]. That was excellent. So
you can see all of those lines in the graphs. One thing that comes down to me as I go
around and visit the states and then listen to this paper is that states are paying too
much for their testing programs. They just flat over pay. I don’t know if we have any
test publishers sitting here, but if there are, you’re overpaid. The fact of the matter is
that as states we are spending way too much money on security. We’re paranoid
about security. How many forms do they have? Way too many. Why do they have
that many forms? Security. So, that’s one thing we have to change
David, I gotta tell you, I have two degrees in psychology, and your presentation was
fascinating because every time you talk about No Child Left Behind, you say things
like, to some degree, nevertheless, on the other hand. But when you talk about
continuous improvement you were are right on target. You were committed and you
were excited about what you were saying. I don’t know if you noticed on the back of
48
National Association of Test Directors 2005 Annual Proceedings
the paper that I received it says that Jack Grayson is going to write a foreword to this
paper. Jack Grayson created Baldridge and if he had been here he would have been
given you very high marks. Because once you started talking about continuous
improvement you were right on with what all that Baldridge involves. That was
excellent model for others to follow.
Barbara, I love when you get in to talking about leading indicators. Again in this
paper, it talks about what in the world are leading indicators. In education we talk
more about trailing indicators and that has evolved a bit. And quite frankly, do you
know the difference between trailing indicators and leading indicators? When you
look at them and what you do with them. They are all just numbers. If you look at
your numbers soon enough and take action on them they become leading indicators.
That’s what we need to do. So, Nationwide we’re glad you’re on our side!
So general comments.
The compensatory model is great. It’s what needs to be done, it is the correct fix
except for one thing. It should be applied to students, we should look at merging
student performance across areas. It should be applied to subgroups. But I don’t think
it should be applied to schools. I’m worried because it’s No Child Left Behind, and
that translates to the subgroups, if we start averaging then we go back and start
covering up low performing subgroups with high performing subgroups. So that’s
what bothers me a little, but I think that compensatory model is the right approach.
Multiple indicators. I was at a meeting in Colorado where the groups were talking
about the multiple indicators used and the three accountability systems in Colorado.
And, quite frankly the folks were confused by it. So, there was this whole session and
it filled up a wall of posters about what indicators they should have, what multiple
indicators they should have. In the end the guy got up and summarized it all by saying,
well what we need to do is take all of these multiple indicators and we need to bring
them together in one measure. And then I realized that that conundrum really makes
sense. And what we do want is multiple indicators that influence the very small
number of indicators that we actually pay attention to. Margaret, this kind of gets to
your point too, that you don’t want all of these indicators, you want to get them
together into one thing you can look at. And again, in the paper I handed out it talks
about the difference between an indicator and an index that synthesizes indicators,
which is basically what No Child Left Behind is (see Appendix for the paper).
So yesterday I did something I’ve never done. I’ve been attending AERA meetings
since 1976. I’ve never gone to a presidential address and I wanted to see what Eva
Baker would say about testing. And she said she had a slide that talked about
accountability fixes. Accountability fixes -- more indicators, opportunities to learn
measures, performance assessment, formative assessments and prioritize our
standards. She couldn’t have been more wrong. Absolutely wrong. She understands,
but she did exactly what is happening in the United States with No Child Left Behind.
People have lost site of the fact that accountability and getting diagnostics and helping
49
Current Guidance for Integrity In Testing
the teachers, giving people information back that they can actually use to improve our
students performance, are two different devices. And as people who have
psychometric backgrounds, we’ve got to step up, and we’ve got to be the ones to say,
look states you need two testing programs. You need a testing program that does
accountability, and here are the characteristics of an accountability test. It’s not
testing every standard with 12 items if individual students lack in diagnostic scores
that teachers get back at the beginning of the next year and keep doing that every year
It’s a very tight survey measure that the right psychometric properties for
accountability.
And then we do all of those other great things that Eva want us to do – the diagnostic
assessments of all kinds. I wrote all sorts of notes on her program yesterday and half
of them are underlined and had exclamation marks next to them. Do we do formative
assessment separately? There are many types of formative assessments -- process
assessments, diagnostic, interim, pre-test, teacher-made test, curriculum tests,
performance-based test, unit tests, pop quizzes, grades for credits, benchmarks,
adaptive assessments, all of these things have to do with learning, but not with
accountability. And that’s what’s causing the confusion with No Child Left Behind.
Because we try to do all of those things with the same state assessment test and as we
saw in Judy’s paper it costs millions of dollars in Ohio to try to please everybody with
a single test. So we need two tests, we need an index that handles this multiple
measures situation and we need a compensatory model, but we need to be careful not
to lose sight about what No Child Left Behind is all about, and not go back to masking
the performance of the individual sub-groups.
So, I think my message to you is, it’s time for this group – the nation’s test directors,
to step up and say we need our assessments to be true to their focuses. And that
probably means we need two test programs in each state. And quiet frankly states
ought to be doing formative assessment. Now to pay for that, many districts are doing
it themselves. They can talk about what you’re doing and then the states can focus on
accountability assessments. So I think to go back to the title of our session, ‘Change
It, Fix It Or Live With It.’ I think we can fix it.
50
Download