1 A Quandary for School Leaders: Equity, High

advertisement
A Quandary for School Leaders: Equity, High-stakes Testing and Accountability
Submitted for
AERA Handbook of Research on Educational Leadership for Diversity and Equity
Julian Vasquez Heilig, Ph.D.
Department of Educational Administration
The University of Texas at Austin
George I. Sanchez Building (SZB) 374D
1 University Station D5400
Austin, TX 78712-0374
Tel: (512) 897-6768
Fax: (512) 471-5975
jvh@austin.utexas.edu
Sharon L. Nichols, Ph.D.
Department of Educational Psychology
University of Texas at San Antonio
501 W. Durango Blvd., Rm. 4.342
San Antonio, TX 78207-4415
Tel: (210) 458-2035
Fax: (210) 458-2019
Sharon.Nichols@utsa.edu
2
The goal of this chapter is to consider the interrelation among equity, high-stakes
testing and accountability as they relate to evolving roles of today’s school leaders. As
educational policy has developed over the past eighty years, a rapidly growing fear and
uncertainty has emerged around the “core technology” of education (Young and Brewer,
2008). As a result, many schools leaders feel as if their work has changed dramatically,
from a focus on curriculum and instruction to one on assessment and intervention
(McNeil, 2005). The intense focus on test results and how those results are used and
shared with the public has left many school leaders feeling disillusioned, anxious and
uncertain about the decisions they make (Vasquez Heilig & Darling-Hammond, 2008).
Consequently, school leaders face a quandary over how best to manage their schools
when policy-driven accountability mandates conflict with curriculum-based, studentcentered instructional practice—an issue particularly salient among leaders serving in
historically low performing schools and where leaders are often rewarded for discarding
the neediest underperforming students (Vasquez Heilig, Williams, & Young, 2011). As
we discuss in this chapter, for many school leaders in low-performing schools, the
exclusion of at-risk students from school appears to be a rational response to the
quandary fomented by the current educational policy environment.
We begin with a short history to locate the roots of high-stakes testing and
accountability in the debates between Deweyan pedagogical principles and administrative
reformers and briefly trace the emergence of testing out of the progressive era into the
educational discourse. We then describe the role that the Lone Star State has played in the
push for national high-stakes testing and accountability followed by a précis of Texas
data considering long-term African American and Latina/o student achievement trends
3
and graduation/dropout rates in the midst of accountability. Next, we examine the
literature to understand how educators in low-performing schools across the nation have
responded to the press of standards and accountability. We conclude with
recommendations for future research and some suggestions for school leaders based on
proposed alternative views of high-stakes testing and accountability.
Foundations of the High-Stakes Testing and Accountability Movement
At the turn of 20th century, the United States experienced dramatic population and
demographic growth. Driven by immigration, from 1890 to 1930, the population nearly
doubled, from 62 million to 122 million. Concomitant with population growth,
compulsory schooling laws doubled the cost of schooling in many localities as attendance
quadrupled (Tyack, 1976). School enrollment ballooned from 15.5 to 50 million, and the
number of schoolteachers increased from 400,000 to 5.6 million (Nichols & Good, 2004).
In spite of (or because of) these shifts, in conjunction with a highly racialized society, the
struggle for educational equity and access were readily apparent (Blanton, 2004). In the
early 1900s, questions about quality and access were particularly relevant as African
American and Latina/o students attended woefully under resourced schools at
disproportionately higher rates (Blanton, 2004).
The progressive movement came into prominence in this era of rapidly changing
demographics and evolving social context. An awakening of social conscience among the
muckrakers, prohibitionists and corporate reformers spurred the movement dubbed the
progressive era. Progressive reformers, so named by historians, fomented the public
policy discussions of the day. On one side were the administrative progressives who
argued that the primary goal of schooling was a uniform structure in the mold of
4
Taylorism that efficiently prepared individuals for a place in the workforce (Tyack,
1974). In today’s language, the tenants of administrative progressivism could be
considered neoliberal. On the other side were the pedagogical progressives who
proffered that schools should recognize and adapt to the individual capacity and interests
of students rather than rote standardization (Tyack, 1974)—a.position that aligns more
closely with the socio-constructivist conception of teaching and learning (Vygotsky,
1978). These dichotomous positions contrast with today’s view of the term “progressive”
which is commonly associated with a particular political point of view (i.e., liberal).
The administrative progressives sought to apply a corporate model where expert
bureaucrats ran schools seeking social efficiency. They supported multiple ability tracks,
extensive revisions of curriculum, detailed records of students and upgraded training for
education professionals (Cuban, 1988). Administrative progressives argued that the
governance of city schools was immersed in local politics and inefficiency and should be
turned over to educational experts. They also instituted inter-locking directorates that
created a “symbiosis” between state departments of education, education professors and
state educational associations to encourage local administrators to publicize and enforce
new education codes, regulations and standards (Tyack, James, & Benavot, 1987). The
administrative progressives were concerned with organizational performance and
aggressive “uniform” goals rather than individualized development.
The pedagogical progressives’ counterpoint to administrative progressives was
that the key to student success was not centered in management efficiency, but rather a
focus on meeting the needs of individual students. John Dewey (as cited in Tyack, 1974)
argued,
5
It is easy to fall into the habit of regarding the mechanics of school organization
and administration as something comparatively external and indifferent to
educational purposes and ideas…We forget that is it such matters as the
classifying of pupils the way decisions are made, the manner in which the
machinery of instruction bears upon the child that really controls the whole
system. (Tyack, 1974 p. 176)
Pedagogical progressives concentrated on inspiring teachers to change philosophy,
curriculum, and methods to subvert the “hegemony” of school and giving them
independence to do so in order to increase student achievement and success (Tyack,
1974). Dewey argued that democratic education required “substantial autonomy” for
teachers and children. He theorized that children needed education that was authentic—
allowing them to grow mentally, physically and socially by providing student-centered
opportunity to be creative, critical thinkers (Dewey, 1938).
Labaree (2005) proffers that the modern educational policy environment is
heavily influenced by this long-standing debate between administrative and pedagogical
views. He argued that pedagogical progressives failed “miserably” due to the structure
formed by administrative progressives that “reconstructed the organization and
curriculum of American schools in a form that has lasted until this day” (p. 276). The
administrative progressives won the struggle to focus school reform on the management
of schools and the measurement of uniform curriculum structures. In their study of
progressive era, Lynd and Lynd (as cited in Tyack, 1974) acknowledged that in the clash
between quantitative administrative efficiency and qualitative education goals, “the big
guns are all on the side of the heavily concentrated controls behind the former” (p. 198).
6
These positivistic values cultivated by the administrative progressives undoubtedly
underlie the emerging power of standardized testing in education from the progressive era
onward.
The Transition from Low-Stakes to High-Stakes Testing: Toward Greater School
Efficiency
For centuries, philosophers and scientists have sought ways to measure human
potential and intelligence (Gould, 1996). As the science of measurement evolved, so too
did the discourse surrounding how tests and measures might be used to facilitate the
educational experience. The use of standardized tests in school settings emerged in the
late 18th century at the same time psychologists were becoming increasingly interested in
developing scientific methods for measuring human qualities such as intelligence
(Gamson, 2007; Mazzeo, 2001; Sacks, 1999). The widespread use of such tests was
popularized with the construction of the standardized “National Intelligence Test”
spearheaded by Edward Thorndike, Lewis Terman, and Robert Yerkes (Giordano, 2005).
Standardized intelligence testing gained momentum between 1880 and World
War I as a variety of instruments were utilized to gauge mental ability (Boake, 2002;
Gamson, 2007). Soon after the turn of the 20th century, the Binet-Simon scale began to
test children aged 3 to 15 years to compare their intellectual capacity with that of
“normal” children and adolescents of various ages to devise an IQ, or intelligence
quotient (Binet & Simon, 1916). Ability testing of immigrants also emerged as federal
immigration acts as early as 1917 required basic literacy tests for new arrivals (Timmer &
Williamson, 1998). During WWI, the Army utilized mental testing developed by Edward
Thorndike to determine which soldiers were eligible for the officer corps (Samelson,
7
1977). The conception that aptitude could be quantitatively measured and used for
educational purposes also gained momentum as intelligence tests were revised for use in
schools and promoted “tracking” systems used to segregate students according to ability
(Chapman, 1981).
Growing familiarity of these standardized tests ignited debates early in our
educational history about how best to assess students’ academic success potential.
Proponents, skeptical of subjective teacher grading systems (Starch & Elliott, 1912, as
referenced in Giordano, 2005), believed standardized tests were the perfect way to elicit
meaningful and reliable data about students. Opponents worried about test bias and their
limited capacity to adequately account for student differences (e.g., race, income). These
are familiar worries about the appropriate use of standardized tests that have persisted
since their inception (Sacks, 1999).
In spite of ongoing debates regarding the fundamental purposes of schools (and
therefore, use of tests, e.g., Cuban, 1988; Tyack & Cuban, 1995), proponents of
standardized tests and purveyors of the administrative progressive viewpoint of schooling
were convinced of their necessity and role. E. L. Thorndike put it this way,
Educational science and educational practice alike need more objective, more
accurate and more convenient measures…Any progress toward measuring how
well a child can read with something of the objectivity, precision,
commensurability, and convenience which characterize our measurement of how
tall he is, how much he can lift with his back or squeeze with his hand, or how
acute his vision is, would be of great help in grading, promoting, testing the value
of methods of teaching and in every other case where we need to know ourselves
and to inform others how well an individual or a class or a school population can
read (Thorndike, 1923, p. 1-2).
Since Thorndike’s time, the form and function of standardized tests have expanded
swiftly and have been used for many purposes over the years (Sacks, 1999). Norm-
8
referenced tests (e.g., IQ) gave us a way to rank students according to aptitude. Criterionreferenced tests provided measures of students’ progress against eternally defined
standards. Other standardized tests gave us a way to make predictions about students’
academic potential (e.g., SAT, GRE) (Giordano, 2005; Herman & Haertel, 2005; Moss,
2007; Sacks, 1999).
When it comes to consequences, until the 1980s, performance on any one of these
types of tests had relatively low stakes for students and essentially no stakes for teachers.
Although some states and districts have a history of experimenting with high-stakes
testing as a way to reform and improve schools, this practice was inconsistent and
relatively inconsequential for large groups of students (Allington & McGill-Franzen,
1992; Mazzeo, 2001; Tyack, 1974). This has changed radically over the past few decades
as the rhetoric of public schools in “crisis” expanded (Berliner & Biddle, 1995; Glass,
2008; National Commission for Excellence in Education, 1983) culminating with the No
Child Left Behind act (NCLB, 2002) that mandated the use of high-stakes testing in all
schools for all students. Consequential standardized testing systems pervade modern
American classrooms (Herman & Haertel, 2005; Orfield & Kornhaber, 2001).
Using Stakes-Based Testing to Address Underperformance and Inequities in
Education
The current emphasis on tests for making important decisions schools, students,
teachers, and administrators can be traced back to two important events. The first rests
with the Cold War Era launch of Sputnik—an event that ignited deep concern about the
adequacy of American public schools to prepare students to compete internationally.
Since then, public school bashing has become a tool justifying series of federal reforms
9
that over time have become increasingly intrusive (Gamson, 2007; Giordano, 2005). The
second event came with the 1965 authorization of the Elementary and Secondary
Education Act (ESEA). ESEA was the first large-scale federal legislation aimed at
equalizing educational opportunities for all of America’s students. A significant goal of
ESEA was to address resource allocation inequities among schools serving wealthy and
poor students. Of its many provisions, ESEA provided federal dollars to schools serving
large populations of students of poverty. At the time, it was argued that equalizing school
inputs (resources, library books, access to materials, supplies and teachers) would help to
minimize achievement gaps among students.
Over time, the focus shifted from school inputs (e.g., what do schools provide to
students?) to their outputs (e.g., what skills do students leave with?). As a result, states
began implementing minimum competency tests as a way to comply with federal
recommendations to ensure all students left school with at least the ability to read and do
basic math (Herman & Haertel, 2005; Moss, 2007). In some states, students could be
denied a diploma if they did not pass these tests, but there were few if any consequences
for teachers or schools.
Eventually the minimum competency tests were criticized for being relatively
easy to pass since they were concerned with minimums to be learned: the achievement
floor and not the achievement ceiling. Many politicians and citizens alike believed that
students were not being stretched enough, that overall US school performance was not
improving, and that the terrible achievement gap between white middle-class students
and black, Hispanic, or poor students was not being reduced. As discontent grew, the
10
language of minimum competency changed to the language of standards-based
achievement (Glass, 2008; McDonnell, 2005).
In the years following ESEA concern for the emerging American educational
“crisis” grew (Aronowitz & Giroux, 1985). This was fueled, in part, by international data
that purported to show that US schools were not as good as those in other nations. More
fuel was provided because of a growth of the international economy at a time when the
national economy was performing poorly. A scapegoat for that state of affairs needed to
be found (Berliner & Biddle, 1995). The concerns of the 1970s culminated in the release
of A Nation at Risk (1983)—a report that predicted that unless public education
received a major overhaul, and unless expectations for student achievement were raised,
America’s economic security would be severely compromised (National Commission for
Excellence In Education, 1983).
Indeed, A Nation at Risk sparked a renewed interest in and urgency about how
we educate America’s students. That sense of urgency set in motion a series of policy
initiatives aimed at improving the “failing” American educational system. The past 20
years have also seen a renewal of the rapid demographic changes experienced in during
the progressive era, spurring a renewed focus on educational policy to solve America’s
educational problems—among these was a call for “closing the achievement gaps,”
accountability, and more consequential testing (Anderson. 2004; Lipman, 2004; Herman
& Heartel, 2005).
11
Closing the Achievement Gap: The Modern Rise of
High-Stakes Testing and Accountability
While immigrants mostly of European origin led the immigration wave during the
progressive era, people of color are leading the current demographic shift in US schools.i
According to the U.S. Census Bureau, the number of Latina/os living in American
households will rise 500% between 1970 and 2010 (U.S. Census, 2006). By the 2000
census, Latina/os had become the largest racially collated ethnic minority group in the
United States at almost 13% of the populace (U.S. Census, 2001).1 By 2008, nearly one
in six U.S. residents were Latina/o (U.S. Census, 2009). Turning to African Americans,
they increased by 6.6 million between 1990 and 2000—the largest increase since the first
Census in 1790. African Americans were 12.9% of the U.S. population in 2000 and
increased 14% by 2008 (U.S. Census, 2008). Nationally, students of color constituted a
majority in the public schools in 11 states by 2008 (Southern Education Foundation,
2010).ii
One of the most pressing problems in the United States is improving student
academic performance, within the nation’s burgeoning African American and Latina/o
student population (Rumberger & Arellano Anguiano, 2004). African American and
Latina/o students comprise a large sector of students vulnerable to poor school
performance, as many of these youth arrive in high school having received uneven or
irregular instruction (Jencks & Phillips, 1998). Although there have been multitude of
efforts (both successful and unsuccessful) over time targeting achievement-related gaps
among poorer students of color and their white counterparts (Anderson, 2004), we focus
1
People who identify as Latina/o may be of any race.
12
on the latest efforts subsumed within the No Child Left Behind Act of 2001 that
reauthorized the Elementary and Secondary Education Act of 1965.
No Child Left Behind: Lassoed From Texas
No Child Left Behind was rooted in Texas long before it’s passage in 2001. Soon
after the Nation at Risk (1983) report, there was a new push by Texas policymakers and
business leaders to reform the state’s schools. The Perot Commission, and later the Texas
Business-Education Coalition coalesced corporate leaders to represent the business
perspective in education reform (Grissmer & Flanagan, 1998). Ross Perot and his allies
were “influential actors” and proponents of accountability and testing in Texas (Carnoy
and Loeb, 2003). Codified in this reform effort was a determination to inculcate reform
measures to increase efficiency, quality and accountability in a push for schools to
perform more like businesses (Grubb, 1985).
As a result, Texas was one of the earlier states to develop statewide testing
systems during the 1980s, adopting minimum competency tests for school graduation in
1987.iii In the early 1990s, the Texas Legislature passed Senate Bill 7 (1993), which
mandated the creation of the first-generation of Texas public school accountability to rate
school districts and evaluate campuses. The Texas accountability system was initially
supported by the Public Education Information Management System (PEIMS) data
collection system, a state-mandated curriculum and the Texas Assessment of Academic
Skills (TAAS) statewide testing program.
The prevailing theory of action underlying Texas-style high-stakes testing and
accountability ratings was that schools and students held accountable to these measures
would automatically increase educational output as educators tried harder, schools
13
adopted more effective methods, and students learned more (Vasquez Heilig & DarlingHammond, 2008). Pressure to improve test scores would produce genuine gains in
student achievement (Scheurich, Skrla, Johnson, 2003). As test-based accountability
commenced in Texas, achievement gains across grade levels conjoined with increases in
high school graduation rates and decreases in dropout rates brought nationwide acclaim
to the Texas accountability ‘miracle’ (Haney 2000).
Citing the success of the first generation of Texas-style high-stakes testing and
accountability, President George W. Bush and former Secretary of Education Rod Paige,
two primary arbiters of NCLB, lassoed their ideas for federal education policy from
Texas. NCLB replicated the Texas model of accountability by injecting public rewards
and sanctions into national education policy and ushered in an era where states and
localities are required to build state accountability systems on high-stakes assessments.
Early on, the literature echoed the administrative progressive ideals that the long
term implications of accountability pointed to increase efficiency and achievement
(Cohen, 1996; Smith and O’Day, 1991) others posited the Deweyan ideal that testing
would dramatically narrow the curriculum and could negatively impacting classroom
pedagogy (McNeil & Valenzuela, 2001; Valencia & Bernal, 2000). Nevertheless, at the
point of NCLB’s national implementation, the Texas Miracle was the primary source of
evidence fueling the notion that accountability created more equitable schools and
districts by positively impacting the long-term success of low-performing students
(Nichols, Glass, & Berliner, 2006).
The Texas Case: The Long-Term Impact of Accountability
The successes of the Lone Star State’s accountability policy in the midst of the
14
Texas Miracle has been debated vociferously in the literature (Carnoy, Loeb, & Smith,
2001; Haney, 2000; Klein, Hamilton, McCaffrey, & Stecher, 2000; McNeil, Coppola,
Radigan & Vasquez Heilig, 2008; Linton & Kester, 2003; Toenjes & Dworkin, 2002;
Vasquez Heilig & Darling-Hammond, 2008). In theory, accountability spurs high schools
to increase education output for all students, especially for African American and
Latina/o students, who have been historically underserved by U.S. schools. Yet the
question remains: Do policies that reward and sanction schools and students based on
high-stakes tests improve African American and Latina/o student outcomes over the long
term?
The Texas Comptroller of Public Accounts (2001) indicated that the PEIMS was
created in 1983 for the Texas Education Agency (TEA) to provide a uniform accounting
system to collect all information about public education, including student demographics,
academic performance, personnel, and school finances. The PEIMS lies at the heart of the
Texas student accountability system, and the wealth of information gathered from school
districts offers the opportunity to gauge the success of Texas-style accountability for
students over time. At the time of writing, there were no publicly available TEA reports
or published research considering cross-sectional African American and Latina/o
achievement and progress through school in Texas. As a result, this chapter gathers data
from multiple state reports to descriptively consider TAAS and TAKS exit testing
achievement, grade retention, dropout rates, and graduation rates for more than 15 years
of Texas-style accountability.iv
To address the question of whether student outcomes for African American and
Latina/o students have improved over time in Texas, we examined cross-sectional high
15
school exit exam data from the inception of accountability in 1994 through 2010—the
most recent data available. During this time, Texas utilized two generations of
accountability assessment systems. The first generation of relied on the Texas
Assessment of Academic Skills (TAAS) and lasted 1994-2002. The second generation
included the Texas Assessment of Knowledge and Skills (TAKS) and includes data from
2003-2010. Our descriptive statistical analyses focus on African American and Latina/o
high-stakes high school exit test score trends for 10th graders (TAAS, 1994-2002) and
11th graders (TAKS, 2003-2010).
TAAS Exit Exam
Figure 1 shows that African Americans dramatically increased their achievement
on the TAAS Exit Math, from only 32% meeting minimum standards in 1994 to 85% by
2002. Concurrently, the percent of Latina/os meeting minimum standards increased from
40% to 88%. Although achievement gap between minorities and whites remained, the
gap for Latina/os and African Americans narrowed to 8% and 11%, respectively, between
1994 and 2002.
Figure 2 also shows large gains in the percent of African American and Latina/o
students meeting minimum standards on the TAAS Exit Reading. By 2002, TEA reported
that 92% of African Americans and 90% of Latina/os in the state had met minimum
standards on the TAAS Exit Reading. African Americans showed an increase of 32%
more students meeting minimum standards, while Latina/os showed an overall increase
of 29%. The achievement gap closed to 8% for Latina/os and 6% for African Americans.
[INSERT FIGURES 1 AND 2 HERE]
16
TAKS Exit Exam
In 2003, the TAKS replaced the TAAS as the exit exam in Texas. As shown in
Figure 3, between 2003 and 2010 the percentage of African Americans passing the TAKS
Exit Math increased from 25% to 81%, a gain of 56%. Latina/os showed a similar gain of
55% more students meeting minimum standards on the TAKS Exit Math (from 30% to
85%). Similar to the closing of the achievement gap on the TAAS Exit Math, the TAKS
Exit Math gap for African Americans and Latina/os decreased to 4% and 9% by 2010
(see Figure 3).
[INSERT FIGURE 3 HERE]
During the past 8 years of TAKS Exit testing, the percentage of African
Americans passing the TAKS Exit English Language Arts increased 43%, while the
proportion of Latina/os meeting minimum standards increased 38% (see Figure 4).
Similar to the closing of the achievement gap noted on the TAAS Exit Reading, the gap
between African American and White students decreased to 6%. By 2010, the gap
between the percent of Whites and Latina/os passing the TAKS Exit English Language
Arts had declined to 7%.
[INSERT FIGURE 4 HERE]
Dropout
Cross-sectional snapshot cohort rates show that publicly reported yearly dropout
rates more than halved in the first decade of accountability—arriving at about 1% for
African Americans and 2% for Latina/os by 2004 (see Figure 5). However, after 2005,
when the state began to use the National Center for Education Statistics (NCES) dropout
definition for leaver reporting, the yearly count tripled for Latina/os and quadrupled for
17
African Americans. Clearly, as evidenced by Figure 5, Latina/os and African Americans
were over-represented in the underreporting of yearly dropouts.
In the 1998–1999 school year, TEA introduced tracking of individual students in
cohorts between Grades 9 and 12 (TEA, 2001). As a result, the longitudinal cohort
dropout analysis begins in 1998 instead of 1994.v Figure 6 shows that TEA-reported
African American and Latina/o cohort dropout rates halved between 1999 and 2005.
However, after 2005, using the NCES dropout standard for leaver reporting, a 100%
increase in the number of publicly reported dropouts occurred in Texas.
[INSERT FIGURES 5 AND 6 HERE]
Notably, the cohort dropout rates more than doubled for African Americans and
Latina/os, after the adoption of the NCES standard. These numbers align with empirical
research critical of TEA’s publicly reported dropout numbers (Losen, Orfield, & Balfanz,
2006; Vasquez Heilig & Darling-Hammond, 2008) and suggests that student leavers were
underreported for quite some time by the state, especially when it came to African
American and Latina/o populations. Texas has vastly undercounted and underreported
dropout data over time.
Graduation Rates
If significantly larger numbers of African Americans and Latina/os were dropping
out of school in Texas, then cohort graduation rates should be correspondingly low.
Figure 7 shows that TEA reported African American and Latina/o graduation rates from
1996-2004 gradually rose to about 80% then dipped by almost 10% when NCES
standards were instituted in 2005. Notably, the large decline did not occur for Whites in
18
Texas, as their cohort graduation rates only dipped about 1% after the NCES readjustment.
[INSERT FIGURE 7 HERE]
In a study of Texas dropout data, Losen et al. (2006) argued that Texas graduation
rates historically have been overstated. They examined PEIMS data for individual
students and proffered that between 1994 and 2003, the state’s graduation rate increased
from 56% to 67%. In contrast, TEA’s publicly released statistics locate the graduation
rates at 72% and 84% for the same period—a difference of 17% by 2003, the equivalent
of approximately 46,000 students. Losen et al. noted that the overstatement of graduation
rates in Texas occurred partly because PEIMS has included many ways that students
could be excluded from enrollment data used to calculate graduation rates. Instead of
utilizing PEIMS to define away the dropout and graduation numbers in Texas, the NCES
definition has created more transparency in the state while calling into question whether
gains have actually occurred in Texas since the inception of accountability in 1994.
The Intercultural Development Research Association (IDRA) has argued that
adopting the NCES national dropout definition for Texas has provided a more accurate,
yet still understated representation of the magnitude of the overall dropout problem in
Texas (Johnson, 2008). More than two decades of IDRA’s yearly high school attrition
studies of PEIMS data have suggested that TEA has consistently and severely
undercounted student leaving in publicly reported dropout and graduation rates. IDRA
found the overall student attrition rate of 33% was the same in 2007–2008 as it was more
than two decades ago (Johnson, 2008). In contrast, TEA had reported annual dropout
rates that declined from 5% to 1% and longitudinal cohort dropout rates that declined
19
from about 35% to around 5% over the same time frame (Figure 8). IDRA also posited
that the high school attrition rates for Latina/o and African American students accounted
for more than two thirds of the estimated 2.8 million students lost from Texas public high
school enrollment since the 1980s (Johnson, 2008).
[INSERT FIGURE 8 HERE]
In summary, TEA’s TAAS and TAKS exit exam data show that African
American and Latina/o students apparently made dramatic achievement gains and
narrowed the achievement gaps during the TAAS and TAKS eras. However, the crosssectional student progress analysis showed that dropout rates and graduation rates for
African Americans in Texas do not appear to have improved (even with the apparently
inflated rates released by TEA) after about 15 years of high-stakes testing and
accountability policy; in fact, if data from empirical sources are to be believed, the
situation has worsened.
As a cautionary note, we acknowledge that this review of data is limited because
of the ongoing debate about the validity of leaver data collected by the state. Data
reported by the state of Texas has long been accused of inaccuracy in the accounting of
student leavers (Haney, 2000; Orfield et al., 2004; Vasquez Heilig & Darling-Hammond,
2008). The data used in these analyses are the same data that has drawn criticism from
IDRA and other researchers that have argued that the leaver problem is underreported
(Johnson, 2008). We believe the actual dropout rates to be much higher and graduation
rates lower than the publicly reported data (See also McNeil, Coppola, Radigan and
Vasquez Heilig, 2008). Furthermore, critics have questioned the validity of TAKS and
TAAS score growth over time due to TEA’s lowering of cut scores in successive state-
20
mandated testing regimes (Mellon, 2010; Stutz, 2011). We will return to these issues in
forthcoming sections.
Are High-stakes Testing and Accountability As Good as Advertised?
What explains the “dubious” nature of these data? Why does Texas consistently
overstate the educational trajectories of its African American and Latina/o youth
population? Campbell’s law cogently describes inevitable corruption that emerges when a
single indicator is used to evaluate complex social systems: “The more any quantitative
social indicator is used for social decision-making, the more subject it will be to
corruption pressures and the more apt it will be to distort and corrupt the social processes
it is intended to monitor” (Campbell, 1975, p. 35). Campbell’s law thusly warns of the
impending corruption and distortion associated with high-stakes testing systems that are
reliant solely on test scores to evaluate complex educational processes.
Rapidly emerging research seems to bear out Campbell’s early warning and
increasingly, data suggest educators engage in all types of activities aimed at making test
scores as favorable as possible. These “gaming” activities (Ryan, 2004; Figlio & Getzler,
2002) have a particularly negative effect on African American and Latina/o students’
educational outcomes since poor students of color and those with learning disabilities or
for whom English is a Second Language are more likely to be low scorers on these tests.
For example, poor, African American and Latina/o students are more likely to suffer
from low-quality pedagogy due to excessive teaching to the test and rote and drill
activities as well as more restricted curriculum (McNeil, 2000; Vasquez Heilig, 2011
Vasquez Heilig, Cole, & Aguilar, 2010). Other types of gaming activities include blatant
cheating such as correcting test answers (Amrein-Beardsley et al., 2010; Jacob and
21
Levitt, 2002a; Pedulla et al., 2003) and excluding special populations from testing
through exemptions and other means in order to show overall increased educational
achievement (Cullen & Reback, 2006; Jacob, 2005; Jennings & Beveridge, 2009).
Gaming and Cheating Because of High-Stakes Testing
Before we review some of the research underscoring the prescience of Campbell’s
law, it is important to note that not all acts of cheating or gaming are the same (Nichols &
Berliner, 2007a, b). We believe, as Amrein-Beardsley, Berliner, and Rideau (2010)
describe, that there are different levels or “degrees” of cheating that ensue from highstakes testing pressures. First-degree actions constitute willful, premeditated acts
designed to intentionally change, alter, or fudge test score data. Second-degree actions
include more nuanced types of test fudging that include subtle modifications of test
administration protocols such as when teachers provide test taking cues, encouragements
and reminders, as well as gaming associated with how proficiency cut scores are
determined and manipulating testing populations. Finally, third degree “cheating” is
characterized by involuntary actions or cheating without awareness such as when
educators use readily available test items to prepare students but “focus only on the
testable standards, concept areas, and performance objectives” (Amrein-Beardsley,
Berliner, & Rideau, 2010, p. 7).
Prevalence of cheating
It is difficult to gauge the prevalence of test-related teacher/administrator
cheating/fudging because of the reliance on self-report data drawn from nongeneralizable samples. Still, some clues have emerged. A survey conducted with teachers
and administrators in Tennessee found that 9% of teachers witnessed some type of test
22
impropriety on that state’s test (Edmonson, 2003). A national survey of teachers
conducted in the early years of NCLB suggested that approximately 10% admitted to
providing hints about answers during the test (second degree) with approximately 1.5%
admitting to actually changing answers on the test (first-degree) (Pedulla et al., 2003).
Jacob and Levitt (2002a, b) studied classroom test results in the Chicago Public School
district that at the time used ITBS to make decisions about the quality of teachers. They
concluded that approximately 4-5% of classroom teachers had engaged in blatant
cheating (e.g. changing answers from wrong to right).
Another difficulty in estimating prevalence has to do with varying definitions of
what constitutes cheating. Amrein-Beardsley, Berliner, and Rideau’s (2010) self-report
study of teacher cheating provides some additional insight on prevalence according to
their first, second, and third degree characterizations. In their study, Amrein-Beardsley et
al. (2010) surveyed a non-representative sample of 3,085 teachers across the state of
Arizona, asking them to provide yes/no responses to a series of questions pertaining to
their views of other teachers’ cheating behaviors as well as their own. Not surprisingly,
teachers reported lower instances of personal cheating and greater instances of others’
cheating. Reports on what colleagues do in the classrooms were based on what they
overheard or what they were told (i.e., not what they witnessed). From this data, they
found that 39% of teachers said colleagues encouraged students to redo test problems,
34% gave students extra time on the test, and 34% wrote down questions for students to
use to prepare for future tests. Few knew of anyone who outright cheated (10%). Only
1% of respondents admitted to blatantly cheating themselves, but were more likely to
admit second and third degree types of cheating activities such as encouraging students to
23
redo problems (16%) and giving students extra time on the test (14%) or encouraging
them during the test (8%).
Although we must be cautious about generalizing from this data, it seems safe to
estimate that anywhere from 25% to 40% of teachers at some point in time engage in
“second” degree forms of cheating (consisting primarily of test administration protocol
violations). When it comes to more blatant, willful acts (changing answers), similar to
what others have found, it is likely more in the 1-5% range. Interview and focus group
data provide more information on the type of blatant acts witnessed. For example, one
teacher reported the blatant cheating activities of a once valued mentor:
She told me to sit down. She said that she needed my help because of some of the
bubbles were wrong and they needed to be fixed. I did not even think to question
her. I was just, like, ‘okay.’ So, I sat down next to her and I saw her. She was
going down looking at all the bubbles. She told me that some of them were
wrong. So, she gave me a stack of them and she just told me this one is supposed
to be B or this one is supposed to be C and whatever (Amrein-Beardsley, Berliner,
& Rideau, 2010, p. 16).
Unfortunately, this was not an isolated incident. Another teacher reported the following:
I observed one time during the AIMS a fellow teacher in the same grade level
doing the writing, which is supposed to be done by the children—the prewriting,
first draft, final draft. I walked by her classroom and the door was opened. She
had the kids lined up at her desk…She told me they were just doing the AIMS test
and she was correcting their writing and sending them back to their seat[s] to
correct their writing. Of course that year she got kudos because her kids reached
24
the 85th percentile. So, she got great results. She was doing the writing for them,
but she got all the pats on the back and everything else. I even brought it to the
principal’s attention. He did not really want to hear about it because of course if
our scores look better they [administrators] don’t care. They turn their heads
(Amrein-Beardsley et al., 2010, p. 20).
When learning is viewed as a product (i.e., test score) instead of a process (i.e., an
experience), and when that product is the only thing that stands between educators and
financial resources that would significantly improve theirs and their students academic
livelihoods, then it stands to reason that educators will engage in behaviors aimed at
making that indicator as favorable as possible.
No Child Behind Left
The most prominent and empirically documented form of second-degree acts of
cheating involves manipulating the test taking pool. An overreliance on test scores as the
sole criteria for significant funding decisions has the unintended consequence of
incentivizing exclusionary practices where educators—either blatantly or indirectly—
remove low-test scorers from taking the test. Studies have shown that greater numbers of
marginalized populations of students tend to be excluded from the test (Figlio & Getzler,
2002; Jacob, 2002). These exclusionary practices come in many forms. Figlio and
Getzler’s (2002) study provides evidence that the introduction of high-stakes tests in
Florida increased by almost 50% the likelihood a student would be classified with a
learning disability, thereby making it more likely their test scores would be excluded
from AYP calculations. Elsewhere, Figlio (2006) found that lower test scorers were more
25
likely to receive longer suspensions when caught in altercations with other, higher test
scoring students.
Sometimes exclusionary practices are more abrupt, aggressive, and obvious. In
Birmingham Alabama, for example, just before the state test, 522 students were
administratively “withdrawn” from the rolls (Orel, 2003). One of these students,
attempting to re-enroll in school, noted:
He [the principal] said that he would try to get…the students out of the school
who he thought would bring the test scores down. He also gave us this same
message over the intercom a couple of times after that. On the last day that I went
to school, I was told to report to the principal’s office because my name was not
on the roster. I was given a withdrawal slip that said “Lack of interest.” I did miss
a lot of school days. I had family problems. I had allergies (Orel, 2003)
And there are many examples. In North Carolina, some district-figures showed a
spike in suspensions around test taking time, and in Florida in 2004, over 100 low scoring
students were “purged” from attendance roles just before the state test (Nichols &
Berliner, 2007b).
In another more subtle effort at gaming the test taking pool, Figlio and Winicki
(2004) looked at school lunches before, during, and after high-stakes tests in Virginia
school districts. Their analysis of nutrition content and caloric amount throughout
Virginia’s high-stakes testing administration period revealed that districts faced with the
threat of imminent test-based sanctions were significantly more likely to provide higher
(empty) caloric lunches than other districts not facing immediate threats of sanctions.
This was done presumably in the quest to give lower scoring students a cognitive boost
26
on test days (Figlio & Winicki, 2004), thereby helping district to avoid federal sanctions.
In contrast to the intent of NCLB to expand educational opportunities to historically
marginalized populations, what we see over and over, is the exact opposite—high-stakes
testing incentivizing ways to further deny opportunity and high quality educational
experiences to all students.
Vasquez Heilig & Darling-Hammond (2008) examine longitudinal urban student
progress and learning over time in an large, urban district in Texas to see the effects of
high-stakes testing and educational opportunities provided to African American and
Latina/o students. Their analyses show that there were increases in student retention,
student dropout/disappearance, and, ultimately, failure to advance to graduation,
disproportionately affecting African Americans and Latina/os. Additionally, student
retention and disappearance rates increased high schools’ average Reading and Math Exit
TAAS (Texas Assessment of Academic Skills) scores and Texas Education Agency
(TEA) accountability ratings, enabling a large, urban Texas district’s high schools to
maintain a public appearance of success.
Playing With Numbers
Another form of second degree cheating has to do with politicization/gaming of
cut scores. When consequences are tied to tests, it becomes extremely important that
there are clear definitions of what level of performance constitutes passing and failing. If
a test includes 30 items, for example, how many must a test taker get right to deem
him/her proficient in that area? Must they get 29 correct? 28? As the stakes of doing well
(or poorly) increase, we see explicit and implicit forms of gaming associated with how
these arbitrary cut scores are determined (Glass, 2003). For example, in the spring of
27
2005, Arizona State Board of Education and Schools’ Chief Tom Horne publicly debated
the merits of two different cut scores: one which would have resulted in 71% of Arizona
students passing, and the other resulting in 60% of them passing. In short, the state board
wanted “easier” standards while Tom Horne was arguing for “tougher” standards
(Kossan, 2005).
In Texas, TEA has consistently lowered the standards in successive testing
regimes. Stutz (2001) reported that TEA lowered testing standards as “1.9 million
students tested in math, reading and other subjects… were required to correctly answer
significantly fewer questions to pass the high-stakes Texas Assessment of Academic
Skills.” For example, in math, students had to get only about half the math questions
right. Two years earlier, they had to get about 70 percent of the TAAS questions correct.
TEA also conducted similar reductions to the TAKS testing standards. Mellon (2010)
revealed, “The biggest change involved the social studies test for students in grades 8 and
10. This year, for example, eighth-graders had to answer correctly 21 of 48 questions —
or 44 percent. Last year, the passing standard was 25 questions, or 52 percent.” The lower
passing standards that TEA has consistently implemented over time calls into question
the much- touted improvements on the state-mandated TAAS and TAKS testing regimes.
In 2005, Achieve Inc compared state high-stakes test proficiency levels with those
set by National Assessment for Educational Progress (NAEP), a federally funded
achievement test viewed as a comparable assessment to most state tests. When it came to
fourth grade math performance in 2005, states varied widely in how they defined
“proficient” (Figure 9). Compared to NAEP’s standard of proficiency, Mississippi’s tests
28
were the “easiest” compared to NAEP standards, whereas Massachusetts’ assessments
were much “harder.”
[INSERT FIGURE 9 HERE]
The nature of cut scores brings into question the meaningfulness (or contentrelated validity) of resultant high-stakes testing performance. Data we review from Texas
(Figures 1-4) suggest that overtime greater proportions of African American and Latina/o
students have attained minimum levels of competency on the state’s TAAS/TAKS tests.
Although this pattern may represent the “truth” in the public policy sphere, the empirical
research on the consequential nature and limited validity of testing in Texas makes this
interpretation somewhat suspect. As others have pointed out, favorable patterns of
student performance on Texas’s high-stakes tests are more likely the result of lowering of
cut scores standards and suspicious exclusionary practices (removing low scorers from
test taking) or other forms of data manipulation (e.g., misrepresentation of
dropout/graduation rates), particularly with respect to African American and Latina/o
populations (Haney, 2000; Linn, Graue, & Sanders, 1990; Shepard, 1990; Stutz, 2001;
Mellon, 2010; Vasquez Heilig & Darling-Hammond, 2008). Thus, it is difficult to
ascertain how much students actually learn under high-stakes testing environments
(Amrein & Berliner, 2002; Nichols, Glass, & Berliner, 2006)
Cheating Unknowingly
Cheating in the third degree is characterized by involuntary actions or cheating
without awareness such as when educators use readily available test items to prepare
students but only focus on narrow testable standards, concept areas, and performance
objectives (Amrein-Beardsley, Berliner, & Rideau, 2010). Studies show that as test-
29
related pressure goes up, distortions in sound instructional decision-making become more
likely. Teachers report that the tests come to drive instruction, forcing them to teach to
the test, narrow the curriculum, restrict their instructional creativity, and compromise
their professional standards (Jones & Egley, 2004; Jones, Jones, & Hargrove, 2003;
Pedulla et al., 2003; Perlstien, 2007).
Vasquez Heilig (2011) examined how high schools in Texas have narrowed
curriculum and pedagogy in response to Texas high-stakes exit testing. Teachers (11 of
33) and principals (6 of 7) from each of the four study high schools detailed aspects of
“teaching to the test” and the impact of exit testing on the narrowing of the curriculum.
A staff member at suburban high school in Texas was concerned that the attention being
paid to the TAKS testing was distracting from the core mission of educating students.
The respondent had been on staff at the high school for over a decade, which provided a
long view of how the policy was changing the school environment:
I think personally that we're so caught up in this game of making the numbers
look good you know, so that your AYP and all this other stuff, the report card,
that we've forgotten why we're here as educators. . . . I think it's very transparent,
to them [students] too, that the emphasis is on teaching the tests, and
manipulating, in a sense, the figures, rather than on focusing on really teaching.
The students in the suburban high school also seemed concerned with how exit
testing was impacting instruction and the quality of the curriculum. As the staff member
mentioned, the tensions associated with the TAKS testing were also on the minds of the
students. When asked whether the TAKS appeared in the daily curriculum, students
30
related that they had noticed that many of their courses had a heavy TAKS preparation
focus. One student stated,
In my pre-AP [Advanced Placement] we were working in class on one problem
by day or sometimes, some days with practice like, many problems during the
class. But in regular classes they just give you the [TAKS] book and the class was
about that, was according with that book.
The students who faced the heaviest test-prep focused courses were those who had not
passed TAKS during prior testing opportunities. Students relayed that they were tracked
into courses where the amount of TAKS curriculum was increased. A Texas student who
had failed previous administrations of the TAKS Exit related,
Every class I have, it’s based on the TAKS, you know? Like we do exercises that
are in the TAKS. . . . Right now I’m taking the TAKS classes for the tests I need
and basically he lectures the whole period [on TAKS]. . . . We don’t have
homework.
Student informants at a rural Texas high school related that the curriculum was
saturated with teaching to the exit tests. They reported that their chemistry class entailed
100% TAKS test preparation—no textbooks, labs, experiments, or other traditional
means of science curriculum. The entire chemistry course was solely designed to drill
students for science exit testing by utilizing multiple-choice worksheets. The idea seemed
somewhat implausible, until the rural high school chemistry teacher was randomly
chosen to participate in a focus group. She characterized the worksheets in the course as
being entirely geared for the TAKS: “Mine is not going through a 15-minute bell-ringer,
31
then going on to teaching chemistry. No, no me it’s everything, so mine are actual
[TAKS] lessons. . . . I don’t just teach my course, now I teach towards the TAKS.”
It is also an open question whether tested subjects receive more attention in an
environment of accountability. While high-stakes testing did not halt nationwide arts
education in America, it is readily apparent that the focus of NCLB is elsewhere
(Chapman 2007; Zastrow & Janc, 2004). The only subjects for whom the federal
government holds states accountable (through Adequate Yearly Progress, the
accountability mechanism of NCLB) are reading and mathematics. Vasquez Heilig, Cole
and Aguilar (2010) examined another class of third-degree cheating—the decline of arts
education curriculum due to testing in language arts and mathematics. They tracked the
evolution and devolution of visual arts education from the progressive era through the
modern accountability movement. Their analysis of archival material, state curricular
documents, and conversations with policymakers show an increasing focus in the
accountability era on core subject areas of reading, writing, and mathematics at the
expense of arts education. They found that high-stakes testing had narrowed the
curriculum and pushed arts education and other non-core subjects to the margin. Berliner
(2009) argued that the accountability era has also culminated a movement from
curriculum being driven solely by local pedagogical and curricular discourse, to an
environment where educational standards defined by the state and federal levels have
determined the declining prominence and presence of the arts and other non-tested core
subjects in school curriculum.
Diane Ravitch, a widely respected conservative and initial supporter of No Child
Left Behind, has recently reversed her view concluding that high-stakes testing has
32
eroded educational quality, restricted curriculum access and lowered expectations of our
students (Ravitch, 2010). She concluded,
At the present time, public education is in peril. Efforts to reform public education
are, ironically, diminishing its quality and endangering its very survival. We must
turn our attention to improving the schools, infusing them with the substance of
genuine learning and revising that conditions that make learning possible (p. 242.)
If we didn’t know any better, we might have guessed John Dewey made this statement.
High-stakes testing pressure has resulted in view of learning as a product, a
simple test score, rather than as a process. It has also narrowed our perspective of learners
as their test score rather than as social beings who are learning to negotiate complex
social, and academic environments (McCaslin 1996; 2006). This commodification of
learning and learners as test scores or products fosters a specific kind of educational
context that has the unintended effect of rewarding cheating—a type of corruption
Campbell’s law predicted decades ago (1975).
Although student cheating is a familiar worry among educators and researchers
(Murdock & Anderman, 2006; Cizek, 1999; Hollinger & Lanza-Kaduce, 1996; Jensen et
al., 2002; McCabe & Trevino, 1997), only since this institution of high-stakes testing
systems have incidents of gaming become structural, visible and pernicious. Teachers and
administrators, faced with the potential deleterious consequences of not making AYP
(i.e., of students not scoring well on the test), behave in ways that will better ensure that
resultant test scores are as favorable as possible. Although gaming the system harms
everyone, marginalized populations are most at risk of enduring unintended negative
consequences of rational acts of cheating enacted by teachers and administrators.
33
School leaders must be more knowledgeable about how much they can truly trust
their students’ assessment data. This is especially true because of inherent validity
uncertainties associated with these data. The quandary for schools leaders rests with what
to do with these suspicious data from which they are asked to make decisions. Do you
use the data to assess success or to push out students? School leaders are in a pivotal
position of having to negotiate tensions between policy mandates (top down pressure)
and the need to support hard working teachers and students (bottom up pressures).
Educational leaders are the fulcrum of high-stakes testing and accountability policy as
they manage the press from a manifold of directions. The question arises— What should
school leaders do with the avalanche of data? Some school leaders approach schooling of
children with Deweyan ideals, while others are blinded by gaming their data to produce
enron-esque results. We argue that the former rather than the latter is the purpose of
education.
Future Research: Equity, Standards and Accountability
There are several areas of research that would provide important information on
the potential for standards and accountability to create more equity for districts and
schools. One important area for future research is how standards and accountability have
impacted how school districts and leaders go about the process of hiring and distributing
teachers. There is paucity in the research literature on whether No Child Left Behind has
remedied the historical inequality in the distribution of great teachers and leaders for lowperforming schools and students (Lipman, 2004).
It seems equally important to better understand the relationship between school
leaders’ experiences with high-stakes testing (i.e., as teachers) and their views of testing
34
later when they enter the schools as leaders. For example, Achinstein, Ogawa, and
Speiglman (2004) found that the accountability movement along with the local conditions
of specific schools that interact in ways that end up socializing two tracks of teachers
going into teaching. Are there two tracks of school leaders? In general, there seems to be
one track of leaders who seek out and are relatively amenable to local conditions where
the instructional decision-making is out of one’s hands (i.e., district-controlled) and
another track of school leaders who prefer greater degree of autonomy and independence.
There are many potential explanations for this seemingly two-track system—which begs
for more research to uncover the role in which previous experiences with tests may play
in influencing later professional orientations and decision-making.
Finally, the ultimate NCLB sanction levied after multiple years of failing to make
Adequate Yearly Progress (AYP) is total school restructuring/closure. Restructuring
policies under NCLB currently provide a broad framework for school districts to make
strategic changes in order to improve achievement, including school reconstitution or
turnaround. While there is a sizable amount of research on piecemeal school reform and
restructuring efforts, there is very little coherent data or comprehensive research on the
potential of the wholesale firing school leaders and teachers for improving student
achievement.
Conclusion
Despite a lack of empirical research demonstrating that standards and
accountability has accomplished equity for schools and districts, the current educational
policy discourse suggests that the solution to the persisting minority-majority
achievement gap, high African American and Latina/o dropout rates, and low graduation
35
rates is higher passing standards and even more high-stakes tests. Clearly, the structural
aims of the administrative progressives have overwhelmed the pedagogical progressives’
focus on practice in the classroom (Labaree, 2005). No Child Left Behind, the current
educational policy construct, has bypassed pedagogical progressive by entrenching and
codifying administrative progressive principles of management efficiency rooted in
quantitatively measurable outcomes. Thus, the ideas derived from administrative
progressive era persist as the student-centered pedagogical progressives ideals wane. The
consequences are potentially dire, especially for poor African American and Latina/o
populations whose educational experiences are significantly compromised as the data and
literature we review suggest. The historical roots of the design and implementation of
high-stakes testing in modern educational policy is a didactic for school leaders in the
midst of accountability.
When it comes to positing new directions for new policy, it is important to
understand that the ways in which our educational problems are defined are of central
importance. In the fall of 2009, Secretary of Education, Arne Duncan spoke to a
Washington-based education interest group, arguing for what was to become the Race to
the Top initiative. In the quest for more top-down policies, he posited, “It’s not enough to
define the problem. We’ve had that for 50 years. We need to find solutions—based on the
very best evidence and the very best ideas.” Mr. Duncan’s assumption that we somehow
know (and agree upon) the problems we face in improving America’s schools implies that
there is a single, identifiable solution. The problems with education are quite varied (too
many behavioral problems, underfunded school, large class sizes), necessitating a more
nuanced set of solutions (more school counselors, more teachers, multilingual
36
environments). Thus, our educational problems are vast, varied, and necessitate a wide
spectrum of solutions that acknowledge complex interrelationships and educational
discourses (Weaver-Hightower, 2008).
Mr. Duncan’s supposed reliance on “evidence” as the pathway to our educational
solutions must also be viewed with skepticism. For administrators, it is an era of data or
evidence-based decision making; however, evidence born out of high-stakes testing
contexts are inherently unreliable and invalid, as we review throughout this chapter. But
it is equally important that administrators be suspicious of their inherent biased
interpretations of what that “evidence” represents. Spillane and Miele (2007, p. 48) put it
this way,
Contrary to many people’s intuitions, a single piece of information (e.g., “test
scores are down”) does not entail a single conclusion (e.g., “classroom instruction
is poor”). This is in part because information is always interpreted with respect to
a person or organization’s existing beliefs, values, and norms. These constructs
serve as the lens through which new information is understood and thus influence
how information gets shaped into evidence.
By considering the inadequacy of high-stakes testing and accountability for
fomenting equity—this chapter seeks to push the field towards a new paradigm of
standards and assessment as an ecology and move the field beyond the uneasy dichotomy
that currently pits assessment as a technical exercise involving the quantification of
cognitive abilities versus assessment as the humanistic endeavor of portraying learners'
qualitative development (Falsgraf, 2008). Ultimately, if standards and high-stakes tests
do not provide a quality assessment of knowledge or cognitive ability as measured by
37
college or workforce readiness—higher education and career success—then it would
suggest an approach that is more ecological in nature, a development of a multiple
measures approach that entails broader subjective and objective assessments that can
better predict long-term student success.
Returning to the initial framing of the genesis of high-stakes testing and
accountability, administrative progressives and pedagogical/student-focused progressives
represent an inherent tension in American education. The persistence of high-stakes
accountability flies in the face of empirical research—there is no evidence to show that
such a system is improving children's education in low-performing schools over the long
term (especially as measured by graduation and dropout rates). Proponents of
accountability are very adept at using the language of equity and other seemingly
"progressive" concepts to promote the current NCLB-inspired system. The ascendancy
and dominance of the administrative/positivistic paradigm has co-opted the equity and
student-centered language of pedagogical progressivism and have in a sense silenced that
debate by creating the appearance of striving for an educational ideal even at the same
time it is being undermined.
For school leaders that are pressed by high-stakes testing and accountability, this
chapter may be welcome. However, those that support the rhetoric and the hegemony of
these structures may be unlikely to support our conclusion. In response, we conclude with
a thought proffered by Aronowitz and Giroux (1985, pp. 199-200),
…the debate about the reality and promise of U.S. education should be analyzed
not only on the strength of its stated assumptions but also on the nature of its
structured silences, that is, those issues which it has chosen to ignore or
38
deemphasize. Such an analysis is valuable because it provides the opportunity to
examine the basis of the public philosophy that has strongly influenced the
language of the debate and the issues it has chosen to legitimate.
Clearly, equity is a ratiocinative critique of high-stakes testing and accountability.
39
References
Achinstein, B., Ogawa, R. T., & Speiglman, A. (2004). Are we creating separate and
unequal tracks of teachers? The effects of state policy, local conditions, and
teacher characteristics on new teacher socialization. American Educational
Research Journal, 41(3), 557-603.
Allington, R., & McGill-Franzen, A. (1992). Unintended effects of educational reform in
New York. Educational Policy, 6(4), 397–414.
Amrein, A. L. & Berliner, D. C. (2002). High-stakes testing, uncertainty, and student
learning. Education Policy Analysis Archives, 10(18). Retrieved June 5 2007,
from http://epaa.asu.edu/epaa/v10n18/.
Amrein-Beardsley, A., Berliner, D.C. & Rideau, S. (2010). Cheating in the first, second,
and third degree: Educators' responses to high-stakes testing. Educational Policy
Analysis Archives, 18(14), Retrieved July 15, 2010 from
http://epaa.asu.edu/ojs/article/view/714
Anderson, J. D. (2004). The historical context for understanding the test score gap.
National Journal of Urban Education and Practice, 1(1), 1-21.
Aronowitz, S. & Giroux, H. A. (1985). Education under siege: The conservative, liberal
and radical debate over schooling. London: Bergin & Garvey Publishers Inc.
Berliner, D. C. (2009). Rational response to high-stakes testing and the special case of
narrowing the curriculum. Paper presented at the International Conference on
Redesigning Pedagogy, National Institute of Education, Nanyang Technological
University, Singapore.
40
Berliner, D. C. & Biddle, B. J. (1995). The manufactured crisis: Myths, fraud, and the
attack on America’s public school. Reading, MA: Addison-Wesley Publishing.
Binet. A., & Simon, T. (1916). The development of intelligence in children. Baltimore:
Williams & Wilkins.
Blanton, C. K. (2004). The strange career of bilingual education in Texas, 1836-1981.
College Station: Texas A&M University Press.
Boake, C. (2002). From the Binet-Simon to the Wechsler-Bellevue: Tracing the history
of intelligence testing. Journal of Clinical and Experimental Psychology, 24, 383405.
Campbell, D. (1975). Assessing the impact of planned social change. In G. Lyons (Ed.),
Social research and public policies: The Dartmouth/OECD conference. Hanover,
NH: Public Affairs Center, Dartmouth College.
Carnoy, M. and Loeb, S. (2003). Does external accountability affect student outcomes? A
cross-state analysis? Educational Evaluation and Policy Analysis, 24, 305-331.
Carnoy, M., Loeb, S., & Smith, T. (2001) Do higher state test scores in Texas make for
better high school outcomes? (CPRE Research Report, No. RR-047), Consortium
for Policy Research in Education, Philadelphia, PA ERIC ED478984.
Chapman, L. (2007). An Update on No Child Left Behind and National Trends in Education.
Arts Education Policy Review, 109 (1): 25-36.
Chapman, P. (1981). Schools as sorters: Testing and tracking in California, 1910- 1925.
Journal of Social History, 14, 701-717.
Cizek, G. J. (1999). Cheating on tests: How to do it, detect it, and prevent it. Mahwah,
NJ: Erlbaum.
41
Cohen, D. (1996). Standards-based school reform: Policy, practice, and performance. In H. F.
Ladd (Ed.), Holding Schools Accountable. Washington, D.C: The Brookings Institution.
Cuban, L (1988). Constancy and Change in Schools, 1880 to Present. In P. Jackson (Ed.),
Contributing to Educational Change. Berkeley, CA.: McCutchan.
Cullen, J. & Reback, R. (2006). Tinkering toward accolades: School gaming under a
performance accountability system. NBER Working Paper No. 12286.
Dewey, J. (1938). Experience and education. New York, NY: Macmillan.
Edmonson, A. (2003, September 21). Exams test educator integrity—emphasis on scores
can lead to cheating, teacher survey finds. The Commercial Appeal.
Falsgraf, C. (2008). The ecology of assessment. Language and Teaching, 42(4), 491-503.
Figlio, D. N. (2006). Testing, crime, and punishment. Journal of Public Economics, 90(45), 837-851.
Figlio, D. & Getzler, L. (2002). Accountability, ability and disability: Gaming the
system? National Bureau of Economic Research working paper 9307.
Figlio, D. & Winicki, J. (2004). Food for thought: The effects of school accountability
plans on school nutrition. Journal of Public Economics, 89, 381-394.
Gamson, D. (2007). Historical perspectives on democratic decision making in education:
Paradigms, paradoxes, and promises. In P. A. Moss (Ed.), Evidence and decision
making: The 106th Yearbook of the National Society for the Study of Education,
Part 1 (pp. 15-45). Malden, MA: Blackwell Publishing
Giordano, G. (2005). How testing came to dominate American schools: The history of
educational assessment NY: Peter Lang.
Glass, G. V. (2003 Revision). Standards and criteria redux. Retrieved June 5, 2007, from
42
http://glass.ed.asu.edu/gene/papers/standards/.
Glass, G. V (2008). Fertilizers, pills, and magnetic strips: The fate of public education in
America. Charlotte, NC: Information Age Publishing.
Gould, S. J. (1996). The mismeasure of man. NY: WW Norton & Co.
Grissmer, D. and Flanagan, A. (1998) Exploring Rapid Achievement Gains in North
Carolina and Texas: Lessons from the States (Washington, D.C.: National
Education Goals Panel).
Grubb, N. (1985) The Initial Effects of House Bill 72 on Texas Public Schools: The
Challenges of Equity and Effectiveness (Austin: Lyndon B. Johnson School of
Public Affairs).
Haney, W. (2000). The myth of the Texas miracle in education. Education Policy
Analysis Archives, 8(41) Retrieved from http://epaa.asu.edu/epaa/v8n41/
Herman, J. L. and Haertel, E. H. (Eds.) (2005). Uses and Misuses of Data for
Educational Accountability and Improvement: The 104th Yearbook of the National
Society for the Study of Education, Part II. Malden, MA: Blackwell.
Hollinger, R. C., & Lanza-Kaduce, L. (1996). Academic dishonesty and the perceived
effectiveness of countermeasures: An empirical survey of cheating at a major
public university. NASPA Journal, 33, 292-306.
Jacob, B. (2002). Accountability, incentives and behavior: The impact of high-stakes
testing in the Chicago public schools. National Bureau of Economic Research
working paper 8968.
Jacob, B. (2005). Accountability, incentives and behavior: The impact of high-stakes
testing in the Chicago Public Schools. Journal of Public Economics, 89(5-6), 761
43
796.
Jacob, B. & Levitt, S. (2002a). Rotten apples: An investigation of the prevalence and
predictors of teacher cheating (Working Paper #9413). Cambridge, MA: National
Bureau of Economic Research. Retrieved June 5, 2007 from
http://www.nber.org/papers/w9413/
Jacob, B. & Levitt, S. (2002b). Catching cheating teachers: The results of an unusual
experiment in implementing theory (Working Paper #94149). Cambridge, MA:
National Bureau of Economic Research. Retrieved June 5, 2006, from
http://www.nber.org/papers/w9414/.
Jencks, C., & Phillips, M. (Eds.) (1998). The Black-White test score gap. Brookings
Institute Press.
Jennings, J. & Beveridge, A. (2009). How does test exemption affect schools’ and
students’ academic performance? Educational Evaluation and Policy Analysis,
31(2), 153-175.
Jensen, L. A., Arnett, J. J., Feldman, S., & Cauffman, E. (2002). It’s wrong, but
everybody does it: Academic dishonesty among high school and college students.
Contemporary Educational Psychology, 27, 209-228.
Johnson, R. (2008, October). Texas public school attrition study, 2007-08: At current
pace, schools will lose many more generations. IDRA Newsletter. Retrieved from
http://www.idra.org/newsletterplus/October_2008/
Jones, B. & Egley, R. (2004). Voices from the frontlines: Teachers’ perceptions of highstakes testing. Education Policy Analysis Archives, 12(39). Retrieved June 5,
2007, from http://epaa.asu.edu/epaa/v12n39/.
44
Jones, M.G., Jones, B., and Hargrove, T. (2003). The unintended consequences of highstakes testing. Lanham, MD: Rowman & Littlefield.
Klein, S. P., Hamilton, L. S., McCaffrey, D. F., & Stecher, M. B. (2000). What do test
scores in Texas tell us? (Issue Paper). Santa Monica, CA: RAND.
Kossan, P. (2005, May 13). State deems failing grades good enough to pass AIMS.
Arizona Republic. Retrieved June 5, 2007, from
http://www.azcentral.com/arizonarepublic/news/articles/0513scores13.html.
Labaree, D. F. (2005) Progressivism, Schools and Schools of Education: An American
Romance. Paedagogica Historica, 41, 275-288
Linn, R. L., Graue, M. E., & Sanders, N. M. (1990). Comparing state and district test
results to national norms: The validity of claims that “everyone is above average.”
Educational Measurement: Issues and Practice, 9(3), 5-14.
Linton T. H. & Kester, D. (2003, March 14). Exploring the achievement gap between
white and minority students in Texas: A comparison of the 1996 and 2000 NAEP
and TAAS eighth grade mathematics test results, Education Policy Analysis
Archives, 11(10). Retrieved March 5, 2009 from
http://epaa.asu.edu/epaa/v11n10/
Lipman, P. (2004). High stakes education: Inequality, globalization and urban school
reform. NY: RoutledgeFarmer.
Losen, D., Orfield, G., & Balfanz, R. (2006). Confronting the graduation rate crisis in
Texas. Cambridge, MA: The Civil Rights Project at Harvard University.
Lynd, R. S., & Lynd, H. M. (1929). Middletown: A study in modern American culture.
New York: Harcourt, Brace & World.
45
Mazzeo, C. (2001). Frameworks of state: Assessment policy in historical perspective.
Teachers College Record, 103(3), 367-397.
McCabe, D. L., & Trevino, L. K. (1997). Individual and contextual influences on
academic dishonesty: A multicampus investigation. Research in Higher
Education, 38, 379-396.
McCaslin, M. (1996). The problem of problem representation: The Summit’s conception
of student. Educational Researcher, 25, 13-15.
McCaslin, M. (2006). Student motivational dynamics in the era of school reform.
Elementary School Journal, 106, 479-490.
McDonnell, L. (2005). Assessment and accountability from the policymaker’s
perspective. In J. L. Herman & E. H. Haertel (Eds.), Uses and Misuses of Data for
Educational Accountability and Improvement: The 104th Yearbook of the National
Society for the Study of Education, Part II, Malden, MA: Blackwell.
McNeil, L. (2000). Contradictions of school reform: Educational costs of standardized
testing. New York, NY: Routledge.
McNeil, L. (2005), “Faking equity: high-stakes testing and the education of Latino
youth”, In A. Valenzuela (Eds.), Leaving Children Behind: How “Texas-style”
Accountability Fails Latino Youth. State University of New York Press, New
York, NY.
McNeil, L. M., Coppola, E., Radigan, J., & Vasquez Heilig, J. (2008). Avoidable losses:
High-stakes accountability and the dropout crisis. Education Policy Analysis
Archives, 16(3). Retrieved June 20, 2009, from http://epaa.asu.edu/epaa/v16n3/
McNeil, L. & Valenzuela, A. (2001). “The harmful impact of the TAAS system of testing
46
in Texas: Beneath the accountability rhetoric.” In M. Kornhaber & G. Orfield,
(Eds.,), Raising Standards or Raising Barriers? Inequality and High Century
Foundation (pp. 127-150). NY: Century Foundation Press.
Mellon, E. (2010, June 7). Qualms arise over TAKS standards. The Houston Chronicle.
Moss, P. A. (2007) (Ed.). Evidence and decision making: The 106th Yearbook of the
National Society for the Study of Education, Part 1. Malden, MA: Blackwell
Publishing.
Murdock, T. B., & Anderman, E. M. (2006). Motivational perspective on student
cheating: Toward an integrated model of academic dishonesty. Educational
Psychologist, 41(3), 129-145.
National Commission for Excellence in Education (1983). A Nation at Risk: The
Imperatives for Educational Reform. Washington DC: US Department of
Education, National Commission for Excellence in Education.
Nichols, S., & Berliner, D. C. (2007a). Collateral damage: How high-stakes testing
corrupts America’s schools. Cambridge, MA: Harvard Educational Press.
Nichols, S., & Berliner, D. C. (2007b). The pressure to cheat in a high-stakes testing
environment. In E. M. Anderman & T. B. Murdock (Eds.), Psychology of
academic cheating (pp. 289-312). NY: Elsevier.
Nichols, S. L., Glass, G. V, & Berliner, D. C. (2006). High-stakes testing and student
achievement: Does accountability pressure increase student learning? Education
Policy Analysis Archives, 14(1). Retrieved June 30, 2009, from
http://epaa.asu.edu/epaa/v14n1/.
47
Nichols, S. L., & Good, T. (2004). America’s teenagers—Myths and realities: Media
images, schooling, and the social costs of careless indifference. Mahwah, NJ:
Erlbaum.
Orel, S. (2003). Left behind in Birmingham: 522 pushed-out students. In R. Cossett Lent
& G. Pipkin (Eds.), Silent no more: Voices of courage in American schools.
Portsmouth, NJ: Heinemann.
Orfield, G. & Kornhaber, M. L. (Eds.). (2001). Raising standards or raising barriers?
Inequality and high stakes testing in public education. New York: The Century
Foundation Press.
Orfield, G., Losen, D., Wald, J., & Swanson, C. B. (2004). Losing our future: How
minority youth are being left behind by the graduation rate crisis. Cambridge,
MA: The Civil Rights Project at Harvard University.
Pedulla, J., Abrams, L., Madaus, G., Russell, M., Ramos, M., Jing, M. (2003). Perceived
effects of state-mandated testing programs on teaching and learning: findings
from a national survey of teachers. Boston: National Board on Educational
Testing and Public Policy.
Perlstein, L. (2007). Tested: One American school struggles to make the grade. NY:
Henry Holt & Co.
Ravitch, D. (2010). The death and life of the great American school system: How testing
and choice are undermining education. NY: Basic Books.
Ryan, J. (2004). The perverse incentives of the No Child Left Behind Act. New York
University Law Review, 79, 932-989.
48
Rumberger, R. and Arellano Anguiano, B. (2004). Understanding and addressing the
California Latino achievement gap in early elementary school. UC Linguistic
Minority Research Institute Working Paper. Accessed at http://lmri.ucsb.edu.
Sacks, P. (1999). Standardized minds: The high price of America’s testing culture and
what we can do to change it. Cambridge, MA: Perseus Publishing.
Samelson, F. (1977). World war I intelligence testing and the development of
psychology. Journal of the History of the Behavioral Sciences, 13(3), 274-282.
Scheurich, J. J., Skrla, L., Johnson, J. F. (2003). Thinking carefully about equity and
accountability. In L. Skrla, & J. J. Scheurich (Eds.), Educational equity and
accountability: Paradigms, policies and politics. NY: Routledge.
Shepard, L. A. (1990). Inflated test scores gains: Is the problem old norms or teaching
the test? Educational Measurement: Issues and Practice, 9(3), 15-22.
Smith, M., & O’Day, J. (1991). Systemic school reform. In S. Fuhrman & B. Malen (Eds.), The
Politics of Curriculum and Testing. NY: Falmer.
Southern Education Foundation (2010). A New Diverse Majority. Atlanta, GA.
Spillane, J. & Miele, D. (2007) Evidence in Practice: A Framing of the Terrain. In P.
Moss (Ed.), Evidence and decision making: The 106th Yearbook of the National
Society for the Study of Education, Part 1 (pp. 46-73). Malden, MA: Blackwell
Publishing.
Starch, D., & Elliott, E. C. (1912). Reliability of the grading of high-school work in
English. School Review, 20, 442-457.
Stutz, E. (2001, June 9). Bar for passing TAAS lowered cutoff scores for math test
debated. The Dallas Morning News.
49
Texas Comptroller of Public Accounts. (2001). Special report: Undocumented
immigrants in Texas, a financial analysis of the impact to the state budget and
economy (Publication No. 96-1224). Retrieved June 1, 2009, from
http://www.window.state.tx.us/specialrpt/
undocumented/undocumented.pdf
Texas Education Agency. (1998). Grade-level retention in Texas public schools, 1996–97
(Document No. GE08 601 07). Austin, TX: Author.
Texas Education Agency. (1999). 1996–97 report on high school completion rates.
Retrieved June 16, 2009, from
http://ritter.tea.state.tx.us/research/pdfs/9697comp.pdf
Texas Education Agency. (2000). Academic Excellence Indicator System reports, SY
2000–01. Retrieved June 1, 2009, from http://ritter.tea.state.tx.us/perfreport/aeis/
2000/state.html
Texas Education Agency. (2001). Secondary school completion and dropouts in Texas
public schools 1999–00. Retrieved June 15, 2009, from
http://ritter.tea.state.tx.us/research/
pdfs/9900drpt.pdf
Texas Education Agency. (2002). Three-year follow-up of a Texas public high school
cohort. Retrieved June 11, 2009, from
http://ritter.tea.state.tx.us/research/pdfs/wp06.pdf
Texas Education Agency. (2003a). Secondary school completion and dropouts in Texas
public schools 2000–01 retrieved June 15, 2009, from http://ritter.tea.state.tx.us/
50
research/pdfs/
0001drpt_reprint.pdf
Texas Education Agency. (2003b). Statewide TAAS results—Percent passing tables
Spring 1994–Spring 2002, Grade 10, Reading, Mathematics, Writing. Retrieved
June 11, 2009, from http://ritter.tea.state.tx.us/student.assessment/reporting/
results/swresults/august/
g10all_au.pdf
Texas Education Agency. (2004a). Academic Excellence Indicator System: 2003–2004.
Austin, TX: Author.
Texas Education Agency. (2004b). Secondary school completion and dropouts in Texas
public schools 2001–02. Retrieved June 15, 2009, from
http://ritter.tea.state.tx.us/research/
pdfs/0102drpt.pdf.
Texas Education Agency. (2005). Secondary school completion and dropouts in Texas
public schools 2002–03. Retrieved June 15, 2009, from
http://ritter.tea.state.tx.us/research/
pdfs/dropcomp_2002-03.pdf
Texas Education Agency. (2006). Secondary school completion and dropouts in Texas
public schools 2003–04. Retrieved June 15, 2009, from
http://ritter.tea.state.tx.us/research/
pdfs/dropcomp_2003-04.pdf.
Texas Education Agency. (2007). Secondary school completion and dropouts in Texas
public schools 2005–06. Retrieved June 15, 2009, from
51
http://ritter.tea.state.tx.us/research/
pdfs/dropcomp_2005-06.pdf
Texas Education Agency. (2008a). Grade-level retention in Texas public schools, 2006–
07 (Document No. GE09 601 01). Austin, TX: Author.
Texas Education Agency. (2008b). Secondary school completion and dropouts in Texas
public schools 2006–07. Retrieved June 15, 2009, from
http://ritter.tea.state.tx.us/research/
pdfs/dropcomp_2006-07.pdf
Texas Education Agency. (2009). Statewide TAKS performance results. Retrieved
January 27, 2010, from
http://www.tea.state.tx.us/index3.aspx?id=3220&menu_id3=793
Texas Senate Bill 7, 73rd Texas Legislature, Education Code § 16.007 (1993).
Thorndike, E. L. (1923). Education: A first book. NY: Macmillan.
Timmer, A. & Williamson, J. (1998). Immigration Policy Prior to the 1930s: Labor
Markets, Policy Interactions, and Global Backlash, Population and Development
Review , 24 (2), 739-771.
Toenjes, L. A., & Dworkin, A. G. (2002). Are increasing test scores in Texas really a
myth, or is Haney's myth a myth? Education Policy Analysis Archives, 10(17).
Available from http://epaa.asu.edu/epaa/v10n17/
Tyack, D. (1974). The One Best System: A History of American Urban Education. Cambridge:
Harvard University Press.
Tyack, D. (1976). Ways of Seeing: An Essay on The History of Compulsory Schooling. Harvard
Educational Review, 46, 355-389.
52
Tyack, D., & Cuban, L. (1995). Tinkering toward utopia: A century of public school
reform. Cambridge, MA: Harvard University Press.
Tyack, D., James, T., & Benavot, A. (1987). Law and the shaping Addison-Wesley.
U.S. Census Bureau. (2001). The Hispanic population reports: Census 2000 brief.
Washington, DC: U.S. Department of Commerce, Economics and Statistics
Administration.
U.S. Census Bureau. (2006). U.S. population estimates by age, sex, race, and Hispanic
origin: July 1, 2005. Washington, DC: U.S. Department of Commerce,
Economics and Statistics Administration.
U.S. Census Bureau. (2008). An Older and More Diverse Nation by Midcentury. U.S.
Census Bureau News, Press Release CB08-123. Washington, DC: Author.
U. S. Census Bureau. (2009). School Enrollment--Social and Economic Characteristics
of Students: October 2008 Retrieved November 9, 2009, from
http://www.census.gov/population/www/socdemo/school/cps2008.html
Valencia, R. R. & Bernal, E. M. (2000). An overview of conflicting opinions in the
TAAS case. Hispanic Journal of Behavioral Sciences, 22(4), 423-443.
Vasquez Heilig, J. (2011). Understanding the interaction between high-stakes graduation
tests and English language learners. Teachers College Record, 113(12), p. Vasquez Heilig, J., Cole, H. & Aguilar, A. (2010). From Dewey to No Child Left Behind:
The Evolution and Devolution of Public Arts Education. Arts Education Policy
Review, 111(4), 136-145.
53
Vasquez Heilig, J & Darling-Hammond, L (2008). Accountability Texas-style: The
progress and learning of urban minority students in a high-stakes testing context.
Educational Evaluation and Policy Analysi, 30(2), 75-110.
Vasquez Heilig, J., Dietz, L. & Volonnino, M. (2011). From Jim Crow to the Top 10%
Plan: A historical analysis of Latino access to a selective flagship university.
Enrollment Management Journal: Student Access, Finance, and Success in
Higher Education, in press.
Vasquez Heilig, J. & Williams, A. & Young, M. (2011). At-risk student averse: Risk
management and accountability. University of Texas at Austin. Working Paper.
Vygotsky, L. S. (1978). Mind in society. Cambridge, MA: Harvard University Press.
(Edited by M. Cole, V. John-Steiner, S. Scribner, & E. Souberman)
Weaver-Hightower, M. B. (2008). An ecology metaphor for educational policy analysis:
A call to complexity. Educational Researcher, 37(3), 153-167.
Young, M. D., and Brewer, C. (2008), “Fear and the preparation of school leaders: The
role of ambiguity, anxiety and power in meaning making”, Journal of Education
Policy, 22(1), 106-129.
Zastrow, C. & Janc, H. (2004). Academic atrophy: The condition of the liberal arts in
America’s public schools. Washington DC: Council for Basic Education.
54
Figure 1. Texas Assessment of Academic Skills (TAAS) Exit Math: Percent meeting
minimum standards (1994–2002). Source: Statewide TAAS Results, by the Texas
Education Agency, 2003b.
55
Figure 2. Texas Assessment of Academic Skills (TAAS) Exit Reading: Percent meeting
minimum standards (1994–2002). Source: Statewide TAAS Results, by the Texas
Education Agency, 2003b.
56
Figure 3. Texas Assessment of Knowledge and Skills (TAKS) Exit Math: Percent
meeting minimum standards (2003–2009). Source: Statewide TAKS Performance Results,
by the Texas Education Agency, 2009.
57
Figure 4. Texas Assessment of Knowledge and Skills (TAKS) Exit English Language
Arts: Percent meeting minimum standards (2003–2009). Source: Statewide TAKS
Performance Results, by the Texas Education Agency, 2009.
58
Figure 5. Dropout rates (1995–2008). Source: Secondary school completion and dropout
data from the Texas Education Agency.vi
59
Figure 6. Cohort dropout rates (1998–2008). Source: Secondary school completion and
dropout data from the Texas Education Agency.vii
60
Figure 7. Cohort graduation rates (1996–2008). Source: Secondary school completion
and dropout data from the Texas Education Agency.viii
61
Figure 8. From
http://www.idra.org/IDRA_Newsletter/October_2009_School_Holding_Power/Texa
s_Public_School_Attrition_Study_2008_09/
62
Figure 9. Achieve, Inc. (2005). NAEP Vs. state proficiency 2005. Retrieved June 5,
2007, from http://www.achieve.org/node/482.
63
i
Although Asian and Latin American immigration increased steadily through much of
the 19th century and the start of the 20th century, these regions still contributed
substantially fewer newcomers than Europe during this time period.
ii
Southern Education Foundation (2010) relates “Six of these states were in the South and
five, including Hawaii, were in the West. Nine of the ten states in the continental US
were at or near the nation’s southern border. Latina/os represented almost nine out of
every10 non-White students in the West, where there was also a higher percentage of
Asian-Pacific students (9 percent) than African American students (6 percent). African
Americans were not the largest non-White student group in any of the Western states.”
iii
As Mazzeo (2001) points out, other states and regions adopted various forms of highstakes testing (e.g., New York). We focus on Texas because of its history of
experimenting with accountability-based testing and because of our familiarity with
resultant data.
iv
The most recent publicly available data at the time of writing were utilized in the
research.
v
To understand student leavers, a cohort method is more desirable than the yearly
snapshot as it considers what happens to a group of students over time and is based on
repeated measures of each cohort to reveal how students progress in school. The cohort
method is more accurate than the yearly snapshot dropout rate that TEA historically has
highlighted in the public sphere.
vi
Data from TEA, 2001, 2003a, 2004b, 2005, 2006, 2007, 2008b.
vii
In 1999, TEA first reported longitudinal EL dropout rates for students who left the
system between Grades 9 and 12 for reasons other than obtaining a GED certificate or
graduation (TEA, 2002). Data from TEA, 2001, 2003a, 2004b, 2005, 2006, 2007, 2008b.
viii
Data from TEA, 2001, 2003a, 2004b, 2005, 2006, 2007, 2008b.
Download