literature review on standardized testing and ell students

advertisement
Standardized Test Bias and ELL /page
Standardized Test Bias Against
English Language Learners: What it is and how to reduce it
Lories Slockbower
TBED 542: Multiculturalism and Acculturation
Dr. Gladys Scott
August 13, 2009
0
Standardized Test Bias and ELL /page
1
Researchers have studied evidence of bias in standardized tests only to conclude that tests
which are well-designed and appropriately formed show no bias (Sattler, 1992: Valdes and
Figueroa, 1996). After extensive examination of factors such as item content, sequence,
structure, difficulty, factor solutions and predictions, researchers couldn’t find reason to deem
such tests as unreliable (Niesser et al., 1996; Sattler, 1992). However, this paper reviews
literature which questions the validity of standardized tests when assessing students who are nonnative speakers and have been raised in a culture different from the American norm. The
research asserts that bias will occur in a test of intelligence, ability or achievement that was
developed and normed in the United States and given to students who are culturally and
linguistically diverse (Rhodes, Ochoa, Ortiz, 2005). Considering that immigrant families
represent 20 percent of the student population in the United States, and that English Language
Learners (ELLs) score well below their English-speaking native peers, researchers must look for
ways to improve culturally and linguistically diverse student standardized test outcomes (Dorner,
Orellana, Li-Grining, 2007).
The literature reviewed in this paper examined the achievement gap in standardized test
scores between linguistically and culturally diverse students and their native English-speaking,
peers, determined what contributes to the gap and suggested ways that schools can narrow the
gap. Studies found two major shortcomings of standardized tests – their cultures (every test
projects the culture of its creator) and their norms – in relation to students who are culturally or
linguistically diverse. They also recommend ways in which standardized tests can be modified to
account for the needs of students who are not native English speakers and factors which boost
the scores of ELLs on these tests.
Standardized Test Bias and ELL /page
2
Studies Reflecting Cultural Differences of Test Takers
All assessments of intelligence and cognitive ability reflect the culture (values, beliefs,
ideals) of their creators; therefore, performance is based on learning the rules of a society
(Rhodes, Ochoa, Ortiz, 2005). Standardized tests are based directly on the principle of
“assumption of comparability” which means students are compared to a set of norms to
determine their standing and it is assumed those students are similar to those on whom the test
was standardized. To be valid, students should be compared to those who are of the same level of
acculturation. (Salvia, Ysseldyke, 1991) Because the norms may be inappropriate, the cultural
bias occurs. In writing a book that synthesizes established research and theory on the topic to
help practitioners, Rhodes, Ochoa, Ortiz, contend that students who don’t have the opportunity to
become acculturated at the same pace as their peers are likely to score lower because they don’t
have the knowledge and content, not because they are less able (Rhodes, Ochoa, Ortiz, 2005).
Studies on ELLs’ Performance on New York Regents Exams
Three studies looked at different aspects of the achievement gap on the New York
Regents high school exiting exams which has become part of New York’s means of satisfying
the requirements of the No Child Left Behind Act (Department of Education 2001). These high
stakes standardized tests play a central role in assessing student achievement, instructional
methods, and school quality. New York is one of 21 states now using high school existing exams
for all students, including special needs, culturally diverse and ELLs, to meet NCLB’s
requirements. Because ELLs do not perform as well as their native English peers, the tests
become a great challenge in education (Dong, 2004).
Only 50% of ELLs students in New York City schools passed the Regents in 2003
(Dong, 2004). Supporting the findings of Rhodes, Ochoa, and Ortiz in regard to standardized
Standardized Test Bias and ELL /page
3
tests, Dong attributed the abysmal scores to obviously, a lack of English proficiency, but also to
the fact that these students do not share the same cultural experiences as mainstream American
students. For example, Dong found that Asian and European students who come to the U.S. have
a different approach to test-taking. They may not be used to multiple choice tests, unlike native
English speakers who have been trained in that format. In addition to the language and reading
challenge posed by the questions, the tests frequently demand inferences which the ELLs have
difficulty making. Her findings called for reforming essay tests to have cross cultural
understanding in test design and grading (Dong, 2004). Dong referenced the supporting research
of Mohan (1986) which compared and contrasted the levels of inference demanded of the reader
on the New York Regents. Such semantic inferences, Mohan concluded, were testing cultural,
rather than content knowledge, thereby putting the ELL at a distinct disadvantage.
Dong’s recommendations called for greater awareness of student diversity when
formatting the test, development of language skills outside the classroom, usage of assessments
that go beyond the traditional measures of intelligence and usage of a more precise measure of
acculturation and English proficiency in interpreting the results. He supported several
accommodations for ELLs taking standardized tests like the Regents which were also proposed
by Butler and Stevens (2001). These include text modification strategies, assess students’ content
knowledge in their native language, rephrase questions to reduce linguistic complexity, provide
cultural notes and glossaries, simplify directions and reduce cultural bias.
According to Dong, the Regents developed multilingual versions for ELL students but
not in every language. However, she recommended standardized test reforms on three levels:
involve ESL/Bilingual professionals in the test design to review for cultural bias and language,
build language awareness into daily lesson plans and classroom assessment for teachers who deal
Standardized Test Bias and ELL /page
4
with ELL students daily and use performance assessment techniques to better evaluate ELL
students (Dong, 2004). It would seem obvious to include the input of bilingual teachers which
has a strong presence in Menken’s study.
While the above-mentioned studies look at the format and structure of the Regents, the
research of Kate Menken of City University of New York analyzed how test driven curriculum
for the Regents affected the scores of these tests, and denied ELLs from a sufficient bilingual
education. Considering that the ELLs across the United States are now being included in
statewide assessments, her findings have broad implications. In New York City, ELLs make up
13.8% of the public school population. In 2005, only 33.2% of ELLs passed the English Regents
compared to 80.7% by all students. The ELL passage rate was 58.1% for the Math Regents,
compared to 81.5% of the general population (Menken, 2005). Since 30% of ELLs drop out of
New York City schools, the highest of all students, Menken considered her research important to
understand the direction to properly educate ELLs.
In her study of ten New York City high schools in 2005, she found that the Regents were
really language proficiency exams and not measurements of content knowledge. Her research
determined that the schools tried to raise the scores of ELLs by changing their language policies
and “teaching to the test.” She argued that such an approach promoted monolingual instruction
and deprived ELLs of true language arts curriculum as their peers received.
While Dong’s research looked at the content of the test, Menken’s study sought to
determine how high-stake tests have changed the learning experience for ELLs and understand
the language policy implications of the focus of the assessment. Dong conducted her research
through data sampling while Menken’s information was acquired through interviews,
observations, state, district and school policy documents, standardized test scores, graduation,
Standardized Test Bias and ELL /page
5
promotion/retention, and dropout data. Researchers interviewed 128 participants in ten schools
including New York City high school teachers, administrators and ELL students. Like Dong,
Rhodes, Ochoa and Ortiz, Menken’s research concluded that the Regents relied heavily on
language proficiency, including the math portion. Therefore, to raise the scores of ELLs, schools
increased the amount of English instruction instead of providing a strong bilingual program
(Menken, 2005).
Among the ten schools, administrators took different approaches to deal with the pressure
to have positive annual yearly progress reports. For example, School 4 required its 606 ELLs to
have a daily double-period English Regents preparation course and a Saturday program in
addition to an extended school schedule, often 12 periods, compared to their English proficient
peers who attended eight periods a day. All students, including the ELLs, were required to
receive a score of at least 65 on the English Regents, 10 more points than the actual statewide
passing score (Menken, 2005). In essence, this approach deviated from the strong bilingual
program which New York City schools had in use. Additionally, School 4 placed ESL students
who just arrived to the U.S. into advanced English Regents preparation courses before they
learned English language fundamentals (Menken, 2005).
While School 4 increased English language instruction, School 1 preserved native
language instruction. The majority of ELLs were Spanish speakers, all of whom received ESL.
Spanish students also received bilingual classes in math, science and social studies. When the
teachers realized that the skills on the Advanced Placement Spanish exam and in the national
curriculum for the AP course were similar to those required by the English Regents, they
required Latino ELLs to take Spanish as a Native Language at the lower levels and Advance
Placement Spanish at the more advanced levels. In addition, the school offered an English
Standardized Test Bias and ELL /page
6
Regents Preparation course in Spanish. This approach was so successful – increasing pass rates
by 50 percent – that it was implemented at other schools as well. Menken reasons that School 4’s
approach supports bilingual education research which shows that developing literacy in students’
first language helps them develop literacy in their second language which develops content
transfers from the first language (Menken, 2005).
The study’s shortcoming is that it did not report the results of the other nine New York
City schools, which is an obvious question since the other schools took an increased English
study approach. This research described the approach schools took but lacked a connection
between the approach and the students’ performances on the Regents. If the bilingual approach
boosted scores by 50% at School 1, what was the outcome of the monolingual approach? To
make a viable comparison, one needs more data. Also, what were the scores of all the foreign
language students? While Spanish-speakers in the one high school could take AP Spanish, what
could be offered to native speakers of other languages, whether European, Middle Eastern or
Asian?
One strength of this study, however, is derived from its interviews of teachers and
students which gave a clear understanding of the test driven curriculum’s impact on learning.
Pages of interviews relate their frustrations. ELLs’ responses were most disturbing as they told
researchers that the push to raise scores has narrowed the curriculum so much that they didn’t
think they are getting prepared for college. Because they were only learning the content on the
test, there was no room for projects and in-depth study of certain topics. The study concluded
that language policies in school must be carefully planned and decided upon by teachers,
administrators and the community to meet the needs of the students, not determined by high-
Standardized Test Bias and ELL /page
7
stakes testing (Menken, 2005). This concurred with Rhodes, Ochoa, Ortiz and Dong who
expressed need for bilingual educators to help create the tests and correlating curriculum.
Socio-Economic Factors Affect Standardize Testing Results
In contrast, another study focusing on Latino students, specifically Mexican American
students’, examined standardized math test scores using an integrated model that viewed
standardized test performance as a result of situational and cultural factors based on individual,
family, peer and institutional levels (Morales, Saenz, 2007). Rather than survey students and
conduct interviews of teachers and students as Menken’s study did, researchers examined 12
hypothesis using data from a series of math comprehension tests administered by the National
Center for Education Statistics (NCES). The probability sample provided a nationally
representative sample of schools and high school seniors. For this study, the final sample
included 490 Mexican-origin students, which seemed rather small considering the comparison to
7,690 White students. The study concluded that Mexican-origin students scored significantly
lower (by 10 points) on the math test than did Whites. This is significant given that the test only
has 81 possible points. (Morales, Saenz, 2007). Because the sample considered seniors only, it
noted that the gap could have been much larger if it sampled students in earlier grades.
To explain the gap, researchers introduced the different variables and found that 67% of
the gap in scores on cognitive math tests was due to differences in SES between Mexican-origin
and White students (Morales, Saenz, 2007). Also of note was the effect of generational status
since the study questioned the claim that Mexican immigrants hinder success of the Mexicanorigin population. When looking at the ethnic gap in cognitive test scores, generational status
slightly widens the gap. First generation students scored about four points above secondgeneration students.
Standardized Test Bias and ELL /page
8
This study considered factors such as students’ study habits, gender, family background
and home language, peer pressure, and positive school experiences which were not included in
Menken’s research. Such factors would help understand the performance of ELLs on the
Regents. Neither study considered the impact of the students’ communities. If the studies had
factored in variables regarding the students’ community and neighborhoods, explanations for the
achievement gap may have been more defined. For example, the studies did not correlate the
community’s crime rate, cultural segregation and poverty level to academic achievement. Unlike
the Menken study, the research of Morales and Saenz didn’t acquire student feedback regarding
test preparation classes for the math test or if such courses were even part of the curriculum. If
the students were getting test preparation classes, were they bilingual or monolingual as in the
New York City schools? Such factors would have possibly helped to determine how the school
could offset the negative impact of the Mexican-origin student’s lower SES compared to their
White peers.
Language Brokering May Increase Standardized Test Performance for ELLs
Sociologists have conducted many studies on the translating and interpreting work of
immigrant children, most of which analyzed how that work related to the children’s development
(Orellana, Reynolds et al., 2003).
However, the study in this literature review examined this practice more closely to see
how language brokering cultivated linguistic, math and social-cultural aptitude which in turn
increases these students’ skills on standardized tests in math and reading comprehension
(Dorner, Orellana, Li-Grining, 2007). It tested the hypothesis that language brokering is related
to academic outcomes and called for further mixed-method studies on the topic. The study
sought to answer two questions: What is the scope of children’s experiences with language
Standardized Test Bias and ELL /page
9
brokering in a particular Chicago immigrant community and is there a connection between this
practice of immigrant households and students’ performances at school?
Because the study looked at students’ scores over a period of five years, it was possible to
have better control for their early academic achievement (Dorner, Orellana, Li-Grining, 2007)
Since this longitudinal study controlled for children’s gender, exposure to bilingual education
and generational status, it could be more confident in its findings. Earlier studies lacked these
control groups and studies reviewed in this paper also lacked this perspective.
This 2001 research study was based at the Regan Elementary School in Chicago where
90% of the students were low-income, 40 % were limited English proficient and 75% were
Hispanic, most of whom were Mexican. Researchers surveyed the 10 fifth and sixth grade
mainstream and bilingual Spanish classes posing questions about children’s preferred language,
their lives, and their experiences with translating, interpreting, reading, writing and technology.
Of 313 children, 280 responded or about 89% of those surveyed. About half were girls. As
expected, 90% of the first and second-generation children said they translated for other people in
everyday ways. Researchers created three categories of student activity: active, partial, and nonlanguage brokers. To measure academic outcomes, they used standardized math and reading test
scores which were administered as the Iowa Basic Skills (Dorner, Orellana, Li-Grining, 2007).
The research results showed that 35 percent of the students were active language brokers,
most of whom spoke Spanish at home and had some bilingual education. Thirty-four percent
were partial language brokers and 53 percent were not language brokers. Significantly, by grade
five, the active brokers scored an average of eight points higher than both groups in reading and
math (Dorner, Orellana, Li-Grining, 2007). Researchers concluded that the language brokering
was positively related to their standardized tests scores in reading comprehension. It noted that
Standardized Test Bias and ELL /page 10
not all the students who broker would reap higher scores because some students find the activity
stressful (Dorner, Orellana, Li-Grining, 2007). The positive results appear to develop among
students who are active language brokers so educators and parents must determine how to
enhance those skills for application in the classroom.
The study recommended that educators and curriculum developers continue research to
determine if training bilingual students who are not active translators would result in higher
academic performance.
Conclusion
It can be assumed from the five studies in this literature review that standardized testing
poses significant biases against the English Language Learners in both format and content which
questions the validity of such assessments. These tests are normed for American students whose
primary language is English. These tests continue to inadequately measure content knowledge in
an ELL’s native language and phrase questions and directions in a complex manner. When test
preparation classes are conducted in English only, districts compromise their bilingual program.
To resolve these inequities, researchers urge that ESL/Bilingual professionals be involved in the
test design to review for cultural bias and language and to develop performance assessment
techniques to evaluate ELL students rather than relying on tests. Research is lacking in
longitudinal studies to determine the most effective classroom approach for standardized testing
preparation and to learn the factors, aside from socio-economic, which most impact the success
of ELLs on standardized tests. More studies must be conducted to learn how to more finely
assess cultural proficiency, just as practitioners now measure language proficiency. While many
studies look at the Spanish-speaking student in regard to testing, there is a gap in research
regarding students of other languages. Studies must examine any test bias toward them, also.
Standardized Test Bias and ELL /page 11
References
Butler, F., & Stevens, R. (2001). Standrardized assessment of the content knowledge of English
language learners K-12: Current trends and old dilemmas. Language Testing, 18 (4),
409-427.
Dong, Yu Ren. (2004). Assessing and evaluating ELL students in mainstream classes.
In Terry A. Osborn (Ed.), Teaching Language and Content to Diverse Student (pp. 39-65).
Greenwich, CT: Information Age Publishing.
Dorner, L., Orellana, M., & Li-Grining, C. (2007). “I helped my mom,” and it helped me:
Translating the skills of language brokers into improved standardized test scores.
American Journal of Education, 113, 451-478.
Menken, K. (2006). Teaching to the test: How No Child Left Behind impacts language policy,
curriculum, and instruction for English Language Learners.
Bilingual Research Journal, 30, (2), 521-546.
Mohan, B. (1986). What are we really testing? In B. Mohan Language and Content
(pp. 122-135). Reading, MA: Addition-Wesley.
Morales, M., Saenz, R. (2007). Correlates of Mexican American students’ standardized test
scores: An integrated model approach.
Hispanic Journal of Behavioral Sciences, 2, 349-365.
Neisser, U., Boodoo, G., Bouchard, T., Boykin, A., Brody, N., Ceci, S.J., et al. (1996).
Intelligence: Knowns and unknowns. American Psychologist, 51, 77-101.
Standardized Test Bias and ELL /page 12
No Child Left Behind Act. (2002) Pub. L. No. 107-110.
Orellana, M., Reynolds, J. Dorner, & M. Meza. 2003. “In other words: Translating or ‘paraphrasing’ as a family literacy practice in immigrant households. Reading Research
Quarterly 38 (1) 12-34.
Rhodes, R., Ochoa, S., & Ortiz, S. (2005). Acculturational factors in psyshoeducational
assessment. Kenneth W. Merrell (editor), Assessing Culturally and Linguistically Diverse
Students (pp.124-135). New York, NY: The Guilford Press.
Sattler, J. (1992) Assessment of children (rev. 3rd ed.). San Diego, CA: Sattler
Salvia, J., & Ysseldyke, L.E. (1991). Assessment (5th ed.). New York: Houghton Mifflin.
Valdes, G., & Figueroa, R.A. (1996). Bilingualism and testing: A special case of bias.
Norword, NJ: Ablex.
Download