Oral Reading Fluency Benchmark Procedures and Considerations

advertisement
Oral Reading Fluency Benchmark Procedures and Considerations:
Final Report for Save the Children
By: Dr. Mónika Lauren Mattos
Teachers College, Columbia University
TABLE OF CONTENTS
List of Figures .................................................................................................................... iii
List of Tables ..................................................................................................................... iii
List of Acronyms ............................................................................................................... iv
Glossary ............................................................................................................................. vi
Introductory Summary ........................................................................................................ 1
Literature Review................................................................................................................ 1
Sample Benchmarking Procedures ................................................................................. 1
Tools Used to Measure Fluency ................................................................................... 12
Tools Used to Assess Fluency and Comprehension in Alphasyllabic Scripts .............. 17
Bangla Fluency Data Trends......................................................................................... 33
The Relationship between Fluency and Comprehension .................................................. 45
Ethiopia: A Sample Oral Reading Fluency Benchmark Study ......................................... 56
Considerations for Future Fluency Benchmark Studies in Bangladesh............................ 61
Language Learning Context.......................................................................................... 61
Competencies ................................................................................................................ 63
Recommendations ............................................................................................................. 64
Guidelines in the Benchmark Making Process ............................................................. 64
Align the Benchmark Tool to the External Criterion Measure ..................................... 65
Select Regions............................................................................................................... 66
Develop the Benchmark Tool ....................................................................................... 66
Train Assessors and Pilot the Benchmark Tool ............................................................ 68
Sampling and Data Collection ...................................................................................... 68
Conduct Workshops to Interpret and Discuss the Findings .......................................... 70
Conduct Workshops to Develop Proposed Benchmarks .............................................. 70
The Way Forward: Advocacy, Mobilization, and Collaboration ................................. 71
Conclusion ........................................................................................................................ 72
References ......................................................................................................................... 75
Appendix A: Sample Decodable Reader in Bangla .......................................................... 79
ii
List of Figures
Figure 1.
A Network of Processing Systems for Reading .................................................. 51
List of Tables
Table 1.
Six Dimensions of Fluency ........................................................................................... 48
Table 2.
Sample Timeline for Benchmark Making Process .............................................. 65
iii
List of Acronyms
ASER
Annual Status of Education Report
AUC
Area Under the Curve
BRAC
Bangladesh Rural Advancement Committee
CAMPE
Campaign for Popular Education
CIES
Comparative International Education Society
CLS
Correct Letter Sounds
DCS
DIBELS Composite Score
DIBELS
Dynamic Indicators of Basic Early Literacy Skills
DMG
Dynamic Measurement Group
DORF
DIBELS Oral Reading Fluency
DPE
Directorate of Primary Education
EGRA
Early Grade Reading Assessment
FSF
First Sound Fluency
GRADE
Group Reading Assessment and Diagnostic Evaluation
IAT
Instructional Adjustment Tools
IER
Institute for Education Research
LAB
Language Acquisition Battery
LB
Literacy Boost
LNF
Letter Name Fluency
MOE
Ministry of Education
MoPME
Ministry of Primary and Mass Education
NCTB
National Curriculum and Textbook Board
iv
NSA
National Student Assessment
NWF
Nonsense Word Fluency
PAL
Programs for Assisted Living
PROTEEVA Promoting Talent through Early Education
PSF
Phoneme Segmentation Fluency
RAN
Rapid Automatic Naming
READ
Reading Enhancement for Advancing Development
RTI
Research Triangle Institute International
RtR
Room to Read
RWI
Reading and Writing Instruction
TPF
The Promise Foundation
UOCTL
University of Oregon Center on Teaching and Learning
USAID
United States Agency for International Development
WCPM
Words Correct Per Minute
WPM
Words Per Minute
WWR
Whole Words Read
v
Glossary
Accuracy: one of the components of reading fluency; reading with accuracy involves
high word recognition and strong decoding skills to sound out unfamiliar words
Aksharas: the symbol units of the Bangla writing system
Alphabetic script: a writing system whose symbols represent the basic sounds of the
language
Alphasyllabary language: an abugida; a language that uses symbols to represent
consonant sounds and shows vowel sounds with diacritics
Alphasyllabic script: a writing system whose symbols may represent syllable sounds;
consonants with inherent vowels
Cut-off points: points that indicate skills levels at which student performance can be
predicted
Decodable text: A type of text used to help children decode words using the phonics
skills taught when they are learning to read
Decoding syllables: the process of matching a letter or combination of letters to their
sounds and recognizing the syllable patterns in words
Decoding skills: the ability to read words quickly and automatically; some of the
subskills needed to decode include knowledge of sound-symbol correspondence,
segmenting words into individual sounds, blending syllables and sounds, building a large
repertoire of sight words
External criterion measure of reading: an external assessment designed to measure
student performance against fixed, predetermined criteria
Fluency benchmarks: points of reference against which oral reading fluency can be
compared at the beginning, middle, or end of the academic year
Leveled texts: a range of texts written at different reading ability levels in order to match
these to children’s actual ability levels, monitor their progress and provide the necessary
instructional support at each ability level
Nonwords: In the context of reading assessments, pseudowords that are pronounceable
based on phonics rules but do not exist nor have meaning
Orthography: the representation of the sounds of a given language by written symbols
vi
Phoneme segmentation: the ability to segment words into their individual sounds
Phonemic awareness: the ability to hear, identify, and manipulate individual sounds in
spoken words
Phonological segmentation: the ability to segment the sounds of a language at the word,
syllable, and phoneme level
Phonological skills: the ability to identify and manipulate units of oral language such as
initial, middle, and ending sounds in words
Prosody: one of the components of reading fluency that includes pitch, stress, and
timing; reading with prosody involves reading with expression, in phrases or chunks, and
using intonation or pauses to signal punctuation or grammatical features of a language
Reading acquisition: the process of acquiring the skills needed in order to learn to read
Screening measures: measures used to identify or predict students who may be at risk
for poor reading outcomes
Semantic complexity: In texts, how meaning in a given language is conveyed through
words, phrases and sentences at increasing levels of complexity
Sensitivity criterion: a statistical measure used to evaluate a benchmark goal or cut point
for risk
Speed: one of the components of reading fluency that is measured in words per minute;
there is usually an appropriate reading rate for a given age or grade level
Syllabic awareness: A component of phonological awareness; involves the
understanding that words are divided into syllables
Syntactic complexity: In texts, the logical and grammatical arrangement of words at
increasing levels of complexity
Threshold point: the value at which decoding skills optimally support the ability to read
Word recognition: the ability to recognize written words correctly and effortlessly
Word accuracy: In the context of oral reading fluency, the ability to read words without
errors
vii
Introductory Summary
This first section of this report provides a review of the literature on how
benchmarks are created. The procedural guidelines serve as a reference that informs how
benchmark tools and procedures can be developed and adapted to other contexts. Tools
used to measure fluency are delineated across technical reports and clinical case studies.
The second section discusses the relationship between fluency and comprehension. It
describes universal and language specific features that influence the reading acquisition
process in Bangla. The next section provides a sample oral reading fluency benchmark
study that included languages that use alphasyllabic and alphabetic scripts. The last
section addresses salient themes that emerged from fact-finding meetings with relevant
stakeholders. Issues around the process of developing fluency benchmarks in Bangladesh
are thoroughly discussed. The report concludes with recommendations on the steps and
approaches that can guide the benchmark making process.
Literature Review
Sample Benchmarking Procedures
In the U.S., research, development, and implementation of early literacy
assessments as well as oral reading fluency benchmark making procedures have informed
similar efforts in other country contexts. The development of EGRA tools was informed
by research conducted on the implementation of the Dynamic Indicators of Basic Early
Literacy Skills (DIBELS) in the U.S. In turn, the development of DIBELS was informed
by research conducted on Curriculum-based Measurement, which consists of
standardized procedures used to assess and monitor literacy skills as well as skills in
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
1
other subject areas. In 1992, Hasbrouck and Tindal (2006) used this alternative measure
to assess oral reading fluency and develop benchmarks in grades 2-5 in the U.S. The
researchers did this by collecting oral reading fluency data from 8 regions in the U.S. at
three points in the academic year. Over the years, ongoing research in this area has
helped to establish new oral reading fluency benchmarks in response to changes in testing
procedures, standards, and demographics in the American education sector.
In 2012, the University of Oregon Center on Teaching and Learning (UOCTL)
published a technical paper on the important changes made to the development of
benchmark goals on the Dynamic Indicators of Basic Early Literacy Skills (DIBELS).
The changes were made because educators and administrators noted that many students
who met the benchmark goals did not pass external criterion measures as evidenced in
comprehensive standardized tests (OUCTL, 2012). This was due to the fact that
composite scores were solely aligned to the internal performance screening measures for
the benchmark goals. The benchmark goals were not aligned to an external, standardized
comprehensive assessment administered at the end of the academic year. Consequently,
the fluency benchmark goals could not serve as predictive measures for student
performance on an external comprehensive assessment. In U.S. classrooms, this meant
that teachers who relied on the DIBELS Data System were not able to properly identify
struggling readers nor plan for interventions through literacy instruction.
In response, the OUCTL therefore suggested the development of new fluency
benchmark goals and cut points for risk that aligned with an external criterion and could
therefore serve as predictive measures. The organization delineates a different approach
to the development of benchmark goals based on “(a) an external technical review of
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
2
DIBELS Next materials, (b) an analysis of the procedures used to establish the Dynamic
Measurement Group’s (DMG) former goals, (c) consistent feedback from users, and (d)
best practices in education research on sample selection and study replication, (UOCTL,
2012, p.1). The technical paper thus addresses key elements and procedural guidelines in
the development of benchmark goals based on lessons learned.
According to the authors, two initial key elements to consider are the student
population and the external criterion measure. Benchmark goals are based on a large
sample size representative of the student population across schools in the country thus
mirroring demographic data. It is important to identify an already existing nationally
recognized standardized test that serves as a strong “external criterion measure of
reading.” Before the start of the benchmark process, all stakeholders involved in the
development of benchmarks goals need to agree on the standards set by the external
criterion measure that will be linked to fluency measures.
The development of benchmark goals also involves careful consideration of the
statistical procedures used. The authors point out that the analytic lens selected influences
the understandings culled from the benchmark making process (UOCTL, 2012). They
state that in order to develop the new benchmark goals for DIBELS Next and the cut off
points, the main statistical procedures implemented were “(a) the Area under the Curve
(AUC) and (b) sensitivity, (p.5).” The Area Under the Curve is a statistical procedure that
evaluates how well a screening measure actually groups students between cut off points,
thereby providing a predictive value of students’ basic early literacy skills across grade
levels at the end of the academic year. A sensitivity indicator is a statistical procedure
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
3
used to determine how well selected points on a scale score actually help single out the
students who do not meet a criterion goal.
Another key element involves a consideration of “decision rules.” The authors
inform that, “For each measure at each time point that is recommended, we calculated (a)
the benchmark goal, and (b) the cut point for risk ” (UOCTL, 2012, p.6). Students who
meet or exceed the benchmark goal are more likely to “score at or above the 40th
percentile” on an external criterion measure. In the process of developing benchmark
goals, a cut point for risk is set in order to identify the students who do not meet a
benchmark goal. In their aim to align with an external criterion measure, the researchers
noted that the cut point for risk should indicate that students who scored below the
benchmark goal would score anywhere below the 20th percentile (UOCTL, 2012).
The last key element addressed is the particular analytic approach used in the
process of developing benchmarks. This entailed the linking of each fluency measure
administered at three points in the academic year to an external criterion measure
administered at the end of the year. In order to determine the accuracy of the benchmark
goals, only measures that have an Area Under the Curve that is greater than .75 are
selected (UOCTL, 2012, p.7). Next, they carefully analyzed each measure selected. For
each of the three points in time, an initial analysis was conducted to set the benchmark
goal and a second analysis was conducted to set the cut point for risk. The analyses
conducted to create the benchmark goal and the cut point for risk at each point in time
then underwent the sensitivity and specificity statistical procedures to ensure that
struggling readers were readily identified and provided with targeted literacy instruction.
In order to provide strong confidence in the ability to predict how well students would do
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
4
on the external criterion measure administered at the end of the year, the authors selected
a sensitivity criterion of 90% for the benchmark goals and a sensitivity criterion of 80%
for the cut point risk on the DIBELS.
Thus, after the student population is selected and the external criterion measure is
aligned in the process of creating benchmark goals, the screening measures are then
administered at the beginning, middle, and end of the school year. At the end of the year,
students take the standardized test that serves as the external criterion measure. Student
performance on the screening measures at the beginning, middle, and end of the year is
then compared to student performance on the external criterion measure. Student
performance is identified as proficient or below. Benchmarks can then be developed by
noting where the proficient and lower scoring students were at the beginning, middle, and
end of the academic year—the critical benchmark time periods. The alignment between
the screening measures and the external criterion measure minimizes prediction errors in
the development of benchmarks. This ensures that the benchmarks that are set can predict
how well students will do on the criteria described in the external measure (UOCTL,
2012).
Powell-Smith, Good, Latimer, Dewey, Wallin, and Kaminski (2012) further
describe the process of developing DIBELS Next benchmark goals and cut points for risk
through a study conducted from 2009 to 2010. Their technical paper also provides an
evaluation of the DIBELS Next measures and addresses four aims. In the benchmark
goals study, the authors aimed to identify the performance levels on the DIBELS Next
assessment that would serve as good predictors of students’ performance on end of year
reading goals. A second aim was to evaluate the reliability of the DIBELS Next
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
5
assessments and the DIBELS Composite Score. A third aim for the study was to explore
the correlations between different elements of the DIBELS Next assessment and the
Group Reading Assessment and Diagnostic Evaluation (GRADE), an external criterion
measure. A fourth aim was to evaluate teacher and assessor satisfaction with the DIBELS
assessments. This literature review discusses relevant details of the first three aims of the
study.
Powell-Smith and his colleagues selected students from grades kindergarten to
sixth grade from three geographical areas in the United States. The participants attended
English-medium general education classrooms and included students for whom English is
a second language as well as students with disabilities who were able to take part in the
DIBELS assessment. The researchers explain that they selected a subset of the total
sample to take part in both the DIBELS assessment and the additional GRADE
assessment in order to examine the reliability and validity of the DIBELS measures. In
order to check for validity, fifty students representative of the three geographical areas
were selected to take the GRADE, an external criterion measure. In order to check for
reliability, three out of the five school districts spread across three geographical areas
were chosen to take either an alternate-form reliability test, a test-retest reliability, or
inter-rater reliability test. The parents of the students who took both the DIBELS and
GRADE assessments completed a demographics survey.
The measures for the benchmark goals study were the DIBELS measures, the
external criterion measure Group Reading Assessment and Diagnostic Evaluation
(GRADE), and a questionnaire filled out by teachers and administrators to gauge the
usability of the DIBELS Next assessment. The individual measures of the DIBELS
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
6
included First Sound Fluency (FSF) to test kindergarten students’ ability to isolate and
identify initial sounds in words in the beginning and middle of the year. Given that many
kindergarteners may exhibit partial and emergent fluency in the FSF, the assessors used
differential scoring for this measure. Letter Naming Fluency (LNF) tests kindergarten and
early first grade students’ letter automaticity, or ability to identify and say the name of
lower case and uppercase letters. In the Phoneme Segmentation Fluency (PSF) measure,
students demonstrate their phonemic awareness by listening carefully and sounding out
word parts. The assessors used differential scoring for the students that demonstrated
emergent phonemic awareness. The Nonsense Word Fluency (NWF) measure tests
students’ ability to identify the correspondence between letters and sounds and blend
sounds to form complete nonsense words that follow a vowel-consonant or consonantvowel-consonant pattern. The assessors start to administer the NWF measure in the
middle of kindergarten. Students receive two separate scores this fluency measure. One
score stands for the number of correct letter sound correspondences marked in the first
minute, Correct Letter Sounds (CLS). The second score stands for the number of
nonsense words that are correctly read aloud without using phoneme segmentation,
Whole Words Read (WWR). The DIBELS Oral Reading Fluency (DORF) and Retell is a
two-part measure that tests students’ skills in phonics, making sense of unknown words
in context, reading connected text with fluency and accuracy, and reading with
understanding. In the DORF section, students read a different a one-minute grade-level
passage in the beginning, middle, and end of the year. For the benchmark assessment, the
assessor finds the DORF scored by calculating “the median number of words read
correctly and the median number of errors across the three passages” (Powell-Smith et
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
7
al., 2012, p. 31). The assessor finds the accuracy rate by dividing the median number of
words read correctly by the sum of the median words correct and the median errors. In
the retell section, the assessor makes a quality response rating that measures
comprehension. The DIBELS-Maze measure assesses students’ reasoning and
comprehension skills. The assessor uses a formula that calculates an adjusted score that
takes into consideration instances where students may have guessed on the test.
In preparation for the assessment phase, school coordinators and teachers
conducted the testing for DIBELS and GRADE. School personnel involved reviewed the
testing materials, steps for administration of the test, and rules for scoring the students’
responses during one day of training. They also practiced scoring mock responses to
ensure fidelity in the procedure. In order to maximize reliability and accuracy in the
scoring process, the school coordinators and teachers participated in calibration activities.
The principal investigators addressed any discrepancies in the administration of the test
and in the scoring procedures during the training (Powell-Smith et al., 2012).
In order to ensure inter-rater reliability, five participants from each grade were
randomly selected so that an assessor and a shadow-scorer could administer the test.
School coordinators and teachers took turns as assessors and shadow-scorers. The
shadow scored protocols omitted the names of the students and used other identifiers
such as student identification, grade level and school district (Powell-Smith et al., 2012).
The alternate form reliability test was performed in the school district where
students’ reading skills were most varied. Powell-Smith et al. (2012) add that in grades 2,
3, 4, and 6, “stratified sampling by benchmark status was utilized to obtain a sample
comprised of 50% students at benchmark and 50% from combined strategic and intensive
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
8
instructional recommendation categories” p. 15. In first and 5th grade there were low or
disproportionate numbers of students who fell in the strategic and intensive categories so
the researchers oversampled to meet the sampling goal.
In order to check for test-retest reliability, the students took the DIBELS Next
assessment in the middle of the year and were then retested two weeks later. The
principal investigators checked for test-retest reliability in the same school district where
students’ reading skills were most varied. The researchers used data from the DIBELS
assessment administered in the beginning of the year to select a student sample for the
retest two weeks after the middle of the year. As in the alternate form reliability test, they
used stratified sampling to gather 50% of students who met the benchmark and 50% of
students who fell in either the strategic and intensive categories. In instances where the
sampling goal of thirty students from the respective categories could not be met, they
sorted grade lists again, this time by the lowest non-word fluency scores for first grade
and the lowest DIBELS Oral Reading Fluency Scores (DORF) for grades two to six to
meet the required percentage of students from the strategic and intensive categories.
Once scores from the individual measures are calculated for each grade, “the
measures that correlate highly with later outcomes” are first weighted and then
“combined into a DIBELS Composite Score (DCS)” (Powell-Smith et al., 2012, p. 35). In
first grade, composite scores are determined for the middle and end of the year. In grades
two to six, composite scores are determined for the beginning, middle, and end
benchmark assessments. In order to find out the strength of the DIBELS Next measures
as predictive measures, they are compared against an external criterion measure. In this
case, the external criterion measure was the Group Reading Assessment and Diagnostic
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
9
Evaluation (GRADE) designed for students in preschool to 12th grade. The GRADE
consists of five sections, 16 subtests, and 11 grade-specific testing levels. Only the
relevant subtests were used in each grade. As in the DIBELS Next measures, subtest
scores were combined to determine the composite scores. The principal investigators
reviewed the data for each grade and benchmark assessment in search of invalid scores
and made decision rules to remove outliers.
The data set was then ready for analysis. The GRADE was administered around
the same time as the benchmark assessment at the end of the academic year. In the effort
to develop benchmark goals and cut points that could be generalized, the researchers
decided that students whose lowest raw scores showed performance at or above the 40th
percentile on the GRADE would serve as the external criterion for adequate reading skills
while the lowest raw scores at above the 20th percentile would serve as an external
criterion for the cut points (Powell-Smith et al., 2012). The principal investigators used
the end-of-year benchmark and cut point approximations from the GRADE external
criterion to guide their calculation of the benchmark goals and cut points for risk based
on the DIBELS Composite score for the end of the year. Once the benchmark goals and
cut points for risk were calculated for the end of the year using the DIBELS Composite
Score, they referred to these to determine benchmark goals and cut points for risk for the
middle of the year using the DIBELS Composite Score for the middle of the year. Once
the benchmark goals and cut points for risk were calculated for the middle of the year,
they referred to these to determine the benchmark goals and cut points for risk for the
beginning of year using the DIBELS Composite Score for the beginning year. Goals and
cut points for risk were also developed using the individual DIBELS Next measures.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
10
Analyses and correlations were then drawn from the DIBELS Next measures, the
DIBELS Next composite scores and the external criterion measure, GRADE for each of
the benchmark periods.
The findings show correlation increases with each subsequent grade level except
for the retell scores, which showed a steady decrease in each subsequent grade level. The
findings also show that the reliability coefficients were high on the test-retest, alternate
form, and inter-rater reliability. Given these findings, students who meet or exceed a
current benchmark goal have 80% to 90% of meeting later reading goals on the DIBELS
Next Measures and therefore have high chances of doing well on the GRADE. Another
important finding was that the DIBELS Next Composite Score was generally a better
reading proficiency measure than DIBELS Next individual measures. The authors add
that the utility of the composite score as an internal criterion measure was strong and
therefore validated the DIBELS Next benchmark goals and cut points for risk. PowellSmith et al., (2012) conclude that more research is necessary in order to replicate the
findings with other external criterion measures.
The section above focused on the procedures applied in the creation of benchmark
goals and cut points for risk for fluency in English reading. Although there are universal
features in reading acquisition (Frost, 2012), there are also language-specific features
(Nag, 2007; Perfetti, 2003) that pose several implications for the creation of fluency
benchmarks. Thus, the procedures described above may serve as a reference in countries
where fluency benchmarks are yet to be established and should be adapted to meet
context-specific needs. In light of these considerations, theoretical models that explain
how children learn to read in languages not based on the alphabetic script are a relevant
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
11
part of the discussion. The next section of the literature review addresses the tools used to
measure fluency with a lens on reports and clinical studies.
Tools Used to Measure Fluency
Jukes, Vagh, and Kim (2006) state that the tools used to measure fluency require
consideration of the varied skills, subskills, challenges, and underlying processes
involved in learning to read in the writing system of a particular language. The authors
add that there are also cultural and country-specific factors such as reading standards and
school curricula that influence target measures for fluency.
Among these is the assessment of letter knowledge, or in the case of
alphasyllabary languages, the assessment of syllable knowledge is an important fluency
measure. Children’s ability to correctly and fluently name letters or the alphasyllabic
script and their corresponding sounds, serves as a strong indicator of their ability to read
(Jukes et al., 2006). An assessment of phonological awareness entails an activity that asks
children to signal rhyming words, take apart the sounds of initial sounds, word endings,
and tap out the number of sounds in each word. An assessment of phonological recoding
tests children’s “ability to apply phoneme correspondence rules” through a nonword
reading activity (p. 9.) The authors further suggest that the syllable complexity of a
particular language must be taken into account in the development of items that assess
phonological processing skills as fluency measures.
Oral reading fluency measures pay attention to children’s ability to read
connected text with “accuracy, speed, and prosody” (Jukes et al., 2006, p. 10). The
emphasis on connected text creates an assessment window into children’s comprehension
given that if they can read clearly with proper pacing and expression, they are more likely
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
12
to read with understanding. Working memory is extremely important because children
need to remember the sounds that correspond to the written script in order to manipulate
them. They also need to remember the meaning of the string of words read and how these
connect to the other parts of the text at the level of sentences and ideas throughout the
text.
Jukes and his colleagues suggest that oral reading fluency can also be assessed by
“asking children to read aloud from curriculum-relevant texts and counting the number of
words accurately read within a span of 60 seconds” (p. 11). They add that the syllable can
be the “unit of analysis,” thus the number of correctly decoded syllables in one minute
can serve as a valid fluency measure. Moreover, the authors point out that normative data
from country-specific, linguistically and culturally relevant curricular materials can be
used to shed light on grade level expectations for reading.
The authors recommend that fluency outcome measures should be aligned to the
number of words, syllables, phonemes, sentences, syllables/word, and
phonemes/syllables found in grade level texts. This may also entail the use of readability
formulas, which focus on the syntactic and semantic complexity of texts to have an idea
of the cognitive demands placed on the reader. In order to figure out the syntactic
complexity of a text, the number of words in each sentence are first counted. An average
score is then calculated. In order to figure the semantic complexity of a text, researchers
usually note the average number of low and high frequency words or calculate the
average number of syllables in each word. Lexical diversity is a measure of the variety of
words found in the text. The authors inform that the lexical diversity of a text can be
calculated by finding the ratio of the “total number of unique words in the text (“types”)
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
13
to the total number of words in the text (tokens).” This is called the “type-token” ratio.
Juke et al., (2006) state that “a consideration of lexical diversity is important in the
measurement of fluency as children who read passages with many repetitive words are
bound to have an easier time and go through them faster, than children who read passages
that are more lexically diverse as they will decode a greater number of different words
through the passage” (p. 15).
While the elements noted above are relevant considerations that should be taken
into account in an exploration of the tools used to measure fluency, the authors contend
there exists a dearth of research on reading fluency norms in certain countries. Jukes et al.
(2006) therefore recommend the following: “(a) take into account characteristics of the
orthography that children are learning to read, (b) evaluate the results in keeping with the
demands of the curricula, and (c) capitalize on any opportunities to collect information on
general trends for a given language in a given country” (p. 16). Once fluency benchmarks
and basic oral reading fluency tools are developed, a link to reading comprehension can
be embedded in the tools.
The authors pose that links between oral reading fluency and reading with
understanding can be made through question and answer comprehension tests. They
suggest that careful consideration should be given to the topic of the text and the quality
of the questions. This involves selecting a text that requires a limited amount of prior
knowledge so that children can draw responses directly from the text itself. The questions
should be designed in a way that children cannot guess what the response may be just by
reading the questions.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
14
Although the tool described above is better suited for children who have basic
decoding skills, another tool to measure fluency entails written graded reading
assessments that can be used inexpensively on a large scale. The letter reading and word
reading portion asks that children tell the difference between letters and non-letters as
well as words and non-words. The sentence portion is timed and requires children to read
simple sentences and identify them as true or false.
Both early literacy skills and comprehension can also be informally assessed
through Maze tests. Jukes et al., (2006) prefer the Maze test rather than the Cloze test
because it is more suitable for beginning readers. Beginning readers can read the passage
and select one word from the multiple-choice option available, one word that fits the
sentence according to the intended contextual meaning. In order to design a Maze test,
key words are omitted either by their classification as parts of speech or by omitting
every fifth or seventh word. The authors recommend the omission of particular parts of
speech since that approach can better serve as a measure of reading comprehension.
Other characteristics of the Maze test is that it is timed, some of the words from the
multiple-choice options are selected as distractors, and readers must read beyond single
sentences to read the passage with understanding.
Jukes and his colleagues share the findings of a pilot study conducted to explore
what measures of reading fluency are relevant and applicable across languages and
orthographies. Based on the pilot findings, the authors highlight that an ideal approach to
assessing reading skills in development contexts should include a composite score that
encompasses both letter/graphic unit reading and passage reading ability that is linked
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
15
with comprehension. Thus, the authors recommend the inclusion of these elements in a
reading assessment:
(1) An oral letter reading fluency test--children read as many letters/graphic units
as possible in 60 seconds, (2) An oral passage reading fluency test—children read
as many words of a connected text as possible in 60 seconds. Two different
passages should be used to improve reliability, and (3) Comprehension should be
assessed by 5 questions asked at the end of the each passage. Students who do not
finish reading the passage in 60 seconds should be allowed to finish. (p. 22).
For languages similar to Hindi, the authors suggest selecting the syllable units that are
taught first through curricular materials and/or selecting syllable units from the reading
passages. As previously mentioned, the length and complexity of the passages as well as
their lexical diversity should be assessed.
Jukes and his colleagues (2006) make a distinction between commercially
developed assessments and curriculum-based assessments of oral reading fluency.
Teachers use curriculum-based assessments on a weekly basis to assess and monitor
students’ academic growth. Teachers then use the data from these assessments to adjust
instruction. Commercially developed assessments are designed and scored by an external
organization under contract. The instructions for these types of assessments require
students to respond to the same questions in the same way. The DIBELS discussed in this
document is an example of a commercial assessment. The DIBELS in its original version
included a retell fluency measure to gauge reading comprehension. The current version
named DIBELS Next runs an updated version in English and Spanish that provides the
retell fluency measure as an option rather than a requirement because the measure did not
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
16
demonstrate consistent performance nor did it render reliable and predictive outcomes on
the external measure (OU CTL, 2012). Although the authors mention samples of
commercially developed reading assessments that mainly focus on the English language,
it is useful to take a look over the test design and explore what kinds of test items may be
applicable to other languages and orthographies.
For example, the Reading Fluency Indicator includes four passages at varying
levels of difficulty for children between the ages of five and eighteen. The Reading
Fluency Progress Monitor is standardized and normed, with up to 30 reading passages for
children in grades one to eight. This tool allows the person who administers the test to
find the passage that more closely matches each reader’s actual level of mastery and
difficulty. The Reading Fluency Benchmark Assessor can be used by the classroom
teacher to monitor oral reading fluency once a week, twice a week, or on a monthly basis.
It includes eight levels of assessment with 30 reading passages at each level. Once the
teacher administers the test, she can assess each child’s specific reading level and use
these assessments to guide literacy instruction. The Gray Oral Reading Test contains
developmentally sequenced passages, each followed by five questions that assess
comprehension.
Tools Used to Assess Fluency and Comprehension in Alphasyllabic Scripts
Jukes and his colleagues also note the efforts of Pratham, an Indian
nongovernmental organization that conducts large-scale fluency assessments in rural and
urban areas. The authors’ critique is that while the design of Pratham’s assessment tools
resembles some of the tools discussed thus far, the fluency measures concentrate mainly
on words decoded correctly, less on reading rate, and no attention is given to reading
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
17
comprehension. The tool includes a four-sentence text that allows the test administrator
to gauge whether readers can move on to a more challenging text or read a wordlist.
While Wijayathilake and Parrila (2014) state there is limited research on basic
reading skills in alphasyllabic languages, it is important to explore Vagh’s (2009; 2010)
work on the reliability and validity of the Annual Status of Education Report (ASER)
assessment tools given that these were designed for alphasyllabic languages. The ASER
reading tool is available in Hindi, Bengali, Gujarati, Kannada, Tamil, among others. In
this study, the ASER reading tool was implemented in the Hindi language and was
aligned to standards 1 and 2 textbooks from India. For the purposes of this literature
review, the focus will be on Vagh’s discussion of the tools for the basic reading
assessment, not the portion that discusses the ASER math assessment tools.
Vagh (2009) informs that the ASER reading assessment tool “classifies children
at the ‘nothing’, ‘letter’, ‘word’, ‘paragraph’ (grade 1 level text), and ‘story’ (grade 2
level text) level based on defined performance criteria or cut-off scores that allow
examiners to classify children as masters or non-masters of any given level” (p. 2.) The
tool therefore emphasizes mastery of basic reading skills.
In Vagh’s (2009) study, he evaluated the concurrent validity of the ASER reading
tool through a comparison with the Fluency Battery—a tool adapted from EGRA and
DIBELS. The Fluency Battery is made up of the following subtests: Akshar Reading
Fluency, Barakhado Reading Fluency, Word Reading Fluency, Nonword Reading
Fluency, two first grade level passages with each linked to two comprehension questions,
and two second grade level passages with each linked to two comprehension questions.
The author adds that, “the content of the Fluency Battery was drawn from prior ASER
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
18
reading tests as the material has been extensively evaluated and piloted to ensure their
grade and content appropriateness for the population of interest” (p. 19). The findings
indicate there is a strong concurrent validity between the ASER reading tool and the
Fluency Battery. The children who participated in the study and performed above the
standard 1 story level on the Fluency Battery also performed well on the ascending ASER
reading levels. However, the author points out there were children who performed at the
“nothing level” on the ASER reading test who were able to read four aksharas or more on
the Fluency Battery. Due to this minor inconsistency, he suggests that other studies
should further explore the suitability of the cut-off criteria for fluency rates.
In similar research, Vagh (2010) reconsiders the findings on children’s
performance on the different ASER-reading levels as well as their performance on the
Fluency Battery. He informs that the strong validity coefficients between these basic
reading tests indicate “increasing fluency rates with higher ASER-reading levels” and
that the inconsistencies noted earlier may be a result of misclassification given that the
levels are “mutually exclusive categories” (p.14). While both the Fluency Battery and the
ASER reading levels are strongly aligned, Vagh further explains that the reason for
testing and the type of data needed should decide which test to administer. He also
suggests that both tests could be administered in tandem to evaluate children’s reading
development “within and across reading levels” in programs such as the Read India
Program.
More studies are needed in order to develop tools to measure fluency in contexts
where alphasyllabic languages are spoken. The Promise Foundation (TPF) in India brings
together educators, psychologists, and social workers to meet the needs of children in
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
19
underserved communities, with a focus on how children acquire literacy in the mother
tongue. Tiwari (2011) informs that among the many tasks of the LAB tool designed by
the Promise Foundation, there is “a range of Kannada literacy and cognitive tasks like
ashkara knowledge, reading fluency, reading comprehension” and several others (p. 9).
He adds that the tool was adapted to languages such as Mayalam and Bengali. Given that
in her research, Nag (2007) posits that the pace in learning to read the complex aksharas
is gradual and can extend past the third grade, the LAB tool is a useful example of a local
language-based assessment tool that measures reading fluency and comprehension across
grades.
Nag and Sircar (2008) adapted elements of the LAB tool in their study on learning
to read in Bengali in Kolkata primary schools. Since the focus of the study was on the
early stages of reading development in Bengali, the researchers carried out an initial
screening assessment to identify readers and nonreaders. They then conducted an indepth assessment that collected data on children’s vocabulary knowledge, word
recognition, reading comprehension, and phonological skills among others. According to
the authors, the schools where the children attended were also studied based on “their
daily routine, work culture, and teaching-learning processes” (Nag & Sircar, 2008, p. 4).
In terms of word recognition, the authors found that children learn the Bengali
aksharas in the order in which they are taught—teachers move from the simple or single
akshara to the more complex aksharas. Nag and Sircar (2008) add that the children had
difficulty decoding the complex akshara regardless of whether they appeared
independently or as part of a word. In the word and nonword reading assessment, the
findings showed that seven year olds could read the majority of words (96%) and
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
20
nonwords (84%) accurately. The most challenging aspect of decoding occurred when
children tried to read the akshara located in the middle of the word. The researchers
explain that although sound-symbol relationships are mainly consistent in Bangla, the
middle akshara in words illustrate the inconsistencies that challenge the decoding
process. The researchers’ findings reveal that, “children found it easier to work with
syllables than phonemes and that their understanding that words are made up of
phonemes came later. Children who were struggling with phonological processing were
also struggling with simple word recognition and spelling” (Nag & Sircar, 2008, p. 7).
Moreover, the areas of reading comprehension that were assessed were inferential
thinking and understanding details in a nonfiction text. The researchers inform that in
general the children performed better on understanding explicit details in a nonfiction
passage than on making inferences. Stronger readers were more easily able to make
inferences from passages that contained more sentences with challenging syntax and
fewer familiar words (Nag & Sircar, 2008). They point out that they identified struggling
readers as those who had difficulty decoding words as well as those who had trouble
making inferences.
Another aspect of reading comprehension that was assessed was the vocabulary
task. They assessed word knowledge by giving the children the option to use the
vocabulary word in a sentence, give a synonym, or provide a definition. Nag and Sircar
(2008) report that the children preferred to communicate their understanding of word
knowledge through sentence construction. However, they point out that the children in
the earlier grades did not accurately express their word knowledge through sentence
construction as well as the older children. According to the researchers, most of the
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
21
children in the earlier grades produced “dead sentences” with a weak connection between
the meaning of a word and the way it was used in a sentence. A trend the researchers
noticed was that instruction played an important role in children’s metalinguistic
awareness. The children’s decision to convey word knowledge mainly through sentence
construction was largely based on teachers’ curricular practice.
Nag and Sircar (2008) hold that sentence construction alone falls short as a valid
assessment of word knowledge. The authors recommend the following alternatives:
ï‚·
Test concepts from a cluster of words. Give a set of linked words for a
name or concept relation between these words.
ï‚·
Sample word list: chair, table, desk, bed
Response: furniture, household things, etc.
ï‚·
Test word-context matches. Give a situation and ask multiple-choice
questions to tease out the child’s understanding.
Sample situation: The teacher was happy that Radha had stopped spending
time with her friends.
Multiple-choice question: What did the teacher think of her friends? They
were (a) irresponsible (b) responsible.
ï‚·
Test word-personal experience match. Give a word and ask for a
recounting of personal experience that can capture the meaning of the
word. The quality and accuracy of the connections the child makes gives
an indication of the depth of the child’s understanding of the word.
Example: Describe a situation when you are worried about something. (p. 15).
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
22
Thus, the researchers recommend that tests weave activities that foster meaning making
such as conceptual categorization, personal relevance, and inferential thinking into
activities that assess sound-symbol relationships at the word level.
Sircar and Nag (2014) more recently conducted a study on sound-symbol
relationships between the akshara and the phonological syllables in Bangla. The
researchers specifically looked closely at challenges that arise during reading, instances
where the characteristics of the spoken and written language signal a lack of congruency
between the akshara symbols and the phonological syllables and how this influences the
reading process. The fluency measures on the screening battery involved reading word
lists, syllable, and phoneme processing tasks. The participants in grades 2, 3, and 4 were
categorized as either readers on grade level or struggling readers. The researchers also
included a phonological processing assessment that comprised the manipulation of
“target syllables and phonemes in nonwords, in either initial or final positions” (p. 205).
On akshara knowledge and word recognition tasks, Sircar and Nag (2014) noted
that learning akshara symbols takes a great deal of time based on the finding that
participants in the fourth grade were still in the midst of the akshara acquisition process.
An example of consonant clusters on the word list that posed a challenge were
/CCV/clusters. Consonant clusters were the most difficult to decode, especially for the
second graders that were on grade level and the struggling readers. These participants
sounded each phoneme rather than blending them. The findings also showed that
participants accurately decoded the basic aksharas, in line with the pattern of akshara
instruction (/Ca/, /CV/, /CCV/) in the primary schools where the children attended. Sircar
and Nag observe that, “while the less familiar symbols appeared to elicit segmental
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
23
analysis of the markers within the syllable block (the ‘spelling’ of the akshara), the
familiarity of the common akshara appears to allow the symbols to be reliably processed
as undifferentiated tasks” (p. 205).
In the nonword reading task, ninety-four percent of participants’ errors were
phonological as demonstrated by the reversal of consonant sounds in /CCV/ akshara
while the remaining errors were lexical (Sircar & Nag, 2014). According to the
researchers, the data suggests that proficient readers use phonological analogies to decode
nonwords and less proficient readers rely heavily on orthographic information as
evidenced by their sounding out of inherent vowels. They also note that a repertoire of
effective decoding strategies such as the one implemented by proficient readers can be
integrated in akshara instruction so that children learn to apply rules and successfully
manage the exceptions to these rules when they process text.
In the phonological processing task, Sircar and Nag (2014) found that participants
who decoded words correctly were better able to process phonemes. In the phoneme
tasks, all the participants had greater difficulty in manipulating initial phonemes in more
complex akshara syllables than in less complex akshara syllables. In a segmentation
task, participants were presented with short and long nonwords. The researchers inform
that the participants preferred to segment the nonwords as CVC-CV, which indicates a
phonological segmentation rather than one mediated by orthography. They posit that this
can by explained by the language-specific constraints found in initial or medial consonant
clusters. In akshara substitution tasks, the findings were similar in that the majority of
participants manipulated akshara phonologically rather than by “akshara-by-akshara
manipulation.” The authors conclude that readers generally relied more on their
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
24
phonological awareness in order to read akshara syllables while struggling readers had
difficulty using phonological and orthographic characteristics to decode akshara. The
factors that supported the decoding process were children’s ability to distinguish how
specific phonemes can and cannot be arranged to form syllables and words, use analogies
to recognize new words, and accurately recognize words they often see in print texts were
important factors that influenced reading ability. The findings from this study highlight
some of the cognitive and linguistic demands that arise as children learn to read with
fluency in the extensive orthography of the Bangla language. While Nag and Snowling
(2012) acknowledge that across languages there are similar features in the process of
learning to read, the way children understand and apply the nuances of alphasyllabic
knowledge warrants further study in order to provide insights into language-specific
factors that influence reading development. The following study sheds some light on how
children process texts in an alphasyllabic script.
In a clinical study designed to assess reading difficulties in Kannada, an Indian
alphasyllabary, Nag and Snowling (2008) referred to previous longitudinal data that
focused on children’s word and nonword reading skills in the participant selection
process. The measures for the current study included, “tests of basic reading, spelling,
phonological, visual and oral processing skills” (p. 2.) A cognitive measure in the
assessment battery that is relevant to reading fluency is Rapid Automatic Naming (RAN).
In RAN tasks, participants are expected to name akshara-based syllables or words as
quickly as possible in order to measure processing speed and automaticity. Rapid
Automatic Naming tasks are used in clinical studies to identify children who may be at
risk for reading difficulties. The researchers’ preliminary findings show that proficient
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
25
readers possess an implicit orthographic knowledge of the “rules that govern the
ligaturing of the vowel and additional consonants to the base consonant” without
receiving formal instruction from their teachers on these rules. They add that readers who
struggle have a difficult time discerning these rules and therefore have limited akshara
knowledge. An implication that can be drawn from this study is that students need
explicit instruction on orthographic knowledge and also require additional opportunities
to read and listen to stories in and out of school in order to build their metalinguistic
awareness.
The articles discussed thus far delineate a variety of tools that can be used to
measure oral reading fluency and comprehension. The development of grade level
passages and corresponding questions, decoding and phonological awareness tasks, Maze
tests, and vocabulary tasks, among others are reliable measures of oral reading fluency
and comprehension. The following study considers similar issues in the context of
Kannada, an alphasyllabic language spoken in South India.
Nag and Snowling (2011) conducted a study that explored the differences in
comprehension among students in fourth, fifth, and sixth grade. The students’ first
language is Kannada, an akshara-based script like Bengali and Tamil. Nag and Snowling
add that given the large number of akshara in Kannada, the reading acquisition process
extends well into fourth and fifth grade. The researchers considered two aspects of
reading comprehension. The first was the relationship between decoding, phonological,
and comprehension skills. The second considered the link between inflection and
vocabulary knowledge and comprehension.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
26
In the study, there were ninety-five participants from twelve schools. In order to
explore the relationship between comprehension, decoding and phonological skills, the
students read six passages and answered two questions. The passages consisted of fiction,
informational texts, biographies, and riddles from fourteen to thirty-seven words in length
(Nag & Snowling, 2011). The reading comprehension test was also used to assess reading
accuracy. Other tests of reading accuracy were word and nonsense word lists. Tests to
measure phonological processing were also included.
In order to explore the link between inflection, vocabulary knowledge, and
comprehension, the participants took a test that consisted of defining the meaning of
words on word lists developed from grade level texts. They were also given ten sentences
of varied length to repeat. The researchers made sure that the longer sentences contained
less complex syntax so that cognitive space could be freed to focus on inflection. They
counted the number of omissions and substitutions to determine the participants’
inflection knowledge.
Nag and Snowling (2011) found that varied levels of reading comprehension were
attributed to phonological processing skills and word accuracy. Phonological skills
facilitated the ability to decode. Accuracy at the word level facilitated reading
comprehension. Low performance on word and nonsense word lists as well as syllable
and phoneme manipulation tasks were indicative of poor reading comprehension. The
authors state that while these findings support similar findings in other languages, they do
not consider that low performance on decoding skills alone accounts for poor
comprehension. The findings from the second part of the study showed that “once
reading accuracy and phonological skills had been controlled for,” vocabulary and
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
27
inflection knowledge were strong predictors of reading comprehension. The participants
who had less vocabulary knowledge also had limited comprehension. In the context of a
language rich in inflections, the participants who had knowledge of morphological
segments were better able to read with understanding. Inflection errors occurred in nounverb and noun-adjective agreements. The researchers add that inflection knowledge was
an independent predictor that explained the varied levels of comprehension found in
participants’ processing of text. Nag and Snowling conclude that more studies are needed
to further investigate the relationship between children’s oral language development,
knowledge and processing of low and high frequency inflections, and reading
comprehension in akshara-based languages.
In a more recent study conducted by Nakamura (2014) in two language
communities of South India, the researcher investigated how children acquire reading
skills in Kannada or Telugu and English and how children’s knowledge of the reading
process transfers across scripts. An important contribution of this study is the exploration
of the predictors of the reading acquisition process within and across alphasyllabic and
alphabetic languages as well as an analysis of the transfer of literacy knowledge.
According to the researcher, an overarching goal of the study was to explore the
possibility of an identifiable threshold in multilingual settings at which children’s reading
outcomes in the alphasyllabic language would make it more likely for children to be able
to transfer their literacy knowledge to English.
Nakamura (2014) and her colleagues selected 556 students in standards 1 to 5
from 13 low-income schools representative of the urban, rural, government, and private
schools in South India. Out of the 322 Kannada speakers, there were 168 boys and 154
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
28
girls that participated in the study. Out of the 234 Telugu speakers, there were 116 boys
and 118 girls that participated in the study. The students participated in three rounds of
assessments that comprised reading skill subtasks. In the first round, the students
completed 9 tests that comprised “concepts of print, blending, deletion, letter naming,
decoding, slasher, oral vocabulary knowledge, listening comprehension, and reading
comprehension” (p.14). While the majority of the subtasks are similar to others
mentioned in the literature review, the “slasher” test assessed fluent word recognition and
required participants to read sentences typed without spaces in between the words. The
objective was to mark with a slash all the places where a space belonged in between the
words in each sentence. In the second round, the subtasks consisted of “akshara
knowledge, spelling, and oral reading fluency” (p.14). In the last round, the tests
consisted of “deletion, letter naming, decoding, oral vocabulary knowledge, listening
comprehension, and reading comprehension” (p.14). A point the researcher makes is that
older participants were also required to complete the easier subtasks because the varied
range of reading scores in previous studies indicated that attainment of grade level
reading proficiency may not be reached by all students. Thus, in order to minimize the
probability of yielding zero scores participants had to score at least or above 30% on
gradual test eligibility criteria such as print concepts, letter naming, oral reading fluency,
among others before moving on to the next level of subtasks. Out of the entire sample
size, 91% were eligible for decoding, 76% were eligible for oral reading fluency in
Kannada or Telugu, and 62% were eligible for oral reading fluency in English.
In regards to the main findings on participants’ literacy knowledge in Kannada or
Telugu, Nakamura’s (2014) study further confirms that the ability to decode in an
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
29
alphasyllabic language is contingent upon both syllabic and phonemic awareness. She
adds that children’s oral language development in the mother tongue as evidenced by a
repertoire of vocabulary words and the ability to understand the language of sentences
and stories also play a crucial role in the ability to identify the relationship between
akshara symbols and sounds. In fact, oral language development continues to support
reading development across the grades. In addition to the finding that syllabic awareness,
phonemic awareness, and oral vocabulary knowledge were strong predictors of reading
outcomes in Kannada and Telugu, the author informs that the participants were able to
decode with accuracy and speed in these respective languages by fifth grade. While the
findings showed that boys and girls scored similarly across the subtasks, mastery of
decoding skills in the three languages does not necessarily correlate with comprehension.
Nakamura draws this conclusion based on the finding that student performance was
higher on subtasks that measured accuracy in oral reading fluency than on reading
comprehension subtasks.
In regards to the main findings on participants’ literacy knowledge in English,
Nakamura (2014) states there is a similar correlation between phonemic and syllabic
awareness as well as oral vocabulary knowledge and participants’ scores in the decoding
skills subtasks. However, she adds that, “syllable awareness was not a significant
predictor of English coding ability” (p. 22). The author notes that knowledge of oral
vocabulary had a consistent effect on decoding skills across the grades. In contrast to her
findings on Kannada and Telugu oral reading fluency subtasks, there was a significant
correlation between participants’ ability to read with accuracy and comprehension in
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
30
English. Nakamura also found that participants’ decoding skills in Kannada or Telugu
influenced their ability to decode efficiently in English.
This major finding supports the theoretical underpinnings of the study, which hold
that “cognitively demanding tasks that underlie reading in multilingual children are
sharable, transferrable, and facilitative across languages” (Nakamura, 2014, p. 8). The
author contends that as children learn to read in two languages with different
orthographies, they first acquire language-specific subskills and shared subskills that
maximize reading outcomes in the mother tongue. Once children acquire foundational
skills in the first language, they engage in a “cognitive resource sharing” process in
which they build on these shared subskills while they learn the specific subskills needed
to maximize reading outcomes in the second language. Moreover, Nakamura (2014)
identifies 60% as the approximate “threshold point” at which children’s literacy
knowledge in Kannada or Telugu optimally supports the ability to decode in English. The
participants in the study reached this threshold at the end of Standard 4.
The above predictors for decoding ability in Kannada, Telugu, and English hold
implications for test design, pedagogical approaches, curriculum frameworks, program
development in multilingual settings, and language in education policy. In regards to
Kannada and Telugu, they are bound by a nuanced understanding of reading development
in an alphasyllabic script. Thus, Nakamura’s (2014) study emphasizes the ways that
learning to read in an alphasyllabic script are particularly different from learning to read
in an alphabetic script. For instance, reading development in Kannada or Telugu requires
that teachers simultaneously build students’ syllabic and phonemic awareness so that
students can read with fluency. Students also need ample time to learn the visually
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
31
complex script and symbol-sound correspondence at syllabic and phonemic levels before
they develop strong decoding skills. This is line with Nag’s (2007) research, which
compared the pace of acquisition of akshara knowledge and phonemic awareness in low
and high performing schools and found that students in both types of schools required a
prolonged period of time to hone their decoding skills.
Nakamura (2014) also points out that students learning to read in a transparent
alphasyllabic script need explicit instruction when it comes to learning the rules and
patterns that govern symbol-sound correspondence. She argues that phonics-based and
sight word instructional approaches are not conducive to learning to read in an
alphasyllabic script. Instead, Nakamura recommends the provision of akshara charts that
are readily visible, formal and informal activities that build oral language skills, the
creation of games that foster opportunities to practice the spatial relationships and
phonemic markers present in aksharas, among others.
Nakamura (2014) concludes that children’s oral language development is distinct
from their reading development within and across languages. She stresses the need to
address this distinction in the identification of appropriate pedagogical approaches in
context-specific languages of instruction as well as a thorough consideration of the timing
of instruction. Nakamura highlights the need for more research studies that investigate
reading development in alphasyllabic scripts so that impact studies on reading
intervention programs conducted by local and international organizations can begin to
demonstrate marked improvement in children’s reading outcomes and language in
education policies.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
32
The studies discussed above contribute to research centered on foundational
skills, fluency, and comprehension in alphasyllabic orthographies. Along with factors
such as age, grade, gender, and socioeconomic status, linguistic and cognitive
perspectives also inform fluency data trends from impact studies conducted by
international organizations. The next section explores fluency data trends specific to the
Bangla language and contributes to the conversation on the relevance of languagespecific oral reading fluency benchmarks.
Bangla Fluency Data Trends
According to Education Watch (2000) only 4 out of 53 terminal competencies in
Bangladesh’s primary education system pertain to the Bangla language. In a study
conducted by the Bangladesh Rural Advancement Committee (BRAC) Research and
Evaluation Division, the terminal competencies of Bangla reading, writing, and listening
of fifth grade students were assessed. The broader goal of the report was to inform
stakeholders about the state of primary education in Bangladesh in two ways. The first
was through an assessment of fifth grade students’ ability to meet the terminal
competencies. The second was through an evaluation of teacher education for the primary
grades. In order to investigate students’ attainment of the competencies, the team
randomly selected 2,509 students from 186 schools. The students attended government
schools, private schools, or informal schools.
The Bangla assessment tool consisted of ten test items. In order to assess reading
competence, students first read aloud a printed paragraph and a handwritten paragraph
then responded to four questions. Two questions were designed for each paragraph.
Students who correctly responded to a minimum of one question for each paragraph were
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
33
assessed as having adequate reading skills. The findings indicate that 65% of students
met the minimum criteria for reading competency. Nationwide, 62.2% of students
responded correctly to both questions on the hand written paragraph while 33.6% of
students responded correctly to both questions on the printed paragraph (Education
Watch, 2000).
For the listening comprehension assessment, students first listened to a recording
of a paragraph then responded to two questions. If students responded correctly to at least
one of the two questions, they were assessed as having met the requirement for listening
competency. Collectively, 80.7% of students met the minimum criteria while 43.2%
responded correctly to both questions. There was no statistically significant difference
between the performance of boys and girls. Students in the urban areas performed better
than students in the rural areas, at 87.2% and 78.5% respectively (Education Watch,
2000).
In the writing assessment, students were asked to correctly complete three out of
four prompts. The research team defined correct sentences as those that made sense and
contained at least half of the words spelled correctly. The first prompt was to correctly
write about an object they could see. The second prompt was to correctly write about an
object they could not see. The third prompt was to fill out a form. The fourth prompt was
to write a message. The findings show that 55% of students met the minimum criteria
(Education Watch, 2000).
In another level of analysis, the BRAC research team combined students’ scores
for each of the competencies, reading, listening, and writing. The findings indicate that
on average, only 36.5% of students achieved minimum competence in reading, listening,
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
34
and writing. By gender, only 33.2% of girls and 39.8% of boys achieved minimum
competence. By area, only 34% of rural students and 46.3% of urban students achieved
minimum competence. The report further states that by the end of primary school, less
than 2% of students meet the standards set by all 53 terminal competencies.
The researchers posit that these alarming percentages bring to bear on the issues
of equity and quality in education given that the small number of students who attained
all 53 competencies were those who attended the “best” schools in the capital (Education
Watch, 2000). Based on their general low achievement findings, the authors recommend
that teacher education, accountability systems, materials for teaching and learning,
competencies, and language objectives should be reexamined for the purpose of building
students’ basic literacy skills. They emphasize the importance of revisiting the
competencies in order to scaffold children’s learning experiences in a way that
authentically starts where students actually are in their literacy development. An
additional contribution is that it was the first time that terminal competencies were
wholly addressed in a study (Education Watch, 2000). Findings from other studies on
emergent and early reading skills and their link to terminal competencies continue to
inform primary education policy and practice in Bangladesh.
For example, Dowd and Friedlander (2009) published a report on the emergent
and early reading assessment validation study results on Save the Children’s branch in
Bangladesh. A main purpose of the study was to explore if the assessment tools reflected
variations and relationships within and across children’s reading skills in rural and urban
areas as well as in different reading programs. While the sample size of readers was too
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
35
small to yield reliable fluency and comprehension estimates for each grade level, the
study did provide preliminary estimates.
In first grade, participants who read the first story read an average of 14.64 words
correctly per minute. First graders who read the second story read an average of 20.10
words per minute. In second grade, participants who read the first story read an average
of 33.53 words correctly per minute. Second graders who read the second story read an
average of 41.34 words correctly per minute.
In the comprehension portion, first grade participants scored 22% of the responses
correctly on the grade level passages. Second grade participants scored significantly
higher. Dowd and Freidlander (2009) state that the much higher scores indicate that more
difficult questions need to be developed for both second grade passages. Although the
authors conclude that the Bangla reading assessment aptly captures grade level
differences, they acknowledge that adjustments to the assessment tool are necessary
given that scores indicate an “increase in reading fluency and comprehension not across
the same passages, but on grade-level texts” (p. 3). They add that fluency scores show
differences across rural and urban areas. In order to improve comparability across areas,
Dowd and Friedlander recommend two options. The first option is to use one passage to
assess all children. The second option is to assess students in one particular grade.
Finally, the authors recommend that the study’s preliminary estimates be used solely as
benchmarks for future grade level reading assessments.
Nath, Guajardo, and Hossain’s (2013) impact study on the Literacy Boost (LB)
intervention reports the changes in Bangla reading skills between the 2011 baseline
findings of grade 3 participants and 2013 endline findings of grade 4 participants. All of
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
36
the 465 participants were from the Meherpur District in Bangladesh. Out of this number,
255 children participated in the LB intervention and 210 children participated in the
comparison group. The participants either attend schools that receive either the Literacy
Boost intervention or Save the Children’s Basic Education Sponsorship intervention
which for the purposes of the study serve as the comparison schools. The reading
assessment data collected consisted of letter knowledge, single word reading, decoding,
fluency, accuracy, listening comprehension, and reading comprehension. Participants
who were able to read five words correctly in 30 seconds were referred to as readers. The
assessors multiplied the number of words read correctly in 30 seconds by two to calculate
the reading rate. Participants who could not read five words correctly in 30 seconds were
referred to as nonreaders. The protocol for nonreaders involved listening to the passage
and responding to the same set of comprehension questions as the readers. The
researchers point out that the baseline determined the 75th percentile as a benchmark for
each assessment measure. For fluency, they identified 42 words per minute correct as a
benchmark.
Nath and his colleagues state that the difference between the fluency baseline and
endline scores of the participants in the Literacy Boost intervention group and the
comparison group was not statistically significant. At baseline, 83% of the LB
intervention group could read five words in 30 seconds. At endline, 88% of the LB
intervention group could read five words in 30 seconds. At baseline, 85% of the
comparison group could read five words in 30 seconds. At endline, 83% could read five
words in 30 seconds. Both groups started at approximately 29 words correct per minute
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
37
and nearly doubled their reading rate in the endline assessment. Both groups therefore
exceeded the established 42 words per minute correct benchmark.
In the reading accuracy measure, 70% of the LB intervention group could
accurately read a grade level passage in the baseline. At endline, 81% of the LB
intervention group could accurately read a grade level passage. At baseline, 73% of the
comparison group could accurately read a grade level passage. At endline, 78% of the
comparison group could accurately read a grade level passage. Neither group met the
benchmark goal of 92% in the reading accuracy measure (Nath et al., 2013).
In reading comprehension, the participants from both groups were able to
correctly answer 1.5 out of 5 comprehension questions based on a grade level passage at
the baseline. In the endline, participants in both groups were able to correctly answer
approximately three out of five comprehension questions. Both groups exceeded the
benchmark goal of responding correctly to two out of the five comprehension questions
(Nath et al., 2013).
The researchers developed another way to analyze the comprehension data based
on a composite measure. They explain that the composite measure entailed reading grade
level passages and correctly answering 75% to 80% of the comprehension questions
while also scoring a minimum of one standard deviation below the corresponding average
fluency or average accuracy. While few of the participants from the LB intervention and
comparison groups could read with comprehension in the baseline, approximately 37% of
the participants from both groups were able to do so in the endline assessment.
Nath and his colleagues conclude that in general the impact of Literacy Boost
shows modest gains for fluency and comprehension. While struggling students
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
38
demonstrated marked improvement from the baseline to the endline assessment, overall
reading scores did not improve significantly in the LB schools. In order to contextualize
the findings and identify correlations, the authors considered other contributing factors
such as school attendance, reading buddy participation, lack of a home literacy
environment, reading camp attendance, and students’ workloads. The establishment of
correlations to a confluence of factors reveals the complexity of the challenge in
promoting children’s Bangla reading fluency. The situation is further compounded by a
lack of national level fluency benchmarks for reading in the children’s first language.
Basher, Jukes, Cooper and Rigole (2014) confirm that while small scale studies
are conducted by local and international organizations, the reality is that a standard for
oral reading fluency does not exist in Bangla. Room to Read’s (RtR) study in the report
Bangla Reading Fluency in Early Grades had a twofold purpose. An initial purpose was
to collect impact data on the organization’s Reading and Writing Instruction (RWI)
program in government primary schools. A second purpose was to collect data on
children’s early grade reading in schools around the country in order to compare these to
the average fluency rates and comprehension skills of schools where Room to Read
implements its RWI program.
The researchers randomly selected 30 first grade students and thirty second grade
students from thirty RtR supported government primary schools in Sirajganj District. The
actual sample size consisted of 809 first graders and 791 second graders, for a total of
1,600 participants from the RtR supported schools. For the nationally representative
sample, the researchers randomly selected 1,630 first graders and 1,464 second graders
from 84 schools representative of the twelve primary streams, for a total of 3,094. For
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
39
both sample sizes, the researchers attempted to select an equal number of boys and girls.
The total sample size for the study was therefore 4,694 students from 114 schools. The
Early Grade Reading Assessment (EGRA) tool was customized to fit the linguistic
features of the Bangla language and standard EGRA testing procedures were
implemented. The study did not include baseline data (Basher et al., 2014).
The findings show that first graders in RtR supported schools can correctly read
18.99 words per minute and provide correct responses to 1.38 out of five comprehension
questions at the end of the year. First graders in comparison schools from around the
country can correctly read 15.51 words per minute and provide correct responses to .98
questions out of five comprehension questions at the end of the year. The findings reveal
that second graders in RtR supported schools can correctly read 41.35 words per minute
and provide correct responses to 2.16 out of five comprehension questions. Second
graders in comparison schools from around the country can correctly read 33.44 words
per minute and provide correct responses to 1.79 out of five comprehension questions.
While Basher and his colleagues inform that the effect sizes for the RWI
intervention ranged from 0.34 to 0.53 and were greater for reading comprehension, the
above findings clearly indicate that students in Room to Read supported schools also
performed better than students from comparison schools around the country in oral
reading fluency. However, even with the RWI intervention, there were 20% of first
graders and 6% of second graders who could not read at the end of the year. Further
cause for concern is that across the country, there were approximately 32% of students in
first grade and 16% of students in second grade that could not read. Comprehension
scores for Room to Read support schools and countrywide schools were also very low.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
40
Based on these findings, the authors suggest that students in primary school
would benefit from a supplementary Bangla literacy intervention. They state that oral
reading fluency should serve as an “indicator of the quality of education in the early
grades” (p. 40). The authors conclude that further large-scale, countrywide research
should collect data on oral reading fluency rates across streams and divisions in primary
schools. This would yield a more robust nationally representative sample and support the
process of setting benchmarks for oral reading fluency within and across the primary
grades.
Sayed, Guajardo, Hossain and Gertsch (2014) conducted a baseline survey report
on primary schools in Bangladesh to identify children’s performance on the READ
project’s intervention areas. A second objective of the baseline assessment was to
benchmark the basic reading skills of students in grades 1, 2, and 3. These benchmarks
would then be compared to the results on another assessment of children’s basic reading
skills administered later in the academic year, soon after the implementation of the
READ intervention. The researchers also collected background data to highlight the
factors that may affect students’ reading acquisition process in the primary grades.
The data collection phase lasted from March to June 2014. In the study, thirty
schools were selected for the control group, 39 schools that received support from
Promoting Talent through Early Education (PROTEEVA), a pre-primary intervention
were selected for the second group, and 32 schools that received both the PROTEEVA
and READ interventions were selected for the third group. The total number of schools
from 21 districts that participated in the study was 101. The total number of participants
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
41
was 3008, which consisted of 1,004 first graders, 1,003 second graders, and 1001 third
graders (Sayed et al., 2014).
Specific grade level competencies were assessed but for the purpose of this paper,
the main findings relevant to the oral reading fluency, accuracy, and comprehension
subtests will be discussed. For the oral reading fluency and reading comprehension
assessments, the first grade participants read a grade level short story comprised of 59
words. First grade participants who were able to read a minimum of five words in the
first 30 seconds were referred to as readers. After reading the story, these students then
responded to ten comprehension questions. The types of comprehension questions
consisted of literal, inferential, summary, and evaluative questions. The assessors read the
story to the first grade participants who were unable to read five words in the first minute.
These participants then responded to the ten comprehension questions posed.
The main findings reveal that only one out of ten participants in the first grade
was a reader. On average, first grade readers could accurately read approximately 46 out
of the 59 words in the story. First grade readers could therefore read with 79% accuracy.
In the oral reading fluency measure, first graders could read on average 16 words per
minute. In the comprehension measure, first graders on average correctly responded to 4
out of 10 questions. Overall, evaluative, inferential, and summary questions posed the
most difficulty. In the comparative analysis, READ supported schools and READ plus
PROTEEVA supported schools show similar results in fluency, accuracy, and
comprehension. While there is no statistical difference in fluency and reading accuracy
across the three groups, first graders in the control schools did not perform as well in
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
42
reading comprehension as the READ supported schools and the READ plus PROTEEVA
supported schools (Sayed et al., 2014).
The second grade fluency assessment consisted of students reading a grade level
passage made up of 83 words. Students who could correctly read a minimum of five
words in the first minute were referred to as readers. After reading the story, these
students then responded to ten comprehension questions. The findings show that second
graders across the three types of schools correctly read an average of 23 words per
minute. One third of second graders could read five words correctly in the first minute
with an average 83% accuracy. In reading comprehension, second graders scored an
average of approximately 25%. Similar to the first graders, the second graders had
difficulty responding to inferential, summary, and evaluative questions. In the
comparative analysis, the statistical difference for fluency, accuracy, and comprehension
between second grade students in READ schools, READ plus PROTEEVA schools, and
control schools was not significant (Sayed et al., 2014).
The third grade fluency assessment consisted of students reading a grade level
passage made up of 122 words. Students who correctly read a minimum of five words in
the first 30 seconds were referred to as readers. After reading the story, these students
then responded to ten comprehension questions. The findings show that third graders
across the three types of schools correctly read an average of 28 words per minute. Two
thirds of third graders could read five words correctly in the first 30 seconds with an
average 83% accuracy. The average reading comprehension for third graders was
approximately 16%. Forty-three percent of third graders from the three types of schools
could not respond to any of the comprehension questions. As in the findings for first and
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
43
second grade, third graders found the inferential, evaluative, and summary questions the
most difficult. In the comparative analysis, the statistical difference for fluency, accuracy,
and comprehension between third grade students in READ schools, READ plus
PROTEEVA schools, and control schools was not significant (Sayed et al., 2014).
The researchers acknowledge the stark contrast between rising oral reading
fluency rates and decreasing comprehension across the grades. The authors recommend
adjustments in the intervention such as intense instructional support targeted to the areas
of letter knowledge, phonemic awareness, and comprehension. They also recommend
professional development on the use of formative assessments and additional reading
materials for children. The authors conclude that adjustments made based on the findings
from the baseline survey and data collected from the scheduled endline assessment will
further inform the impact of the intervention. An important conclusion that Sayed and his
colleagues draw from the findings in the baseline survey report is that there are children
in grades 1, 2, and 3 that are nearly “on pace” for their respective grade level while a
large number of children lag behind and simply have not learned to read.
The in-country fluency trends described in the studies above indicate there is an
urgent need to bring fluency and comprehension as critical indicators of both quality and
equity in education. National norms for oral reading fluency performance have proven to
serve as powerful indicators of reading competence in the early grades (Hasbrouck &
Tindal, 2006). The development of national fluency benchmarks in Bangla for the
primary grades is a step toward this goal. Common grade level standards of attainment
for oral reading fluency can help stakeholders assess and determine where students in
every primary school are in relation to the standard and provide them with the necessary
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
44
scaffolds to meet the benchmark goal by the end of the academic year (Hasbrouck
&Tindal, 2005). While local institutions and international organizations conduct research
and provide varied intervention programs that aim to identify and address the gaps in
children’s basic literacy skills, the creation of national standards for oral reading fluency
in Bangla would further support the efforts of education sector stakeholders toward the
goal of teaching children to read with fluency and comprehension.
In order to move closer toward this goal, it is important to think deeply about the
language-specific and universal features of the reading acquisition process that lend
themselves to a better understanding of the relationship between fluency and
comprehension in the context of the Bangla. The next section discusses some of the
universal features of the reading acquisition process and explores the intersection
between fluency and comprehension.
The Relationship between Fluency and Comprehension
Fountas and Pinnell (2006) describe the universal characteristics of fluency as
multidimensional, requiring four levels of processing: at the letter or symbol, word,
phrase, sentence, and entire text. As children learn to read, there are key processing
mechanisms that occur at each point. At the level of the letter or symbol, children use
visual information to note the differences in each such as size and shape. Children who
attain letter fluency are able to name a letter and its sound (s). They notice that these
letter symbols connect to form words. At the word level, children notice that words come
in varied lengths and that words hold meaning. They also use picture cues in the text to
figure out words. Children also begin to recognize they have seen some words before.
Familiar words and word parts help children figure out how to read new words or word
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
45
parts in a sentence. As they practice new and familiar words, children learn that a
continuous string of words are connected and have layers of meaning. At the phrase and
sentence levels, children notice how punctuation and the implicit grammatical rules of a
given language help them to visually segment parts of the sentence in a way that makes
sense. They begin to use everything they know about sentence structure and vocabulary
from their formal and informal experiences with speaking and listening as well as stories
read to them in order to practice the flow of reading written language. In their reading,
children begin to practice such elements of fluency as tone and inflection in a way that
supports understanding of the text. At the level of the text, children’s growing awareness
of the author’s tone and how written language works in terms of the structures of fiction
and nonfiction texts helps them to process the text more fluently. By looking at fluency
from the perspective of these four levels of processing, it is evident that children engage
in different forms of meaning making at each level.
On the link between fluency and comprehension at the level of the text,
Fountas & Pinnell (2006a) aptly point out:
“Ultimately, the reader must use comprehension itself to support
fluency…comprehension and fluency are intricately and intercausally connected.
Each benefits from and influences the other. They are, in fact, parts of the whole
act of reading—the complex processing that readers do—and they are extremely
hard to separate. Readers use the structure, or organization, of the text, as well as
their background knowledge, to support both comprehension and fluency.”
(p. 67).
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
46
It is clear that more than fluency is required to read with understanding (Nag, Chiat,
Torgerson, & Snowling, 2014). In the classroom, the more children are engaged in
explicit, scaffolded instruction and are provided with many opportunities to practice
decoding at each level of fluency processing, the better chances they will read with
fluency and increased comprehension. In this process, oral language development plays a
critical role in learning to read with fluency and understanding. Children’s oral language
development is the building block for the development of print concepts, phonological
awareness, phonemic awareness, letter or akshara symbol knowledge, symbol-sound
relationships, as well as vocabulary and writing (National Institute of Child Health and
Human Development; 2000, 2005; Nag & Snowling, 2011).
Fountas and Pinell (2006) explain how oral language development plays a crucial
role in oral reading fluency and comprehension. They elaborate that children use what
they know about how spoken language sounds like and the nuanced meanings of words
they acquire through their experiences with talk in school, at home, and in the wider
community. Part of oral language development therefore includes the acquisition of
vocabulary through talk. This is in line with Nation and Snowling’s (2004) research
which found that the more oral vocabulary words children know, the easier it is to draw
on phonological awareness and comprehension to recognize words in print thereby
improving reading fluency. A teacher’s challenge is to bridge children’s implicit syntactic
and semantic knowledge as well as the structural, visual, and meaning cues within the
text to learn to read with fluency and understanding. The authors highlight the six
dimensions of fluency: pausing, phrasing, stress, intonation, rate, and integration. Table 1
below shows Fountas and Pinell’s (2006) definitions of each dimension of fluency.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
47
Table 1.
Six Dimensions of Fluency
Dimensions
Pausing refers to the way the reader’s voice is guided by punctuation.
Phrasing refers to the way readers put words together in groups to represent the
meaningful units of language. Phrased reading should sound like oral language,
although more formal. Phrasing involves pausing at punctuation as well as at places in
the text that do not have punctuation.
Stress refers to the emphasis readers place on particular words (louder tone) to reflect
the meaning as speakers would do in oral language.
Intonation refers to the way the reader varies the voice in tone, pitch, and volume to
reflect the meaning of the text—sometimes called “expression.”
Rate refers to the pace at which the reader moves through the text. An appropriate rate
moves along rapidly with a few slowdowns or stops and long pauses to solve words. If a
reader has only a few short pauses for word solving and picks up the pace again, look at
the overall rate. The pace is also appropriate to the text and purpose of the reading—not
too fast and not too slow.
Integration involves the way the reader consistently and evenly orchestrates pausing,
stress, intonation, and rate. The reader moves smoothly from one word to another, from
one phrase to another, and from one sentence to another, incorporating pauses that are
just long enough to perform their function. There is no space between words except as
part of meaningful interpretation. When all dimensions of fluency—pausing, phrasing,
stress, intonation, and rate –are working together, the reader will be using expressions in
a way that clearly demonstrates that he understands the text and is even thinking beyond
the text.
Source: Fountas & Pinnell (2006a)
When children increasingly attain fluency at the levels of the letter or akshara, word,
phrase, and sentence, they further benefit from instructional support and ample time to
practice reading and integrating the six dimensions of fluency with grade level texts. It is
clear from this perspective there is more to oral reading fluency and how children process
texts than measures of accuracy and speed. At each of the levels of fluent processing, it is
important to tap into children’s repertoire of resources such as background knowledge
gained from personal experiences (Calkins, 2000) for comprehension and support their
ability to build an inner control of the reading process over time (Clay, 1991).
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
48
In regards to the process of building this inner control, Moore and Lyon (2005)
state that, “children who read slowly, word-by word, with little expression, have
difficulty comprehending and remembering what they read. It is this connection to
comprehension that makes fluency most critical” (p. 53). When children struggle to read
and require additional support in the process of building an inner control of the reading
process, an analysis of children’s reading behaviors can provide valuable data about the
sources of information that they use to decode and make meaning. In the case of Bangla,
syllable processing is a strong predictor of fluency and comprehension (Nag et al., 2014a;
Nag et al. 2014b)
According to Fountas and Pinell (2009), sources of information on children’s
reading behaviors are usually classified into three categories: meaning information which
may include reading and understanding pictures as well as words, structural information
which involves the arrangement of words and sentences in a given language, and visual
information which includes text structures, symbol-sound relationships at the level of
phonemes and syllables as well spaces and diacritical marks. Teachers may use running
records as a tool to monitor, assess, and adjust instruction based on these three categories
(Moore & Lyon, 2005). By conducting error analyses within and across these categories,
it is easier to understand the individual and collective profiles of readers and describe
what is happening in the intersection between fluency and comprehension.
An important point to consider in this intersection is that even when children are
still in the process of learning to read, they can develop comprehension skills. While
fluency and comprehension overlap in some ways and are interdependent, children have
the capacity to tell and listen to stories, as well as read pictures in order to ask questions,
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
49
make predictions, inferences, and connections, identify important ideas, and synthesize as
evidence of critical thinking and comprehension (Moore & Lyon, 2005). Children may
therefore comprehend a text even if they cannot process it through the written language.
Alternatively, they may also use their decoding skills to read fluently yet not understand a
word of what they have read because what they have just read does not stay long enough
in working memory for it to actually make sense. Anyone who listens to children who
read in this way and engages in comprehensive discussion can attest that what was read
aloud with fluency was a string of words unattached to meaning. This is how expression,
another important characteristic of oral reading fluency, can offer clues to the link
between fluency and comprehension.
In order to process a written text with understanding, children must attend to the
“visible” sources of information described above as well as “invisible” sources of
information (Fountas & Pinnell, 2006). These include various types of knowledge such as
how language works, knowledge of concepts and facts, knowledge derived from
personal, social, and cultural experiences, and knowledge of the characteristics of written
texts such as genres and story elements (Calkins, 2000; Fountas & Pinnell, 2006;
Gonzalez, 2005). Fountas and Pinnell (2006) posit that even when children develop the
decoding skills needed to read with fluency, they are continuingly challenged by texts
across grade levels to maintain oral reading fluency as part of a network of processing
systems that occur at the same time during reading. This network of strategic actions
portrays reading as thinking, a network in which readers think within the text, about the
text, and beyond the text in order to make meaning. Figure 1 illustrates how the
maintenance of fluency and other strategic actions support comprehension.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
50
Figure 1.
A Network of Processing Systems for Reading
Source: Fountas & Pinnell (2006a)
Moore and Lyon (2005) further support this expanded description of oral reading
fluency. They argue that past perspectives on fluency narrowly conceived of it as
consisting of two components, namely an appropriate rate and accuracy. While reading
rate can serve as an indicator of comprehension, the authors recommend lessening the
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
51
emphasis on speed in the early grades and assessing it after first grade for English. Some
reasons the authors give are that emergent readers are still practicing directionality and
that automaticity usually sets in later because children are slowly acquiring a bank of
sight words. This should be considered in the case of Bangla, given that the research
studies previously cited in this report note that emergent readers need more time to attain
automaticity due to the visually complex features of the alphasyllabic script (Nag 2007;
2011; Nag & Sircar, 2008; Nakamura, 2014). In terms of accuracy, researchers hold that
in order to maximize comprehension children should read 9 out of every 10 words
correctly (Moore & Lyon, 2005; Clay, 2000). However, there are instances in which
children may even read many words correctly per minute and not understand what was
read due to issues with working memory (Abadzi, 2012). Another point to keep in mind
regarding the relationship between fluency and comprehension is that texts used to
practice learning to read must be “just right,” neither too easy nor too hard otherwise
fluency and comprehension are compromised (Fountas & Pinnell, 2001).
The creation of oral reading fluency benchmarks that are aligned with
comprehension benchmarks can greatly inform the publication of children’s books that
scaffold the cognitive demands of texts through a balance of challenge and support along
a continuum within and across grades (Fountas & Pinnell, 2006b). Fountas and Pinnell
explain that children’s books have certain text characteristics that make them accessible
with just the right amount of sentence complexity to support children’s learning to read
and reading to learn. Some of the text characteristics noted by the authors are: genre, text
structure, content, themes and ideas, language and literary features, syntactic and
semantic word and sentence complexity, word length and frequency, genre-specific
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
52
illustrations, print layout, sentence length, font size, and number of pages. These
characteristics play a critical role in the way children learn to process texts and ultimately
make meaning.
However, in contexts where solely textbooks are used to teach children how to
read, it is challenging to meet children where they are in their reading acquisition process
because textbooks are often written in one grade-specific reading level rather than a range
of reading levels for a given grade. Therefore, there is often a mismatch between where
individual children are in the reading acquisition process, knowledge and understanding
of the cognitive demands of different types of texts, the availability of a range of
successive text level gradients to scaffold reading instruction, and the language-specific
differentiated teaching approaches required to move children along each stage of the
reading continuum.
Fortunately, the fact that low-income countries have little to no accessibility to
children’s books is slowly changing. Local and international organizations are in the
process of working with ministries of education to develop decodable and leveled texts
across many languages. Refer to Appendix A for a sample decodable text in the Bangla
language created by Room to Read. The United States Agency for International
Development (USAID) delineated a simple framework that is universally applicable in
the creation and leveling of books based on criteria that move beyond readability
formulas (Davidson, 2013). In effect, children’s books can then be matched to readers at
different stages of the reading acquisition process. Teachers could assess children’s
reading behaviors in terms of types of decoding errors made, oral reading fluency rate, as
well as literal and inferential comprehension. Based on a triangulation of this data,
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
53
teachers could then select texts that reflect the appropriate balance of challenge and
support and adjust explicit instruction accordingly.
While Rasinki (2010) agrees that the more children are able to read fluently, the
more cognitive space is left open to focus on comprehension, he warns that assessment
tools like DIBELS must be careful not to “craft a de facto and reductionist definition of
fluency—a rate attenuated by accuracy” (p. 8). Rasinksi highlights the intercausal
relationship between reading rate and comprehension when he argues that although
reading rate may provide a window into children’s decoding skills, vocabulary and
comprehension also impact reading rate. The author uses the term “meta-fluency” to
describe the need to create assessments and instructional methods that help children build
the inner control “of the elements of fluency—accuracy, rate, and expression—to the end
of comprehending what they read to become fluent readers” (p.8). In regards to oral
reading rate, Abadzi (2012) explains how the visual complexity of alphasyllabic scripts in
addition to akshara combinations take up working memory and that children therefore
take longer to decode longer words, thereby influencing reading rate. This is why it is
important to expose children to print in and out of school. This will sharpen their ability
to recognize familiar words, decode new words, strengthen automaticity, and read with
expression. When children read with expression, Rasinki specifically refers to children’s
ability to read smoothly in phrases while also communicating the intended meaning
through nuances in the tone of voice.
Moreover, Rasinksi (2010) challenges the notion of fixed oral reading fluency
norms used to benchmark at each grade level. He holds that children can potentially read
a fluid range of words in one minute within and across grades and that this range can be
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
54
influenced by many factors. For instance, he wonders how text genres, quality children’s
books, levels of text difficulty, and reading for longer stretches of time impact the
number of words read correctly per minute and ultimately comprehension.
In an alphasyllabic language, inflections are a form of expression that can signify
grammatical categories. Nag and Snowling’s (2011) research on reading comprehension,
decoding skills, and oral language further support the notion that expression plays a
critical role in fluency and comprehension. In the study, the researchers collected data on
the predictors of reading comprehension in a group of 95 Kannada speaking children
from 12 schools in India. In addition to a reading comprehension test, the participants
were assessed on vocabulary and inflectional knowledge. In the portion that assessed
inflection, the participants “were asked to repeat a set of ten sentences differing in length,
with longer sentences comprising more substantive words and inflections but simple
syntax, to reduce demands on syntactic knowledge. Knowledge of inflection was
estimated based on the number of omissions or substitutions of inflections made” (p. 93).
In that portion of the study, the findings revealed that knowledge of inflection was an
independent factor that influenced reading comprehension. According to the authors, this
indicates that the more inflections there are in a language, the more children have to learn
about the morphological parts of words so they can make meaning as they read. They
recommend explicit instruction on low and high frequency inflections to support reading
comprehension.
The section above explored the intercausal relationship between fluency and
comprehension. It described how fluency plays a pivotal role when children learn to read
as well as when they read to learn. The section also discussed the complexity of the
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
55
reading process in terms of the importance of oral language development and its
connection to phonological awareness, vocabulary acquisition, accuracy, reading rate,
expression, and comprehension. The next section discusses oral reading fluency
benchmark making procedures in the context of a study conducted in Ethiopia.
Ethiopia: A Sample Oral Reading Fluency Benchmark Study
Among the many languages spoken in Ethiopia, Amharic, Tigrigna, and
Hararigna use an alphasyllabic script called a fidel (Piper, 2010; Nakamura, 2014). In
2010, the Ethiopian Ministry of Education (MOE) partnered with Research Triangle
Institute International (RTI) in order to conduct an Early Grade Reading Assessment
(EGRA). The portion of the EGRA that measures oral reading fluency was used to
develop oral reading fluency benchmarks in several mother tongues. While there were
languages that use the alphabetic script included in the study, for the purpose of this
report the benchmark making procedures will be discussed through the lens of Amharic,
Tigrigna, and Hararigna since these are written in an alphasyllabic script.
The RTI research team reviewed the analysis and findings from the MOE’s
country-based learning assessment reports to inform the adaptation of the EGRA tool for
each regional language. The team also analyzed the Ethiopian MOE’s minimum learning
competencies for grades 1 to 4 to ensure that the EGRA tasks aligned with the basic skills
needed to meet the respective learning goals (Piper, 2010).
In order to adapt the EGRA instrument, RTI researchers worked with local
language experts to develop certain subtasks with consideration given to textbooks in
grades 2 and 3. They also held in-country workshops and invited local and international
experts from various entities to support the adaptation process. The RTI team trained
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
56
assessors, piloted the adapted EGRA tool, and analyzed the results from this data to make
any needed changes to the subtasks.
Then, the RTI researchers and local experts from the MOE used specific sampling
methods to ensure “regional representativeness.” It took approximately 6 weeks to collect
data. During the analysis stage, RTI researchers only compared zero scores across
languages since the languages are quite different from each other. Piper (2010) points out
that while oral reading fluency benchmarks are language-specific, “the U.S. and
international benchmarks do shed some illustrative light on where Ethiopia is in the area
of reading” (p.21). RTI therefore used 60 Words Per Minute (WPM), the “absolute
lowest benchmark for reading difficulties in the U.S. as well as the number of children
who were reading zero words” to gauge the percentage of children not meeting the
benchmark in each regional language (p. 21).
The next step was to identify the percentage of children that did not meet the
benchmark by grade level and region. The researchers compared achievement word
reading fluency scores across grades, rural and urban regions, and languages and
identified language-specific gaps. The findings show that less than 10% of the
participants met the benchmark of 60 WPM in any of the regional languages. The
researchers identified variations in the zero scores across regional languages from grade 2
to grade 3. In the analysis of accuracy, Piper (2010) explains that due to the alphasyllabic
script of the Amharic, Tigrigna, and Hararigna languages, “the ability to read words
accurately is not likely to differ from the ability to read the fidel accurately”
(p. 33). This supports findings from other studies that children learning to read in an
alphasyllabic orthography need more time to learn the greater number of symbol-sound
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
57
relationships (Nag, 2007; Nag et al. 2014). Accuracy was compared at the fidel and word
levels. The relationship between word naming fluency, decoding fluency, oral reading
fluency, and predictive factors such as student, school and family level factors were also
analyzed.
All of the data were carefully analyzed in order to create basic oral reading
fluency benchmarks. According to Piper (2010), the following statistical methods were
applied to arrive at an initial set of benchmarks:
“First, quantile regression methods are used to show potential markers for oral
reading fluency scores. Second, analysis of the average reading scores for schools
in the lowest 25th percentile of wealth variables is used to show that schools in
poor areas can do quite well in oral reading fluency. Third, scatter plots matching
oral reading fluency and reading comprehension scores are presented to
investigate the fluency levels necessary to ensure high levels of reading
comprehension. Fourth, multiple regression results are used to determine the
levels of fluency for the expected levels of reading comprehension” (p. 40).
As part of the analysis of the statistical findings, each regional language group
participated in workshop meetings to mutually decide on the draft fluency and
comprehension benchmarks based on the current oral reading fluency and comprehension
scores. Piper adds that while all language groups had a minimum of 80% comprehension
rate, the differences in reading comprehension benchmarks in each language depended on
oral reading fluency targets required to read with understanding. The proposed oral
reading fluency benchmark was 60 WPM for Tigrigna and Hararigna. The two regions
where Amharic is spoken proposed 60 WPM and 90 WPM respectively.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
58
Piper’s (2010) study presents the initial steps taken to determine language specific
oral reading fluency benchmarks in the Ethiopian context. It used 60 WPM, the lowest
benchmark used in the U.S. as a frame of reference with the consideration that reading
benchmarks from a different language spoken in another country normally do not apply
in other contexts. Interestingly, in regions of Ethiopia like Afan Oromo and Sidaamu
Afoo where languages are written in an alphabetic script, the proposed oral reading
fluency benchmarks were 70 WPM and 75 WPM respectively. The fact that two regions
where Amharic is spoken set the benchmark lower (60 WPM) and higher (90 WPM)
brings to the fore the importance of ongoing discussion during the decision making
process about oral reading fluency and comprehension data trends, what languagespecific levels are needed for high comprehension, rationales surrounding the
approximations of fluency and comprehension rates, and the pedagogical implications
once benchmarks are set. These varied approximations also illustrate how stakeholders
took into account context-specific data and engaged in deliberation on the levels of
fluency needed for high comprehension. At the time of the study in 2010, extremely low
percentages ranging from 0.1% to 4.3% indicated that children from all regions of
Ethiopia were far from meeting the proposed benchmarks. Therefore, part of the fluency
benchmark making process included the creation of different target percentages for each
region in Ethiopia to be reached by 2015.
While the case of Ethiopia illustrates how the development of oral reading fluency
benchmarks relied on the lowest benchmark used in the U.S. as a reference point and was
open to opposing views on the appropriate benchmark for Amharic, Abadzi (2012)
extends the debate over reading rate comparisons across languages. She explains that due
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
59
to the differences in scripts, some believe that reading rates are bound to be language
specific. Others hold that oral reading fluency rates are in fact comparable across
countries regardless of language or script and that similar and effective intervention
approaches could therefore be implemented. Abadzi argues that, “Across several
languages and scripts 45-60 words per minute amount to 80% comprehension when
vocabulary is known and point to automaticity” (p.13). While frames of reference are
helpful, it is ultimately up to stakeholders to decide how oral reading fluency benchmarks
are to be established.
In a report on reading fluency measurements in Education for All and Fast Track
Initiative partner countries, Abadzi (2011) informs:
“At least 50 reading fluency studies had been worldwide by February 2010. Many
studies involved from 800 to about 3,000 students, but few have collected
nationally representative data. Most focused on specific regions or excluded
remote areas, and a few involved small convenience samples. Of the studies,
many involved EGRA or similarly detailed instruments, while others involved
just passage reading and comprehension questions…Some studies focused on just
one grade, and different single-grade studies may exist in one country with
samples that are not comparable” (p. 11)
In the case of Bangladesh, it will need to be decided if adaptations of EGRA tools and
procedures will be used to develop an initial set of benchmarks or if the Save the
Children Monitoring and Evaluation team and other stakeholders will opt for an
alternative battery of assessments and benchmark making procedures.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
60
The section above highlighted oral reading fluency benchmark making procedures
that can be adapted to the context of Bangladesh. The next section addresses two salient
themes that surfaced during in-country meetings. These themes bring together
stakeholders’ perspectives and immediate concerns about the current language in
education context and pragmatic considerations for the teaching and learning of reading
in Bangla in a competency-based education system. In turn, these themes underscore the
critical relevance of Bangla fluency benchmarks and how these can frame further
dialogue on how children best learn to read in an alphasyllabic script. The next section
contains a synthesis of the notes from in-country meetings.
Considerations for Future Fluency Benchmark Studies in Bangladesh
Language Learning Context
In teacher education programs in Bangladesh, the teaching of reading as a process
is not formally taught. Teachers teach a Bangla language-based curriculum that
incorporates the four modalities of literacy, namely reading, writing, listening, and
speaking. While children across the country learn in Bangla, English, or Arabic and may
speak a Bangla dialect at home, a chunk of instructional time is devoted to prepare
students for examinations. As a result, children do not experience all the facets of literacy
learning. Further, there may be discrepancies between the National Curriculum and
Textbook Board (NCTB) curriculum and the way it is implemented.
In READ supported schools, students learn to read in Bangla via a whole
language approach. The students practice listening, speaking, and picture reading.
Teachers expose students to the whole sentence first, then the words that make up the
sentence, and finally the letters that make up each word. A challenge for children is to
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
61
write words without vowels even though vowels are pronounced. The inconsistency in
the way that some sounds are pronounced and written, which can mean that the order is
inverted, poses difficulty for students.
Students therefore need time to learn the aksharas of the Bangla language.
Around the 8th month of school, first graders are expected to have learned all the single
letters and then begin to learn about the conjunct letters. It is at this critical point that
fluency is normally stunted. Students’ cognitive space is used to build the visual memory
required to identify the shapes of single letters and the sounds they make. The cognitive
demand deepens when students use their visual memory to identify the changes in shape
when single letters are strung together to form conjunct letters and the syllable sounds
these make. In order for students to learn the alphasyllabary principle, ample time and
instructional support are needed to practice the implicit rules of the language.
An area of contention around the notion of implicit rules for Bangla is the topic of
a standard sound system. While a standard Bangla pronunciation may exist, it is not
formally taught in a way that is systematic. Students in different geographical areas may
learn a nonstandard variety of Bangla in school that more closely reflects the dialect of
Bangla spoken at home. This situation can easily give the impression that a standard
Bangla pronunciation does exist. Consequently, students in some regions are not ready to
pronounce certain words because their phonological awareness may not been formally
and systematically exposed to some specific sounds required to build knowledge of lettersound relationships.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
62
Competencies
In Bangladesh’s competency-based education system, six out of 52 terminal
competencies address Bangla language learning (Chabbott, 2008). This is surprising
given that reading in Bangla is required for children to learn in all other subject areas.
Moreover, some consider that competencies are not set at the right level. In a
competency-based instructional approach, pre-determined outcomes are used to assess
student learning (Education Watch, 2000). However, learning and teaching continua are
not connected to beginning, middle, an end of year benchmarks, which would help
teachers use their ongoing assessments to gauge where students are at a given point in the
academic year and develop short and long-term instructional plans to help their students
meet the expected end of year competencies. This is further compounded by the fact that
students receive approximately three hours of instruction each school day in multilevel
classrooms where the student to teacher ratio poses an additional challenge.
While the Ministry of Primary and Mass Education (MoPME) recently provided
new textbooks, the annual competency measures are still relatively traditional. It
therefore needs to be confirmed whether the competency measures in the new
instructional materials are linked to a clear set of measures for reading. Furthermore, the
competencies may use the word “fluent” but the specific measures for fluency in the past
have not specified what the measures for fluency actually are. A sample end of year
competency is that second graders should be able to read simple stories but findings from
in-country studies indicate the contrary (Basher et al., 2014; Sayed et al., 2014). The
development of fluency benchmark goals would help develop clearer expectations at each
stage of the Bangla reading acquisition process and inform how standards and
competencies are set and measured.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
63
Recommendations
Guidelines in the Benchmark Making Process
In any context, the development of benchmark goals for fluency is an iterative
process that requires time, money, collaboration, and effort. It is therefore pertinent to
think about what will be the READ project’s particular contribution and how this
contribution will be framed in both policy and practice. While it would seem ideal to
develop from the start specific benchmarks for the beginning, middle, and end of the
year, it is impractical to start there. It would even be disastrous given the fluency trends
previously discussed in this paper. The massive amount of data that would be generated
would be another deterrent along with budgetary constraints. An approach may be to
implement the benchmark assessment near the end of the academic year right before third
grade students are administered the National Student Assessment (NSA) to measure end
of the year competencies. This way the READ Team can assess whether the benchmark
tool is predictive of the findings in the third grade National Student Assessment. In
October 2014, the READ project’s Monitoring and Evaluation Team conducted a
baseline survey report. These baseline findings can also inform the decision making
process on the endline benchmark tool. During the process, it is important to keep in
mind what levels of reading fluency are required so that students score highly on reading
comprehension.
The immediate section below illustrates a sample timeline for benchmark making
procedures. After the table, there is a more detailed discussion on the considerations and
recommendations based on in-country meetings and procedures drawn from the Ethiopia
benchmark study (Piper, 2010) cited previously in this report.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
64
Table 2.
Sample Timeline for Benchmark Making Process
Procedure
Approximate Duration
READ Team initial internal and external workshop
sessions to discuss the way forward; advocacy,
mobilization, and collaboration with other stakeholders
Align Bangla reading competencies and the NSA tool
2-3 weeks
Workshop sessions to select areas within regions, examine
grade level textbooks, decide on the sampling framework,
develop a language-sensitive tool
Train assessors and conduct interrater reliability tests
one month
Pilot the benchmark tool; test the reliability and validity of
subtasks
Workshop sessions to interpret and discuss the findings
from the pilot, make adjustments to the benchmark tool
Data collection phase in all regions
7+ days depending on
regions and other factors
4 days
Workshop sessions to discuss oral reading fluency rates
and comprehension scores from each region
Workshop sessions to develop proposed draft benchmarks
2-4 days
1-2 days
2 weeks
1-2months
1-2 days
Align the Benchmark Tool to the External Criterion Measure
An initial consideration is to align the benchmarking tool to the external criterion
measure on which the Bangla reading competencies are based. During in-country
meetings, individuals commented that the NSA is linked to learning outcomes and does
not measure fluency per se. The READ projects’ Monitoring and Evaluation team
received a hard copy of the 2013 NSA Bangla assessment. One of the members of the
team expressed that neither terminal competence tests (grade-wise) nor the NSA measure
fluency or comprehension with EGRA instruments or similar tools and that this
challenges the NSA’s ability to serve as an external criterion measure. The member
added that for this reason it may be a good idea to see national reading measurements in
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
65
the ASPR. Further meetings will therefore need to be held to decide if the NSA Bangla
assessment can be used as an external criterion measure or if another assessment can
better serve the purpose.
Select Regions
It is important to consider the geographic and regional spread to ensure a
nationally representative sample that encompasses the linguistic and cultural diversity of
the country (Piper, 2010). Once the regions are selected, a plausible approach is to
administer a large-scale pilot endline benchmark assessment and conduct smaller mixed
method studies in each region to further contextualize and interpret the findings. It is also
important to include in the sample places that were not covered in previous studies such
as Hill Track areas where dialects are spoken as well as Qawmi madrasas.
Develop the Benchmark Tool
Workshops to develop the tool and assess pilot findings should be held. The
involvement of many stakeholders is critical in the development of the benchmarking
tool. Stakeholders from universities, language institutes, the Directorate of Primary
Education (DPE), funding agencies, other international organizations, among others
should participate. Language experts from Dhaka University and the International Mother
Language Institute can help make sure that the linguistic complexity of the benchmarking
tool is up to par with the high frequency single letters, conjunct letters, high frequency
words as well as the reading passages from the grade level textbook and end of year
competencies.
Stakeholder discussions during the workshops should decide what range of
responses would be considered correct, where meaning is not compromised in cases
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
66
where the children’s first language is a Bangla dialect. The grade level expectations of
the national curriculum and the NSA Bangla assessment tool can be used to design the
degree of difficulty of the test items. In order to minimize the time needed to administer
the test and to ensure it is not tedious for the children, the benchmark tool should be short
and simple with a multidimensional design that efficiently measures a number of
elements. The READ team should also look at how data from the implementation of
Instructional Adjustment Tools (IAT) inform this effort. Although the tool does not
measure oral reading fluency rate yet, it does measure other critical areas in each of the
stages. It can potentially inform the development of the benchmark tool.
Another point to consider is the children’s use of Bangla and whether standard
Bangla or a Bangla dialect is the medium of instruction. This has implications in the
selection of high frequency words. When the subtasks of the benchmarking tool are
created, they should be sensitive to the particular region where the assessment is
administered so that the dissonance between students’ phonological awareness and their
formal instruction in standard Bangla does not compromise the results. While children
acquire language from informal, social contexts, they also acquire language from formal,
academic contexts. Classroom discourse is a combination of formal and informal
language (Garcia, 2009). In terms of the selection of high frequency words, vocabulary
words, and words used in the reading passages, it makes sense to adhere to the variety
and range of words that appear in the grade level textbook. Reading passages may
contain the same text and the same number of words, yet maintain sensitivity towards the
specific language-learning context.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
67
Train Assessors and Pilot the Benchmark Tool
Once assessors are trained, inter rater reliability tests should be administered.
Participants from several selected schools in each region should be assessed. The
reliability and validity of the subtasks in the benchmark tool can then be tested.
Additional workshops should be held to discuss pilot findings and develop ways to
improve the benchmark tool. Among stakeholders present, local language experts and
others can tweak the subtasks as needed. Stakeholders involved in the development of the
benchmark tool should be involved in the piloting of the tool (Piper, 2010). Once the
benchmark tool is field tested in urban and rural areas, the level of challenge in the
reading passages and comprehension questions can be balanced out.
Sampling and Data Collection
During in-country meetings, individuals emphasized the value of considering all
the language learning contexts; standard Bangla, dialects spoken in the rural areas,
Arabic, and English medium schools. The inclusion of as many streams as possible from
each of the language communities will ensure that the sample is representative. The issue
of enrollment should be kept in mind since there are instances in which a student is
enrolled in separate classes or there are “ghost” students listed on the student register.
It is also important to keep in mind the dynamics involved in the selection of
government schools, nongovernment schools, urban schools, rural schools, and
socioeconomic status. Piper (2010) noted that care needs to be taken not to develop
benchmarks solely off of findings from wealthy schools only since this can be
“problematic.” He adds that it is important to include findings that make the point that
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
68
children who attend poor schools yet receive good instruction may also achieve fluency
and comprehension.
Piper (2010) and his colleagues in Ethiopia developed basic oral reading fluency
benchmarks for Amharic, an alphasyllabic script. The EGRA tool was adapted and
representativeness was ensured. In a description of the procedure used, Piper states the
following:
Similar to other national assessments such as NLA, ours did not draw a
simple random sample of the population of students in each group of
interest, for cost and efficiency reasons. But to enable us to make
inferences about the performance of the entire population and not just
those sampled, we weighted our results. Our data needed to be weighted
because the sample design did not give each individual an equal chance of
selection. If we did a random sample of students in Ethiopia, we would
have to send the assessment teams to thousands of schools throughout the
country. Instead, we grouped students within schools, schools within
woredas, and woredas within regions, and corrected for this grouping
using weights. (The weights increase the power of the individuals who
were sampled, making them represent the estimated population within
each group.) (p.15).
The READ project team and stakeholders need to decide on a sampling
framework that best suits the particular purposes and priorities that arise out of workshop
meetings. The timing of the assessment should also be considered. The data collection
phase may last approximately one to two months.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
69
Conduct Workshops to Interpret and Discuss the Findings
The READ Team, local language experts and other stakeholders should meet to
discuss the oral reading fluency rates and comprehension scores from each geographical
area. An end goal of the workshop sessions should be to develop draft benchmarks based
on findings representative of each regional sample. During in-country meetings, the
question of whether to aggregate or not is a decision that will need to be made.
Once the endline assessment is completed and third grade students’ NSA results are
released, it will be evident if the benchmark tool was predictive of the findings in the
NSA 3rd grade assessment.
Conduct Workshops to Develop Proposed Benchmarks
As mentioned earlier, the creation of fluency benchmarks is an iterative process.
During one of the in-country meetings, a stakeholder mentioned that 60 WCPM (Words
Correct Per Minute) would be a reasonable end of year oral reading fluency rate for
second grade Bangla readers. In a benchmark study conducted by Piper (2010) and his
colleagues, they also used 60 WCPM as a benchmark minimum for second grade
Amharic readers in order to gauge a general comparison.
Piper (2010) implemented several statistical procedures to develop oral reading
fluency benchmarks in Amharic. For example, he used quantile regression approaches to
identify critical points for oral reading fluency scores. In order to demonstrate that
students in poor regions can achieve reasonable oral reading scores, he analyzed the
“average reading scores for schools in the lowest 25th percentile of wealth” (Piper, 2010,
p. 40). Piper also implemented multiple regression analyses to gauge a reasonable match
between fluency levels and expected comprehension levels. The author informs that once
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
70
the data was collected, quantile regression methods were used to show critical points for
oral reading fluency scores. During the workshop sessions, the READ Team and other
stakeholders can reach an agreement to determine draft fluency and comprehension
benchmarks for Bangla. An important part of the discussion should include what
definition of reading comprehension rate will be used. For example, the definition may be
based on the number of correct questions out of the number of completed questions or the
number of correct questions out of the total number of questions.
The Way Forward: Advocacy, Mobilization, and Collaboration
While learning how other entities proceed to develop oral reading fluency
benchmarks serves as a frame of reference, advocacy work entails consensus and
ownership of the benchmark making process. The READ Team can begin to think of
alliances in the form of long-term local and international collaboration that can provide
the specialized and technical assistance that will be needed. In the process of selecting
stakeholders, it is important to keep in mind which entities will not seek payment because
there is interest in the endeavor and which ones will have to be paid via contracts.
Local partners from the Campaign for Popular Education (CAMPE), the Institute
for Education and Research (IER), the Directorate of Primary Education (DPE), the
National Curriculum and Texbook Board (NCTB), BRAC University, University of
Dhaka, Room to Read, among others can support the advocacy and mobilization process.
Language experts from local universities as well as the Bangla Academy, the Language
Institute, and the International Mother Language Institute should also collaborate on the
development of the benchmark tool. International partners may include USAID and the
Global Reading Network among others. Participation at the upcoming Comparative
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
71
International Education Society (CIES) conference can inform the READ Team about
what other USAID-funded agencies are doing around the topic of benchmarks.
Collaboration and discussion should also center on the implications of designing a
benchmark tool. It will likely raise accountability among districts, administrators,
teachers, and students. Strategic systemic support will need to be provided to move
students along a developmental continuum that leads to reading with understanding.
Conclusion
Setting language-specific oral reading fluency benchmarks is an important step
that needs to be taken if children are to read with fluency and comprehension. A
consensus on clear guidelines at the levels of the akshara, word, and sentence in the early
grades will help teachers identify where students actually are in the reading acquisition
process and where they need to be. In this way, teachers will be able to provide targeted
and differentiated instruction that will move students successfully through the stages of
reading acquisition. The creation of oral reading fluency benchmarks that align with
comprehension benchmarks can potentially improve the quality of reading instruction in
primary classrooms in areas such as decoding skills and explicit comprehension strategies
through ongoing teacher professional development workshops. It may spark the creation
and use of a wide range of grade level reading materials that supplement the use of
textbooks. It may inform the teaching of reading curriculum in pre-service and in-service
teacher education programs as well as urge the cooperation of parental and other
community literacy initiatives. Local and international organizations will be in a better
position to collaborate with the Ministry of Education in order to develop and expand
feasible literacy interventions based on oral reading fluency and comprehension
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
72
benchmarks. Thus, collective resources and policy efforts towards the creation of oral
reading fluency benchmarks that align with comprehension benchmarks can anchor the
long term research, planning, and implementation required to see marked changes in the
quality of reading outcomes in Bangladesh.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
73
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
74
References
Abadzi, H. (2011). Reading fluency measurements in EFA FTI partner countries:
outcomes and improvement prospects. Working Paper Series. Global Partnership
for Education: Washington D.C.
Abadzi, H. (2012). Developing cross-language metrics for reading fluency measurement:
Some issues and options. Working Paper Series on Learning No. 6. Global
Partnership for Education: Washington D.C.
Baker, D. L., Cummings, K.D., Good, R.H., & Smolkowski, K. (2007). Indicadores
dinámicos del éxito en la lectura (IDEL): Summary of decision rules for intensive,
strategic, and benchmark instructional recommendations in kindergarten through
third grade.Technical Report No. 1. Dynamic Measurement Group: Eugene, OR.
Basher, M.S., Jukes, M., Cooper, P. & Rigole, A. (2014). Bangla reading fluency in early
grades: A comparative study between Room to Read supported government
primary schools and other primary schools of Bangladesh. Room to Read:
Bangladesh.
Center on Teaching and Learning, University of Oregon DIBELS Data System. (2012).
2012-2013 DIBELS Data System Update Part II: DIBELS Next Benchmark
Goals. Oregon: USA.
https://dibels.uoregon.edu/docs/techreports/DDS2012TechnicalBriefPart2.pdf
(retrieved December 26, 2014.)
Chabbott, C. (2008). Developing a practical assessment of early language learning in
Bangladesh. BRAC University: Bangladesh. (retrieved January 13, 2015).
Calkins, L. (2001). The art of teaching reading. New York: Longman.
Clay, M. M. (1991). Becoming literate: The construction of inner control. Portsmouth,
NH: Heinemann.
Davidson, M. (2013). Books that children can read: decodable books and book leveling.
USAID: Washington D.C.
Dewey, E.N., Powell-Smith, K.A., Good, R.H., Kaminski, R.A. (2014). Technical
adequacy supplement for DIBELS next oral reading fluency. Dynamic
Measurement Group: Eugene, OR.
Dowd, A.J., Friedlander E. (2009). Bangladesh program: emergent and early grades
reading assessment validation results. Save the Children: Washington D.C.
Fountas, I.C. & Pinnell, G.S. (2001). Guiding readers and writers: grades 3-6 teaching
comprehension, genre, and content literacy. Heinemann: New Hampshire.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
75
Fountas, I.C. & Pinnell, G.S. (2006)a. Teaching for comprehending and fluency:
Thinking, talking, and writing about reading, K-8. Heinemann: New Hampshire.
Fountas, I.C. & Pinnell, G.S. (2006)b. Leveled books: matching texts to readers for
effective teaching, K-8. Heinemann: New Hampshire.
Fountas I.C.& Pinnell, G.S. (2009). When readers struggle: teaching that works.
Heinemann: New Hampshire.
Frost, R. (2012). Towards a universal model of reading. Behavioral and Brain Sciences,
35, (5), 263-279.
Garcia, O. (2009). Bilingual education in the 21st century: A global perspective. UK:
Wiley-Blackwell.
González, N. (2009). Beyond culture: the hybridity of funds of knowledge. In N.
González, L.C. Moll, C. Amanti (Eds.). Funds of knowledge: theorizing practices
in households, communities, and classrooms. New York: Routledge.
Guajardo, J. Hossain, M., Nath, B.K.D., Dowd, A.J. Guajardo (2013). Literacy Boost
Bangladesh endline report. Save the Children: Bangladesh.
Hasbrouck, J. & Tindal, G. (2005). Oral reading fluency: 90 Years of Measurement.
Technical Report Number 33. Behavioral Research & Teaching: University of
Oregon.
Hasbrouck, J. & Tindal, G. (2006). Oral reading fluency norms: A valuable assessment
tool for reading teachers. The Reading Teacher, 59 (7), 636-644.
Jukes, M., Vagh, S., and Kim, Y. (2006). Development of assessments of reading ability
and classroom behavior: a report prepared for the World Bank. Harvard Graduate
School of Education. Cambridge: MA.
Moore, P. & Lyon, A. (2005). New essentials for teaching reading in pre-k-2. Scholastic:
New York.
Nag, S. (2007). Early reading in Kannada: the pace of acquisition of orthographic
knowledge and phonemic awareness, Journal of Research in Reading, 30 (1),722.
Nag, S. & Sircar, S. (2008). Learning to read in Bengali: A report of a survey in five
Kolkata primary schools. The Promise Foundation: Bangalore, India.
Nag, S. (2011). The akshara languages: what do they tell us about children’s literacy
learning?, in R. Mishra and N. Srininivasan, (Eds.), Language-Cognition: state of
the art (pp. 272-290). Lincom Publishers, Germany.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
76
Nag, S. & Snowling, M. (2011)a. Reading difficulties in Kannada, an Indian
alphasyllabary. The Promise Foundation: Bangalore, India.
http://www.thepromisefoundation.org/TPFRdK.pdf. (retrieved January 7, 2014).
Nag, S. & Snowling, M. (2011)b. Reading comprehension, decoding skills, and oral
language. The EFL Journal, 2 (2), 85-105.
Nag, S. & Snowling, M. (2012). Reading in an alphasyllabary: implications for a
language-universal theory of learning to read. Scientific Studies of Reading, 16
(5), 404-423.
Nag, S. Snowling, M. Quinlan, P. & Hume, C. (2014)a. Child and symbol factors in
learning to read a visually complex writing system. Scientific Studies of Reading,
18 (5), 309-324.
Nag, S., Chiat, S., Torgerson, C. & Snowling, M. (2014)b. Literacy, foundation learning
and assessment in developing countries: final report. Department for International
Development: UK.
Nakamura, P. (2014). Facilitating reading acquisition in multilingual environments in
India (FRAME-India): final report. American Institutes for Research: Washington
D.C.
Nation, K. & Snowling, M.J. (2004). Beyond phonological skills: broader language skills
contribute to the development of reading. Journal of Research in Reading, 27 (4),
342-356.
National Institute of Child Health and Human Development. (2000). Report of the
national reading panel: teaching children to read: an evidenced-based
assessment of the scientific research literature on reading and its implications for
reading instruction: reports of the subgroups. Washington, D.C.: U.S.
Department of Health and Human Services. NIH Publication Number 00-4754.
NICHD Early Childcare Research Network. (2005). Pathways to reading: The role of oral
language in the transition to reading. Developmental Psychology, 41 (2), 428-442.
Perfetti, C.A. (2003). The universal grammar of reading. Scientific Studies of Reading, 7
(1), 3-24. Lawrence Erlbaum Associates.
Piper, B. (2010). Ethiopia early grade reading assessment data analytic report: language
and early learning. RTI International: Ethiopia.
Powell-Smith, K.A., Good, R.H., Latimer, R.J., Dewey, E.N., Wallin, J., & Kaminski,
R.A. (2012). DIBELS next: findings from the benchmark goals study. Technical
Report Number 11. Dynamic Measurement Group: Eugene, OR.
Rasinski, T.V. (2010). The fluent reader: oral and silent reading strategies for building
fluency, word recognition, and comprehension. Scholastic: New York.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
77
Sayed, M.A., Guajardo, J., Hossain, M.A., & Gertsch, L. (2014). READ baseline survey
report. Save the Children: Bangladesh.
Sircar, S. & Nag, S. (2014). Eds. Winskel, H. & Padakannaya, P. Akshara-Syllable
mappings in Bengali: A language specific skill for reading. South and Southeast
Asian Psycholinguistics, 202-211.Cambridge University Press.
Tiwari, S. (2011). Literacy development in the alphasyllabaries: implications for clinical
practice. Rapporteur’s Theme Summary on Language Literacy and Cognitive
Development (LLCd) Symposium, 16th-17th December: Bangalore, India.
Vagh, S.B. (2009). Evaluating the reliability and validity of the ASER testing tools.
ASER Centre: New Delhi, India.
Vagh, S.B. (2010). Validating the ASER testing tools: comparisons with reading fluency
measures and the Read India measures. ASER Centre: New Delhi, India.
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
78
Appendix A: Sample Decodable Reader in Bangla
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
79
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
80
Final Report: ORF Benchmark Procedures
Dr. Mónika Lauren Mattos
Save the Children Bangladesh
81
Download