TERA-3 Review: Early Reading Ability Test Analysis

Test of Early Reading Ability, Third Edition. Review of the Test of Early Reading Ability, Third Edition by SHARON deFUR, Associate Professor of Special Education, College of William and Mary, Williamsburg, VA: DESCRIPTION. The Test of Early Reading Ability, Third Edition (TERA-3) is a normreferenced, individually administered test that assesses the mastery of emergent literacy skills in young children ages 3 years 6 months to 8 years 6 months. There are five identified purposes of the TERA-3: (a) to identify children who are below peers in reading development; (b) to identify strengths and weaknesses of individual children; (c) to document progress as a result of early reading intervention; (d) to serve as a measure in reading research; and (e) to serve as one component of a comprehensive assessment. To their credit, the authors clearly state that the TERA-3 is not to be used as a sole basis for instructional planning. The TERA-3 has two alternate forms and the test kit includes an examiner's manual, Form A and B picture book, and examiner record booklets for Forms A and B. Three subtests comprise the TERA-3: (a) Alphabet (mastering the alphabet and its functions; (b) Conventions (understanding the arbitrary conventions of reading and writing in English); and (c) Meaning (understanding that print conveys thought and meaning). The full test can be administered typically in 30 minutes or less, but the items are not timed. Nonclinical staff can administer the TERA-3, but the authors strongly recommend that the examiner have formal training in assessment with a basic understanding of testing statistics, and general procedures regarding test administration, scoring, and interpretation. Regardless of training, the authors recommend careful study of the examiner's manual and a minimum of three practice opportunities before using the TERA-3 in an actual testing situation. The TERA-3 results are reported in raw scores, age and grade equivalents, percentile scores, standard scores, and confidence scores for each of the three subtests. A Reading quotient is calculated by summing scores from the three subtests and using a table to convert the sum to a Reading Quotient. The Reading Quotient is also reported as a percentile. Standard error of measurement, confidence interval, and standard score ranges are calculated for the combined subtests and for the Reading Quotient. Answers receive a score of 1 for correct or 0 for incorrect, and expected answers are clearly indicated in the examiners record booklet. Chronological age determines the start point for testing, but a basal is established when three items are correct in a row, and a ceiling is established when three items are failed in a row. The examiner record booklet includes a profile sheet that offers a graphic comparison across the three TERA-3 subtests as well as a graph to compare the TERA-3 Reading Quotient with other comparable measures that might have been administered to the child. In addition, the examiner record booklet has space for interpretation and comments, where, in addition to diagnostic implications, the examiner can note the conditions of testing and the degree of validity obtained given the testing conditions. The authors take care to help the user interpret the scores obtained on the TERA-3. Throughout the manual, the authors emphasize that 'tests do not diagnose, people do' urging the user to consider why a child responded in a certain way and not just that they did respond in a particular way. DEVELOPMENT. The TERA-3 represents a revision of the TERA-2 with the authors taking the following actions based on the recommendations of reviewers of TERA-2. The test authors collected new normative data, addressing the need for appropriate demographic representation; conducted extensive reliability and validity studies; added items as recommended; made the test pictures in color; and re-introduced grade and age equivalents (with reluctance) because they are required by many state and local agencies. No discussion is directed at describing the specific field tests for the TERA-3; however, extensive discussion is provided in the description of the technical adequacy of the TERA-3. The authors provide a readable review of reading literature that documents the importance of emergent literacy skills in alphabet, conventions, and meaning along with the importance of assessing reading in young children. The authors substantiate the appropriateness of addressing these three areas simultaneously, rather than sequentially, as this progression mirrors the reading development process. The theoretical framework underlying the TERA-3 is well supported in current reading research (National Reading Panel, 2000). TECHNICAL. The TERA-3 used a relatively small norming sample (N = 875), but one that was well matched to the general school-age population (gender, race, ethnicity, SES, disability, and urban/rural) and representative of regions across the United States. All data were collected between February 1999 and April 2000. In response to criticisms of the TERA-2, the test authors carefully examined the possibility of bias for gender, race, ethnicity, and culture and found none evident, or made accommodations as needed. The test developers assessed content sample reliability using two measures. The first measure was the use of coefficient alpha across age intervals for each subtest and for the Reading Quotient. They found all subtests had acceptable levels of alpha with all values exceeding .80. The Reading Quotient had alphas of .91 or higher across all ages. They also evaluated the internal consistency reliability for a variety of examinee subgroups and found alphas for all subgroups to be high also (above .91) and concluded that this indicates that the TERA-3 has no clear bias for any subgroup. In addition, the alternate forms correlations exceeded .80. Test-retest reliability (an interval of 2 weeks) resulted in correlation coefficients near .88, with most comparisons near .92. The test developers assessed interscorer reliability using one TERA author and two advanced graduate students who observed a total of 40 protocols; the interscorer reliability was at .99, clearly supporting the consistency that could be expected between test examiners who had well-developed skills in administering the TERA-3. The test developers describe a convincing and systematic process used to determine content validity for the TERA-3. These include reviewing research, comparing lists of emerging reading behaviors, subjecting items for expert examination, employing a conventional item analysis, and a differential item functioning analysis. Each of these measures strongly supported the assertion that items on the TERA-3 represent the behaviors consistent with those expected for emerging readers, and do so without bias. The test developers estimated the concurrent validity of the TERA-3 scores by comparing them to scores on other norm-referenced measures, including the TERA-2, the Stanford Achievement Test-9, and the Woodcock Reading Mastery Test-Revised (NV). Not surprisingly, the concurrent validity coefficients for the TERA-2 were extremely high. The TERA-3 compared well to the SAT-9 reading comprehension and the WRMT-R(NV) reading quotient. Moderate predictive validity was found for other subtests of the SAT-9 and the WRMT-R(NV). The test developers also determined that the TERA-3 differentiated appropriately for chronological age and for children who were experiencing reading, language, or learning disabilities where lower scores on the TERA-3 would be expected. In conclusion, the authors provide convincing evidence that the TERA-3 is a psychometrically sound measure of early reading ability. COMMENTARY. The TERA-3 represents the culmination of 20 years of revision on a Test of Early Reading Ability where the test developers have carefully attended to the criticisms of earlier versions. To their credit, the technical development and analysis of the TERA-3 instills confidence that the test scores can be considered highly reliable and valid and that the authors have taken great care to address any inherent bias due to race, gender, ethnicity, SES, or disability. In spite of the TERA-3's technical adequacy, commendably, the authors urge the user not to rely solely on this test for curricular or other diagnostic decisions and point the user to other sources of data that can substantiate or refute the findings of the TERA-3 for any one child. The examiner's manual is well written and readable, which can serve to educate the user who studies it attentively. The TERA-3 is easy to use and score, but I support the test developers' recommendation that the user have training in assessment and interpretation and that the prospective user engage in systematic practice prior to using the TERA-3 for diagnostic purposes. The authors indicate that they reluctantly reinstated the calculation of age and grade equivalent scores and I share their reluctance. In spite of all of the psycho-educational specialists' warnings that age and grade equivalents are meaningless scores regarding instruction or diagnostic comparisons across instruments, educators tend to gravitate toward these scores because they seem the most understandable. Yet, an age equivalent of 4 years 4 months on the TERA-3 Alphabet subtest or a grade equivalent of 1.3 on the Meaning subtest have no instructional implications, nor do they accurately measure progress. Children's grade and age level equivalency could be rising over time, but their standard scores and percentiles declining when the rate of improvement did not equate to the passage of chronological time. I would urge the authors to make this precaution more prominent than is found in their current manual. SUMMARY. The TERA-3 represents a reliable and valid measure of early reading ability and reflects those skills that have been identified in reading research as critical to the development of reading. It provides data that suggest strengths or weaknesses in understanding the alphabet and its functions, understanding the conventions of print, and in deriving meaning from print. The TERA-3 offers a quick tool, with easy one-to-one nonclinical administration, to supplement other formal and informal assessments of development reading and can screen for specific areas of strength and weaknesses in individual children. Although the TERA-3 is not a restricted assessment tool, interpretation can be best done by examiners who have training in assessment and an understanding of developmental reading skills. Given that alphabetic knowledge and phonological awareness have a high rate of prediction for future reading skills (National Reading Panel, 2000), the TERA-3 provides some data that could assist educators in identifying those students who would benefit from early intervention in their reading instruction. REVIEWER'S REFERENCE National Reading Panel. (2000, April). Report of the National Reading Panel: Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction: Reports of the subgroups. Washington, DC: National Institute of Child Health and Human Development, National Institutes of Health. Review of the Test of Early Reading Ability, Third Edition by LISA F. SMITH, Associate Professor, Psychology Department, Kean University, Union, NJ: DESCRIPTION. The Test of Early Reading Ability, Third Edition (TERA-3) is an individually administered assessment of emerging reading skills for children between the ages 3 years, 6 months and 8 years, 6 months. The authors define five purposes for the TERA-3: (a) to identify those children who are significantly below their peers in reading development and thus may be candidates for early intervention, (b) to identify strengths and weaknesses of individual children, (c) to document children's progress as a consequence of early reading intervention programs, (d) to serve as a measure in research studying reading development in young children, and (e) to accompany other assessment techniques. (examiner's manual, p. 8) The TERA-3 is attractively packaged. It contains three sturdy and colorful spiral bound booklets, one each for the examiner's manual, Test Form A, and Test Form B, and a profile/examiner record booklet for each form of the test. Each form of the TERA-3 is made up of three subtests. Subtest I, Alphabet, has 29 items designed to assess skills such as knowledge of the alphabet, counting syllables, and initial and final letter sounds. Subtest II, Conventions, has 21 items designed to assess principles such as page orientation, punctuation, spelling, and capitalization. Subtest III, Meaning, has 30 items designed to assess skills such as reading comprehension, sentence construction, and paraphrasing. The administration is not timed but takes an average of 30 minutes. Age-appropriate entry points are given; ceiling and basal points are clear. The authors state that children under the age of 5 may require a break after each 10 minutes of testing. Scores, demographics, and other data are entered on the appropriate form's profile/examiner record booklet page. In scoring the items, correct responses are scored as 1, incorrect as 0. Although the authors claim that the scoring is straightforward, there is some potential for ambiguity or subjectivity for some items. Each subtest has a mean of 10 and a standard deviation of 3. The three subtest scores are combined to form a composite Reading Quotient score with a mean of 100 and a standard deviation of 15. The name given to this score and the fact that the metric is identical to an IQ can lead to serious misinterpretation of the nature of what is being measured here. The authors appear to be making the argument that reading ability is somehow analogous to intelligence, a contention with a host of philosophical and empirical problems. Percentiles, age equivalents, and grade equivalents are also given, the last two with cautionary remarks. DEVELOPMENT. The TERA-3 has been under development since 1981. The examiner's manual gives a detailed historical overview and a listing of current improvements. It offers several definitions of reading and describes components of early reading to establish the rationale for the TERA-3 subtests. Individual items were developed from consultation of research results, the literature base, other tests, and curriculum materials. In forming the subtests, the authors 'asked seven professionals with expertise in reading to review our item placement and suggest any changes they deemed necessary' (examiner's manual, p. 62). Although the professionals are listed by name, no credentials are given. The authors argue that the TERA-3 is based on a philosophy of emergent reading. They cite Valencia (1997) as providing a guiding structure for the development of the measure, with the notable exception that the TERA-3 does not include any measure of phonemic awareness. There are some questions that arise with respect to item development and placement. These relate to the numbers of item types present within each subtest across forms and the ordering of items. However, these issues are relatively minor and probably do not affect the subscores generated. TECHNICAL. Standardization. The norming sample included 875 children aged 3-6 to 8-6 from 22 states. This sample appears to be representative of nationwide statistics as reported in the 1999 U.S. Census, with regard to geographic region, gender, race, urban/rural residence, ethnicity, family income, educational level of parents, and disability status. The sample was also stratified by age. Participants took both forms of the TERA-3 during one testing session; no counterbalancing procedures are described. Reliability. Evidence of reliability is presented for content sampling, time sampling, and interrater reliability. For content sampling, acceptable coefficient alpha data are given by age, form, subtest, and reading quotient score. Acceptable coefficient alpha data are also given by selected subgroups: gender, ethnicity, learning disabled, language impaired, and reading disabled. However, it would have been helpful to see the data for the selected subgroups by age, as well. Correlations for the alternate forms (immediate administration) are also acceptable. Overall, for content sampling, Subtest II, Conventions, demonstrates lower reliability (.83) on both forms as compared to the other subtests (roughly .90) and the Reading Composite (.95). Test-retest reliability at a 2-week interval was investigated using n = 30 children aged 4-6 years from Michigan and n = 34 children aged 7-9 years from Texas. Though the correlations shown are acceptable, it would be difficult to generalize about stability over time given the sample characteristics. Interrater reliability on 40 randomly drawn protocols was .99. Validity. Evidence of content validity is provided by lists of research, curriculum materials, and other tests consulted; favorable evaluations by the seven professionals (mentioned previously); and a parallel comparison of the item content on the TERA-3 to Valencia's (1997) categories of early reading behaviors. The authors state that they selected items for the TERA-3 based on the item-total score Pearson correlation rather than the point-biserial correlation, although these are mathematically identical. A number of differential item functioning (DIF) bias studies were conducted. Although 13 items across the two forms exhibited some DIF, the amount observed was negligible. There is some concern that 7 of the 13 items demonstrating DIF were on Subtest I, Alphabet, Form B. Criterion-prediction studies were conducted using the TERA-2, the Stanford Achievement Test Series-Ninth Edition (n = 70), the Woodcock Reading Mastery TestRevised-Normative Update (n = 64), and teacher ratings (n = 411). The correlations tend to be moderate to high. Evidence of construct validity was determined by correlating performance on the TERA-3 to age. Favorable correlations here are hardly surprising given the developmental nature of reading and the additional instruction received as age increases. Similarly, group differentiations comparing disability subgroups to nonclassified subgroups are what would be expected. Details of a confirmatory factor analysis need clarification. COMMENTARY/SUMMARY. Generally, the TERA-3 accomplishes its stated purposes, especially if used in conjunction with other assessments. Its strengths lie in the ease of administration, easy to use tables for scoring, and a clearly written examiner's manual. However, claims by the authors that the TERA-3 is 'a valid measure of reading' (examiner's manual, p. 76) should be viewed with caution. Tests themselves are not valid. The TERA-3 will be used with diverse types of children in a variety of settings for an assortment of reasons. As such, the validity of the TERA-3 will depend on the specific use of the test in a given situation. The authors should be commended for offering an assessment based on modern reading theory that incorporates examples from everyday life that should appeal to children. REVIEWER'S REFERENCE Valencia, S. W. (1997). Authentic classroom assessment of early reading: Alternatives to standardized tests. Preventing School Failure, 41(2), 63-70

TERA-3 Review: Early Reading Ability Test Analysis

Related documents

Products

Support

TERA-3 Review: Early Reading Ability Test Analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib