Comprehensive Assessment of Spoken Language

advertisement
Hayward, Stewart, Phillips, Norris, & Lovell
Test Review: Comprehensive Assessment of Spoken Language (CASL)
Name of Test: Comprehensive Assessment of Spoken Language (CASL)
Author(s): Elizabeth Carrow-Woolfolk
Publisher/Year: AGS 1999
Forms: two, Form 1 ages 3-6 and Form 2 ages 7-21
Age Range: 3 to 21 years
Norming Sample
Based on her theory of language, Carrow-Woolfolk explains in the acknowledgments that she began the process of developing this
test by dividing “language” into component parts of structure and processing; then further into item types which she felt would elicit
language behaviours reflecting competence and use. With draft test items, Carrow-Woolfolk conducted an initial tryout in one school
in Houston, TX. She states that this process of tryout offered information and insight into how useful (or not) the items were. Then
she expanded the pool to two school districts in a second tryout. The third tryout aimed to identify “special problem areas”.
Formal test development was initiated in 1992 when three pilot studies were conducted (1992-94). The pilot studies addressed item
selection and discrimination. These studies demonstrated that the Rhyming Words test did not show adequate increasing skill with
age increases. Additionally, two tests, Second-Order Comprehension and Non-Literal Language, were combined into one test, NonLiteral Language.
The process continued with three national tryouts (1995-1996) aimed at scoring criteria, administration time, item selection, and bias.
As a result, changes were made that included deletion of items that did not “meet the criteria for difficulty, discrimination, internal
consistency, model fit, or item bias” (Carrow-Woolfolk, 1999, p. 106).
Total Number: 2 750
Number and Age: 1700 students of the 2750 chosen by random sampling to create representative sample for final analysis p. 108
Location: 166 sites by geographic region
Demographics: guided by sampling plan, stratified by gender, geographic region, SES, race/ethnicity based on 1994 Census
Geographic: four major regions defined by US Census Bureau: Northeast, North Central, South, and West
Rural/Urban: not described in this way
SES: by mother’s education
1
Hayward, Stewart, Phillips, Norris, & Lovell
Other: Though not designated as a sample criterion, special education categories were represented in the standardization sample “in
approximately the same proportions that occur in the U.S. school population” (Carrow-Woolfolk, 1999, p. 111). Categories were:
specific learning disabilities, speech or language impairments, mental retardation, emotional disturbance, and other impairments.
Result indicated that totals are within 0.9 percentage points of the targets. As outlined in Table 7.15, percentages were compared to
U.S. population data from U.S. Dept. of Education.
The standardization process was extensive, clearly reflecting a commitment on the part of AGS to Carrow’s work. Tables in the
manual show representation by gender/age, geographic region/age, mother’s education level/age, race/age, race/mother’s education,
race/region, The resulting standardization sample was, “within 1 percentage point of the total target U.S. population percentage for
each region” (p. 108).
Coding and scoring procedures for the open-ended items were examined. Open-ended responses were analyzed when they appeared
to be idiosyncratic (p. 112).
Summary Prepared By (Name and Date): Eleanor Stewart 9 July 07 additions on 16 Jul
Test Description/Overview
The test kit consists of three easel test books, a norms book, an examiner’s manual, two record forms (Form 1 for ages 3 to 6 and
Form 2 for ages 7-21), and a nylon carrying tote. The drawings are in black and white.
The CASL is an oral language test that consists of 15 individual tests covering lexical/semantic, syntactic, supralinguistic, and
pragmatic aspects of language. Carrow-Woolfolk states that these tests examine oral language skills “that children and adolescents
need to become literate as well as to succeed in school and in the work environment” (p. 1).
The test manual is well-organized and written in an easy-to-read style that makes Carrow-Woolfolk’s theory and rationale easy to
read. A chapter is dedicated to Carrow’s Integrative Language Theory. Carrow-Woolfolk explains her theory of language
development which focuses on four components: lexical/semantic, syntactic, supralinguistic, and pragmatics around which the CASL
is framed. Carrow further details each by classifying categories as follows: lexical semantics, syntactic semantics, pragmatic
semantics, and supralinguistic semantics. She provides a rationale for the division into categories based on research (p. 11). Carrow
argues that assessment should target knowledge, processing (ie. listening comprehension and oral expression), and performance. In a
section addressing language disorder, Carrow explains that a good theory must also account for a breakdown in language. She points
2
Hayward, Stewart, Phillips, Norris, & Lovell
to the difficulties in defining the borders of language disorder as well as the difficulties attending to language difference (using an
interesting example of French Canada, p. 17).
Comment: I think that what Carrow is offering is more of a model than a theory. She is pulling together much of what we know about
language development and theorizing about a way to bring it all together in a working model with individual skill sets that can be
assessed though she argues for integration. Carrow’s use of the terms “content” and “form” alert us to the origins of her
theoretical perspective on language. In this, she is referring to models such as Bloom and Lahey, and to Chomsky’s theoretical work
(this last which she cites).
Purpose of Test:
Carrow-Woolfolk states that “The CASL provides an in-depth evaluation of:
1) the oral language processing systems of auditory comprehension, oral expression, and word retrieval,
2) the knowledge and use of words and grammatical structures of language,
3) the ability to use language for special tasks requiring higher level cognitive functions,
4) the knowledge and use of language in communicative contexts” (p. 1).
Further, the author states that CASL can be used to: identify language disorders, diagnose and intervene in spoken language, monitor
growth, and conduct research. In these last two, the author notes that the wide age range on which the test was standardized allows
for tracking children’s performance longitudinally. Of particular note to TELL, Carrow states, “Researchers can use the CASL to
explore the relationship between oral language processing skills and reading abilities” (p. 7).
Areas Tested: The manual provides a table (Table 1.2 on page 2) of the four language areas and 15 tests.
The four components and corresponding subtests are:
1. lexical/semantic: comprehension of basic concepts, antonyms, synonyms, sentence completion, and idiomatic language
2. syntactic: syntax construction, paragraph comprehension of syntax, grammatical morphemes, sentence comprehension of
syntax, grammaticality judgment
These first two components comprise the first level of analysis. Followed by:
3. supralinguistic: nonliteral language (figurative speech, indirect requests, sarcasm), meaning from context, inference, and
ambiguous sentences
4. pragmatic: pragmatic judgment
3
Hayward, Stewart, Phillips, Norris, & Lovell
Comments from the Buros reviewer: Meaning from context not clear as “it is claimed that all information needed to solve these
problems is provided in the sentences, which is difficult to evaluate” (Snyder & Stutman, 2003, p. 217). Also, regarding Pragmatic
Judgment, “it is difficult to evaluate these assertions. There is no discussion of what factors might lead an examinee to do well on the
Pragmatic component, yet still exhibit difficulties with {communicative intent} outside of the testing situation” (p. 218). “This
reviewer is not convinced that the Basic Concepts, Paragraph Comprehension, Synonyms, and Sentence Comprehension rely on
receptive aspects more than other tests in the battery” (p. 218).
Areas Tested:
 Oral Language
Vocabulary
 Listening
Lexical
Syntactic
Grammar
Narratives
Supralinguistic
Other (Please Specify)
Who can Administer: Those with background in language, education, or psychological testing who have graduate level training in
testing and assessment may administer this test. The range of potential examiners includes: neuropsychologists, school psychologists,
educational diagnosticians, speech-language pathologists, learning disability specialists, reading specialists, and resource room
teachers.
Administration Time:
Testing time will depend on which Core Battery is administered and the number of supplementary tests undertaken. Table 1.6 (p. 8)
provides administration time estimates for each test based on the standardization sample. Based on time estimates in Table 1.6,
administration time could be as short as17 minutes for three Core tests at age 3 years to 3 years, 5 months or as long as 59 minutes
for five Core tests at 8 years to 8 years, 11 months.
Comment: A useful table summarizing all the tests with ages, basal and ceiling rules, processing skill, structure category,
description, and examples is found on the AGS website.
Test Administration (General and Subtests):
Test administration is straightforward although the examiner needs to be very familiar with all the test materials and the record form.
The instructions, scoring, and correct/incorrect responses are found in both the easel and on record forms. The basal and ceiling rules
4
Hayward, Stewart, Phillips, Norris, & Lovell
5
are stated at the beginning of each test. Examinee responses are pointing, single word, and open-ended verbal responses. Not all tests
are administered to all age groups. The intent is to administer those tests that constitute “Core” for a given age range. The same test
may be Supplementary to a different age group. For example, at age 5 years to 6 years, 11 months, core tests include antonyms,
syntax construction, paragraph comprehension, and pragmatic judgment but basic concepts and sentence completion are
supplementary at this age range.
Comment: I think that a clinician needs to be very familiar with this test prior to administration as the print layout on the record
form is busy and instructions are detailed. Clinicians need to be careful to have the appropriate booklet and to observe somewhat
varying ceiling rules.
Comment: The manual states, “The basal and ceiling rules were constructed with the goal of maintaining high reliability while
minimizing testing time” (p. 119). This statement indicates to me that the author acknowledges how difficult it is to administer a test
that takes a long time to reach ceiling.
Comment: The Buros reviewer points to scoring difficulties that may be related to the elevated level of linguistic complexity required
for “responses approach[ing]naturalistic communication that creates increasing scoring difficulty. This, however, may be
unavoidable in an instrument that seeks to assess the full range of linguistic functioning.” Also, the reviewer notes scoring difficulties
with the core vs. supplementary structure of the test in which it is sometimes onerous to look up standard scores to determine
whether to continue with supplementary testing (Snyder & Stutman, 2003, p. 220). Finally, the reviewer states that “A computer
program for scoring that could be used in tandem with the testing would also ease this difficulty” (p. 220).
Test Interpretation:
Chapter 4, “Description of the CASL Tests”, provides detailed information about each test in terms of format, what it assesses,
background about the underlying skill including research evidence, and interpretation. Interpretation hinges on the component and its
test which, taken together, address knowledge, processes, and performance.
Chapter 6, “Determination and Interpretation of Normative Score,” describes the standardized scores, the steps taken to complete
comparisons using standard scores, and provides two examples of student’s completed CASL results.
Standardization:
Age equivalent scores
Grade equivalent scores
Percentiles
Standard scores
Stanines
Other (Please Specify) Normal curve equivalent, test-age equivalent (age at which raw score is average; not an age equivalent,
Hayward, Stewart, Phillips, Norris, & Lovell
see p.90). Core Composites, Category Indexes and Processing Index Standard Scores.
Comment: Each of these obtained scores is defined and explained in the manual. I find this useful both as a review and as a way of
communicating to others what the scores mean and how to use them.
Reliability:
The manual describes in detail the extensive reliability testing that was undertaken. For each, data are provided based on sample
characteristics.
Internal consistency of items: How: Using Rasch split-half method, a Rasch ability estimate was calculated and then correlated and
reported by age group for all 15 tests. Overall, the reliabilities were high.
Also, reliabilities were reported for the Core Composite for each age band and the five Index scores, calculated using Guilford’s
formula. The Core reliabilities were in .90s and the Indexes ranged from .85 to .96.
Standard Error of Measurement (SEM)/confidence intervals: computed for 12 age groups using split-half reliabilities. Overall, SEMs
ranged from low 3.6 to high 9.0.
Test-retest: 148 randomly selected students in three age groups: 5 years to 6 years, 11 months; 8 years to 10 years, 11 months; and
14 years to 16 years, 11 months were retested. The interval was 7 to 109 days, with a “median interval” of 6 weeks. “Almost all
retesting was done by the same examiner who had administered that CASL the first time” (Carrow-Woolfolk, 1999, p. 123).
Characteristics of the sample were provided. Coefficients (corrected and uncorrected) for Core tests, Supplemental Tests, and
Indexes were presented in table form by age range with SD. Ranges: .65 (synonyms)-.95 for tests, .92-.93 for Core Composites, and
.88-.96 for Indexes.
Inter-rater: no information
Other: none
Validity
Content: Carrow’s detailed description of the constructs and the tests provides the basis for content validity. She based the tasks on
previous research and the theoretical design she developed. Item analyses included: First, classical item analysis ensured that there
would be “a good range of item difficulty within each test and to examine and eliminate, if necessary, poorly discriminating items”
6
Hayward, Stewart, Phillips, Norris, & Lovell
(p. 113). Second, Rasch item analysis was conducted with the BIGSTEPS program.
Criterion Prediction Validity: Studies compared scores from four language tests and one cognitive test: TACL-R, OWLS, PPVTIII/EVT, and K-BIT. Each correlation produced comparisons with counterbalanced samples of slightly different size and age groups.
Results indicate:
 high correlations with PPVT-III on lexical/semantic and receptive index score, and the same for EVT with lexical/semantic
and expressive index score,
 highest correlation with OWLS Oral Composite with a range of correlations that reflect different content and format,
 correlations distinguish between language and cognitive on K-BIT, and
 high correlations with receptive tasks on TACL-R.
Construct Identification Validity: The Author provides information on: developmental progression of scores, intercorrelations of
CASL tests, and “factor structure of the category and processing indexes” (Carrow-Woolfolk, 1999, p. 126).
 Progression of scores across age ranges.
 Intercorrelation coefficients range from .30 to .79 for 6 age groups. There are “moderate correlations among tests, low enough
to support the interpretation that each test is measuring something unique but high enough to support their combination to
produce the Core Composite and Index scores” (p. 125).
 Factor analyses are detailed for ages 3-4, 5-6, and 7-21 years. A one factor model for ages 3-4 and 5-6 demonstrated Model
Fit Statistics. A three factor Model was used for groups in the 7-21 year range. The author summarizes the results: “There
appears to be a trend that as the examinees get older, the fit of the data to the model gets better (chi-square decreases steadily
and p increases). There also appears to be a trend that the factor correlations decrease as the examinees’ age increases. Both
of these trends are consistent with the language theory on which the CASL is based, which predicts that as children grow
older, the linguistic structure changes from general to specific. Thus a simpler and more general one-factor model fits the data
better for younger children, while a more complex and specific three-factor model fits the data better for adolescents and
young adults. For the intermediate groups (7-10 and 11-12), the hypothesized models fits the data only moderately well,
indicating that these age groups are in the transitional stage of language development” (p. 130).
Differential Item Functioning: Comment: This information has been covered by developmental progression of scores but I do not
see any other reference or statistical analysis using the term DIF.
7
Hayward, Stewart, Phillips, Norris, & Lovell
Other (Please Specify): Beginning on page 137, clinical validity with eight different clinical groups matched for gender,
race/ethnicity, SES and region was detailed with this standardization sample: speech impairment n= 50, language delay n=50,
language impairment n=42, mental retardation n=44, learning disability (reading) two separate age samples n=50 and n=30,
emotional disturbance n=31, and hearing impairment n=7.
Comment: To their credit, AGS acknowledges that the very small sample of children with hearing impairment limits interpretation
for that group. They note difficulty in obtaining a homogenous sample of hearing impaired children. However, they point out that the
finding of performance one standard deviation below the mean is consistent with other studies reporting on hearing impaired
children and published tests (Carrow-Woolfolk, 1999, p. 146). For clinicians working with children with hearing impairments, it is
disappointing to find, once again, they are left searching for means of objectively testing the skills of hearing impaired children.
Comment: The Buros reviewer states “Additional information would also be useful in supporting the internal validity of the CASL”
(Snyder & Stutman, 2003, p. 218).
Summary/Conclusions/Observations:
Although Carrow is aiming for an integrative test, I still feel that with the CASL that we are breaking down into smaller parts for
examination and then attempting to draw it all together in the end using composite profiling. This may be a function of the author’s
orientation to cognitive information processing models. In this sense, the CASL still feels like an old style test to me (in contrast to
Test of Narrative Language (TNL) or the Preschool Language Assessment Instrument-PLAI).
While I think it is commendable to address the pragmatic aspect of language, I am not sure that the particular tasks are adequate. It
is difficult to access this aspect of communication in contrived interactions.
I found the discussion of memory elucidating as I have wondered what role memory played in some of the test reviewed so far. See
also, CELF-4 for RAN and discussion of memory in relation to language.
Clinical/Diagnostic Usefulness:
The CASL is an extensive diagnostic test. For that reason, it is unlikely that clinicians would use it unless there are specific
diagnostic questions. More likely, they would select subtests that target specific skills. If only certain subtests were selected, then of
course, we would need to be careful about use of psychometrics.
8
Hayward, Stewart, Phillips, Norris, & Lovell
9
There have been few tests that address adolescent language so this test is a bonus to those who follow children from elementary to
junior high and beyond, particularly when young adults prepare for work.
Elizabeth Carrow-Woolfolk is a familiar author in speech-language pathology having developed the widely used Test of Auditory
Comprehension of Language (TACL-3). The test is often referred to as “the Carrow” and is most often used in test batteries to
quickly assess morphosyntactic language aspects. Carrow-Woolfolk is also the author of the Oral (OWLS), (CAVAT), and (CELI).
References
Bloom, L. and Lahey, M. (1978). Language development and language disorders. NY: Wiley.
Carrow-Woolfolk, E. (1999). Comprehensive assessment of spoken language (CASL). Circle Pines, MN: American Guidance Service.
Chomsky, N. (1980). Rules and representations. New York: Columbia University Press.
Snyder, K. A., & Stutman, G. (2003). Review of the Comprehensive Assessment of Spoken Language. In B.S. Plake, J.C. Impara, and
R.A. Spies (Eds.), The fifteenth mental measurements yearbook (pp. 216-220). Lincoln, NE: Buros Institute of Mental
Measurements.
U.S. Department of Education. (1996). Eighteenth annual report to Congress on the implementation of the individuals with disabilities
education act. Washington, DC: Author.
To cite this document:
Hayward, D. V., Stewart, G. E., Phillips, L. M., Norris, S. P., & Lovell, M. A. (2008). Test review: Comprehensive assessment of
spoken language (CASL). Language, Phonological Awareness, and Reading Test Directory (pp. 1-9). Edmonton, AB:
Canadian Centre for Research on Literacy. Retrieved [insert date] from http://www.uofaweb.ualberta.ca/elementaryed/ccrl.cfm
Download