Structured Photograp..

advertisement
Hayward, Stewart, Phillips, Norris, & Lovell
Test Review: Structured Photographic Expressive Language Test-3 (SPELT-3)
Name of Test: Structured Photographic Expressive Language Test-3
Author(s): Dawson, Janet, Stout, Connie, and Eyer, Julia
Publisher/Year: Janelle Publications 2003 (previous editions 1983, 1991, and 1995)
Forms: only one
Age Range: 4 years, 0 months to 9 years, 11 months
Norming Sample: Two chapters by Laura Barnes, statistical consultant, describe the statistical properties of the test. Standardization
was conducted between March 2002 and March 2003. Children included in the sample were recruited by speech-language
pathologists working in various clinical settings.
Total Number: 1580
Number and Age: n=200 at 4 yrs., n=221 at 5 yrs., n=353 at 6 yrs., n=335 at 7 yrs., n=250 at 8 yrs., and n=221 at 9 yrs.
Location: 20 states representing four major U.S. geographic regions
Demographics: Data was reported for gender, ethnicity, and age. The authors noted that when compared to U.S. Census data (U.S.
Bureau of Census, 2000), both African American and Caucasian children were closely representative of population while Hispanic
children were slightly underrepresented and other ethnicities slightly over-represented.
Rural/Urban: not specified.
SES: Mothers’ education level was used and recorded as percentage of the sample: 27% less than high school, 30% with some postsecondary, 37% college graduates, 6% no indication.
Other (Please Specify): The number of language impaired children was compared closely with prevalence estimates of 7%, though
exact numbers were not given. Tables present sample characteristics (number and percentages) by region and age, ethnicity and age,
and gender and age for each region and the country as a whole.
Summary Prepared By (Name and Date): Eleanor Stewart 18 July 2007
Test Description/Overview:
The test kit consists of a manual, a photographic stimulus book, and test forms. The test form contains identifying information, and
short instructions on administration, response recording, and scoring.
This new edition includes new test items for assessment of more complex language structures. In addition, more information on
1
Hayward, Stewart, Phillips, Norris, & Lovell
prompting is outlined. Stimulus photographs are updated and said to reflect U.S. population diversity. A chapter describes African
American English and includes scoring for AAE. Standardization was expanded to include children 4 years, 0 months to 9 years, 11
months of age.
The authors claim that because “the contextual framework is provided along with photographs depicting the activities or events, there
is a reduction of emphasis on the pragmatic and semantic aspects of language. Familiar vocabulary and situations are illustrated”
(Dawson, Stout, & Eyer, 2003, p. 3).
In the introductory chapter, the authors provide a brief overview of specific language impairment (SLI) as a diagnostic category of
language learning disability. They discuss the impact of SLI on oral language, social interaction, and written language, particularly
reading. The persistence of these difficulties throughout the child’s academic career is highlighted. Specific challenges in the area of
morphosyntactic abilities are associated with SLI. A brief summary of studies is provided that appears to be current to 1999. Based
on the evidence outlined, the authors propose that assessment of morphology and syntax is key to identifying language impairments
in children. They cite several studies that used versions of the SPELT in research on SLI which they say support its use as an
assessment tool with “good discriminant accuracy” (Dawson et al., 2003, p. 2).
Comment: Perona, Plante, and Vance addressed discriminant accuracy in their 2005 article.
Comment: Other than the single statement, no other discussion of the relationship with reading is offered. Too bad.
The SPELT-3 uses an elicited format. The authors state that ideally targets could be assessed from spontaneous language samples.
However, for practical purposes, language sampling is generally too time-consuming and faces other constraints such as how to
generate the conditions to elicit a range of structures spontaneously. Further, they note that spontaneous conversation rarely generates
all of the complex structures of interest.
Purpose of Test: “The first purpose of the SPELT-3 is to identify children who are performing significantly below their age
equivalent peers in the production of morphosyntactic structures” (Dawson et al., 2003, p. 3), to identify their individual relative
strengths and needs, and to identify areas of further investigation.
Areas Tested:
1. Morphological structures: preposition, plural, possessive noun and pronoun, reflexive pronoun, subject pronoun, direct/indirect
object, present progressive aspect, regular and irregular past tense, modal auxiliaries, contractible/uncontractible copula, and
contractible/uncontractible auxiliary are assessed.
2
Hayward, Stewart, Phillips, Norris, & Lovell
3
2. Syntactic structures: negative, conjoined sentences, wh question, interrogative reversal, negative infinitive phrase, propositional
complement, relative clause, and front/center embedded clause are assessed. Of these structures, new additions to SPELT-3 were: wh
clause, propositional complement, front/center embedded subordinate clause and the relative clause.
Areas Tested:
 Oral Language
Grammar
Who can Administer: No specific professionals are identified. Like other tests reviewed, the manual only specifies that individuals
who administer, score, and interpret the SPELT-3 have a “thorough understanding of child language development, particularly
morphology and syntax” (Dawson et al., 2003, p. 5). Comment: Such knowledge limits the range of individuals who would ideally be
qualified to use the SPELT-3. For example, a speech pathology faculty member once despaired that her SLP graduate students could
not identify these structures. That is
disheartening when the research points to these structures as indicators of SLI.
Administration Time: Administration time is 15 to 25 minutes “depending on examiner familiarity and child’s ability and
personality” (Dawson et al., 2003, p. 7).
Test Administration (General and Subtests):
The authors urge examiners to be well acquainted with the test in all aspects prior to assessing children. In particular, they note, “Indepth study of the Scoring Guide (Chapter 4) is necessary for the clinician to be able to accurately score responses” (Dawson et al.,
2003, p. 5). Test conditions include a quiet room free from distractions, an appropriate table and chair so that the child has
comfortable access to the test materials and the examiner can record responses on the form and turn pages in the photo booklet.
The test items are numbered from 1 to 53. There is one practice item with a prompt (“Tell me the whole thing”). Abbreviations of
target structures are listed alongside each test item. The examiner’s eliciting statement is clearly marked in bold print with the child’s
target response alongside in underlined type. For example, item 7: pl. n. (ez) Tell me about this picture. [Prompt if response is
“cups.” Here is one glass, here are two_____________] (Two glasses on the table.)
Appendix A provides detailed instructions for eliciting prompts. Comment: I think that these prompts don’t add much in the situation
that clinicians described to me. They felt that they often have exhausted the possibilities and then are not able to cue. Perhaps the test
Hayward, Stewart, Phillips, Norris, & Lovell
could be further developed to include a dynamic assessment component?
Responses are marked as correct or incorrect on the record form. The correct items are added to obtain a raw score. The raw score is
converted to standard score, with confidence intervals, percentile ranks, and age equivalents.
Test Interpretation:
Chapter 5, “Analysis and Interpretation”, discusses the standardized scores and their use in interpretation, and presents the evidence
for interpreting linguistic performance relative to the research on the linguistic structures tested. The authors caution against the use
of age equivalents though they are included in the available scores. At the end of the chapter, the authors present a section on general
cautions regarding interpreting test results. Though these are familiar cautions, I think that the section is particularly well written.
Comment: I found the section on interpreting linguistic performance very useful as a brief review of the emergence of structures.
Standardization:
Age equivalent scores
Percentiles and percentile band
Other confidence intervals at 90 and 95 % levels.
Standard scores
In the chapter on standardization, Laura Barnes outlines the decisions made with regard to the scores derived from the
standardization sample. Here, for example, she discusses how age equivalent scores that fell between the median ages for two age
groups were assigned to the lower age group. Further she states, “It cannot be over emphasized that raw scores at the upper limits of
an age equivalency range represent above average performance for that age range and typically only somewhat below average
performance for the next higher age range” (Barnes, 2003a, p. 44). She cautions the examiner to consider measurement error when
children’s raw scores approximate the cut-off for an age group. Barnes states, “These age equivalency ranges are an extremely rough
estimate of test-age performance and should not ever be used for placement” (p. 44). Comment: This last statement should be written
in bold type as it is most unfortunate that age equivalents are even presented given the temptation they carry. I never, never, never
report age-equivalent in reports or when communicating a child’s performance. So, I am not sure why test developers continue to
calculate AEs and include them in the test package. Is there a reason?
In addition to her comments on age equivalency, Barnes also notes the decisions made with respect to standard score development
where the decision was made to combine genders given that “no clear pattern of gender differences” was found. By doing so, the
developers were able to “maintain a reasonably large number of cases for each normative group” (Barnes, 2003a, p. 43). Also, since
the distributions by mean and standard deviations of children at ages 8 years to 8 years, 6 months; and 9 years to 9 years, 6 months
4
Hayward, Stewart, Phillips, Norris, & Lovell
5
overlapped, a decision was made to report 12-month intervals for 8 and 9 year olds. Using Z scores, raw numbers were converted to
standard scores with a mean of 100 and SD of 15. Barnes comments, “This was not a normalizing transformation because there was
no expectation that morphosyntactic abilities would be normally distributed in the population. It was, in fact, expected and observed
that the raw scores would be negatively skewed (that is, children would make relatively few errors) and that the skewness would
increase with age” (p. 43). Barnes points out that caution should be used in interpreting regarding percentile scores. Comment: I
found Barnes discussion illuminating for its explanation of the decisions that must be made in order to develop standardized scores. I
haven’t found many other test authors describing these decisions though I imagine all test developers must have to make such
decisions.
Reliability:
Reliability: Reliability “ranged from .76 to .92 with median reliability of .86 … somewhat higher estimates for scores of younger
children are likely due to greater variability among their scores” (Barnes, 2003b, p. 46).
Internal consistency of items:
Test-retest: 56 children were retested with a median interval of 11 days. The correlation is reported as .94.
Inter-rater: Two trained raters independently scored completed 188 protocols of 85 females and 101 males representing 8 states.
Correlations of .97 to .99 were reported. The authors also present the absolute differences stating, “Raters were within one point of
each other for 90% of the sample with higher rates of agreement for protocols of older children” (p. 45).
Other: SEMs In order to obtain standard errors of measurement for raw scores, separate internal consistency estimates by age group
were calculated. These estimates were “also used to obtain the standard errors for computing confidence intervals for standard scores
and percentiles” (p. 46).
Validity:
Content: A research review was conducted examining evidence for the development of morphology and syntax relative to test items,
revisions of items, and the addition of proposed items. Items were compared to the Index of Productive Syntax (IPSyn) by
Scarborough (1990) measure which is “widely used for analysis of spontaneous language in clinical and research settings” (Barnes,
2003b, p. 46). This comparison demonstrated “that the majority of SPELT-3 items tapped the verb phrase”. The author goes on to say
that the evidence supports that children with SLI have difficulty with verb phrase structures. Accordingly, the SPELT-3 contains
fewer targets for noun phrase as children have less difficulty with nouns.
Criterion Prediction Validity: Concurrent validity with the Syntax Construction Test of the CASL was demonstrated with a study of
Hayward, Stewart, Phillips, Norris, & Lovell
34 children (22 girls, 12 boys, ages 4 through 9, in one state). A correlation coefficient of .78 indicates “substantial overlap” as both
tests measure the expressive aspect of morphosyntactic structures.
Construct Identification Validity: Evidence is presented in Table 8 and in text that, “test score means increased with age, while
standard deviations decreased with age, and the rate of score increase was more pronounced among younger children” (Barnes,
2003b, p. 47) providing evidence that development is captured
Differential Item Functioning: above
Other: The Perona et.al. study (2005) of 4 and 5 year olds reported 90% sensitivity and 100% specificity.
Summary/Conclusions/Observations:
My overall impression is that of a well-developed test that contributes important information to the identification of children with
language difficulties. In fact, it is one of the few tests I have reviewed so far that has strong discriminative ability. The reference list
is concise at 50 entries. Perhaps that is because the field of inquiry is rather well-defined for morphosyntactic abilities.
Clinical/Diagnostic Usefulness:
While the test properties are impressive and there is very good evidence to support using the SPELT-3 for identifying children with
SLI, the word on the street is that this is a hard test to administer. Clinicians told me that while prompts are outlined, they still have
difficulty eliciting the target responses. They are left without a strategy that preserves the integrity of the test for diagnostic purposes.
Still, in relation to tests reviewed, the SPELT-3 is outstanding in its construction and psychometric foundation. I would encourage
clinicians to persist and incorporate this test in their test batteries.
6
Hayward, Stewart, Phillips, Norris, & Lovell
7
References
Barnes, L. (2003a). Development and standardization. In Janet Dawson, Connie Stout, and Julia Eyer (Eds.), Structured Photographic
Expressive Language Test-3 (pp. 41-44). DeKaub. IL: Janelle Publications.
Barnes, L. (2003b).Technical qualities of SPELT-3 standardization sample scores. In Janet Dawson, Connie Stout, and Julia Eyer
(Eds.), Structured Photographic Expressive Language Test-3 (pp.45-47). DeKaub, IL: Janelle Publications.
Dawson, J., Stout, C., & Eyer, J. (2003). Structured photographic expressive language test-3. DeKaub, IL: Janelle Publications.
Perona, K., Plante, E., & Vance, R. (2005). Diagnostic accuracy of the Structured Photographic Expressive Language Test-3 (SPELT3). Language, Speech, and Hearing Services in Schools, 36, 103-115.
Scarborough, H. S. (1990). Index of Productive Syntax. Applied Psycholinguistics, 11, 1-12.
U.S. Bureau of Census (2000). Statistical Abstract of the United States. Washington, DC: Author.
To cite this document:
Hayward, D. V., Stewart, G. E., Phillips, L. M., Norris, S. P., & Lovell, M. A. (2008). Test review: Structured photographic
expressive language test-3 (SPELT-3). Language, Phonological Awareness, and Reading Test Directory (pp. 1-7). Edmonton,
AB: Canadian Centre for Research on Literacy. Retrieved [insert date] from
http://www.uofaweb.ualberta.ca/elementaryed/ccrl.cfm.
Download