Clinical Decision-Making: Using Tests to Determine Child Language

advertisement
Clinical Decision-Making:
Using Tests to Determine Child
Language Impairment Severity
TAMMIE SPAULDING, PHD, CCC-SLP
UNIVERSITY OF CONNECTICUT
NOVEMBER 19, 2011
Student Collaborators
Graduate Students
Undergraduate Students
 Cecilia Figueroa
 Ashley Burgess
 Sabrina Jara
 Caitlin Geary
 Margaret Swartwout
 Kacie Wittke
 Calli Schechtman
 Shannon Tobin
Clinical Decision-Making
Not
Impaired
Impaired
Severity of
Impairment
Prognosis
Frequency and
Duration of
Treatment
Session
Treatment
Approach &
Priorities
LTGs
& STOs
Articles of Reference
1.
Spaulding, T. (in press). Inconsistency in severity ratings for children with
specific language impairment. Journal of Communication Disorders.
2.
Spaulding, T., Swartwout, M., & Figueroa, C. (in press). Using normreferenced tests to determine severity of language impairment in children:
Disconnect between U.S. policy-makers and test developers. Language,
Speech, and Hearing Services in Schools.
 Determine what educational agencies say to do
(re: the use of norm-referenced tests) for
determining severity of impairment in children
Discuss the
Disconnect between
Educational Agency
Guidelines for
School-Based SLPs
and Empirical
Evidence
 Determine if norm-referenced tests


Indicate that they should be used to determine
severity of language impairment
Provide evidence in their examiner’s manuals for
this use
 Determine if empirical evidence supports the
use of norm-referenced tests for informing
severity decisions
Importance of
Determining Severity of Impairment
 Influences Service Delivery





(Spaulding et al., pilot data)
Frequency and Duration of intervention
Treatment approach
Priorities for intervention
Prognostic expectations
Eligibility for service
Characterizing Severity of Impairment
 Categorization is typical

Clinical practice (Caesar and Kohler, 2009)

Research studies
al., 2004)
(e.g., Ballantyne et al., 2007; Cohen et al., 2005; van Daal et
Norm-referenced test use for severity
determinations
 Researchers rely on norm-referenced tests as the
indicator of impairment severity (e.g., Ballantyne et al., 2007;
Bishop & Edmundson, 1987; Evans et al., 2009; Lahey et al, 2001; Hart et al., 2004)
 Are SLPs encouraged to do the same?
 School-based SLPs
 U.S. Department of Education guidelines
Research Question: Clinical Guidelines Study
Spaulding, Swartwout, & Figueroa (in press)
 Do U.S. state Departments of Education recommend
the use of norm-referenced test scores to inform
severity of language impairment decisions?
Method: Clinical Guidelines Study
Spaulding, Swartwout, & Figueroa (in press)
Results: Clinical Guidelines Study
State Departments of Education
STATE
CATEGORY
STANDARD
SCORE
STANDARD
DEVIATION
NORTH DAKOTA
MAINE
ARKANSAS
COLORADO
MILD
MODERATE
SEVERE
78-85
70-77
<70
-1.0 to -1.5
-1.5 to -2.0
< -2.0
ILLINOIS
MILD
MODERATE
SEVERE
PROFOUND
78-84
63-77
62 or below
No criteria specified
-1.0 to -1.5
-1.5 to -2.5
-2.5 or below
No criteria specified
KENTUCKY
MILD
MODERATE
SEVERE
75-79
70-74
<70
-1.33 to -1.66
-1.66 to -2.0
-2.0 or below
TENNESSEE
MILD
MODERATE
SEVERE
70-77
62-69
62 or below
-1.5 to -2.0
-2.0 to -2.5
-2.5 or below
VIRGINIA*
MILD
MODERATE
SEVERE
78-84
70-75
<70
-1.0 to -1.5
-1.5 to -2.0
-2.0 or below
Results: Clinical Guidelines Study
State Departments of Education
STATE
DEPT OF ED
OTHER SOURCES OF DATA
Arkansas
Alternate assessments when validity is a concern
Colorado
Classroom observations curriculum-based
assessments, oral/written language samples, informal
probes
Illinois
Two or more diagnostic procedures/standardized tests
Kentucky
Language samples/narratives, classroom observations,
teacher/parent inteviews, criterion-referenced
activities, writing samples
Maine
Informal assessments can be used
North Dakota
Checklists, language samples, classroom observations
Tennessee
Language samples, checklists, observations
Virginia
Criterion-referenced measures, curriculum-based
assessments, dynamic assessment, language samples,
contextual probes, structured observations, interviews,
reports, checklists
Results: Clinical Guidelines Study
State Departments of Education
STATE DEPT
OF ED
WEIGHTING OF MEASURES FOR SEVERITY
DECISIONS
Arkansas
Illinois
Maine
?
Colorado
Equal weight: Norm-referenced test score, informal assessment,
comprehension of curricular info
More weight: Impact of linguistic deficits on educational performance
Kentucky
More weight to functional assessment and impact on educational
performance than on norm-referenced test score
North Dakota
Equal weight to educational impact and norm-referenced test score, less
weight to informal assessment results
Tennessee
Equal weight to norm-referenced test score, informal assessments, and
functional/academic language skills
Virginia
Equal weight to norm-referenced test score, nonstandardized
asssessment/functional analyses, lang functioning in low
comprehension/low verbal demand envts, and lang functioning in high
comprehension/high verbal demand envts
Discussion: Clinical Guidelines Study
State Departments of Education:
 8 state Depts of Ed. Say to use norm-referenced tests for
severity decisions: Implication:
 Severity of impairment criteria varied across states:
Implication:
 Indicate specific criteria for severity determinations
based on test scores

Assumption 1: ……….. I can apply these boundary criteria on
whatever test I select to administer
Research Question: Applying State Ed Criteria to
Real Children
 Assumption 1: Tested
 I can apply the State Dept of Education criteria on whatever
test I select to use and severity of impairment ratings will be
consistent.
Participants: Demographic Characteristics
TD
(n=31)
SLI
(n=31)
AGE
4.66
(4.0-5.5)
4.66
(4.0-5.7)
SEX
23 MALES, 8 FEMALES
23 MALES, 8 FEMALES
14.24
(11-17)
14.13
(11-17)
2
0
1
22
6
4
2
0
15
10
5
4
22
14
5
12
MOTHER’S
EDUCATION LEVEL
RACE
AFRICAN
AMERICAN
AMERICAN INDIAN
ASIAN
WHITE
UNSPECIFIED
ETHNICITY
HISPANIC
NON-HISPANIC
UNSPECIFIED
Method: Assumption 1
Applying State Ed Criteria to Real Children
 Preschool children with SLI and typically developing (TD)
children were administered:


Test for Examining Expressive Morphology (TEEM:;Shipley, Stone, & Sue,
1983)
Structured Photographic Expressive Language Test-Preschool, Second Edition
(SPELT-P2; Dawson, Stout, Eyeret al., 2005)
 The most common boundary criteria that state Departments of
Education recommended for use were applied to their scores on
these assessments
 Consistency between severity rankings were determined
Method: Assumption 1
Applying State Ed Criteria to Real Children
 Most common State Dept of Ed boundaries
SEVERITY CATEGORY
STANDARD SCORE
BOUNDARIES
Typically Developing
>85
Mild
78-85
Moderate
70-77
Severe
<70
Participants: Norm-referenced Test Scores
Results: Consistency in Severity Classifications
Using State Ed Criteria (SPELT-P2 vs TEEM)
30
CONSISTENT
25
# of participants
INCONSISTENT
20
15
10
5
0
SLI
TD
Results: Consistency in Severity Classifications:
Specifics of Severity Rankings
TEEM
SPELT-P2
TYPICALLY
DEVELOPING
26 TD
31 TD, 3 SLI
MILD
5 TD
6 SLI
MODERATE
0
7 SLI
SEVERE
31 SLI
15 SLI
Assumption 1: Discussion
 Can I apply the Dept of State Education criteria on
whatever test I select to use and it will result in the
consistent severity determination? ANSWER:
NO
Study 1: Assumption 2
Spaulding, Swartwout, & Figueroa (in press)
 8 (16%) of State Departments of Education are
saying to use norm-referenced test scores to
determine severity of impairment

ASSUMPTION 2: Norm-referenced tests must be designed for
the purpose of determining severity of impairment
Research Questions Assumption 2: Tested
Spaulding, Swartwout, & Figueroa (in press)
 Do norm-referenced tests indicate that they should
be used to determine severity of language
impairment in children?
 If so, do they provide empirical evidence supporting
this use?
Method: Assumption 2
Spaulding, Swartwout, & Figueroa (in press)
 Ordered the latest edition of 45 norm-referenced
tests of child language

Met Criteria:
 Assessed
oral language skills of English-speaking children between
the ages of 5-17
 Were normed on kids who spoke American English
 Were commercially available
 Required elicited responses from children
 Were not screening or criterion-referenced measures
 Looked in the examiner’s manuals
Method: Assumption 2
Spaulding, Swartwout, & Figueroa (in press)
 Descriptive data obtained regarding which tests:
 Indicated they should be used for the purpose of determining
severity
 Provided information to convert test performance to a severity
rating
 What
those ratings were, the boundaries between the severity
ratings, and how these boundaries were derived
Results: Assumption 2: Norm-Referenced Test Review
(Spaulding, Swartwout, & Figueroa, in press)
 The Test of Word Knowledge (TOWK; Wiig & Secord, 1992) stated
that it could be used to determine severity of language
impairment in children
 Eleven tests provided information on how to convert test
performance to a severity rating
TEST
SEVERITY
BOUNDARIES
BBCS:E
BBCS-3:R
55-70: Very Delayed
75-85: Delayed
CELF-4
CELF-P2
DELV-NR
MAVA*
≤70: Very Low/Severe
71-77: Low/Moderate
78-85:
Marginal/Borderline/Mild
CREVT-2
TELD-3
TOPL-2
UTLD-4
<70: Very Poor
70-79: Poor
80-89: Below Average
ROWPVT-2000
≤72: Low
73-88: Below Average
TEST
SEVERITY
BOUNDARIES
TEST
PUBLISHER
BBCS:E
BBCS-3:R
55-70: Very Delayed
75-85: Delayed
The Psychological
Corporation
CELF-4
CELF-P2
DELV-NR
≤70: Very Low/Severe
71-77: Low/Moderate
78-85:
Marginal/Borderline/Mild
The Psychological
Corporation
CREVT-2
TELD-3
TOPL-2
UTLD-4
<70: Very Poor
70-79: Poor
80-89: Below Average
Pro-Ed
MAVA
≤70: Very Low
71-77: Low
78-85: Borderline/Marginal
Super Duper
Publications
ROWPVT-2000
≤72: Low
73-88: Below Average
Academic Therapy
Publications
Take Home Points: Assumption 2
Spaulding, Swartwout, & Figueroa (in press)
 Only eleven test manuals provide criteria for
converting a child’s test score to a severity label
 No empirical data to support the cut-off boundaries
they provided
 They don’t appear to be based on how children with
language impairment perform on the test
Take Home Points:
 You can’t apply the state dept of ed criteria on any
test selected for use and expect it to be accurate
 Tests don’t appear to be designed to determine
severity of impairment even if they tell you they are
Research Question: Study 2
Spaulding (in press)
 Can I can apply the Dept of State Education criteria
on whatever test I select to use and it will result in
the consistent severity determination? ANSWER:
NO
 If test manuals provide the same cut-off boundaries
for severity determinations, will children be
consistently classified with the same severity of
impairment on these tests?
Participants: Demographics
SLI (n=16)
TD (n=16)
M = 50.81 months
M = 51.38 months
8.89
8.79
(38-64 months)
(38-65 months)
9 boys, 7 girls
9 boys, 7 girls
M = 14.28 years
M = 14.61 years
1.32
1.50
(12-16 years)
(13-17 years)
Not Hispanic
7
10
Hispanic
5
4
Not reported
4
2
White
9
13
Black/African American
1
1
Multiracial
2
1
Not reported
4
1
Age
SD
Range
Sex
Mother’s education level
SD
Range
Ethnicity (n)
Race (n)
Participants: Norm-referenced Test Performance
Norm-referenced Test Performance
SLI Group
TD Group
______________________
______________________
Mean
SD
Range
Mean
SD
Range
*CELF-P2
77.69
7.19
(65-84)
113.06
9.55
(98-129)
*PPVT-IV
91.06
9.15
(77-111)
113.75
10.70
(92-126)
KABC-II
102.88
8.64
(89-115)
106.09
7.51
(98-119)
*significantly different at p<.05
TEST
SEVERITY
BOUNDARIES
BBCS:E
BBCS-3:R
55-70: Very Delayed
75-85: Delayed
CELF-4
CELF-P2
DELV-NR
≤70: Very Low/Severe
71-77: Low/Moderate
78-85:
Marginal/Borderline/Mild
CREVT-2
TELD-3
TOPL-2
UTLD-4
<70: Very Poor
70-79: Poor
80-89: Below Average
MAVA
≤70: Very Low
71-77: Low
78-85: Borderline/Marginal
ROWPVT-2000
≤72: Low
73-88: Below Average
Descriptive Ratings for Standardized Test Scores
TELD-3
______________________
Score Range Classification
>130
UTLD-4
_______________________
Score Range Classification
Very Superior
131-165
Very Superior
121-130
Superior
121-130
Superior
111-120
Above Average
111-120
Above Average
90-110
Average
90-110
Average
80-89
Below Average
80-89
Below Average
70-79
Poor
70-79
Poor
<70
Very Poor
35-69
Very Poor
Consistency in Language Proficiency Designations
Using Procedures in Tests Themselves
UTLD4 vs. TELD-3: Severity Consistency
14
CONSISTENT
# of participants
12
INCONSISTENT
10
8
6
4
2
0
SLI
TD
Specifics of Proficiency Rankings
TELD-3
VERY SUPERIOR
UTLD-4
1TD
SUPERIOR
3TD
2 TD, 2 TD
ABOVE AVERAGE
7 TD
4 TD
AVERAGE
8 SLI , 6 TD
1 SLI, 1 SLI, 6 TD, 1 TD
BELOW AVERAGE
3 SLI
4 SLI, 2 SLI, 2 SLI
POOR
2 SLI
4 SLI, 1SLI
VERY POOR
3 SLI
1 SLI
Discussion: Study 2
Spaulding (in press)
 Can I apply the Dept of State Education criteria on
whatever test I select to use and it will result in the
consistent severity determination? ANSWER:
NO
 Can I apply the boundary criteria for determining
severity of language impairment in the test manuals
themselves and find consistent severity
determinations? ANSWER:
NO
Conclusions
 Be cautious in using norm-referenced test
performance to inform severity decisions given

Inconsistency in educational agency guidelines

Lack of consistency in severity designations based on how
children with language impairment perform on these tests

Lack of data within test manuals to support this use
Thank you!
Item Selection: Maximize Severity
Determination Accuracy
(hopefully) consider the
items on the test relative
to their purpose
 Considers


Item difficulty
Ability of the people tested
Item difficulty
 Test developers carefully
Person ability
Easy item
Moderate item
Difficult item
Item Difficulty
Item selection: Maximize Diagnostic
Utility
Impaired
Unimpaired
Person’s Ability
TEST
SEVERITY
BOUNDARIES
BBCS:E
BBCS-3:R
55-70: Very Delayed
75-85: Delayed
CELF-4
CELF-P2
DELV-NR
≤70: Very Low/Severe
71-77: Low/Moderate
78-85:
Marginal/Borderline/Mild
CREVT-2
TELD-3
TOPL-2
UTLD-4
<70: Very Poor
70-79: Poor
80-89: Below Average
MAVA
≤70: Very Low
71-77: Low
78-85: Borderline/Marginal
ROWPVT-2000
≤72: Low
73-88: Below Average
Comparison group should be children with
different degrees of impairment
 Do norm-referenced tests provide a sample of
children with different degrees of impairment in
their examiner’s manuals?
 If so, do they provide a means for comparing a
child’s score to children with different degrees of
impairment to determine how impaired they are?
Design characteristics to maximize severity accuracy
 Characteristic 1:
 Diagnosing vs. determining severity of impairment
(item selection process differs) - Lost 4 tests, down to 7
 Characteristic 2:
 Comparison group should be children with different degrees of
impairment – No tests do
 Characteristic 3:
 If boundary cut-offs are provided, need to include an analysis
showing these cut-off boundaries accurately distinguish
amongst children with these different degrees of impairment
Design characteristics to maximize severity accuracy
 Characteristic 1:
 Diagnosing vs. determining severity of impairment
(item selection process differs) - Lost 4 tests, down to 7
 Characteristic 2:
 Comparison group should be children with different degrees of
impairment – No tests do
 Characteristic 3:
 If boundary cut-offs are provided, need to include an analysis
showing these cut-off boundaries accurately distinguish
amongst children with these different degrees of impairment -No tests do; boundaries based primarily on publisher
Download