What Have We Learned From Research?

advertisement
Assessment Accommodations:
What Have We Learned
From Research?
Stephen G. Sireci
Center for Educational Assessment
University of Massachusetts Amherst
Mary J. Pitoniak
Educational Testing Service
© copyright 2006 Stephen G. Sireci
In this presentation we will
• Discuss validity issues in test
accommodations
• List the most common test accommodations
used to promote valid score interpretation
• Discuss research conducted on test
accommodations
• Suggest areas for future research on test
accommodations
© copyright 2006 Stephen G. Sireci
Defining “Accommodation”
• The Standards for Educational and
Psychological Testing
– use the terms “modification” and
“accommodation” almost interchangeably,
– use accommodation “as the general term
for any action taken in response to a
determination that an individual’s disability
requires a departure from standard testing
protocol” (p. 101).
© copyright 2006 Stephen G. Sireci
Current State Testing Programs
• “Accommodation” is used to refer to test or
test administration changes that are not
considered to alter the construct measured.
• “Modification” is used to refer to changes
that are thought to alter the construct.
© copyright 2006 Stephen G. Sireci
Validity Issues in Accommodations
• To support valid test score interpretations for
students with disabilities, it is important to remove
construct-irrelevant barriers to these students’ test
performance, but it is also important to maintain
“construct representation.”
• In situations where individuals who take
accommodated versions of tests are compared
with those who take the standard version, an
additional validity issue is the comparability of
scores across the different test formats.
© copyright 2006 Stephen G. Sireci
The Psychometric Oxymoron
• Accommodated Standardized Test
– Promotes fairness in testing?
Or
– Provides an unfair advantage to some
examinees?
What do the Standards for Educational and
Psychological Testing say on this issue?
© copyright 2006 Stephen G. Sireci
Standards for Educational and
Psychological Testing
• Standard 10.1: “In testing individuals with
disabilities, test developers, test administrators,
and test users should take steps to ensure that
the test score inferences accurately reflect the
intended construct rather than any disabilities and
their associated characteristics extraneous to the
intent of the measurement” (AERA, et al., p. 106).
© copyright 2006 Stephen G. Sireci
Standards for Educational and
Psychological Testing
• Standard 10.4: If modifications are made or
recommended by test developers. . .
(unless) evidence of validity for a given inference
has been established for individuals with the
specific disabilities, test developers should
issue cautionary statements in manuals or
supplementary materials regarding confidence
in interpretations based on such test scores”
(AERA et al., p. 106).
© copyright 2006 Stephen G. Sireci
“Cautionary statements”
• Flagging of test scores: Controversial—
most research in this area focused on
postsecondary and postgraduate admissions
tests (Sireci, 2005).
• How do states handle score reporting issues
for accommodated and alternate
assessments?
© copyright 2006 Stephen G. Sireci
Accommodated Tests and Accommodated Test
Administrations have the Potential to
Undermine Validity in at Least 2 Ways:
1. Construct underrepresentation
2. Construct-irrelevant variance
As stated by Messick (1989):
“Tests are imperfect measures of constructs
because they either leave out something that
should be included…or else include something
that should be left out, or both” (p. 34)
© copyright 2006 Stephen G. Sireci
• When standardized tests are NOT
accommodated for SWD
– Construct-irrelevant variance can interfere
with test performance
• e.g. ability to see, hear, focus, interferes with
measurement of math or reading proficiency
• When standardized tests ARE
accommodated
– Construct underrepresentation may occur
• e.g., read-aloud for a reading assessment
© copyright 2006 Stephen G. Sireci
What methods do states use to minimize
construct-irrelevant variance, while
maintaining construct representation?
© copyright 2006 Stephen G. Sireci
Categories of Accommodations
•
•
•
•
Presentation
Timing
Response
Setting
Thompson, Blount, and Thurlow (2002)
© copyright 2006 Stephen G. Sireci
Presentation Accommodations
•Oral (read-aloud, audiocassette)
• Paraphrasing
• Technological
• Braille/large print
• Sign language interpreter
• Encouragement (redirecting)
• Cueing
• Spelling assistance
• Use of manipulatives
© copyright 2006 Stephen G. Sireci
Timing Accommodations
• Extended time
• Multiple days/sessions
• Separate sessions
Timing accommodations are not so much an issue
on state standards-based assessments because
most have generous time limits.
© copyright 2006 Stephen G. Sireci
Response Accommodations
•
•
•
•
Scribe
Booklet versus answer sheet
Marking booklet to maintain place
Transcription
Setting Accommodations
• Individual administration
• Administration in a separate room
© copyright 2006 Stephen G. Sireci
Other Accommodations
• Alternate assessment
• Others?
© copyright 2006 Stephen G. Sireci
Psychometric Research on Test
Accommodations Has Focused On
•Has the accommodation changed the
construct measured?
•Speed
•Different skill
•Do accommodations help only those who
need them?
–Interaction hypothesis
•Do test scores from accommodated and
non-accommodated administrations have the
same meaning?
© copyright 2006 Stephen G. Sireci
Research on test accommodations for
individuals with disabilities:
•Little empirical study
•Some literature reviews
–Willingham et al. (1988)
─Chiu & Pearson (1999)
–Tindal & Fuchs (2000)
─Pitoniak & Royer (2001)
–Thompson et al. (2002)
─Bolt & Thurlow (2004)
–Sireci, Scarpati, & Li (2005)
•Psychometric issues (Geisinger, 1994)
•Legal issues (Phillips, 1994)
•Also: Keeping Score for All (Koenig & Bachman, 2004)
© copyright 2006 Stephen G. Sireci
Sireci, Scarpati, & Li (2005)
Research Questions
• Do test accommodations improve the scores of
students with disabilities (SWD)?
• If so, do such score gains reflect increased
validity or unfair advantage?
– Interaction hypothesis
• What specific types of accommodations are best
for specific types of students?
© copyright 2006 Stephen G. Sireci
Interaction Hypothesis
Figure 1
Illustration of Interaction Hypothesis
60
Mean Score
50
40
30
GROUP
20
GEN
10
SWD/ELL
No ACC
ACC
Accommodation Condition
Macarthur & Cavalier (2004)
“Differential impact on students with and
without disabilities provides evidence that
the accommodation removes a barrier based
on disability” (p. 55).
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
“Because the source of variance is
fundamentally irrelevant to the measurement
of the construct, a valid accommodation will
improve performance only for students with
a disability” (p. 138).
© copyright 2006 Stephen G. Sireci
Are there any general conclusions
regarding effects?
• Extended time seems to help and it helps SWD
more than non-SWD.
• Oral accommodations show promise (math), but
less uniformity across studies. Effects are
considered unclear.
© copyright 2006 Stephen G. Sireci
Review Process
• ERIC and PsychInfo searches
• E-mails to researchers in this area
© copyright 2006 Stephen G. Sireci
Structure of review
• Dimension 1: SWD or ELL
• Dimension 2: Type of accommodation
• Dimension 3: Experimental or non-experimental
study
Note that the review was primarily conducted in 2003 and so
the results are somewhat dated. We have, however,
reviewed additional research since then.
© copyright 2006 Stephen G. Sireci
Characteristics of Studies
Study Focused On
Research Design
Experimental
Quasi-experimental
Non-experimental
Total
Total
SWD
ELL
13
8
21
2
4
6
10
1
11
25
13
38
Studies pertaining exclusively to ELL will not be
discussed in this presentation.
© copyright 2006 Stephen G. Sireci
Types of Accommodations
Type(s) of Accommodation
# of Studies
Presentation:
Oral*
23
Paraphrase
2
Technological
2
Braille/Large Print
1
Sign Language
1
Encouragement
1
Cueing
1
Spelling assistances
1
Manipulatives
1
*Includes read aloud, audiotape, or videotape, and screen-reading software. Note:
Literature reviews and issues papers are not included in this table.
Types of Accommodations
Type(s) of Accommodation
# of Studies
Timing:
Extended time
14
Multi day/sessions
1
Separate sessions
1
Response:
Scribes
2
In booklet vs. answer sheet
1
Mark task book to maintain place
1
Transcription
1
Setting (separate room)
1
Note: Literature reviews and issues papers are not included in this table.
Characteristics of Studies
• Most of the studies focused on elementary school
(2/3 between grades 3 and 8).
• Only 41% were published in peer-reviewed
journals.
© copyright 2006 Stephen G. Sireci
Results: Extended Time
• Most common findings were gains for both SWD
and and non-SWD.
– Contrast Camara et al. (1998) with Bridgeman
et al. (in press)
• Most studies of extended time (6 of 8) looked at
students with learning disabilities (SWLD)
© copyright 2006 Stephen G. Sireci
Summary of Studies
on Extended Time (1)
Study
Subject(s)
Design
Elliott &
Marquart (2004)
Math
Experimental
Runyan (1991)
Reading
Experimental
Zurcher & Bryant
(2001)
Huesman &
Frisbie (2000)
Analogy
test
Quasiexperimental
Quasiexperimental
Quasiexperimental
Alster (1997)
Reading
Math
Results
All student groups
gained
Greater gains for
SWD
No gains for either
group
Gains for LD but not
for non-LD groups
Greater gains for
SWD
© copyright 2006 Stephen G. Sireci
H1?
No
Yes
No
Yes
Yes
Summary of Studies on Extended
Time (2)
Study
Subject(s)
Design
Results
Gains for LD
Camara, Copeland,
retesters 3x > greater
& Rothchild
SAT
Ex post facto
than standard
(1998)
retesters
Gains for LD
Ziomek &
retesters 4x > greater
ACT
Ex post facto
Andrews (1998)
than gains of standard
retesters
Reading,
Gains for both SWD
Zuriff (2000)
5 experimental
ACT, GRE
and non-SWD
© copyright 2006 Stephen G. Sireci
H1?
Yes
Yes
No
Results: Oral
• Results depend on subject
– Gains for SWD only in Math
– No differential gain in other subject areas
– Tends to support oral accommodation for math
tests
© copyright 2006 Stephen G. Sireci
Study
Subject
Design
Results
H1?
Weston (2002)
Math
Experimental
(b/w and w/in groups)
Greater gains for SWD
Yes
Tindal, Heath, et
al. (1998)
Math
Experimental
(b/w and w/in groups)
Sig. gain for SWD only
Yes
Sig. gains for oral accom., no
differences b/w teacher &
computer
Yes
Calhoon, Fuchs,
& Hamlett
(2000)
Math
Experimental (w/in
group)
Johnson (2000)
Math
Experimental (b/w
group)
Huynh, Meyer,
& Gallant (2004)
Helwig, &
Tindal (2003)
Meloy, Deville,
& Frisbie (2000)
Math
Ex post facto
Math
Quasi-experimental
Science
, Math,
Reading
Experimental
(b/w and w/in groups)
Greater gains for SWD
Yes
Accommodated SWD > matched
non-accom. SWD
Yes
Teachers not accurate in
predicting benefit; no gains for
either group.
No
Similar gains for SWD and nonSWD
No
Oral (continued)
Study
Subject
Design
Results
H1?
Brown &
Augustine
(2001)
Science,
Social
Studies
Experimental
(b/w and w/in groups)
No gain
No
Kosciolek &
Ysseldyke
(2000)
Reading
Quasi-experimental
SWD had greater gains, but not
statistically significant
No
Reading
Experimental
(b/w and w/in groups)
McKevitt &
Elliot (2003)
No sig. effect size differences
b/w accom. & standard.
conditions for either group.
No
More Recent Research
• Extended time
– Cohen, Gregg, & Deng (2005)
– Wainer, Bridgeman, Najarian, & Trapani (2004)
• Oral
– Fletcher, Francis, Boudousquie, Copeland,
Young, Kalinowski, & Vaughn (2006)
• Dictation software
– MacArthur & Cavalier (2004)
© copyright 2006 Stephen G. Sireci
Cohen, Gregg, & Deng (2005)
• Looked at groups of students with and without
accommodations and their performance on specific
types of math items using differential item
functioning methods
– Accommodation status “only marginally related to the
pattern of accommodation-related DIF”
– Different types of students benefited from the extra time
– DIF not due to accommodations, but to differences in
students’ performance across different types of math items
© copyright 2006 Stephen G. Sireci
Cohen, Gregg, & Deng (2005)
“Accommodations are more appropriately
viewed as leveling the playing field; they do
not supply the knowledge necessary to pass
tests” (p. 231).
© copyright 2006 Stephen G. Sireci
Wainer et al. (2004)
• Reanalysis of Bridgeman, Trapani, & Curley (2004)
data
• Evaluated extended time by shortening
experimental sections of SAT
• Little difference for verbal (about 5-point gain)
• Big difference for quantitative
– about 10-30 points, with larger gain associated
with larger time extension
– Largest gains for highest-scoring students
© copyright 2006 Stephen G. Sireci
Wainer et al. (2004)
• Looked at correlations b/w scores from
standard and extended time with students’
HS math grades
– Claimed no relationship, but results
(correlations and sample sizes) were not
reported!
– Important idea to look at external validity
criterion
© copyright 2006 Stephen G. Sireci
Wainer et al. (2004)
• Claim that results support not flagging
verbal, but should flag quantitative
– Don’t acknowledge presence of undesired
speededness
– SWD not included in study
• Hard to agree with conclusions
• Supports increasing time limit on SAT-Q
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
• Experimental study involving
Grade 3 students with (n=91) and
without (n=91) decoding difficulties
associated with dyslexia
• Oral vs. standard accommodation
reading test (Texas)
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
• Accommodation targeted for
specific disability
– Oral reading of proper nouns,
comprehension stems, & answer choices
– Designed to reduce the impact of word
recognition difficulties
© copyright 2006 Stephen G. Sireci
Fletcher et al. (2006)
• Results
– Significant group/accommodation
interaction
– Only SWD benefited from the
accommodation
– Seven times greater likelihood of passing
the test with the accommodation
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)
• Looked at accommodations for writing
assessments
– Experimental study: SWD (n=21),
students w/o documented disability (n=10)
– Three accommodation conditions:
• hand-written
• dictation to scribe
• dictation to speech recognition software
– 48 states allow dictation accommodation
(17 exclude scores)
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)
• Results:
– Dictation improved writing scores for
SWD, with Scribe > speech recognition
software > hand-written
– Dictation did not improve scores for
students w/o disability
– No difference between student groups
with respect to preference (hand vs.
dictation)
© copyright 2006 Stephen G. Sireci
Macarthur & Cavalier (2004)
• Caveat
– Small n (21, 10)
• Construct issue
– Dictation okay if construct =
“composing”
– Not okay if construct=“writing”
© copyright 2006 Stephen G. Sireci
Research on Equivalence of Test
Structure
• One aspect of “construct equivalence”
–
–
–
–
–
Rock, Bennett, Kaplan, & Jirele (1988)
Tippets & Michaels (1997)
Huynh, Meyer, & Gallant (2004)
Huynh & Barton (2006)
Cook, Eignor, Sawaki, Steinberg, & Cline (2006)
© copyright 2006 Stephen G. Sireci
Research on Equivalence of Test
Structure
Results tend to support similarity of test
structure across accommodated and
standard test administrations (oral,
extended time, various).
© copyright 2006 Stephen G. Sireci
Discussion (1)
• Do accommodations hurt or promote valid
score interpretations for students with
disabilities?
– Accommodations are designed to promote
validity by removing barriers (irrelevant
variance)
– In general, the research suggests the
accommodations being used are sensible and
defensible.
© copyright 2006 Stephen G. Sireci
Discussion (2)
• Extended time seems to be a valid
accommodation.
– Unintended test speededness could
explain results for students w/o
disabilities
– Result support revised interaction
hypothesis or “differential boost.”
© copyright 2006 Stephen G. Sireci
Interaction Hypothesis: Typical
Illustration of Interaction Hypothesis
60
Mean Score
50
40
30
GROUP
20
GEN
10
SWD/ELL
No ACC
ACC
Accommodation Condition
Interaction Hypothesis: Revised
“Differential Boost”(Fuchs, Fuchs, Eaton, Hamlett, & Karns, 2000)
Illustration of Revised Interaction Hypothesis
60
Mean Score
50
40
30
GROUP
20
GEN
10
SWD/ELL
No ACC
ACC
Accommodation Condition
© copyright 2006 Stephen G. Sireci
Discussion (3)
• Other accommodations have less consistent and
convincing results, but no evidence of “harm” or
“unfairness.”
• It should be noted that lots of solid and ingenious
experimental research has been done in this
area.
– Small n, but intense with respect to data
collection
© copyright 2006 Stephen G. Sireci
Discussion (4)
• Oral accommodation for math seems valid.
• Oral accommodation for reading involves
consideration of specific construct changes
– Fletcher et al. (2006) results indicate matching
disability and accommodation to one aspect of
construct promotes validity
© copyright 2006 Stephen G. Sireci
Discussion (5)
• Looking across various studies and
accommodation conditions
– Lots of variability across studies with respect to
• accommodation conditions and how they were
implemented
• Student groups (within and between)
• Results
© copyright 2006 Stephen G. Sireci
Future Directions for Test Design
• Test Development: Universal test design
– Build tests that are “accessible to all”
(i.e., that do not need to be accommodated).
– CBT could be particularly helpful in this regard.
– 19th & 20th century: Standardization
– 21st century?—Adaptivity?
(can’t be oxymoronic)
© copyright 2006 Stephen G. Sireci
Future Directions for Research (1)
• Meta-analysis based on practice
– Non-published test accommodations being
conducted in states
– Establish a data warehouse for teachers and
test administrators to record results and make
comments?
– Would address the small-n issue
© copyright 2006 Stephen G. Sireci
Future Directions for Research (2)
• Larger sample sizes due to inclusion, coupled
with improved school data management systems
should promote more research on
– Differential item functioning
– Structural equivalence
– Analysis of educational gains
© copyright 2006 Stephen G. Sireci
Future Directions for Research (3)
• More needs to be done on potential changes to
the construct
– Most often decided by logical analysis
– Structural equivalence research is limited
– Structural equivalence  construct equivalence
© copyright 2006 Stephen G. Sireci
Let’s go do it!
Thank you for your attention!
Sireci@acad.umass.edu
Mpitoniak@ets.org
© copyright 2006 Stephen G. Sireci
Download