Assessing Listening

advertisement
ITBE Workshop, April 20, 2013
Assessing Listening Comprehension: Test Format Decisions
Brian Hampson, Purdue University Calumet - brian.hampson@purduecal.edu
Heather Torrie, Purdue University Calumet – torrieh@purduecal.edu
Test Development Stages (adapted from Hughes, 2003)
1. Stating the problem
a. What is the purpose?
b. What abilities are to be tested?
2. Content
a. What tasks should students perform?
b. What texts should be used?
c. What is the overall format (number of
passages, number of listenings, etc)?
d. What items should be included?
3. Pilot items – using both native and non-native
speakers
Validity: Does the test measure what it is supposed to
measure? (Hughes, 2003)
 Sometimes format can affect validity
Overview of Various Item Formats
Buck (1991) “…successful listening comprehension
involves an interaction between linguistic skills,
knowledge of the context, background knowledge and
inferencing skills. Thus, listening test items, even those
written to test one particular skill, turn out on
examination to be testing a number of different skills.”
Format
Research Notes
Pros
Cons
Multiple-choice
True/False
Matching
Selection
In’nami & Koizumi,
2009 - Meta-analysis
shows MC is easier
than SA
-Reliability
-Less stressful for
testees
-Takes less time for
testees
-Encourages bottomup processing
-Cognitive Load:
difficult for listeners
to hold four options in
their mind while
listening
-Guessing
-Choose the right
answer for the wrong
reason
-Difficult to write
good distracters
-Cheating
-Reliability issues
(scoring difficult,
especially for
inference items)
-Takes time for them
to answer
Yi’an 1998 - people
often choose the right
answer for the wrong
reasons
Short Answer
In’nami & Koizumi,
2009 - Meta-analysis
shows SA more
difficult than MC
Buck 1991 – Supports
validity, but addresses
concerns with
reliability
-Less guessing
-Easier to write items
-More authentic
-More top-down,
especially for main
idea questions
Other skills being
tested
-Reading ability
-Only word
recognition (rather
than true
comprehension)
-Writing ability
-Reading
(understanding the
question and
determining which
information to write
down)
Format
Research Notes
Pros
Table/Outline/Chart
Completion
Song 2011 - Filling
out a table is easier
than blank notes
- Constrains the
-Scoring difficult;
contents of test takers’ reliability issues
notes to a given
framework
-Authentic
-emphasizes top-down
processing
Brindley & Slatyer,
2002 – found that a
chart/table structure
was easier than
SA/blank notes, and
easier than cloze
Re-call
Used in research (eg,
Jung 2003; Sherman
1997); not so much in
classroom
Cloze / Dictation /
Partial Dictation
Cons
Other skills being
tested
-Writing ability
-Authentic
-Good measure of
intake
-More difficult to
score (compile a list
of key information
units)
-Writing and speaking
ability
-Memory
-Note-taking ability
-Reliability
-Emphasizes bottomup listening
-Less authentic
-Writing ability
Test Delivery Format Options:
# of Listenings
One time
Rationale
-Authentic
Question Preview*
Before
Rationale
-Helps students focus on particular
information
Two times through
-Affective value for students
-Reflects the way listening is
taught in the classroom
Sandwiched
-Promotes top-down processing
during the first listening; and bottomup processing during the second
listening
More than twice /
student-controlled
-Also can be authentic (listening
to recorded lectures, online
material, conversation)
Afterwards
-Further promotes top-down
processing
*Sherman (1997) and Buck (1991) suggest that question preview seemed to have more of an affective benefit than actual
performance benefit. Examinees thought it helped them more than it actually did.
References
Brindley, G & H. Slatyer. (2002). Exploring task difficulty in ESL listening assessment. Language Testing, 19(4).
Uses charts and sentence completion. One listening only.
Buck, G. (1991). The testing of listening comprehension: an introspective study. Language Testing, 8(1).
Using verbal report, this study looked at the test-taking process of answering short-answer questions based on
listening to segments of a narrative. While suggesting strength in validity with using short-answer items, the study
reveals concerns with reliability over the various answers.
Cross, J. (2009). Effects of listening strategy instruction on news videotext comprehension. Language Testing
Research, 13(2). 151-176.
One of the ways listening comprehension was measured in this study was using written recalls, where they had to
write down everything they remembered from the videotext.
Hughes, A. (2003). Testing for Language Teaching. Cambridge University Press.
This book is a great overview on test development for all language skills.
In’nami & Koizumi. (2009). A meta-analysis of test format effects on reading and listening test performance: Focus
on multiple-choice and open-ended formats. Language Testing, 26(2).
After reviewing 56 or so studies, found that multiple-choice items were indeed easier than open-ended formats.
Jung, E.H. (2003). The role of discourse signaling cues in second language listening comprehension. The Modern
Language Journal, 87(4). 562-577.
Learners performed a written recall task, after listening to a lecture. Assessment was based on how many key
“information units” learners included in the recall.
Sherman, J. 1997. The effect of question preview in listening comprehension tests. Language Testing, 14. 185-213.
Very interesting study design! Lots of great citations too. Seems that question preview often makes students feel
better, but doesn’t necessarily help them. Could interfere with processing.
Song, M. (2011). Note-taking quality and performance on an L2 academic listening test. Language Testing, 29(1).
Studies how effective note-taking using a partially-filled outline. “it would seem that notes taken in the outline
format in particular, because it constrains the contents of test takers’ notes to a given framework, might have more
potential as a listening measure than notes in the blank format.”
Yi’an, W. (1998). What do tests of listening comprehension test? - A retrospection study of EFL test-takers
performing a multiple-choice task. Language Learning.
Did a qualitative study on 6 learners and why they chose the answers they did (multiple-choice). Showed a lot of
people chose the correct answer, but for the wrong reason.
Download