Carnegie
Mellon
Computational Linguistics in a Reading Tutor that Listens
Jack Mostow
Project LISTEN ( www.cs.cmu.edu/~listen )
Carnegie Mellon University
“To a man with a hammer, everything looks like a nail.” – Mark Twain
Funding: National Science Foundation
Keynote at 42nd Annual Meeting of the Association for
Project LISTEN
Computational Linguistics, Barcelona, Spain
1 7/22/2004
Carnegie
Mellon
If I had a hammer…
[Hays & Seeger]
If I had a hammer,
I’d hammer in the morning
I’d hammer in the evening,
All over this land
I’d hammer out danger,
I’d hammer out a warning,
I’d hammer out love between my brothers and my sisters,
All over this land.
Project LISTEN 2 7/22/2004
Carnegie
Mellon
1.
2.
3.
Project LISTEN’s Reading Tutor
Roles of computational linguistics in the tutor
So… Conclusions
Project LISTEN 3 7/22/2004
Carnegie
Mellon
Project LISTEN’s Reading Tutor (video)
Project LISTEN 4 7/22/2004
Carnegie
Mellon
Project LISTEN’s Reading Tutor (video)
John Rubin (2002). The Sounds of Speech (Show 3).
On Reading Rockets (Public Television series commissioned by U.S. Department of Education) .
Washington, DC: WETA.
Available at www.cs.cmu.edu/~listen .
Project LISTEN 5 7/22/2004
Carnegie
Mellon
Tutoring:
Dr. Joseph Beck, mining tutorial data
Prof. Albert Corbett, cognitive tutors
Prof. Rollanda O’Connor, reading
Prof. Kathy Ayres, stories for children
Joe Valeri, activities and interventions
Becky Kennedy, linguist
Listening:
Dr. Mosur Ravishankar, recognizer
Dr. Evandro Gouvea, acoustic training
John Helman, transcriber
Programmers:
Andrew Cuneo, application
Karen Wong, Teacher Tool
Project LISTEN 6
Field staff:
Dr. Roy Taylor
Kristin Bagwell
Julie Sleasman
Grad students:
Hao Cen, HCI
Cecily Heiner, MCALL
Peter Kant, Education
Shanna Tellerman, ETC
Plus:
Advisory board
Research partners
DePaul
UBC
U. Toronto
Schools
7/22/2004
Carnegie
Mellon
Language models predict word sequences for a task.
E.g. expect ‘once upon a time…’
Domain models describe skills to learn.
E.g. pronounce ‘c’ as /k/.
Production models describe student behavior.
E.g. which mistakes do students make?
Student models estimate a student’s skills.
E.g. which words will a student need help on?
Pedagogical models guide tutorial decisions.
E.g. which types of help work best?
Theme: use data to train models automatically.
Project LISTEN 7 7/22/2004
Carnegie
Mellon
[Mostow, Roth, Hauptmann, & Kane AAAI94]
Problem: which word sequences to expect?
Language model specifies word transition probabilities
Given sentence text (e.g. ‘ Once upon a time… ’)
Expect correct reading PrRepeat once
But allow for deviations
With heuristic probabilities once
PrCorrect upon
Result:
Accepted 96% of correctly read words.
Detected about half the serious mistakes.
PrTruncate up
.
.
.
PrJump a
Project LISTEN 8 7/22/2004
Carnegie
Mellon
[Banerjee, Mostow, Beck, & Tam ICAAI03]
Training data: 3,421 oral reading utterances
Spoken by 50 children aged 6-10
Recognized (imperfectly) by speech recognizer
Transcribed by hand
Method: learn to classify language model transitions
Reward good transitions that match transcript
Penalize bad
transitions that cause recognizer errors
Generalize from features (kid age, text length, word type, …)
Result: reduced tracking error by 24% relative to baseline
Project LISTEN 9 7/22/2004
Carnegie
Mellon
Problem: what should students learn?
Data: pronunciation dictionary for children’s text
‘teach’
/T IY CH/
Method: align spelling against pronunciation
‘t’ /T/, ‘ea’ /IY/, ‘ch’
/CH/
How frequent is each grapheme-phoneme mapping?
‘t’
/T/ occurred 622 times in 9776 mappings
‘z’ /S/ occurred once (in ‘quartz’)
How consistently is each grapheme pronounced?
‘v’
/V/ always
‘e’ /EH/ (‘ b e d
’), /AH/ (‘ th e’), /IY/ (‘ b e’), /IH/ (‘ d e stroy
’)
+ ‘ea’, ‘eau’, ‘ed’, ‘ee’, ‘ei’, ‘eigh’, ‘eo’, ‘er’, ‘ere’, ‘eu’, …
Project LISTEN 10 7/22/2004
Carnegie
Mellon
[Fogarty, Dabbish, Steck, & Mostow AIED2001]
Problem: Which mistakes to expect?
Data: U. Colorado database of oral reading mistakes
‘bed’
/B IY D/
Method: train G
P
P’ malrules for decoding
‘e’
/EH/
/IY/
Project LISTEN 11 7/22/2004
Carnegie
Mellon
G
‘s’
‘s’
‘’
‘’
‘n’
P
/S/
/Z/
//
//
/N/
P’
//
//
/N/
/Z/
//
Example
‘plants’
‘arms’
‘ha_d’
‘car_’
‘land’
Result: predicted mistakes in unseen test data
Context-sensitive rules improved accuracy.
Later work: predict real-word mistakes
[Mostow, Beck, Winter, Wang, & Tobin ICSLP2002]
Project LISTEN 12
Drop ‘s’.
Drop ‘s’.
Add ‘n’.
Add ‘s’.
Drop ‘n’.
7/22/2004
Carnegie
Mellon
[Beck, Jia, Sison, & Mostow UM2003]
Problem: when will a student request help on a word?
Data: 7 months of Reading Tutor use by 87 students
Average ~20 hours per student
Transactions logged in detail
Help request rate excluding common words: 0.5%–54%
Method: train classifier using word, student, history
Result: predict words that unseen students click on
Project LISTEN 13 7/22/2004
Carnegie
Mellon
Learning curves for students’ help requests
.4
.3
Try to predict subset
Grade 1-2 level
1-6 prior encounters
Selected data
53 students
175,961 words
29,278 help requests
.2
.1
0.0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Reading level
Grade 1
Train predictive model
Count help requests 5x
Grade 2
Predict other kids’ data
Grade 3
71% accuracy
Grade 4
Number of previous encounters
Project LISTEN 14 7/22/2004
Carnegie
Mellon
Information about the student
Help request rate, overall reading proficiency, etc.
Information about the word
Word length, position in sentence, etc.
Student’s history with reading word
Percent of times accepted by Reading Tutor, time to read, etc.
Student’s prior help on this word
Was the word helped previously? Earlier today?
How to get all this data??
Project LISTEN 15 7/22/2004
Carnegie
Mellon
Project LISTEN 16 word features
7/22/2004
Carnegie
Mellon
Reading Tutor Student
List readers
Login
Session
Project LISTEN
List stories
Show one sentence at a time
Listens and helps
Story Encounter
Sentence Encounter
Word Encounter
17
Pick stories
Read sentence
Read each word
7/22/2004
Carnegie
Mellon
2003-2004 database:
9 schools
> 200 computers
> 50,000 sessions
> 1.5M tutor responses
> 10M words recognized
Embedded experiments
Randomized trials
The Reading Tutor beats independent practice…
Effect sizes up to 1.3
[ Mostow SSSR02 , Poulsen 04 ]
…but how? Use embedded experiments to investigate!
Project LISTEN 18 7/22/2004
Carnegie
Mellon
[Mostow, Beck, & Heiner SSSR2004]
Problem: Which types of help work best?
Data: 270 students’ assisted reading in the Reading Tutor
Method: randomize choice of help and analyze its effects
Result: detected significant differences in effectiveness
Project LISTEN 19 7/22/2004
Carnegie
Mellon
270 students, 180,909 randomized trials
Student is reading a story
‘People sit down and …’
Student needs help on a word Student clicks
‘read.’
Tutor chooses what help to give
Randomized choice among feasible types
‘… read a book.’
Student continues reading
Time passes…
Student sees word in a later sentence
‘I love to read stories.’
Outcome: success = ASR accepts word as read fluently
(How) does the type of help affect the next encounter?
Project LISTEN 20 7/22/2004
Carnegie
Mellon
Example: ‘People sit down and read a book.’
Whole word:
24,841 Say In Context
56,791 Say Word
Analogy:
13,165 Rhymes With
13,671 Starts Like
Decomposition:
6,280 Syllabify
14,223 Onset Rime
19,677 Sound Out
22,933 One Grapheme
Semantic:
14,685 Recue
2,285 Show Picture
488 Sound Effect
Project LISTEN
Which types stood out?
Best: Rhymes With 69.2%
±
0.4%
Worst: Recue 55.6%
±
0.4%
21 7/22/2004
Carnegie
Mellon
Compare within level to control for word difficulty.
Same day: Later day:
Grade 1 words:
Grade 2 words:
Say In Context ,
Onset Rime
Say In Context ,
Rhymes With
Say In Context
Onset Rime
Rhymes With
Grade 3 words: Rhymes With ,
One Grapheme
Supplying the word helped best in the short term…
But rhyming hints had longer lasting benefits.
Project LISTEN 22 7/22/2004
Carnegie
Mellon
What problem is important to solve?
Language models predict word sequences for a task.
Domain models describe skills to learn.
Production models describe student behavior.
Student models estimate a student’s skills.
Pedagogical models guide tutorial decisions.
…
What data is available to train on?
What method is suitable to apply?
What result is appropriate to evaluate?
Project LISTEN 23 7/22/2004
Carnegie
Mellon
Well I got a hammer,
And I got a bell,
And I got a song to sing, all over this land.
It’s the hammer of Justice,
It’s the bell of Freedom,
It’s the song about Love between my brothers and my sisters,
All over this land.
Project LISTEN 24 7/22/2004
Carnegie
Mellon
Muchas gracias
Molto grazie
Obrigado
Merci beaucoup
Danke schön
Dank U well
Spaseeba
Blagodaria
Tak
Todah rabah
Shukra
Efcharisto
Xeh-xeh
Arigato gozaymas
Kop-kun krap
Thank you! Questions?
Project LISTEN
See papers & videos at www.cs.cmu.edu/~listen .
Thanks
25 7/22/2004