If I Have a Hammer:

advertisement

Carnegie

Mellon

If I Have a Hammer:

Computational Linguistics in a Reading Tutor that Listens

Jack Mostow

Project LISTEN ( www.cs.cmu.edu/~listen )

Carnegie Mellon University

“To a man with a hammer, everything looks like a nail.” – Mark Twain

Funding: National Science Foundation

Keynote at 42nd Annual Meeting of the Association for

Project LISTEN

Computational Linguistics, Barcelona, Spain

1 7/22/2004

Carnegie

Mellon

If I had a hammer…

[Hays & Seeger]

If I had a hammer,

I’d hammer in the morning

I’d hammer in the evening,

All over this land

I’d hammer out danger,

I’d hammer out a warning,

I’d hammer out love between my brothers and my sisters,

All over this land.

Project LISTEN 2 7/22/2004

Carnegie

Mellon

1.

2.

3.

Outline

Project LISTEN’s Reading Tutor

Roles of computational linguistics in the tutor

So… Conclusions

Project LISTEN 3 7/22/2004

Carnegie

Mellon

Project LISTEN’s Reading Tutor (video)

Project LISTEN 4 7/22/2004

Carnegie

Mellon

Project LISTEN’s Reading Tutor (video)

John Rubin (2002). The Sounds of Speech (Show 3).

On Reading Rockets (Public Television series commissioned by U.S. Department of Education) .

Washington, DC: WETA.

Available at www.cs.cmu.edu/~listen .

Project LISTEN 5 7/22/2004

Carnegie

Mellon

Thanks to fellow LISTENers

Tutoring:

 Dr. Joseph Beck, mining tutorial data

Prof. Albert Corbett, cognitive tutors

 Prof. Rollanda O’Connor, reading

Prof. Kathy Ayres, stories for children

 Joe Valeri, activities and interventions

 Becky Kennedy, linguist

Listening:

 Dr. Mosur Ravishankar, recognizer

Dr. Evandro Gouvea, acoustic training

John Helman, transcriber

Programmers:

Andrew Cuneo, application

 Karen Wong, Teacher Tool

Project LISTEN 6

Field staff:

 Dr. Roy Taylor

 Kristin Bagwell

Julie Sleasman

Grad students:

Hao Cen, HCI

 Cecily Heiner, MCALL

Peter Kant, Education

Shanna Tellerman, ETC

Plus:

Advisory board

 Research partners

DePaul

 UBC

U. Toronto

Schools

7/22/2004

Carnegie

Mellon

Computational linguistics models in an intelligent tutor

Language models predict word sequences for a task.

 E.g. expect ‘once upon a time…’

Domain models describe skills to learn.

 E.g. pronounce ‘c’ as /k/.

Production models describe student behavior.

E.g. which mistakes do students make?

Student models estimate a student’s skills.

E.g. which words will a student need help on?

Pedagogical models guide tutorial decisions.

E.g. which types of help work best?

Theme: use data to train models automatically.

Project LISTEN 7 7/22/2004

Carnegie

Mellon

Language model of oral reading

[Mostow, Roth, Hauptmann, & Kane AAAI94]

Problem: which word sequences to expect?

Language model specifies word transition probabilities

 Given sentence text (e.g. ‘ Once upon a time… ’)

Expect correct reading PrRepeat once

But allow for deviations

 With heuristic probabilities once

PrCorrect upon

Result:

 Accepted 96% of correctly read words.

Detected about half the serious mistakes.

PrTruncate up

.

.

.

PrJump a

Project LISTEN 8 7/22/2004

Carnegie

Mellon

Using ASR errors to tune a language model

[Banerjee, Mostow, Beck, & Tam ICAAI03]

Training data: 3,421 oral reading utterances

 Spoken by 50 children aged 6-10

Recognized (imperfectly) by speech recognizer

 Transcribed by hand

Method: learn to classify language model transitions

 Reward good  transitions that match transcript

Penalize bad

 transitions that cause recognizer errors

 Generalize from features (kid age, text length, word type, …)

Result: reduced tracking error by 24% relative to baseline

Project LISTEN 9 7/22/2004

Carnegie

Mellon

Domain model of pronunciation

Problem: what should students learn?

Data: pronunciation dictionary for children’s text

 ‘teach’ 

/T IY CH/

Method: align spelling against pronunciation

 ‘t’  /T/, ‘ea’  /IY/, ‘ch’ 

/CH/

How frequent is each grapheme-phoneme mapping?

 ‘t’ 

/T/ occurred 622 times in 9776 mappings

 ‘z’  /S/ occurred once (in ‘quartz’)

How consistently is each grapheme pronounced?

 ‘v’ 

/V/ always

 ‘e’  /EH/ (‘ b e d

’), /AH/ (‘ th e’), /IY/ (‘ b e’), /IH/ (‘ d e stroy

’)

 + ‘ea’, ‘eau’, ‘ed’, ‘ee’, ‘ei’, ‘eigh’, ‘eo’, ‘er’, ‘ere’, ‘eu’, …

Project LISTEN 10 7/22/2004

Carnegie

Mellon

Production model of pronunciation

[Fogarty, Dabbish, Steck, & Mostow AIED2001]

Problem: Which mistakes to expect?

Data: U. Colorado database of oral reading mistakes

 ‘bed’ 

/B IY D/

Method: train G

P

 P’ malrules for decoding

 ‘e’ 

/EH/

/IY/

Project LISTEN 11 7/22/2004

Carnegie

Mellon

Top five G

P

P’ decoding errors

G

‘s’

‘s’

‘’

‘’

‘n’

P

/S/

/Z/

//

//

/N/

P’

//

//

/N/

/Z/

//

Example

‘plants’

‘arms’

‘ha_d’

‘car_’

‘land’

Result: predicted mistakes in unseen test data

 Context-sensitive rules improved accuracy.

Later work: predict real-word mistakes

 [Mostow, Beck, Winter, Wang, & Tobin ICSLP2002]

Project LISTEN 12

Drop ‘s’.

Drop ‘s’.

Add ‘n’.

Add ‘s’.

Drop ‘n’.

7/22/2004

Carnegie

Mellon

Student model of help requests

[Beck, Jia, Sison, & Mostow UM2003]

Problem: when will a student request help on a word?

Data: 7 months of Reading Tutor use by 87 students

Average ~20 hours per student

Transactions logged in detail

Help request rate excluding common words: 0.5%–54%

Method: train classifier using word, student, history

Result: predict words that unseen students click on

Project LISTEN 13 7/22/2004

Carnegie

Mellon

Learning curves for students’ help requests

.4

.3

Try to predict subset

Grade 1-2 level

1-6 prior encounters

Selected data

53 students

175,961 words

 29,278 help requests

.2

.1

0.0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Reading level

Grade 1

Train predictive model

Count help requests 5x

Grade 2

 Predict other kids’ data

Grade 3

71% accuracy

Grade 4

Number of previous encounters

Project LISTEN 14 7/22/2004

Carnegie

Mellon

Features used

Information about the student

 Help request rate, overall reading proficiency, etc.

Information about the word

Word length, position in sentence, etc.

Student’s history with reading word

Percent of times accepted by Reading Tutor, time to read, etc.

Student’s prior help on this word

Was the word helped previously? Earlier today?

How to get all this data??

Project LISTEN 15 7/22/2004

Carnegie

Mellon

Data collection and translation

Project LISTEN 16 word features

7/22/2004

Carnegie

Mellon

Structure of Reading Tutor database

Reading Tutor Student

List readers

Login

Session

Project LISTEN

List stories

Show one sentence at a time

Listens and helps

Story Encounter

Sentence Encounter

Word Encounter

17

Pick stories

Read sentence

Read each word

7/22/2004

Carnegie

Mellon

Project LISTEN’s Reading Tutor:

A rich source of experimental data

2003-2004 database:

9 schools

> 200 computers

 > 50,000 sessions

> 1.5M tutor responses

> 10M words recognized

Embedded experiments

Randomized trials

The Reading Tutor beats independent practice…

Effect sizes up to 1.3

[ Mostow SSSR02 , Poulsen 04 ]

…but how? Use embedded experiments to investigate!

Project LISTEN 18 7/22/2004

Carnegie

Mellon

Pedagogical model of help on decoding

[Mostow, Beck, & Heiner SSSR2004]

Problem: Which types of help work best?

Data: 270 students’ assisted reading in the Reading Tutor

Method: randomize choice of help and analyze its effects

Result: detected significant differences in effectiveness

Project LISTEN 19 7/22/2004

Carnegie

Mellon

Within-subject experiment design:

270 students, 180,909 randomized trials

Student is reading a story

‘People sit down and …’

Student needs help on a word Student clicks

‘read.’

Tutor chooses what help to give

Randomized choice among feasible types

‘… read a book.’

Student continues reading

Time passes…

Student sees word in a later sentence

‘I love to read stories.’

Outcome: success = ASR accepts word as read fluently

(How) does the type of help affect the next encounter?

Project LISTEN 20 7/22/2004

Carnegie

Mellon

180,909 word hints

(average success rate 66.1%)

Example: ‘People sit down and read a book.’

Whole word:

24,841 Say In Context

 56,791 Say Word

Analogy:

13,165 Rhymes With

 13,671 Starts Like

Decomposition:

6,280 Syllabify

14,223 Onset Rime

19,677 Sound Out

 22,933 One Grapheme

Semantic:

14,685 Recue

2,285 Show Picture

488 Sound Effect

Project LISTEN

Which types stood out?

Best: Rhymes With 69.2%

±

0.4%

Worst: Recue 55.6%

±

0.4%

21 7/22/2004

Carnegie

Mellon

What helped which words best?

Compare within level to control for word difficulty.

Same day: Later day:

Grade 1 words:

Grade 2 words:

Say In Context ,

Onset Rime

Say In Context ,

Rhymes With

Say In Context

Onset Rime

Rhymes With

Grade 3 words: Rhymes With ,

One Grapheme

Supplying the word helped best in the short term…

But rhyming hints had longer lasting benefits.

Project LISTEN 22 7/22/2004

Carnegie

Mellon

So…. what can your computational linguistics model in an intelligent tutor?

What problem is important to solve?

 Language models predict word sequences for a task.

Domain models describe skills to learn.

 Production models describe student behavior.

Student models estimate a student’s skills.

 Pedagogical models guide tutorial decisions.

 …

What data is available to train on?

What method is suitable to apply?

What result is appropriate to evaluate?

Project LISTEN 23 7/22/2004

Carnegie

Mellon

…Well I got a hammer

Well I got a hammer,

And I got a bell,

And I got a song to sing, all over this land.

It’s the hammer of Justice,

It’s the bell of Freedom,

It’s the song about Love between my brothers and my sisters,

All over this land.

Project LISTEN 24 7/22/2004

Carnegie

Mellon

Conclusions…

 Muchas gracias

 Molto grazie

 Obrigado

 Merci beaucoup

 Danke schön

 Dank U well

 Spaseeba

 Blagodaria

 Tak

 Todah rabah

 Shukra

 Efcharisto

 Xeh-xeh

 Arigato gozaymas

 Kop-kun krap

 Thank you! Questions?

Project LISTEN

See papers & videos at www.cs.cmu.edu/~listen .

Thanks

25 7/22/2004

Download