Studying Intonation Julia Hirschberg CS 4706 7/15/2016

advertisement
Studying Intonation
Julia Hirschberg
CS 4706
7/15/2016
1
Today
• Approaches to studying contour meaning
– Questions people ask
• Does contour X convey a different meaning from
contour Y?
• Is contour X used more often in context Z than
contour Y
• Despite what people say/think, not all phenomena
X are uttered with contour Y
– What kind of evidence could we get?
• Found data
• Laboratory experiments: production, perception
• Corpus collection
7/15/2016
2
– What features can we look at and how do we
obtain them?
• Intonation labeling by hand
• Acoustic/prosodic analysis by automatic methods
– Pitch tracking, pause detection, intensity, duration,
speaking rate extraction
• Computational linguistic techniques to extract
transcript-based (text) features
– Part-of-speech
– Sentence length, …
– What techniques do we use for analysis?
• Statistical methods (Splus, Matlab)
• Machine learning techniques
7/15/2016
3
Some Sample Approaches
• Natural Corpus: Hedberg & Sosa 2002
• Introspective, observational: Wilson 1993,
Pierrehumbert & Hirschberg 1990/2
• Laboratory -- Production/Perception: Syrdal &
Jilka 2004
• Laboratory – Brain Imaging (e.g. fMRI): Doherty
et al
7/15/2016
4
A Prescriptive Approach: Wilson 1993
• Declarative statements fall and yes-no-questions
rise?
• Wh-questions fall?
• Small final rise signals ‘more to come’?
7/15/2016
5
Corpus Studies of Questions: Hedberg & Sosa
2002
• How are yes-no and wh-questions uttered and how
might we explain differences?
– Where is the nuclear stress?
– Where is the semantic ‘focus’? What is the ‘topic’?
– Are the ‘wh words’ accented or not?
• Corpus: 73 questions
Who saw John?/Who didn’t see John?
Did John leave?/Didn’t John leave?
– 35 whq’s and 38 ynq’s from the McLaughlin Group
and Washington Week
7/15/2016
6
• Analysis
– Intonational labeling (ToBI) from pitch tracks
– Topic/focus coding
– Frequency distributions of features with
question categories
– Prosody of ‘locus of interrogation’
• Wh word in wh-questions
• Fronted auxiliary in yes-no questions
• Results
– Ynq’s generally uttered w/ falling or level
intonation, not rising (69%)
– Wh-q’s most often uttered with falling (80%)
7/15/2016
7
– Wh-words (60%) in all wh-questions and neg
aux in negative ynq’s (89%) most often
uttered with L+H* accent (‘contrastive’ accent)
-- why?
– Aux in positive ynq’s often deaccented (41%)
or realized with L* (17%) accent – why?
• Conclusions/open questions:
– Why do ynq’s and wh-q’s sometimes rise and
sometimes fall?
– Locus of interrogation is accented in wh-q’s
and in negative ynq’s to “signal interrogative
status of sentence” – but not in positive ynq’s
“due to need to highlight a following element”
7/15/2016
8
Critique
• Is this a good corpus for this investigation?
– Size
– Genre
– What about the speakers?
7/15/2016
9
Syrdal & Jilka 2004
• How are whq’s and ynq’s produced most
naturally (for TTS)?
• Same initial hypothesis: whq’s fall and ynq’s rise
in American English
• Different approach: production and perception
studies
• Production:
– 8 (professional) speakers (5F, 3M)
– Read transcripts of actual dialogues
7/15/2016
10
• Analysis:
– Intonational (ToBI) labeling from pitch tracks
of extracted questions
• Results:
– Ynq’s rose in 83% of cases for females and
53% for males
– Wh-q’s always fell for females and fell 79% of
time for male speakers; wh-q’s and
statements generally fell
– Nuclear accents in ynq’s: majority L*
7/15/2016
11
• Perception studies: acceptability judgments
– Forced choice, 12 listeners
– Stimuli: Pairs of ynq and whq’s with same
voice/different intonation
• 17 natural (9 ynq’s, 8 whq’s)
• 12 synthesized
– 12 subjects (6 and 6)
– Judgments:
• Ynq:
– Natural speech: people preferred standard rise (L* H- H%)
– Synthetic speech: no results
• Whq:
– Natural speech: people preferred falling contours (L- L%) to
rising (H-H%) and slightly to ‘continuation rise’ (L- H%)
– Synthetic: no preference
7/15/2016
12
Critique
•
•
•
•
•
How many questions were produced?
Are professional speakers a good choice?
Read vs. spontaneous speech? For TTS?
Why no results for synthetic speech?
Comparison to Hedberg and Sosa
7/15/2016
13
Doherty et al 2004
• How do people process intonation, e.g., in rising
questions vs. falling statements vs. falling questions?
She was talking to her father?
She was talking to her father.
Was she talking to her father.
• Research questions:
– Where is the ‘prosody’ portion of the brain?
– What other sectors is it ‘close’ to and what is their
function?
– Do particular contours have particular locations?
7/15/2016
14
• Method: functional Magnetic Resonance
Imaging (fMRI) of subjects presented with
digitized recordings
– 11 subjects (4M, 7F)
– Note experimental condition!
– 150 triples, of which each subjects heard only
1 version
• She was talking to her father?
• Was she talking to her father.
• She was talking to her father.
– Monitoring task: Is this a question or a
statement?
• Press one key for question, another for statement
7/15/2016
15
– Results: Increase in activation when subjects
made judgments about tokens w/ rising
intonation -- but not falling, whether syntactic
question or syntactic statement
• Why?
– Semantic processing? No – illocutionary force is same in
rising and falling questions
– Acoustic processing? Maybe…
– Interpreting the rising contour as a question?
• Check lesion studies to see if people with damage
in these areas can interpret rising contours…
7/15/2016
16
Critique
• No rising inverted questions? “Was she talking
to her father?”
7/15/2016
17
Next Class
• How do we represent intonational variation?
7/15/2016
18
Download