Studying Intonation Julia Hirschberg CS 4706 7/15/2016 1 Today • Approaches to studying contour meaning – Questions people ask • Does contour X convey a different meaning from contour Y? • Is contour X used more often in context Z than contour Y • Despite what people say/think, not all phenomena X are uttered with contour Y – What kind of evidence could we get? • Found data • Laboratory experiments: production, perception • Corpus collection 7/15/2016 2 – What features can we look at and how do we obtain them? • Intonation labeling by hand • Acoustic/prosodic analysis by automatic methods – Pitch tracking, pause detection, intensity, duration, speaking rate extraction • Computational linguistic techniques to extract transcript-based (text) features – Part-of-speech – Sentence length, … – What techniques do we use for analysis? • Statistical methods (Splus, Matlab) • Machine learning techniques 7/15/2016 3 Some Sample Approaches • Natural Corpus: Hedberg & Sosa 2002 • Introspective, observational: Wilson 1993, Pierrehumbert & Hirschberg 1990/2 • Laboratory -- Production/Perception: Syrdal & Jilka 2004 • Laboratory – Brain Imaging (e.g. fMRI): Doherty et al 7/15/2016 4 A Prescriptive Approach • Statements fall and questions rise (Wilson 1993) 7/15/2016 5 Hedberg & Sosa 2002 Who saw John?/Who didn’t see John? Did John leave?/Didn’t John leave? • How are yes-no and wh-questions uttered and how might we explain differences? – Where is the nuclear stress? – Where is the semantic ‘focus’? – Are the ‘question words’ accented or not? • Corpus – 35 whq’s and 38 ynq’s from the McLaughlin Group and Washington Week 7/15/2016 6 • Analysis – Intonational labeling (ToBI) from pitch tracks – Topic/focus coding – Frequency distributions of features with question categories • Results – Wh-words (60%) and neg aux in negative ynq’s (89%) most often uttered with L+H* accent (‘contrastive’ accent) -- why? – Aux in positive ynq’s often deaccented (41%) or realized with L* (17%) accent – why? 7/15/2016 7 – Ynq’s generally uttered w/ falling or level intonation, not rising (69%) – Wh-q’s most often uttered with falling (80%) • Conclusions: – Locus of interrogation is accented in wh-q’s and in negative ynq’s to “signal interrogative status of sentence” – but not in positive ynq’s “due to need to highlight a following element” – Why do ynq’s and wh-q’s sometimes rise and sometimes fall? 7/15/2016 8 Critique? • Is this a good corpus for this investigation? – What about the speakers? 7/15/2016 9 Syrdal & Jilka 2004 • How are whq’s and ynq’s produced most naturally (for TTS)? • Same hypothesis: whq’s fall and ynq’s rise in American English • Different approach: production and perception studies • Production: – 8 (professional) speakers (5F, 3M) – Read transcripts of actual dialogues 7/15/2016 10 • Analysis: – Intonational (ToBI) labeling from pitch tracks of extracted questions • Results: – Ynq’s rose in 83% of cases for females and 53% for males – Wh-q’s always fell for females and fell 79% of time for male speakers • Perception: acceptability judgments – Forced choice, 12 listeners – Stimuli: X 7/15/2016 11 – Pairs of ynq and whq’s with same voice/different intonation • 17 natural (9 ynq’s, 8 whq’s) • 12 synthesized – 12 subjects (6 and 6) – Judgments: • Ynq: – Natural speech: people preferred standard rise (L* HH%) – Synthetic speech: no results • Whq: – Natural speech: people preferred falling contours (LL%) to rising (H-H%) and slightly to ‘continuation rise’ (LH%) – Synthetic: no preference 7/15/2016 12 Critique • • • • How many questions were produced? Are professional speakers a good choice? Read vs. spontaneous speech? For TTS? Why no results for synthetic speech? 7/15/2016 13 Doherty et al 2004 • How do people process intonation, e.g., in rising questions vs. falling statements vs. falling questions? • Method: brain imaging (fMRI) – Where is the ‘prosody’ portion of the brain? – What other sectors is it ‘close’ to and what is their function? – Do particular contours have particular locations? 7/15/2016 14 • Procedure: – 11 subjects (4M, 7F) – 150 triples, of which each subjects heard only 1 version • She was talking to her father? • Was she talking to her father. • She was talking to her father. – Task: Monitoring: Is this a question or a statement? – Results: Increase in activation when subjects made judgments about tokens w/ rising intonation but not falling – Why? 7/15/2016 15 – Semantic processing? No – Acoustic processing? Maybe – Interpreting the rising contour as a question? • Check lesion studies to see if people with damage in these areas can interpret rising contours… 7/15/2016 16 Critique • No rising inverted questions? “Was she talking to her father?” 7/15/2016 17 Pierrehumbert & Hirschberg ’90/’92 • A compositional account of intonational meaning • Method: intuition and observation • Hypothesis: – Contours convey relationships • Between current, prior, and following utterances • Between propositional content and mutual beliefs – Contour meanings are composites of the meanings of their pitch accents, phrase accents and boundary tones 7/15/2016 18 Pitch Accent/Prominence in Pierrehumbert 1980 • Which items are made intonationally prominent and how? • Accent type: – – – – H* L* L*+H L+H* simple high (declarative) simple low (ynq) scooped, late rise (uncertainty/ incredulity) early rise to stress (contrastive focus) – H+L* fall onto stress (implied familiarity) – H*+L fall from a high stress (common downstepped contour) 7/15/2016 19 •Downstepped accents: •H*, •L+H*, •L*+H •Degree of prominence: within a phrase: HiF0 across phrases 7/15/2016 20 Prosodic Phrasing in Pierrehumbert 1980 • ‘Levels’ of phrasing: – intermediate phrase: one or more pitch accents plus a phrase accent (Hor L) – intonational phrase: 1 or more intermediate phrases + boundary tone (H% or L% ) 7/15/2016 21 L-L% L-H% H-L% H-H% H* L* L*+H 7/15/2016 22 L-L% L-H% H-L% H-H% L+H* H+L* H*+L 7/15/2016 23 Goal • Explain how contours that share prosodic phenomena convey similar meanings, and how those that differ in phenomena, differ in meaning -- based on their intonational description – H* L- L% vs. H* -H L% vs. H* H- H% I’m from Muskogee… – L* H- H% vs. H* H- H% 7/15/2016 24 Pitch Accents • Convey information status about discourse references, modifiers, predicates and their relationship to S and H’s mutual beliefs – H*: X is new and predicated My name is H* Mark H* Liberman H-H% – L*: X is salient but not part of the speaker’s predication …L* Stalin was L* right H-H% – H*+L: X is inferable from S and H’s mutual beliefs and part of the predication H*+L Don’t H*+L forget to H*+L take your H* lunch L-L% 7/15/2016 25 – H+L* (H+!H*): X is inferable from S and H’s mutual beliefs but not part of predication She’s H+L* teething L-L% – L*+H: X is part of a scale but not part of the predication …I fed the L*+H goldfish L-H% – L+H*: X is part of a scale and in S and H’s mutual beliefs (narrow focus) I don’t L+H* want L+H* shrimp L-H% I want L+H* lobster L-L% 7/15/2016 26 Phrase Accents • Convey relationships among intermediate phrases, such as which form part of larger interpretive units – L-: X L- Y means X and Y are interpreted separately from one another Do you want a sandwich L- or would you like a soda – H-: X H- Y means X and Y should be interpreted together Do you want apple juice H- or orange juice 7/15/2016 27 Boundary Tones • Signal the directionality of interpretation of intonational phrases – H%: X H% Y means interpret X wrt Y You made seven errors L-H% What a shame L-L% We don’t have time to continue today. – L%: X L% Y means no directionality of interpretation suggested You made seven errors L-L% What a shame L-H% We don’t have time to continue today. 7/15/2016 28 Unresolved Questions • How do the meanings of pitch accents in a single phrase combine? The L* blackboard’s painted H* orange L-L% • How do we distinguish the meaning of a phrase accent from that of a boundary tone – especially in intonational phrases with a single intermediate phrase? – E.g. H* H-L% (plateau) vs. H* H-H% (high-rise question) vs. H* L-L% (declarative) • Is this framework useful for investigating contour meaning? E.g. downstepped contours, H+L* 7/15/2016 29 Critique • What is the evidence? • Where might we get it? 7/15/2016 30 Next Class • Read about Text-to-Speech systems 7/15/2016 31