Advanced NLP: Speech Research and Technologies Julia Hirschberg

advertisement
Advanced NLP: Speech
Research and Technologies
Julia Hirschberg
CS 6998
7/15/2016
1
Spoken Natural Language
Processing
NLP/Computational Linguistics historically textoriented
Speech research domain of EE and
Linguistics
1980s: efforts to bring together by DARPA
Today: applications motivate collaboration
Automatic Speech Recognition (ASR)
Text/Concept-to-Speech (TTS/CTS)
Spoken Dialogue Systems (SDS), Speech-toSpeech Translation, Speech Search/Data
7/15/2016
2
Mining
Studying Speech is Different
Understanding input and generating output are
more complicated
ASR errors and lack of formatting cues
TTS/CTS naturalness issues
But there is also more information to take
advantage of
Pitch variation, loudness, rate, voice quality
Filled pauses, self-repairs
7/15/2016
3
Labeled Waveform and
F0 Contour
7/15/2016
4
Current Approaches
Corpus-based studies
Hand-labeled data (ToBI etc.)
Tools:
Analysis (pitch tracks, spectrograms….)
ASR toolkits
TTS systems
Machine learning
Laboratory studies
Evaluation
7/15/2016
5
Prosodic Generation for TTS
Corpus-based approaches
Train prosodic variation on large labeled
corpora using machine learning
techniques
Accent and phrasing decisions
Associate prosodic labels with simple
features of transcripts
To do:
Contour variation
7/15/2016
6
Timing and backchanneling
Disfluencies?
Emotion and ‘personality’
Personalized voices
7/15/2016
7
Concept to Speech
Decisions in TTS depend on text analysis
Concept-to-Speech (CTS) systems should be
able to do better
System knows what it wants to say and can
specify how
But….
Still need labeled corpora to train on
CTS features may be hard to label (focus,
given/new,…)
How to decide how to realize these?
7/15/2016
8
Prosody in ASRU
Little success in improving ASR transcription
More promise in other areas:
Improving rejection
Shrinking search space
Automatic topic segmentation for
browsing/retrieval
Identifying ‘salient’ words in turns
Disambiguating speech/dialogue acts: okay
7/15/2016
9
Recognizing communicative ‘problems’
ASR errors
User corrections
‘Aware’ turns
‘Problematic’ dialogues
Disfluencies and self-repairs
Recognizing speaker emotion
7/15/2016
10
My Research
Meaning of intonational contours:
Rise/fall/rise (L*+H L-H%)
A: Did you take out the garbage?
B: Sort of.
A: Sort of!
High rise questions (H* H-H%)
This is the chicken Chermula?
I’m from Skokie?
7/15/2016
11
Compositional theory of intonational meaning
(w/Pierrehumbert)
Intonational disambiguation across languages:
Spanish, Italian and English (w/Avesani &
Prieto)
William isn’t drinking because he’s unhappy
• Disfluencies: self-repairs (w/Nakatani)
I want to go to Ba- Baltimore.
• Cue phrases (w/Litman)
• Now let’s go to work.
7/15/2016
12
Accent and strict/sloppy interpretations of
ellipsis (w/Ward)
People who live in Los Angeles adore it’s
beaches and so do people who live in New
York
7/15/2016
13
•Accent and given/new (w/Terken)
The ball touches the circle.
The ball touches the triangle.
The ball touches the cone.
The square touches the ball.
Intonation and discourse structure (w/Grosz &
Nakatani)
Boston Directions Corpus
Automatic assignment of accent and phrasing
for TTS (w/Wang, Sproat, Koehn, Abney, Collins,
Rambow)
7/15/2016
14
ToBI prosodic labeling conventions w/many)
Prosody in dialogue systems (w/Litman &
Swerts): generation and understanding (TOOT)
Audio browsing and retrieval: SCAN and
SCANMail (w/many)
7/15/2016
15
CS 6998
Requirements:
Class Participation:
Questions for class discussion
Helping lead a class
Lab exercises
Project
• Literature review
• Data collection and/or analysis from a
corpus
7/15/2016
16
Building a system or system component
(e.g. a preprocessor to assign intonation in
a generation system)
7/15/2016
17
Next Week
Read Hirschberg 2003 and ToBI conventions
Make sure you have access to supplementary
readings if you need them
Bring 3 discussion questions to class
Check access on cs servers to corpora and
/proj/nlp/tools/mathTools/
Xwaves (solaris and linux) esps531.sol,
esps531.linux (also downloadable from KTH)
wavesurfer (win, linux, mac) available at KTH
7/15/2016
18
Projects:
Start thinking about what area you want to
work in for your project and what type of
project you’d like to do
7/15/2016
19
Download