"You made me do it":Classification of Blame in Married Couples

advertisement
“You made me do it”:
Classification of Blame in Married Couples’
Interactions by Fusing Automatically Derived
Speech and Language Information
Matthew P. Black, Panayiotis G. Georgiou,
Athanasios Katsamanis, Brian R. Baucom,
and Shrikanth S. Narayanan
http://sail.usc.edu
Examples of Blame
Husband blaming Wife
Wife blaming Husband
I
You
wasn’t
know,
expecting
you’re
a
this
stranger
when
to
we
me
were
now.
I- Ito
don’tget
married.
Wife:
I’m
Oh,
living
Bob
suggested
at
home
that
with
this
a
roommate.
might
help.
You’re
Why?
Why
did
one
you
who
even
suggested
want
to
it.
get
therapy?
Husband:
IIthe
just
don’t
think
there
is
anygoing
point
in
talking
Husband:
I wasn’t
Are
you
expecting
donewho
dumping
thisthis
on me
when
we
yet?
were
going
get married.
Wife:
know,
you’re
athat
stranger
to
me
now.
I- Ito
don’tOh,
You’re
Bobthe
suggested
one
suggested
might
it.
help.
about You
this.
* Note that the above clips are acted.
* The real couples analyzed in this study cannot be shown for privacy reasons.
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
2 / 18
Overview
• Blame conveyed through various communicative channels
– Language (e.g., “you made me do it”)
– Speech (e.g., prosody)
– Gestures (e.g., pointing)
• Goal: classify extreme cases of blame behavior using audio
– “Low” vs. “High”
• Methodology: combine 2 automatically-derived info sources
– Acoustic: models how the spouses spoke
– Language: models what the spouses said
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
3 / 18
Why Detect Blame?
• Blame is oftentimes targeted in couple therapy
– Can lead to escalation of negative affect and resentment [Dimidjian et al. 2008]
• Blame is one important behavior researched in psychology
– Rely on established manual coding methods
– Challenges: coders must be trained, time-consuming
• How can technology help?
– Automatic detection of blame is scalable alternative to manual coding
• Behavioral Signal Processing (BSP)
– Quantify and recognize abstract human behaviors in natural interaction
settings relevant to psychology research
S. Dimidjian, C. R. Martell, and A. Christensen, Clinical Handbook of Couple Therapy, 4th ed. The Guilford Press,
2008, ch. Integrative behavioral couple therapy, pp. 73–106.
4 / 18
General Technical Challenges
• Blame is a complex human behavior
– High-level, heterogeneous
– Need to extract generalizable features
• Real data in real-life scenarios
– Challenging for robust feature extraction and automatic speech recognition
• Information across multiple modalities/cues
– Distributed information across various signals: acoustics, language, gestures
– How to merge/combine/fuse them
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
5 / 18
Previous Work [Black et al. 2010]
• This paper is extension of our earlier work
– Classified extreme instances (low/high) for 6 behavioral codes
– Only used acoustic cues
• “Blame” was one of the more challenging codes to predict
– Ignoring important lexical cues regarding blame
– Coding manual: “explicit blaming statements (e.g., ‘you made me do it’)
warrant a high blame score”
• This paper addresses this weakness
M. P. Black, A. Katsamanis, C.-C. Lee, A. C. Lammert, B. R. Baucom, A. Christensen, P. G. Georgiou, and S. S. Narayanan,
“Automatic classification of married couples’ behavior using audio features,” in Proc. Interspeech, 2010.
6 / 18
Corpus
• Real couples in 10-minute problem-solving dyadic interactions
– Longitudinal study at UCLA and U. of Washington [Christensen et al. 2004]
– 117 distressed couples received couples therapy for 1 year
– 569 sessions (96 hours)
• Audio
– Single channel
– Far-field
– Variable noise conditions
• Transcription (word-level)
– Chronological
– Speaker labels (wife/husband)
– No timing information
Wife:
then why did you ask
Husband: to get us out of debt
Wife:
mm hmmm ...
A. Christensen, D.C. Atkins, S. Berns, J. Wheeler, D. H. Baucom, and L.E. Simpson. “Traditional versus integrative behavioral couple
therapy for significantly and chronically distressed married couples.” J. of Consulting and Clinical Psychology, 72:176-191, 2004.
7 / 18
Data Pre-processing
• Signal-to-noise ratio (SNR) estimation
– Voice activity detection (VAD) [Ghosh et al. 2011]
– SNR ranged from -1 dB to 26 dB
• Speaker diarization
– Used transcriptions and SailAlign’s speech-text alignment (available on web)
– Each session’s audio split into wife/husband/unknown turns
• Session selection
– (SNR > 5 dB) && (Speaker alignment > 55%)
– 372 of 569 sessions (65%) met both thresholds (62.8 hours)
P. K. Ghosh, A. Tsiartas, and S. S. Narayanan, “Robust voice activity detection using long-term signal variability,” IEEE Trans. Audio,
Speech, and Language Processing, vol. 19, no. 3, pp 600–613, 2011.
8 / 18
Classification Set-up
Wife
• Each spouse’s overall level of blame manually coded
– Standardized coding manual [Heavey et al. 2002]
– 9-point scale (1 = no blame, 9 = repeated blame)
– Multiple trained evaluators (mean1 pairwise
= 0.788)
2 3 4 Pearson’s
5 6 7 8correlation
9
Husband
Wife
High
Blame
Low
Blame
Low
Blame
1
2
3
4
5
6
7
8
High
Blame
9
1
2
3
4
5
6
7
8
9
Husband
• Binary classification set-up (top/bottom 20%)
– Low blame (70 wife + 70 husband), High blame (70 wife + 70 husband)
– Leave-one-couple-out cross-validation
– Gender-independent models of blame
1
2
3
4
5
6
7
8
9
C. Heavey, D. Gill, and A. Christensen. Couples interaction rating system 2 (CIRS2). University of California, Los Angeles, 2002.
9 / 18
Methodology Overview
• Trained 3 classifiers
– Acoustic
– Lexical
– Fusion
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
10 / 18
Acoustic Classifier
• 53,000+ features extracted
1) Extracted frame-level low-level descriptors (LLDs)
– Prosodic: f0, intensity [Praat – Boersma 2001]
– Spectral: 15 MFCCs, 8 MFBs [openSMILE – Eyben et al. 2010]
– Voice quality: jitter, shimmer [openSMILE – Eyben et al. 2010]
2) Separate features for each spouse (wife, husband)
3) 6 temporal granularities
– Global: entire session [Black et al. 2010]
– Hierarchical: 0.1s, 0.5s, 1s, 5s, 10s disjoint windows [Schuller et al. 2008]
4) 14 static functionals (e.g., mean, std. dev.)
• Apply binary classifier
– Classifier: Support Vector Machine (SVM) with linear kernel
– Confidence score: class probability estimates [LIBSVM – Chang et al. 2001]
B. Schuller, M. Wimmer, L. Mösenlechner, C. Kern, D. Arsic, and G. Rigoll, “Brute-forcing hierarchical functionals for paralinguistics:
A waste of feature space?” in Proc. ICASSP, 2008.
11 / 18
Lexical Classifier (1/2)
• Derived from automatic speech recognition (ASR)
Low/High blame
Most likely blame class and
most likely word sequence
Acoustic observations of
rated spouse’s speech
“turn of rated spouse
“Blame class”-specific
“Blame class”-specific
acoustic model
language model
Generic
Unigram
acoustic model language model
• Lexical classifier based on “competitive” language models
– Low (High) blame language models trained on text of low (high) blame spouses
– Classifier: choose the blame class that is most likely
– Confidence score: absolute difference in probabilities of low/high blame case
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
12 / 18
Lexical Classifier (2/2)
• Single most likely path through word lattice may not be robust
– Incorporated probabilities of 100 most likely (“N-best”) paths
– Assumed N-best hypotheses independent for this paper
nth-best path (n = 1, … , 100)
• Oracle lexical classifier
– Upper bound on performance of proposed ASR-derived lexical classifier
– Assume perfect word recognition rate
Manual transcription of rated spouse
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
13 / 18
Fusion Classifier
• Complementary info from acoustic and lexical classifiers
– Score-level fusion of classifiers using confidence scores
• Fusion classifier: another binary SVM
– Inputs: “confidence score”-weighted class hypotheses
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
14 / 18
Classification Results
System
Baseline
Unimodal
Fusion
Classifier
Accuracy
Chance
140/280 = 50.0%
Acoustic
223/280 = 79.6%
Lexical/ASR
211/280 = 75.4%
Lexical/Oracle
255/280 = 91.1%
Acoustic + Lexical/ASR
230/280 = 82.1%
Acoustic + Lexical/Oracle
257/280 = 91.8%
WER:
40%-90%
• • Findings
Significant differences
– –High
All classifiers
performance
significantly
of oracle higher
lexical classifier
accuracy means
than chance
lexical(all
cues
p <are
0.01)
important
– –Lower
Oracle
performance
classifiers significantly
of ASR lexical
higher
classifier
accuracy
due to
than
ASRnon-oracle
being challenging
(all p < 0.01)
– –Fusion
(Acoustic
classifiers
+ Lexical/ASR)
able to advantageously
significantly higher
combine
thanpartially
Lexical/ASR
orthogonal
only (p < 0.05)
language and acoustic information sources
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
15 / 18
Conclusions & Future Work
• Modeled high-level blame behaviors from real couples by fusing
automatically-derived speech and language information
– Proposed acoustic classifier more robust than lexical classifier
– Even with noisy ASR, lexical classifier attained 75% accuracy
– Successfully separated 82% of extreme instances with fusion classifier
• Blaming behaviors are important cue to detect because of their
significance in the context of couple therapy
– Detection of blame could facilitate clinician-guided drill-down therapy
• Future Work
– Improve individual classifiers (acoustic/lexical) and fusion classifier
– Extend fusion experiments to other behavioral codes
– Apply behavioral signal processing methodologies to other domains
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
16 / 18
References & Software
REFERENCES
M. P. Black, A. Katsamanis, C.-C. Lee, A. C. Lammert, B. R. Baucom, A. Christensen, P. G. Georgiou, and S. S. Narayanan,
“Automatic classification of married couples’ behavior using audio features,” in Proc. Interspeech, 2010.
A. Christensen, D.C. Atkins, S. Berns, J. Wheeler, D. H. Baucom, and L.E. Simpson. “Traditional versus integrative behavioral
couple therapy for significantly and chronically distressed married couples.” J. of Consulting and Clinical Psychology, vol.
72, pp. 176-191, 2004.
S. Dimidjian, C. R. Martell, and A. Christensen, Clinical Handbook of Couple Therapy, 4th ed. The Guilford Press, 2008, ch.
Integrative behavioral couple therapy, pp. 73–106.
C. Heavey, D. Gill, and A. Christensen. Couples interaction rating system 2 (CIRS2). University of California, Los Angeles, 2002.
B. Schuller, M. Wimmer, L. Mösenlechner, C. Kern, D. Arsic, and G. Rigoll, “Brute-forcing hierarchical functionals for
paralinguistics: A waste of feature space?” in Proc. ICASSP, 2008.
SOFTWARE
P. Boersma, “Praat, a system for doing phonetics by computer,” Glot International, vol. 5, no. 9/10, pp. 341–345, 2001.
C. C. Chang and C. J. Lin, LIBSVM: A library for support vector machines, 2001, software available at
http://www.csie.ntu.edu.tw/cjlin/libsvm.
F. Eyben, M. Wöllmer, and B. Schuller, “OpenSMILE - The Munich versatile and fast open-source audio feature extractor,” in
ACM Multimedia, 2010, pp. 1459–1462.
P. K. Ghosh, A. Tsiartas, and S. S. Narayanan, “Robust voice activity detection using long-term signal variability,” IEEE Trans.
Audio, Speech, and Language Processing, vol. 19, no. 3, pp 600-613, 2011.
A. Katsamanis, M. P. Black, P. G. Georgious, L. Goldstein, and S. S. Narayanan, “SailAlign: Robust long speech-text alignment,”
in Proc. of Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, 2011.
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
17 / 18
Related Work at Interspeech 2011
• Papers at Interspeech 2011 on same Couple Therapy corpus:
• Saliency detection
– Mon at 10:00 – Ses1-P1
– James Gibson, Athanasios Katsamanis, Matthew Black, and Shrikanth
Narayanan, Automatic identification of salient acoustic instances in couples'
behavioral interactions using Diverse Density Support Vector Machines
• Interaction modeling
– Tues at 14:30 – Ses2-S1-P
– Chi-Chun Lee, Athanasios Katsamanis, Matthew Black, Brian Baucom,
Panayiotis Georgiou, and Shrikanth Narayanan, An analysis of PCA-based
vocal entrainment measures in married couples' affective spoken interactions
Aug. 28, 2011
“You made me do it”: Classification of Blame in Married Couples’ Interactions
18 / 18
Thank you!
Questions?
Download