Monkey See, Monkey Do, Monkey... Talk?

advertisement
Monkey See, Monkey
Do, Monkey… Talk?
by Helen Zou
July 23, 2010
Page 1
Using Vocalization
Features to Identify
Ethanol Intoxication in
Rhesus Macaques
by Helen Zou
July 23, 2010
Page 2
Overview
• Introduction
• Background
– Rhesus Macaques
– Speech processing
• Literature review
– Previous findings in humans
– Macaque vocalizations
• Experiment procedure
• Data analysis
– Segmentation and clustering
– Extracting features
• Results
• Acknowledgments
Page 3
Introduction
• Duke University – Class of 2013
• Biomedical Engineering major and Neuroscience
minor
• Emailed Dr. Grant because of her work with
primates and neuroscience
• Vocalization project
• Worked at both ONPRC and OGI
• Not under any specific program, except…
• Had to give a presentation anyway
Page 4
Background – Rhesus
Macaques
• Alcohol drug discrimination and selfadministration
• Predictors of heavy drinking (dominancerelated?)
• BEC (Blood Ethanol Concentration)
• Need simpler way to measure intoxication in
social settings
• Why not look at speech?
Page 5
Background – Speech
Processing
• Voiced, unvoiced, and noise
• For monkeys, we focused on voiced (coos and
screams)
• Potential features
–
–
–
–
Frequency and pitch
Shimmer (amplitude) and jitter (pitch)
Spectral entropy
Root mean square (energy)
Page 6
Sample Wave Form
Page 7
Sample Voiced Region
Page 8
Sample Noise
Page 9
Sample Background
Page 10
Overview
• Introduction
• Background
– Rhesus Macaques
– Speech processing
• Literature review
– Previous findings in humans
– Exxon Valdez case
– Macaque vocalizations
Page 11
Prior Studies – Klingholz
Recognition of low-level alcohol intoxication from
speech signal (1988)
• Approach recognition of intoxication as speaker
identification task
• Measure laryngeal and articulatory features
– Laryngeal - fundamental frequency and
signal-to-noise (SNR)
– Articulatory – formants (F1/F2 ratio)
• Major findings
– Increased FO variation
– Decreased SNR
– Did not change F1/F2
• Limitation: small sample size
• Much more accurate than human recognition
Page 12
Prior Studies – Hollien
Effects of ethanol intoxication on speech
suprasegmentals (2001)
• Measured several different features
– Nonfluency increase is best measure
– F0 increases and utterance duration increases
(moderate measure)
– F0 variability slightly increases (poor measure)
– Vocal intensity had no change
• 20% of subjects exhibited no consistent changes
• Unfortunately, disagrees with the previous
findings
Page 13
Exxon Valdez Court Case
Acoustic Analysis of Voice Recordings from the Exxon
Valdez by J. Tanford et al (1992)
• Oil tanker crashed in
Alaska in 1989
• Captain of ship denied
intoxication
• Analysis of speech found:
–
–
–
–
–
Misspoken words
Slurred pronunciations
Slower speaking rate
Lower pitch
Increased f0 variability
• Characteristics were
consistent with
intoxication
Page 14
Previous Study – Weerts
Primate vocalizations during social separation and
aggression: effects of alcohol and benzodiazepines (1996)
• Focused on testing the effect of different social
situations
– Social separation: EtOH reduced isolation peeps
– Aggression: EtOH increased aggression peeps
• Social context determines effect of drugs
(potential confounding variable?)
Page 15
Summary of Previous Work
• Experiments done on the effect of intoxication
on human speech have inconsistent findings
• Very few studies actually done on macaque
vocalizations
• Many uncontrolled variables (long-term voice
effort, social context, etc.)
• Definitely some effect of ethanol intoxication on
speech features
Page 16
The Question
• Will the vocalizations of
monkeys change when
intoxicated versus when
sober?
Page 17
Methods
•
•
•
•
•
•
•
Put recorders on the monkeys
Gavage with water or alcohol (alternating)
Measure BECs in one hour
Take off recorders
Analyze data for various features
Identify differences in vocalization
Draw conclusions from data and voila!
• But in reality…
Page 18
Problems
1.
2.
Exceeding recorder
threshold
Not enough
vocalizations
Solutions
1.
2.
Attenuate with rubber
and foam
Switch to more vocal
monkey
Page 19
Clementine Example Waveform
Page 20
Data Analysis?
• Recordings had vocalizations, noise, silence,
other monkeys, etc.
• How would we isolate the monkey of interest?
Page 21
Sample Spectrum
Vocalizations
Noise
Page 22
Clementine Example Spectrum
Page 23
Data Analysis
1.
2.
3.
4.
Cut the wave file into smaller segments
Isolate vocalization parts of speech
Extract features for vocalization regions
Compare features for intoxicated versus sober
speech
Page 24
Segmentation/Clustering
Robust Speaker Change Detection by J. Ajmera et al.
(2003)
• Originally created for separating speakers in
news broadcasts
• Find likely change points
• Segment data with overlapping frames
• Cluster similar segments (by speaker)
Page 25
Segmentation and Clustering
Page 26
Data Analysis
1.
2.
3.
4.
Cut the wave file into smaller segments
Isolate vocalization parts of speech
Extract features for vocalization regions
Compare features for intoxicated versus sober
speech
Page 27
Spectrum – Human vs. Monkey
Page 28
Results – Human vs. Monkey
•
•
•
•
Bandwidth of formants in monkey
vocalizations is larger than for humans
Humans have more formants (5+), monkeys
have much fewer (2-4)
Distance between the formants for monkeys is
much larger than between human formants
Shape of formants is curved for screams and
straight for coos
Page 29
Spectrum – Human vs. Monkey
Human
Noise
Coo
Scream
Page 30
Results - F0 graphs
Page 31
Results – Alcohol vs. Water
• Graphed all of
the features
• F0 as xvariable
produced most
significant
results
• F0 tends to be
higher during
intoxication
Page 32
Results – Rms vs. f0
Page 33
Results
• Root mean
square
(energy) vs.
fundamental
frequency
• Control
vocalizations
have larger
variation in
energy
• Intoxication
has higher f0
Page 34
Results – Rms vs. Spec entropy
Page 35
Results
• Spectral
entropy vs. f0
• Control
vocalizations
have larger
variation in
spectral entropy
• Intoxication has
higher f0
Page 36
Results
• Alcohol increases fundamental frequency
(agrees with Hollien study)
• Alcohol decreases variation in energy and
spectral entropy
• Consistent with alcohol impairing muscle
control of vocal cords
Page 37
Limitations
• Very small sample size
• Limited number of vocalizations
• Lots of silence and noise in
recordings
• BEC was low (between .017 and
.044)
• Monkeys were separated – may
have different results in social
setting
• Only paired comparisons
Page 38
In the Future
• Further study correlations between different
vocalization features and intoxication
• Use recordings to correlate with other factors
(such as stress, dominance, etc.)
• Find ways to increase vocalizations
• Pair vocal recordings with visual tracking
• Measure ethanol intake using vocalizations in
social settings
• Expand studies to other species
Page 39
Conclusion
• Added to the studies done on macaque
vocalizations
• Used computer algorithms to separate and
analyze data
• Found that formants are a good way to separate
human and monkey vocalizations
• Alcohol increases f0 and decreases variability of
energy and spectral entropy
• Eventually use vocalizations to measure
intoxication in macaques in social settings
Page 40
Acknowledgments
• Dr. Kathy Grant
• Dr. Izhak Shafran
• The Grant Lab (Kevin Nusser, Andrew Rau,
Jessica Shaw, and Cara Candell)
• Meysam Asgari
• OGI and ONPRC staff and coworkers
Page 41
Questions?
Page 42
Prior Studies - Klingholz
• Approach recognition of intoxication as speaker
identification task
• 11 human test subjects and 5 controls
• Read a text segment in German
• Measure laryngeal and articulatory features
– Laryngeal - fundamental frequency and
signal-to-noise (SNR)
– Articulatory – formants (F1/F2 ratio)
• Intoxication results
– Increased FO variation
– Decreased SNR
– Did not change F1/F2
• Correlation between BAL and F0
• Long-term voice effort has similar effect
• Much more accurate than human recognition
Page 43
Prior Studies - Hollien
• Speech samples at four levels of intoxication
• 35 human subjects
• Results
– Nonfluency increase is best measure
– F0 increases and utterance duration increases
(moderate measure)
– F0 variability increases (poor measure)
– Vocal intensity had no change
• 20% of subjects exhibited no consistent changes
Page 44
Prior Studies - Weerts
• 33 squirrel monkeys in two different social
situations
• Social separation: EtOH reduced isolation peeps
• Aggression: EtOH increased aggression peeps
• Social context determines effect of drugs
Page 45
Download