Speech perception in noise

advertisement
The Role of Subcortical Encoding in Accounting for Speech Perception
My
poster
title
in Steady-state and Amplitude-modulated
Noise
Name
Poster Number
Where appropriate
Tim Schoof, Stuart Rosen
Department
UCL Speech Hearing and Phonetic Sciences, 2 Wakefield Street, London WC1N 1PF.
t.schoof@ucl.ac.uk
Background
Results
Speech perception in noise improves when the masker fluctuates in amplitude over time.1
This can be attributed to listeners’ abilities to ‘dip listen’ (or ‘glimpse’).
Neural phase-locking may be important for exploiting
dips in the masker. 2
Phase-locked neural activity to speech, as measured by
the frequency following response (FFR), has indeed
been linked to speech perception in noise.3 However,
there is some controversy about its exact role, especially
in dip listening.2,4
Furthermore, cognitive processes may be at least as important as auditory processes for
speech perception in noise.5
Research questions
55
Response magnitude
4
Methods
Stimulus-response lag
10
rms amplitude (dB)
50
45
• What are the relative contributions of auditory and cognitive factors to speech perception in
steady-state and amplitude-modulated noise?
12
• What is the role of subcortical encoding in the perception of speech in noise and dip listening?
Lag (ms)
8
6
• How do amplitude fluctuations in the masker affect subcortical encoding of speech?
Fig. 3 Grand average FFRs to the vowel / ɑ / in quiet (red),
steady-state noise (black), and amplitude-modulated noise
(blue).
Fig. 2 Spectrograms of grand average FFRs to the
vowel / ɑ / in quiet (top), amplitude-modulated noise
(middle) and steady-state noise (bottom).
2
Participants
19 Normal hearing young adults (19 – 29 yrs).
Pure-tone thresholds ≤ 25 dB HL at octave frequencies between 125 Hz and 6 kHz, normal
click ABR.
AMN
Fig. 4 Power spectra of the FFRs in quiet (red), steadystate noise (black), and amplitude-modulated noise (blue)
Speech perception in noise
IEEE sentences low-pass filtered at 6 kHz.
Maskers
• Steady state speech-shaped noise (SSN)
• Speech-shaped noise sinusoidally amplitude-modulated at 10 Hz (AMN)
Speech Reception Thresholds (SRTs) were determined adaptively (50%)
Quiet
SSN
AMN
Quiet
Fig. 5 Boxplots of response magnitude (left) and
stimulus-to-response lag (right) in quiet, steadystate noise, and amplitude-modulated noise
Steady-state and amplitude-modulated noise degrade speech encoding
Mixed effects models with participant as random factor
• Longer stimulus-to-response lag in noise (t(29) = 8.03, p < .001)
• Reduced spectral magnitude at second harmonic (t(29) = -7.47, p < .001) and third harmonic (t(29)
= -10.07, p < .001) in noise.
• Reduced response magnitude in noise (t(29) = -3.44, p = .001).
• Delayed response in steady-state noise compared to quiet (2.3 ms, paired t-test: t(13) = -3.2, p =
.007)
56
Frequency Following Responses
44
Procedure
Recorded across Cz – C7
6000 sweeps
Stimuli presented binaurally at 80 dB SPL
Temporal processing
Response magnitude
rms amplitude (dB)
54
50 52
46 48
Stimuli
Synthesized 100 ms vowel /ɑ/, F0:160 Hz
• Quiet
• SSN at 7 dB SNR
• AMN (Fig. 1) 7 dB SNR at peak
of masker
Fig. 1 Stimulus timing in continuously presented
amplitude-modulated speech-shaped noise.
• Gap detection: 3000-Hz wide bands of noise (1 – 4 kHz).
• AM detection: 3000-Hz wide bands of noise (1 – 4 kHz). Five (sinusoidal) AM rates: 10, 20,
40, 80, 160 Hz.
• FM detection: 1 kHz sinusoidal carrier modulated at 2 Hz.
Cognitive processing
Peak
AMN
Trough
AMN
Fig. 6 Boxplots of response magnitude at peak
and trough of amplitude-modulated masker
Fig. 7 Power spectra of FFR at peak (blue) and trough
(red) of amplitude-modulated masker
Speech is differently encoded at peak and trough of amplitude-modulated masker
Mixed effects models with participant as random factor
• Reduced spectral magnitude at third harmonic at peak of masker (t(15) = -5.11, p < .001)
• Reduced response magnitude at peak of masker (t(15) = -4.5, p < .001).
• But increased F0 at peak of masker (t(15) = 4.56, p < .001).
• Working memory: Reading Span Test (Rudner et al., 2011)
• Attention: Visual Elevator task, part of the Test of Everyday Attention (Robertson et al.,
1996).
• Processing speed: Letter Digit Substitution Test (Van der Elst et al., 2006)
• Text Reception Threshold: visual analogue of the speech perception in noise task (Zekveld
et al., 2007).
Auditory and cognitive skills do not predict speech perception in noise abilities
Pearson’s correlations
• No relationship between SRTs (averaged across SSN and AMN) and any of the
electrophysiological or behavioural measures.
• No relationship between dip listening ability (SRT in SSN minus SRT in AMN) and any of the
electrophysiological or behavioural measures.
Conclusions
• Amplitude fluctuations in the masker become apparent at the level of the brainstem. The FFR is more robustly encoded at the trough than at the peak of the masker.
• Subcortical encoding cannot explain the variability in SRTs. This casts doubt on the exact role of phase-locking in dip listening and speech perception in noise more generally.
• However, neither auditory nor cognitive processes predict speech in noise perception or dip listening skills. This suggests that the amount of variability is simply too small in this group.
Scan icon below for a
copy of this poster.
SSN
References
1. Miller, G. A. and Licklider, J. C. R. (1950). The Intelligibility of Interrupted Speech. The Journal of the Acoustical Society of America, 22(2),167-173.
2. Lorenzi, C., Gilbert, G., Carn, H., Garnier, S., and Moore, B. C. J. (2006). Speech perception problems of the hearing impaired reflect inability to use temporal ne structure. PNAS, 103(49), 18866-18869.
3. Song, J. H., Skoe, E., Banai, K., and Kraus, N. (2011). Perception of speech in noise: neural correlates. Journal of cognitive neuroscience, 23(9), 2268-2279.
4. Moore, B. C. J. (2012). The importance of temporal fine structure for the intelligibility of speech in complex backgrounds. In Dau, T., Dalsgaard, J., Jepsen, M., and Poulsen, T., editors, Speech Perception and Auditory Disorders,
pages 21-32. Centertryk A/S, Denmark.
5. Davis, M. H. and Johnsrude, I. S. (2007). Hearing speech sounds: top-down influences on the interface between audition and speech perception. Hearing research, 229(1-2),132-47.
6. Rudner, M., Rönnberg, and Lunner, T. (2001). Working memory supports listening in noise for persons with hearing impairment. Journal of the American Academy of Audiology, 22(3), 156 – 167.
7. Robertson, I.H., Ward, T., Ridgeway, V., and Nimmo-Smith, I. (1996). The structure of normal human attention: The Test of Everyday Attention. Journal of the International Neuropsychological Society, 2, 525 – 534.
8. Van der Elst, W., van Boxtel, M.P.J, van Breukelen, G.J.P., and Jolles, J. (2006). The Letter Digit Substitution Test: Normative data for 1,858 healthy participants aged 24 – 81 from the Maastricht Aging Study (MAAS): Influence of
Age, Education, and Sex. Journal of Clinical and Experimental Neuropsychology, 28(6), 998 – 1009.
9. Zekveld, A.A., George, E.L.J., Kramer, S.E., Goverts, S.T., and Houtgast, T. (2007). The development of a text reception threshold test: A visual analogue of the speech reception threshold test. Journal of Speech, Language, and
Hearing Research, 50, 576 – 584.
Download