Latency of evoked neuromagnetic M100 reflects perceptual and

advertisement
Cognitive Neuroscience
1111
2
3
4
5
6
7
8
9
10111
1
2
3
4
5
6
7
8
9
20111
1
2
3
4
5
6
7
8
9
30111
1
2
3
4
5
6
7
8
9
40111
1
2
3
4
5
6
7
8
9
50111
1
2
3
4
5
6111p
Website publication 16 October 1998
NeuroReport 9, 3265–3269 (1998)
THE latency of components of the auditory evoked
neuromagnetic field has been shown to reflect, or encode,
stimulus attributes. In particular, the M100 component,
occurring ~100 ms post stimulus onset has a latency that
depends on stimulus pitch, spectral complexity and
presentation level. This study used magnetoencephalography to record neuromagnetic fields evoked by presentation of two-tone complexes consisting of various
proportions of 100 Hz and 1 kHz energy. These are
perceived categorically, as evidenced by classification and
reaction time measurements. It is found that the M100
latency also varies categorically, that is, characterized by
two plateau regions with a sharp interface. Thus, we find
that not only does the M100 latency reflect acoustic
attributes of a stimulus, but also such perceptual characteristics. NeuroReport 9: 3265–3269 © 1998 Lippincott
Williams & Wilkins.
Latency of evoked
neuromagnetic M100
reflects perceptual and
acoustic stimulus
attributes
Key words: Categorical perception; Complex Tones; M100;
Magnetoencephalography; Timing; Temporal encoding
Introduction
For sinusoidal tone stimuli it has been observed
that the latency of the auditory evoked M100
peak, detected by magnetoencephalography (MEG)
encodes the frequency of the tone stimulus. Low
frequency tones (~100 Hz) are associated with latencies ~30 ms later than corresponding high frequency
tones (~1 kHz) across a wide range of signal amplitudes.1,2
Subsequent studies have demonstrated that the
shift in the M100 latency can be attributed to the
spectral energy distribution of the stimulus sound.
While the spectral center of gravity or mean frequency component plays a dominant role in determining the ultimate M100 latency, substructural
features of the spectrum, e.g. modulation frequency,
or periodicity, have been shown to also have an
influence.3–5
In an example comparing sinusoids with spectrally
more complex tones of the same fundamental component, it has been shown that while the latencies for
tones with 1 kHz fundamental component are rather
independent of spectral complexity, the latency
elicited by tones of low (100 Hz) fundamental
component depends strongly on the waveform of
the sound, with triangle waves and square waves
showing less M100 prolongation relative to sinusoids.
Interestingly, while the spectral center of gravity of
triangle and square waves of the same fundamental
component is rather similar (~3FO), the M100
0959-4965 © 1998 Lippincott Williams & Wilkins
Timothy P. L. Roberts,CA
Paul Ferrari and David Poeppel
Biomagnetic Imaging Laboratory, Box 0628,
Department of Radiology, UCSF, 513 Parnassus
Avenue, San Francisco, CA 94143, USA
CA
Corresponding Author
latency prolongation of tones with 100 Hz fundamental component (relative to tones with 1000 Hz
fundamental component) was more marked for the
triangle waves than for the square waves. One
hypothesized explanation for this lies in the overtone
separation for the two waveforms: for triangles all
harmonics are present (thus the overtone separation
is equal to FO), for square waves only odd harmonics
are present (thus the overtone separation is equal to
2FO). Thus the structure of the spectral energy
distribution can be seen to be significant.3
Further, amplitude modulated (AM) tones have
been used to probe the phenomenon of stimulus
attribute encoding in the M100 latency.4,6 A carrier
frequency of 1000 Hz and a modulation frequency
of 100 Hz at 200% modulation depth (suppressed
carrier modulation tone) gives rise to a waveform
which can be described in terms of equal amplitude
spectral lines at 900 Hz, 1000 Hz and 1100 Hz.
These frequencies independently all fall within the
high frequency plateau of the initial M100 observations and would be expected to give rise to relatively
short M100 latencies, at ~100 ms. However, these
stimuli give rise to the perception of the modulation
frequency, analogous to the missing fundamental
effect.7 That is, despite the absence of low (~100 Hz)
spectral energy in the waveform, there is nonetheless a clear perception of a low frequency
element. Resultantly, M100 latencies were found to
be prolonged relative to those arising from the corresponding pure sinusoidal signal (presented at the
Vol 9 No 14 5 October 1998
3265
T. P. L. Roberts, P. Ferrari and D. Poeppel
1111
2
3
4
5
6
7
8
9
10111
1
2
3
4
5
6
7
8
9
20111
1
2
3
4
5
6
7
8
9
30111
1
2
3
4
5
6
7
8
9
40111
1
2
3
4
5
6
7
8
9
50111
1
2
3
4
5
6111p
carrier frequency). While this raises the possibility
of a perceptual relevance in the M100 latency
encoding, it can be easily explained by arguing that
the spectral substructure, particularly periodicity,
further conditions the response latency subsequent
to its initial determination by the spectral center of
gravity (analogous to the above arguments for
triangle vs square waveforms of low fundamental
frequency).
Similar experiments with speech stimuli, specifically vowels, have demonstrated a main effect of
M100 latency variation with phoneme identity (i.e.
/a/ versus /u/), with only a sub-effect of speaker
pitch.8,9 This too has an acoustic explanation: the
phonetic identity of the vowel is largely determined
by the position of first formant energy (F1 = 310 Hz
for /u/, F1 = 710 Hz for /a/) and this energy band
dominates the spectral waveform in intensity terms.
Thus, while the M100 latency can be seen to be
sensitive to a variety of acoustic stimulus attributes,
with musical and linguistic relevance, it is still
unclear whether the M100 latency encoding mechanism relates to or mediates perceptual representations
independent of signal properties, or whether it simply
reflects the accumulation of acoustic stimulus parameters, combined with a simple spectral analysis –
elements that are integral to, but precede further
perceptual elaboration.
To what extent does the M100 latency effect
reflect perceptual classification that departs from
acoustic properties? In this study we investigated
the effect on M100 latency of additive mixes of
100 Hz and 1000 Hz tone components of various
relative amplitudes. A series of two-tone complex
stimuli were generated each with a different amplitude contribution of high and low frequency
elements. Psychophysical data show a manifestation
of categorical perception.10 That is, presented with
a stimulus continuum varying along one axis
(relative amplitude of 100 vs 1000 Hz), a two-state
representation emerges, with only a narrow step
that correlates with perceptual ambiguity. A pure
spectral acoustics argument predicts a smoothly
varying M100 latency derivable from the relative
contributions of low frequency (long latency) and
high frequency (short latency) elements. We thus
addressed the question “as the relative amplitude
contribution of each component varies does the resultant M100 vary (a) linearly with dB bias or (b) categorically with perception?” If a linear superposition
model is invoked, one might expect a two-tone
complex to yield an effect M100 that reflects its two
sub-elements in proportion to their relative amplitude (or dB bias). If, on the other hand, perceptual
classification begins to become increasingly independent of stimulus features, one might anticipate a
3266 Vol 9 No 14 5 October 1998
more categorical result corresponding to two M100
latency plateaus and only a sharp interface region.
Materials and Methods
Nine healthy volunteers were recruited for this study.
Stimulus presentation and magnetoencephalographic
recording was performed with the approval of the
institutional committee on human research. Informed
written consent was obtained from each subject.
Auditory stimuli consisting of two-tone complexes
were synthesized using LabView software to generate
mixes of various dB proportions (from pure 100 Hz
to pure 1000 Hz with intermediate amplitude differences of ±40, 20, 10, 6, 4, 2 and 0 dB). In all a
total of 15 stimuli were generated. Final presentation
levels were individually calibrated for each subject,
and stimuli were presented at 40 dB SL. Prior data
has shown that slight intensity variations around
this presentation level have little influence on M100
latency.2,11,12
All stimuli were presented using a Mac Quadra
800 computer using a Digidesign II soundcard
(Palo Alto, CA) and interleaved randomly using
PsyScope stimulus presentation software.13 Presentation to subjects was made monaurally (to the right
ear) via Eartone ER3A transducers and non-magnetic
‘air-tube’ delivery (Etymotic, Oak Brook, IL). The
other ear was plugged to attenuate ambient noise.
Subjects initially underwent a psychophysical
study. Presented with the tone complex randomly
interleaved, subjects were asked to perform a twoalternative forced choice task and detect greater 100
Hz or 1 kHz presence (i.e. predominance of low or
high pitch) in the complex tone. Reaction time and
stimulus classification were recorded. This study was
performed under the identical experimental conditions as the MEG recording component.
MEG recording was made using a 37-channel
biomagnetometer (MAGNES, Biomagnetic Technologies Inc., San Diego, CA) positioned over the
temporal lobe, contralateral to the stimulus presentation. Optimized positioning over auditory cortex
was established using a 1000 Hz test-tone and iterative sensor array repositioning until an anti-symmetrical evoked response pattern was observed, centered
on an axis passing through the central sensor of
the concentric array. Each of the 15 stimuli were
presented 100 times at pseudo-random interstimulus
intervals in the range 1–2 s and 600 ms epochs
of MEG data were collected around the event at
a sampling rate of 520.8 Hz. Response epochs of a
given trial type were averaged (time-locked to
stimulus onset) to improve signal to noise ratio. The
M100 latency was defined as the peak in root mean
square (r.m.s.) detected fie]d across sensor channels
M100 latency reflects perception
1111
2
3
4
5
6
7
8
9
10111
1
2
3
4
5
6
7
8
9
20111
1
2
3
4
5
6
7
8
9
30111
1
2
3
4
5
6
7
8
9
40111
1
2
3
4
5
6
7
8
9
50111
1
2
3
4
5
6111p
occurring in the latency window from 80–150 ms.1,14
Source localization was performed on the M100 peak,
using a single equivalent dipole model, to confirm its
origin in auditory cortex.15
Results
The behavioral data confirmed that the stimulus
continuum created by mixing signals of 100 Hz and
1 kHz generated categorical auditory percepts. The
psychophysical data (Fig. 1) reveal the two hallmarks
of categorical perception: first, subjects classified the
stimuli drawn from the unidimensional stimulus
continuum in a categorical manner (Fig. 1a), with
only a narrow range of dB differences in which determination was equivocal (between 25% false and 25%
correct); second, reaction time in the forced-choice
paradigm was sharply increased at the same point of
perceptual ambiguity (Fig. 1b). As shown below, this
perceptually equivocal region centered on a point of
ambiguity analogous to that determined from the
M100 latency and had a width corresponding to the
width of the inter-plateau M100 latency gradation.
All stimuli gave rise to auditory evoked responses
that could be characterized in terms of an M100 peak,
with an underlying modeled source in auditory
cortex. Considering the pure sinusoidal stimuli of 100
Hz and 1000 Hz, reproducible demonstration of the
low frequency M100 latency prolongation effect was
found, with a dynamic range (100 Hz to 1 kHz) of
18.1 ± 1.6 ms, generally consistent with previous
MEG data of responses to sinusoidal tone stimuli.1,2
FIG. 1. Psychophysical data from a typical subject. (a) Categorization, (b) reaction time. The horizontal axis plots the relative intensity (in
dB) of the 100 Hz component compared to the 1 kHz component. Two plateaus are seen in (a) with a sharp interface, reflecting categorical
perception of the tone complex. Correspondingly, reaction time peaks at a mixture with ambiguous perception (b).
Vol 9 No 14 5 October 1998
3267
T. P. L. Roberts, P. Ferrari and D. Poeppel
1111
2
3
4
5
6
7
8
9
10111
1
2
3
4
5
6
7
8
9
20111
1
2
3
4
5
6
7
8
9
30111
1
2
3
4
5
6
7
8
9
40111
1
2
3
4
5
6
7
8
9
50111
1
2
3
4
5
6111p
FIG. 2. M100 latency determined from MEG recordings exhibits two
plateaus corresponding to high and low frequency dominance
respectively. The horizontal axis plots the relative intensity (in dB)
of the 100 Hz component compared with the 1 kHz component. Error
bars indicate s.d. across subjects. The ‘point of ambiguity’, that is,
intermediate latency can be seen to occur at a bias of ~6 dB toward
the low frequency contributions. This is in accordance with the relatively poorer signal transmission characteristics and subject hearing
sensitivity at 100 Hz compared with 1 kHz.
No significant difference in r.m.s amplitude of the
M100 component of the detected field was observed
between stimuli. Evoked field amplitudes were 112.1
± 30.1 fT for 1 kHz stimuli compared with 93.2 ±
23.8 fT for 100 Hz tones. All stimuli gave rise to
r.m.s. detected fields between 89.5 ± 11.7 fT (+40 dB)
and 122.5 ± 29.8 fT (+10 dB). No trend was seen as
a function of varying relative amplitude of high and
low frequency content.
In accordance with the psychophysical data, the
M100 latency revealed a categorical step from the
1 kHz plateau to the 100 Hz plateau, modeled well
with a hyperbolic tangent function (r > 0.97). The
M100 latency data revealed a sigmoidal behavior with
varying dB mixtures of the two-tone elements. The
coefficient of the sigmoidal (hyperbolic tangent) shift
was 0.07 ± 0.02 indicating a steep gradation from the
1000 Hz plateau to the 100 Hz plateau. The ‘point of
ambiguity’, i.e. that point with median M100 latency
occurred at a mix in which 100 Hz was present at a
level 6.5 dB above the 1 kHz tone, in good agreement with perceptual threshold differences between
100 Hz and 1 kHz tones.2
Discussion
Analogous to previous questions raised about
the stimulus attribute dependence of the M100
latency,1,3–5,9 this study explored the sensitivity of
M100 latency encoding to stimulus attributes and
specifically addressed the issue of spectral versus
perceptual representations. By mixing tone contributions of 100 Hz and 1 kHz we generated complexes
which have a categorical subjective percept. Correspondingly, M100 latency was found to vary
stepwise, or categorically between plateaus associated
3268 Vol 9 No 14 5 October 1998
with high and low frequency, respectively. Thus, we
conclude that despite its early-post stimulus occurrence, the M100 latency is a sensitive reflector of the
subjects’ categorical perception.
Although some prior data suggest that M100
latency is dominated by the spectral energy distribution of complex tones,4 it has been shown that
the M100 latency is further modulated according to
a secondary spectral analysis.3 This explanation
accounts for differing latency prolongations observed
with triangle and square wave stimuli of the same
fundamental component and similar spectral center
of gravity. Also, this explanation allows for the
prolongation of M100 latency in response to amplitude modulated tones relative to sinusoids of the
carrier frequency. Interestingly, this spectral substructure analysis corresponds to a subjective perception of the low modulation frequency. So, based
on data from triangles, square and AM tones, it is
unclear whether the M100 latency is determined by
an acoustic analysis (allowing contributions of both
center of gravity as well as substructure or periodicity) or by a perceptual processing stage which
recognizes a sound’s timbre or the presence of the
‘missing fundamental’, respectively.
The present data support the involvement of
further perceptual processes having a role in M100
latency determination, since the psychophysical
phenomenon of categorical perception is paralleled
by a categorical shift of M100 latency between
two extreme plateaus. The nature of the perceptual
specification of the electrophysiologic response
latency remains unclear, and the early appearance
(~100 ms) of such involvement may appear somewhat
surprising. Further investigation along these lines
must evaluate the relative importance of perceptual
versus purely acoustic contributions to the M100
component of the evoked response. Analogous investigation of earlier and later evoked response components (e.g. the M50/M60 and M200 peaks) may also
offer insight into the timing of the feedback and/or
spectral analysis processes. Magnetoencephalography
offers the appropriate temporal resolution to document these studies, while also offering an opportunity for resolving differences in spatial origin of
evoked response contributions.
Conclusion
The precise latency of the M100 component of the
auditory evoked neuromagnetic field is influenced
not only by intensity and acoustic properties of the
stimulus, such as fundamental component, spectral
center of gravity and periodicity, but is also
influenced by perceptual qualities of the stimulus.
In this example where a unidimensional variation in
M100 latency reflects perception
1111
2
3
4
5
6
7
8
9
10111
1
2
3
4
5
6
7
8
9
20111
1
2
3
4
5
6
7
8
9
30111
1
2
3
4
5
6
7
8
9
40111
1
2
3
4
5
6
7
8
9
50111
1
2
3
4
5
6111p
stimulus property has a behavioral consequence
consistent with categorical perception, the M100
latency variation also appears to be categorical. The
implication of this is that some perceptual influence
may modulate sensory evoked responses early
(~100 ms) post-stimulus.
References
1. Roberts TPL and Poeppel D. NeuroReport, 7 1138–1140 (1996).
2. Stufflebeam SM, Poeppel D, Rowley HA and Roberts TPL. NeuroReport, 9,
91–94 (1998).
3. Roberts TPL. Presented at SPIE Medical Imaging. Proceedings of the SPIE,
3337, 210–219, 1998. San Diego, CA.
4. Roberts TPL. J Biochem Bioenerg, in press (1998).
5. Greenberg S, Poeppel D and Roberts TPL. A space-time theory of pitch
and timbre based on cortical expansion of the cochlear travelling wave
delay. In 11th International Symposium on Hearing. 1997. Belton Woods,
Grantham, UK.
6. Ferrari P, Poeppel D and Roberts TPL. Presented at Cognitive Neuroscience,
1998. San Francisco, CA.
7. Schouten JF, Ritsma RJ and Cardozo BL. J Acoust Soc Am, 34, 1418–1424
(1962).
8. Poeppel D, Phillips C, Yellin E et al. Neuroscience Letters, 221, 145–148
(1997).
9. Roberts TPL, Poeppel D, Phillips, C, et al. Presented at Society for Neuroscience, 1997. New Orleans, LA.
10. Harnad S, ed. Categorical Perception. Cambridge: Cambridge University
Press, 1987.
11. Reite M, Zimmerman JT, Edrich J and Zimmerman JE. Electroencephalogr
Clin Neurophysiol 54, 147–152 (1982).
12. Bak CK, Lebech J and Saermark K. Electroencephalogr Clin Neurophysiol,
61 141–149 (1985).
13. Cohen J, MacWhinney B, Flatt M and Provost J. Behav Res Methods, 25,
257–271 (1993).
14. Hari R, Hämäläinen M, Ilmoneimi R et al. Neurosci Lett 50, 127–132
(1984).
15. Hämäläinen M, Hari R, Ilmoniemi R et al. Rev Mod Phys 65, 413–497 (1993).
ACKNOWLEDGEMENTS: This work was supported in part by grants from the
NSF LIS initiative and the James S. McDonnell Foundation/Pew Program in
Cognitive Neuroscience. The authors would like to thank Susanne Honma, RT
for excellent technical assistance and Dr S. Greenberg, Dr H.A. Rowley and
Dr S.M. Stufflebeam for helpful discussions.
Received 8 July 1998;
accepted 23 July 1998
Vol 9 No 14 5 October 1998
3269
Download