Physiol Rev 84: 541–577, 2004; 10.1152/physrev.00029.2003. Neural Processing of Amplitude-Modulated Sounds P. X. JORIS, C. E. SCHREINER, AND A. REES Laboratory of Auditory Neurophysiology, Division of Neurophysiology, K.U. Leuven, Leuven, Belgium; Coleman Laboratory, Department of Otolaryngology, Keck Center for Integrative Neuroscience, University of California at San Franscisco, San Francisco, California; and School of Neurology, Neurobiology, and Psychiatry, The Medical School, University of Newcastle upon Tyne, Newcastle upon Tyne, United Kingdom I. II. III. IV. V. VI. VII. VIII. IX. X. XI. Temporal Dimensions of Sound Human Sensitivity to Amplitude Modulation Neural Response Measures Auditory Nerve: Bottleneck to the Central Nervous System A. Basic auditory nerve properties B. Average response rate and magnitude of synchronization C. Phase of synchronization Cochlear Nucleus: Parallel Channels A. Basic organization of the CN B. AM responses of neuronal types in the CN Superior Olivary Complex: An Example of Time-to-Rate Conversion The Nuclei of the Lateral Lemniscus Amplitude Modulation Encoding in the Inferior Colliculus: A Center for Convergence A. Basic organization of the IC B. Modulation transfer functions for IC units: synchronization C. Modulation transfer functions for IC units: average rate D. What determines the MTF upper limit in the IC? E. Is AM encoded in the IC by rate or synchronization? F. Relationship between AM responses and other neuronal properties G. Is modulation frequency represented topographically in the IC? H. Responses to interaural time disparities in modulation envelopes I. Contribution of nonlinearities Amplitude Modulation Encoding in Auditory Thalamus and Cerebral Cortex A. Basic layout of the thalamocortical system B. Temporal responses in the MGB C. Responses to AM in primary auditory cortex: synchronization D. Responses to AM in primary auditory cortex: average rate E. Responses to AM in primary auditory cortex: influence of modulation parameters F. Differences of temporal coding between cortical fields G. Cortical mechanisms H. Temporal coding of complex sounds I. Plasticity of temporal coding properties in auditory cortex Neurophysiological and Psychological Studies in Humans Conclusion 542 544 545 547 547 548 549 550 550 551 553 555 555 555 556 557 558 559 559 559 561 561 562 562 562 564 565 566 567 567 567 568 569 569 Joris, P. X., C. E. Schreiner, and A. Rees. Neural Processing of Amplitude-Modulated Sounds. Physiol Rev 84: 541–577, 2004; 10.1152/physrev.00029.2003.—Amplitude modulation (AM) is a temporal feature of most natural acoustic signals. A long psychophysical tradition has shown that AM is important in a variety of perceptual tasks, over a range of time scales. Technical possibilities in stimulus synthesis have reinvigorated this field and brought the modulation dimension back into focus. We address the question whether specialized neural mechanisms exist to extract AM information, and thus whether consideration of the modulation domain is essential in understanding the neural architecture of the auditory system. The available evidence suggests that this is the case. Peripheral neural structures not only transmit envelope information in the form of neural activity synchronized to the modulation waveform but are often tuned so that they only respond over a limited range of modulation frequencies. Ascending www.prv.org 0031-9333/04 $15.00 Copyright © 2004 the American Physiological Society 541 542 JORIS, SCHREINER, AND REES the auditory neuraxis, AM tuning persists but increasingly takes the form of tuning in average firing rate, rather than synchronization, to modulation frequency. There is a decrease in the highest modulation frequencies that influence the neural response, either in average rate or synchronization, as one records at higher and higher levels along the neuraxis. In parallel, there is an increasing tolerance of modulation tuning for other stimulus parameters such as sound pressure level, modulation depth, and type of carrier. At several anatomical levels, consideration of modulation response properties assists the prediction of neural responses to complex natural stimuli. Finally, some evidence exists for a topographic ordering of neurons according to modulation tuning. The picture that emerges is that temporal modulations are a critical stimulus attribute that assists us in the detection, discrimination, identification, parsing, and localization of acoustic sources and that this wide-ranging role is reflected in dedicated physiological properties at different anatomical levels. I. TEMPORAL DIMENSIONS OF SOUND Among the sensory systems, audition excels in its speed of operation. This is perhaps not too surprising, since our entire sense of hearing depends on the analysis of rapid changes in acoustic pressure at the two ears. The importance of the temporal dimension is manifest in many structural and functional specializations, starting at the peripheral sense organ and carried through the subsequent stages in the central nervous system. The striking sensitivity of auditory structures to temporal features of the acoustic stimulus has been observed since the earliest electrophysiological recordings, and this sensitivity is equally prominent in behavioral observations of humans and experimental animals. Importantly, there are multiple temporal dimensions in acoustic stimuli (238). It is useful to distinguish “finestructure” and “envelope” as two components of a time waveform. The fast pressure variations that determine the spectral content constitute the fine-structure. This finestructure waxes and wanes in amplitude, and the contour of this amplitude modulation (AM) is the envelope. For example, the waveform of a speech utterance shows bursts of energy that correspond to phonemes. The temporal characteristics of these bursts carry much information (44, 108, 214, 265, 272, 281), but their dominant modulation frequency is rather slow (typically 3– 4 Hz, extending up to ⬃20 Hz) vis-à-vis the temporal capabilities of the peripheral auditory system. Faster modulations of several hundred Hertz are also very common, e.g., in segments of voiced speech where they are perceptually associated with voice pitch. These envelope components arise from interactions between fine-structure components and are not present as such, i.e., as acoustic energy, in the waveform. This is illustrated by the superposition of two sine waves, equal in amplitude but separated by a small difference frequency (fd): constructive and destructive interference of the two components generate AM in the form of “beating” at frequency fd. The same principle extends to environmental sound sources, which commonly produce quasi-periodic signals consisting of a range of frequency components (harmonics) that are multiples of a fundamental frequency: the combination of even a limited number of components, e.g., within a coPhysiol Rev • VOL chlear filter, reconstitutes the fundamental frequency in the form of a temporal envelope modulation. (For examples of spectrograms, waveforms, and treatment of AM, see Refs. 99, 100, 177, 180, 302.) The laboratory stimulus most often used in physiological studies of modulation is a pure tone (sinusoid) modulated by another tone. Figure 1A and Equation 1 represent the waveform [s(t)] of a tone with frequency fc (the carrier), whose amplitude is modulated by a lower frequency fm (the modulator) at a modulation depth m (0 ⱕ m ⱕ 1) s(t) ⫽ [1 ⫹ m sin (2f mt)] sin (2f ct) (1) For fc ⬎⬎ fm the first term [1 ⫹ msin(2fmt)] is the time-varying amplitude or envelope.1 Using trigonometric identities, s(t) can be rewritten as the sum of three components at fc and at fc ⫾ fm (the upper and lower sidebands) sin (2fct) ⫹ m/2[sin 2(fc ⫹ fm)t ⫹ sin 2(fc ⫺ fm)t] (2) This signal does not contain energy at fm (Fig. 1, A and B); the modulation in the time waveform is due to the interaction of the components in the signal which are separated by a difference frequency fm. The sinusoidal AM stimulus is special because its envelope consists of a single sinusoidal component. In real-world stimuli, a range of modulations is usually present, which can be summarized by the modulation spectrum: the distribution of modulation energy for the whole waveform or for a selected band of carrier frequencies in the waveform. The subjectively experienced quality of a modulated signal depends on modulation frequency so that the modulation spectrum also defines different perceptual ranges (see sect. II). The impetus in early physiological studies to use modulated stimuli (57, 62, 78, 183, 196) was a desire to go 1 The relationship of m to the waveform is the same as that of the Rayleigh or Michelson contrast ratio used in vision research: m equals the difference between the maximum and minimum luminance divided by their sum. 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING 543 FIG. 1. A: superimposed waveforms of an unmodulated 1,000-Hz tone (thin line) and the same tone sinusoidally amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according to Equation 1. Dashed lines indicate the envelope. The amplitude is referenced to the peak amplitude of the unmodulated tone. B: idealized spectrum of the AM tone in A. At 100% modulation, the amplitude of the sidebands is half that of the carrier, i.e., a difference of 6 dB. C: average response in the form of a poststimulus time (PST) histogram of a nerve fiber to the signal shown in A (stimulus duration, 50 ms). D: spectrum of the PST histogram in C. The components at carrier frequency (fc) and fc ⫾ modulation frequency (fm) indicate that there is phase-locking to the fine-structure of the stimulus waveform. The component at fm is prominently present in the response but is absent in the stimulus (B). The small circle on the ordinate indicates the average firing rate. beyond the arsenal of simple stimuli (pure tones, clicks, noise) that dominated much of the research at that time. Somewhat similar to gratings in the visual domain, AM and frequency modulation (FM) were regarded as elementary features of natural stimuli, which could reveal dynamic properties of the auditory system not addressed with simpler stimuli. Interest in responses to AM was rekindled in the 1980s and 1990s through a convergence of different lines of research concerned with the “dynamic range problem,” speech coding, pitch, and spatial localization of high-frequency sounds, among others. However, AM signals are more than just a convenient laboratory tool to study a diversity of psychophysical and physiological phenomena. The question that we are concerned with here is whether envelope processing is embedded in the auditory system, as may be expected from the ecological prominence of envelopes. Given the theory of natural selection, one can assume that animals are well adapted to their specific acoustic environment and that the statistical structure of the natural auditory environment or the “acoustic ecology” (5) is Physiol Rev • VOL reflected in the structure and function of the auditory system. Acoustic ecology can be defined as the total ensemble of sounds present in an animal’s environment, from both inanimate as well as biological sources. Indeed, the auditory systems of acoustically specialized animals have revealed the existence of highly developed adaptations. Prominent examples include the echolocation system of bats (e.g., Ref. 61), the mating call detection system in frogs (245), and the alarm call differentiation in vervet monkeys (275). Common to these examples is that particular behaviors are elicited by a small set of signals with specific, fairly invariant acoustic properties. Characterization of these lower order physical sound attributes led to the discovery of special neuronal mechanisms. Relatively little work has been done on the quantitative analysis of amplitude modulation statistics in acoustic ecologies and their consequences for neuronal processing. Not only overtly specialized but all animals are likely to exploit consistencies in statistical properties of the acoustical environment. Nelken et al. (194) found that low-frequency amplitude modulations are prominent in 84 • APRIL 2004 • www.prv.org 544 JORIS, SCHREINER, AND REES natural environments and are often coherent over different frequency regions, and may be exploited by the auditory system in signal detection. Voss and Clarke (288) computed temporal correlations of music passages and discovered a 1/f scaling relation over a few decades. More recently, Attias and Schreiner (6) decomposed music, speech, and animal vocalizations into narrow-band frequency channels and studied the statistics of the amplitude and phase distributions for each channel. They also found a distribution of modulation frequencies following a power-law, indicating that the amplitude modulation statistics of natural sound are non-Gaussian, cover a wide range of modulation frequencies, and scale universally, i.e., the frequency dependence is similar over different frequency ranges. Using a mutual information metric between stimulus and spike trains, it was also found (7) that neurons in the cat inferior colliculus are more efficient at coding naturalistic stimuli than nonnaturalistic stimuli: the information rate per spike for naturalistic stimuli was more than 60% higher than for nonnaturalistic signals. Similar results have been seen in the frog (232). This implies that neural processing is adapted and perhaps optimized for the encoding of naturally occurring modulation information. Our purpose is to review physiological mechanisms that may be important for the processing of temporal envelope information. We first briefly highlight findings from human psychophysics to illustrate some of the perceptual consequences of AM, but we refrain from a more substantial discussion of the relationship between physiological mechanisms and perception. Rather, our focus is on a simpler and more basic question; namely, within what limits is AM encoded by single auditory neurons, and does the form of encoding suggest that the temporal envelope dimension is a fundamental organizing principle in the auditory system; in the manner that tuning to orientation, direction, or spatial frequency are considered fundamental in vision. For reasons of space, only occasional reference will be made to the extensive research in bats or nonmammalian vertebrates, even though AM is often an important feature in echolocation signals (156, 198, 258) and their study often preceded the research reviewed here. II. HUMAN SENSITIVITY TO AMPLITUDE MODULATION The ability of human listeners to detect and discriminate AM has been a topic of study since the 18th century. The earliest means of producing a sound with a fluctuating amplitude envelope was to mix two pure tones differing slightly in frequency to generate beats. Thomas Young and Helmholtz (287) both described the sensation of fluctuating amplitude experienced when listening to beats, Physiol Rev • VOL and Helmholtz described the changing quality of the sound as the beat frequency was increased. He noted that “the ear easily follows slow beats of not more than 4 to 6 in a second” while at 30 beats/s it is still possible to hear the pulses of the tone, but it is no longer possible to hear them as distinct events and they have a “jarring and rough” quality. With improvements in technology, subsequent studies (see Ref. 131 for historical review) extended and quantified these findings. Zwicker (324) showed that the threshold for detecting AM is very small at low modulation frequencies (threshold m ⬃2% for fm of 1– 4 Hz and fc of 1 kHz) and increases to a maximum with increasing fm (m ⬃5% for fm of 32 Hz and fc of 250 Hz; and for fm of 125 Hz and fc of 4 kHz). Above this maximum, threshold decreases and falls below the values obtained at low modulation frequencies, but in this range subjects perceive the carrier and the modulation frequency as distinct tones. Zwicker (324) also determined that, for a given carrier, thresholds for the detection of AM and FM measured in terms of their modulation depths coincide on the upper side of the maximum at a modulation frequency he termed the Phasengrenzfrequenz. This led Zwicker to postulate that above the Phasengrenzfrequenz [now termed the critical modulation frequency (CMF) (250, 263)] the carrier and sideband components are analyzed in different critical bands (auditory filters), and thus subjects are not sensitive to differences in the relative phase of the modulation components that enable them to distinguish AM from FM below the CMF. More recent evidence suggests that the situation is more complex than this (180, 263), but nevertheless, it appears that when listening to AM imposed on pure tone carriers detection may rely on spectral rather than temporal cues over some ranges of modulation frequency. One means of eliminating spectral cues, and therefore estimating the temporal resolving power of the auditory system, is to measure the detection of sinusoidal modulation imposed on noise rather than a tonal carrier. The broadband spectrum of the noise precludes the listener detecting the individual spectral components of the stimulus spectrum. The use of such stimuli (9, 285) demonstrated that the relationship between threshold and modulation frequency (the psychophysical temporal modulation transfer function) is essentially a low-pass function with a 3-dB cut-off around 50 Hz and a slope of ⫺4 dB/octave. The minimum threshold modulation depth is ⬃5% at low modulation frequencies (⬍10 Hz) where subjects detect the individual amplitude changes in the stimulus. The upper limit of modulation detection extends to ⬃2.2 kHz (68, 285, 286). As will become apparent later, this coincides with the very highest limits of neural phaselocking to envelopes obtained for some neurons in the auditory periphery in cats (Fig. 2, Refs. 127, 229) and 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING FIG. 2. Amplitude modulation (AM) stimuli generate different percepts that encompass several regions of modulation and carrier frequencies. At very low fm, most strongly near 4 Hz and disappearing around 20 Hz, a sensation of fluctuation or rhythm is produced (hatched). The rate at which the temporal envelope of fluent speech varies is also typically 4 Hz (syllables/s). Fluctuation makes a smooth transition to a percept of roughness, which starts at ⬃15 Hz (bottom curved line), is strongest near 70 Hz, and disappears below 300 Hz (top curved line). Harmonic complex tones produce a pitch that corresponds to a frequency close to the fundamental frequency. However, the lower harmonics can be removed without affecting the pitch, resulting in “residue pitch” if fc and fm are chosen within the shaded region. Finally, small interaural time differences (ITD) can be detected between modulated stimuli to the two ears for a region of combinations of fm and fc that overlaps with the region for residue pitch (thick line). Note that these are regions in stimulus space where modulation is perceptually relevant, but the precise relationship of these percepts to physiological response modulation is usually unclear. For reference, the small dots indicate ⫺10 dB cutoff values for modulation transfer functions (MTFs) of auditory nerve fibers (cf. Fig. 3C) [based on further analysis of data reported by Joris and Yin (127)]. Delineation of psychophysical regions is based on References 16, 104, 233, 278, 325. The ordinate is truncated at 4 Hz. exceeds the limit for phase-locking to envelopes in more central neurons. This raises questions as to the nature of modulation encoding in the central auditory system, even when one takes into account the encoding of modulations by changes in average rate that become apparent at more central sites. Although, as Zwicker noted, a distinct pitch at the frequency of modulation is perceived when components of the stimulus spectrum can be resolved, weaker but nevertheless clear pitches are also perceived with modulations containing no resolved components (179, 233). Even modulations imposed on noise carriers can generate pitches which though weaker than those generated with tonal stimuli are able to support melody recognition (21, 22). Taken together, these findings demonstrate that the periodicity or residue pitches of some modulations must result solely from temporal analysis, but when resolved components are present, pitch salience is increased. Figure 2 schematically indicates the combinations of carrier Physiol Rev • VOL 545 and modulation frequencies resulting in the percepts of fluctuation, roughness, and residue pitch. (Sensitivity to binaural envelope disparities is discussed in section VI.) Two competing models have been proposed to explain the detection of AM. The first consists of a bandpass filter and half-wave rectifier representing processing by the cochlea, followed by a low-pass filter (285). Some measure of the output of this filter provides the basis for the subject’s response (see Ref. 181 for discussion). In essence, therefore, this model is an envelope detector. The second scheme models the detection of modulation by a bank of bandpass filters that are sensitive to different ranges of modulation frequency. A channel or filterbank model of modulation analysis was first proposed by Kay and colleagues (84, 132) on the basis of adaptation studies with FM and AM. Subsequently, the adaptation paradigm was questioned (178, 289), but the concept of a modulation filterbank persists because studies using different psychophysical paradigms have since reported findings which support the concept of modulation frequency tuning. Evidence for such selectivity comes from modulation masking experiments (8, 107), and modulation detection interference (MDI), a phenomenon in which the detection of AM is influenced by modulation at the same frequency but on a very different carrier (318). Dau et al. (36) invoked a model consisting of a modulation filterbank associated with each auditory filter to account for the detection and masking of sinusoidally amplitude-modulated narrowband noise. The latter model was extended (283) to account for comodulation masking release, another phenomenon, like MDI, that indicates some element of modulation waveform analysis across different carrier frequencies (96) (see Ref. 180 for review). Such acrossfrequency interactions between similar modulation envelopes are likely to contribute to grouping and the construction of auditory images (90). Despite different lines of evidence favoring some form of modulation filterbank, the concept remains controversial, and the experimental findings discussed above do not concur in their estimates of the bandwidths for these putative channels. III. NEURAL RESPONSE MEASURES In neurophysiology, one can generally think of a variety of ways in which stimulus features may be “encoded” and processed (208), and it is not immediately obvious which aspects of neuronal behavior are the most relevant for the perceptual task at hand. With few exceptions, the response measures used in studies of AM are average discharge rate (i.e., the number of spikes evoked over several modulation cycles), or some measure of synchronization of the timing of action potentials to the envelope waveform. 84 • APRIL 2004 • www.prv.org 546 JORIS, SCHREINER, AND REES The earliest single-unit studies of peripheral auditory neurons already reported synchronization to the finestructure of tones, in the sense that discharges occur at a particular phase of the cyclical waveform. For example, auditory nerve fibers have the striking capability to “phase-lock” to low-frequency tones up to several kiloHertz [4 –5 kHz in the cat (121), but the upper limit is species dependent (298)]. Phase-locking also occurs to stimulus envelope; both forms of phase-locking are immediately apparent in the poststimulus time (PST) histogram (Fig. 1C) to the AM stimulus of Figure 1A. The fine spacing of peaks at intervals of 1 ms indicates phaselocking to the 1-kHz fine-structure; the grouping into broader peaks spaced by 10 ms indicates phase-locking to the 100-Hz envelope. In contrast to the stimulus spectrum (Fig. 1B), the response spectrum (Fig. 1D) shows energy at fm, i.e., the AM signal is demodulated. Several cochlear nonlinearities with asymmetry between the positive and negative part of the transfer function can contribute to this demodulation, the most important being half-wave rectification in the relationship between displacement of hair cell stereocilia and receptor potential, and in the absence of negative firing rates (135). The response spectrum also shows a value at 0 Hz (Fig. 1D: small circle on ordinate) which equals the average firing rate. In this review, we will use the terms envelope synchronization and envelope phase-locking synonymously to refer to synchronization of the response to the stimulus envelope waveform, and use the term rate coding for changes in average firing rate during manipulation of the stimulus modulation parameters. Different synchronization measures have been used, sometimes leading to seemingly contradictory statements. The most popular metric is “vector strength” R, also called synchronization index (81). Each spike is treated as a vector of unit length and with phase i between 0 and 2 measured as the spike time modulo the stimulus period of interest. The x- and y-components of the vector are xi ⫽ cosi and yi ⫽ sini. The n spikes in a response are combined by vector addition, and the resultant vector is normalized to n 冑冉冘 冊 冉 冘 冊 2 n R⫽ i xi 2 n ⫹ i yi (3) n which takes values between 0 and 1. R can also be obtained from the Fourier spectrum of the PST or period histogram, in which case it equals the magnitude of the first harmonic, normalized by the DC component (average firing rate). Phase is also retrieved with either technique. Statistical significance of synchronization is usually quantified with the Rayleigh test (23, 168). Physiol Rev • VOL As will become clear in this review, envelope coding at peripheral stages is predominantly temporal rather than rate-based, but these two aspects of the response progressively reverse in prominence at successive stages along the neuraxis. Because both average firing rate and synchronization may contribute to the impact that a neuron has on its postsynaptic targets, many experimenters have combined the two metrics by multiplication (nR, with n ⫽ total number of spikes, variously called “modulated rate,” “phase-locked rate,” “synchronized rate”), or, equivalently, by reporting the unnormalized Fourier component, expressed in spikes per second (33, 141, 224, 314). Recently, some authors have used 2nR2, which is also the statistic used in the Rayleigh test of significance (157, 266). Finally, envelope synchronization is often reported as a gain value (in dB), defined as 20 log10 (2R/m), which relates output directly to input and facilitates comparison across studies which use different modulation depth m. The vector strength metric, often under different names (e.g., selectivity index), has found general use in the quantification of periodic neural signals in sensory and even motor physiology (43). Despite its pervasive use, it is important to be aware of its limitations. First, the metric gives only the degree to which the response is modulated to the frequency at which R is calculated (we use the subscripts m and c to indicate modulation frequency and carrier frequency, respectively). It does not capture the full harmonic content of the cycle histogram at fm so that histograms with a rather different shape can result in the same Rm value (see Ref. 127 for an example). An Rm value of one only results from perfect alignment of all spikes at one phase, but a value of zero does not necessarily indicate a random distribution of spike times. For example, if spike times are equally divided between phase and ⫹ , the average vector has zero magnitude. Thus a low vector strength should not necessarily be equated to absence of temporal structure in the spike train, but rather is an indication of lack of energy at the frequency for which R was calculated. Second, high R values indicate that spikes are distributed over a narrow time window relative to the period of interest, but such values do not imply a faithful replica of the stimulus modulation waveform in the probability of discharge. As a reference, a PST histogram that closely resembles a halfwave rectified sinusoidal AM signal with m ⫽ 1 gives R ⫽ 0.5. Higher R values are obtained when the period histograms are more “peaked” than the original sinusoidal modulation signal. Third, R is a compressive metric and is therefore sometimes graphed on an expansive scale (120). Finally, a problem at a more general level is that calculation of Rm requires knowledge of fm, a strategy that the brain cannot use. It may be argued that a “clock” signal is available in the form of the highly synchronized discharge of some types of cochlear nucleus neurons, which could be used to perform a vector strength type calculation in 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING which degree of synchronization is translated into average firing rate, e.g., as suggested in the periodicity extraction scheme by Langner (150). Some authors have used interspike interval or autocorrelation analysis to bring out the time structure of responses that may be more relevant to the operations performed by the central processor (27, 85, 123, 141, 226, 301). In this context it is important to remember that the envelope of most natural sounds is not strictly periodic in the first place and that the raw acoustic waveform is not available as such to the auditory nervous system. Rather, this waveform is decomposed into a multitude of waveforms by virtue of cochlear narrowband filtering (reviewed in Refs. 206, 234). This process profoundly affects the modulation spectrum present in each frequency channel, which is thus determined jointly by the spectrotemporal properties of the acoustic stimulus and of those of the peripheral filtering process (for illustrations, see Ref. 286). In summary, while most studies discussed here have used deterministic stimuli with periodic envelopes and have applied the R metric, it is important to keep in mind that, for natural stimuli, the relationship between neural response modulation and stimulus modulation is more complex and that the neural operations by which the central processor extracts envelope information likely differ fundamentally from the analytical ways of the experimenter. The bulk of studies on AM coding have used the same stimulus strategy, which is to tailor the stimulus to the cell under study. Early work (78, 183) established that peripheral neurons display envelope phase-locking only if the stimulus energy falls within a cell’s tuning curve. For example, Javel (114) shows the lack of response of an auditory-nerve fiber tuned to 800 Hz to a high-frequency AM complex (fc ⫽ 5 kHz) modulated at 800 Hz. Most studies using AM stimuli with tonal carriers match fc to the neuron’s characteristic frequency (CF, frequency of lowest rate threshold), and usually also optimize other stimulus parameters for the cell under study. The complementary approach, in which the population response of cells at many different CFs is studied to a limited set of stimuli, has been little used (27, 293). A description employed both acoustically, psychophysically, and physiologically, is the modulation transfer function or MTF, which is response modulation relative to input modulation as a function of modulation frequency. Schroeder (257) predicted more than 20 years ago that the concept of MTF would increase in importance because the modulation rather than the carrier usually contains the important information and because highly nonlinear transmission systems often exhibit a quasi-linear response to modulation. Physiologically, MTFs are usually measured as the phase-locking to AM tones of fixed m and fc presented at consecutive modulation frequencies, but Physiol Rev • VOL 547 other methods have been employed (see sect. IXB). Marked effects on average rate occur so that a distinction between temporal MTF (tMTF) and rate MTF (rMTF) is usually drawn. IV. AUDITORY NERVE: BOTTLENECK TO THE CENTRAL NERVOUS SYSTEM A. Basic Auditory Nerve Properties Activity in the auditory nerve represents both the output of the cochlea and the input to the central nervous system, and studies of envelope phase-locking have been conducted both to gain more insight into cochlear processing and to define the limits within which the central processor has to operate. Compared with optic and peripheral somatic nerves, the auditory nerve is highly uniform both morphologically (in caliber and branching pattern) and physiologically. We only discuss type I auditory nerve fibers, which form the bulk of the nerve, since near to nothing is known about the physiology of the unmyelinated type II fibers. Because each type I nerve fiber contacts only a single inner hair cell, its activity can, to a first approximation, be understood from basilar membrane motion at a single point in the cochlea followed by further signal modifications by the inner hair cell and hair cell/ nerve synapse (76, 136, 137, 243). The most salient properties are 1) sharp V-shaped tuning to a narrow range of frequencies; 2) a limited dynamic range of ⬃20 –30 dB, reflected in an sigmoidal rate-level function; 3) adaptation of firing rate to sustained stimuli, rather modest compared with adaptation of peripheral nerve fibers in other systems; and 4) phase-locking to low-frequency pure tones (⬍4 –5 kHz in the cat). Auditory nerve fibers show a bimodal distribution of spontaneous rate (SR), on the basis of which several classes of fibers are defined that differ in a number of properties (158, 246, 305). Fibers with high SR (⬎18 spikes/s), which in cat form ⬃60% of the total population, have low thresholds and limited dynamic range. Fibers with medium and low SR have higher thresholds and tend to have “sloping” saturation, i.e., their rate-level functions show a decrease in slope at ⬃30 dB above threshold but do not fully saturate. Also, low-SR fibers show less adaptation than high-SR fibers (230). Differences between the SR classes have been documented mostly with pure tone and spectrally complex stimuli, but AM stimuli have revealed response differences in the time domain as well. We first discuss how the basic AM parameters m, sound pressure level (SPL), fm, and fc (Fig. 3) influence synchronization and average rate, then describe the response phase. 84 • APRIL 2004 • www.prv.org 548 JORIS, SCHREINER, AND REES FIG. 3. Basic dimensions and manipulations in an AM signal and their effect on auditory nerve activity. The relationship of an auditory filter (curve) and AM spectrum are shown schematically for variations in modulation depth m (A), sound pressure level (SPL) (B), modulation frequency (fm) (C), and carrier frequency (fc) (D). For each manipulation, three measures of the responses of an auditory nerve fiber are shown: average rate (rate, dashed line), synchronization magnitude (R, solid line), and synchronization phase (, thin line). B. Average Response Rate and Magnitude of Synchronization When a tone is presented at a fiber’s CF at a fixed suprathreshold level and is modulated with increasing depth, the nerve fiber shows a monotonic, saturating increase in synchronization Rm (Fig. 3A). Although Rm increases with m in absolute terms, synchronization magnitude decreases in relative terms, i.e., the gain (response modulation relative to stimulus modulation) decreases (127). The gain can be as large as 10 dB for m of 10% and decreases to values near 0 dB for m of 100%. Responses to AM as a function of stimulus intensity have been studied extensively in a variety of animals (guinea pig, Ref. 33; chinchilla, Ref. 114; cat, Refs. 127, 135, 294; gerbil, Ref. 270). The rate-level function with AM shows only small differences relative to the function obtained with an unmodulated carrier wave (127, 270). The Physiol Rev • VOL synchronization-level (Rm vs. SPL) function shows a stereotypic nonmonotonic shape; a maximum is reached at low suprathreshold levels, with a decrease in Rm for further increases in SPL (Fig. 3B). It is easy to see how this relationship is expected from the compressive relationship between firing rate and SPL, especially when the modulation depth m is small; maximal modulation of firing rate should occur for amplitude changes centered on the steepest part of the rate-level function, between firing threshold and saturation. At high SPLs, amplitude fluctuations should not translate into fluctations in firing rate because firing rate is saturated. Qualitatively the synchronization-level function does indeed show the expected nonmonotonic shape. However, compared with quantitative predictions based on the rate-level function, the observed synchronization shows 1) larger maximal R values, 2) a maximum that is displaced towards a higher SPL, and 3) higher synchronization values at high SPLs 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING and a shallow downward slope. These deviations are predicted when adaptation over a short time scale is taken into account (33, 270, 311). Basically, adaptation boosts the coding of stimulus changes so that the operating range over which changes in SPL result in changes in firing rate is larger for responses to AM than for steadystate responses to pure tones. There are systematic differences in AM responses of the different SR classes of auditory nerve fibers. One descriptor commonly used to compare envelope phaselocking across cell populations is the maximal R value of the synchronization-level function (Rmax). Cells with low and medium SR tend to have higher Rmax values than cells with high SR, and this difference is particularly marked at low CFs (⬍5 kHz) (127, 294). However, the difference in synchronization between these different auditory nerve classes strongly depends on the synchronization metric used (33, 127, 183, 295). In contrast to earlier reports, Cooper et al. (33) concluded that fibers with high SR showed larger envelope synchronization values than low SR fibers. Their result is less of a conflict than it appears if it is taken into account that the metric used by these authors was (unnormalized) modulated rate rather than Rm, that the average discharge rate of fibers with low SR is generally lower than that of fibers with high SR (158), and that the sample of Cooper et al. is biased to high CFs (⬎8 kHz). Synchronization is robust in high SR cells at low SPLs and in low and medium SR cells at mid and high SPLs (294). However, the different fiber populations reach maximal synchronization at the same level relative to rate threshold (33, 294). Low SR fibers have a larger dynamic range over which significant modulation is present (33), lending further support to the general hypothesis that these fibers are particularly important for hearing at high SPLs. The narrow bandpass filtering by the cochlea limits the range of modulation frequencies transmitted by nerve fibers. As schematized in Figure 3C, increase of fm causes the sidebands in the stimulus spectrum to move away from fc. If fc is centered at the CF of the fiber studied, the energy in the sidebands is increasingly attenuated, resulting in a loss of modulation at the output of the peripheral filter. The response as a function of fm is usually referred to as the modulation transfer function (MTF) and again one should clearly distinguish effects on average rate (rMTF) from effects on synchronization to fm (tMTF). The rMTF is usually flat but may show some decrease in rate with increasing fm, particularly in low-SR fibers (127). In contrast, tMTFs all have a low-pass shape (guinea pig, Ref. 203; cat, Ref. 127; rat, Ref. 186; Fig. 3C). These functions are smooth and do not show any structure related to harmonic ratios, i.e., whether or not the AM components (fc and the two sidebands) are integer multiples of fm is inconsequential. The absolute bandwidth of Physiol Rev • VOL 549 frequency tuning curves, e.g., at 10 dB above threshold, increases with CF (59, 86, 230), and the cut-off frequency of tMTFs shows a concomittant increase with CF (Fig. 2). At very low CFs (a few hundred Hz), a tMTF cut-off frequency can often not be determined because of the broad frequency tuning. Interestingly, for CFs above ⬃10 kHz, the increase in cut-off frequency is not commensurate with the increase in bandwidth of frequency tuning at these high CFs. This presumably reflects temporal filtering at the hair cell/synaptic level rather than spatial filtering at the mechanical level (86, 127). The highest modulation frequency at which significant envelope phase-locking is observed, in high-CF nerve fibers, is ⬃2 kHz (127, 229). A less marked feature of many tMTFs is a shallow positive slope in the low-frequency skirt (94, 127). According to Cooper et al. (33), this slope tends to become steeper at high SPLs, consistent with models that include effects of response adaptation (311). Clearly, the extent of envelope phase-locking in the auditory nerve is sufficiently wide to encompass psychophysical existence regions (Fig. 2). Javel and Mott (115) attributed the disappearance of residue pitch at fc ⬎5 kHz to increased sharpness of tuning of high-CF fibers (59, 230). However, while bandwidth limitations may contribute to the upper fm limit of ⬃800 Hz, they do not explain the disappearance of residue pitch altogether. The dependence of envelope phase-locking on carrier frequency, relative to CF, has not been explored in great detail (114, 127, 295). It merits further study because the available data suggest an important effect. If fc is moved away from CF, the synchronization-level function shifts to higher SPLs. Consequently, for moderate to loud stimuli, strongest phase-locking is present in fibers with CFs that differ from fc, provided that the stimulus is able to excite these fibers (Fig. 3D). Thus, for all but the weakest signals, the representation of stimulus envelope may be carried mainly by fibers tuned to frequencies that differ from fc. C. Phase of Synchronization Few studies reported phase or latency data for AM stimuli. For a given fiber, the phase of response to the envelope shows a slight lead with increasing SPL (127) and, at fixed suprathreshold levels, varies little with changes in carrier frequency (122). In contrast, response envelope phase increases nearly linearly with fm. The slope of this relationship has been used as an estimate of the total delay accrued between the acoustic stimulus and the site of recording, similar to earlier such measurements on responses to pure tones in low-CF fibers (4). The linearity of the phase-fm relationship indicates that it is mostly determined by fixed mechanical and neural transmission delays. Consistent with other delay or onset 84 • APRIL 2004 • www.prv.org 550 JORIS, SCHREINER, AND REES latency measures, the values obtained vary systematically and inversely with CF (127, 294), as expected from the travelling wave on the basilar membrane which starts at the base of the cochlea and reaches its more apically located maximum after some delay. However, many processes contribute to the total delay (242, 244). Gummer and Johnstone (93) scanned envelope delay of nerve fibers near their tuning curve threshold, using AM complexes of fixed fm and low modulation depth over a large range of carrier frequencies. They found a delay component that was large for carrier frequencies near CF and smaller in the tuning curve tail, and the authors provide several arguments to suggest that this component reflects a delay associated with cochlear bandpass filtering. The preceding descriptions are based on synchronization of the response to the envelope frequency. Again, it is important to bear in mind that such descriptions are incomplete. The shape of cycle histograms can depart severely from the shape (usually sinusoidal) of the stimulus envelope, particularly at high SPLs and at large modulation depths. Therefore, the spectrum of the cycle histogram typically consists of a number of spectral peaks, of which the peak at fm is only one, and not necessarily the largest, component (135, 294). Also, the most salient temporal information present in the discharge patterns is not necessarily revealed by calculation of synchronization to stimulus components. For example, robust phase-locking to fm does not imply that the most common interspike intervals are at the period of fm: for envelope periods of several tens of milliseconds multiple spikes occur per envelope cycle, while periods shorter than a few milliseconds succeed each other too fast to allow a spike in every envelope cycle. An interesting discrepancy between envelope phase-locking and dominant interspike intervals is in “pitch-shift” effects of changes in fc (27, 114): phaselocking to fm stays roughly constant, while the most dominant interspike interval shifts in a direction which parallels the subjective pitch of the AM stimulus. In summary, envelope information is abundantly available in auditory nerve discharges in temporal form. Each nerve fiber transmits envelope information over a stereotypical range of modulation frequencies, carrier frequencies, and intensities. These ranges are consistent, at least at a qualitative level, with known auditory nerve properties of frequency tuning, compression, adaptation, and spontaneous activity, and computer models incorporating these properties reproduce the main features of AM responses (105, 117, 271). The main way in which the auditory nerve is a bottleneck to the central nervous system for AM signals is in the extent of modulation frequencies over which synchronization occurs. This range cannot be enlarged centrally, except possibly for frequencies at which fine-structure information is available (⬍4 –5 kHz), because AM arises from a time-domain interaction of stimulus components. Physiol Rev • VOL V. COCHLEAR NUCLEUS: PARALLEL CHANNELS The key dynamic properties of cells in the cochlear nucleus (CN) and the differences with the auditory nerve were described in the pioneering studies of Møller (183, 184, 187): enhanced gain over a large dynamic range, low levels of distortion to sinusoidal modulation, i.e., a rather faithful tracking of the sinusoidal envelope, presence of bandpass tMTFs particularly at high SPLs, and similar tMTF shape for different forms of modulation (sinusoidal AM of pure tone or noise carriers, noise-modulated tones, noise-modulated noise). However, the marked diversity of CN cells supports a variety of AM response patterns, evident in the earliest CN studies (78), and necessitates a discussion of AM responses per cell type rather than global statements about the CN or its subdivisions. Limited attempts have been made (not reviewed here) to uncover the mechanisms underlying the auditory nerve to CN transformations, for gain enhancement in particular (72, 228, 296, 323). A. Basic Organization of the CN An important insight that emerged from study of the CN with simple stimuli was that a limited number of response patterns or “classes” could be discerned and that these patterns are related to morphological cell classes (18, 202). Especially through the technique of intracellular labeling, many of the structure-function relationships that were surmised earlier on the basis of indirect evidence were solidified. The physiological diversity of these different cell types, combined with the diversity of their central projections (297), led to the concept of functionally specialized, parallel pathways (for review, see Refs. 26, 69, 112, 227, 319). Briefly, three subnuclei are defined on the basis of the bifurcation pattern of the auditory nerve. The anteroventral cochlear nucleus (AVCN) has three principal cell types. Stellate cells project to the inferior colliculus (IC) and respond to tones with a burst of regularly spaced action potentials called a “chopper” pattern. Bushy cells, which derive their name from their small and confined dendritic tree and which are remarkable for their strong inputs from the auditory nerve, occur in two types. Spherical bushy cells receive large calyceal auditory nerve terminals (end bulbs of Held) and show responses similar to auditory nerve fibers and are therefore called “primarylike” (PL). Their main projection is to binaural nuclei in the superior olivary complex. Globular bushy cells also receive large nerve terminals in the form of modified end bulbs of Held, and show a characteristic “primary-likewith-notch” (PLN) pattern in response to tones. Their main projection is contralaterally in the superior olivary complex where they give rise to giant calyceal endings on 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING 551 cells in the medial nucleus of the trapezoid body, which are inhibitory on binaural cells in the lateral superior olive (LSO). The posteroventral cochlear nucleus (PVCN) contains octopus cells that project to the ventral nucleus of the lateral lemniscus (VNLL) and show pure onset (Oi) responses to tones. It also contains inhibitory multipolar cells that project to the dorsal cochlear nucleus (DCN) and the contralateral CN and which show onset-chopper (Oc) responses. The principal neurons of the DCN are the fusiform cells, which project to the IC and display remarkably nonlinear spectral properties. These properties arise through local inhibitory interactions with interneurons in DCN (type II cells) and presumably with the Oc cells (195). The classification of CN cells is mostly based on subjective criteria, which contributes to discrepancies in conclusions of different studies. Although there is by no means an agreed upon “task” for each of these circuits, it is clear that each cell type performs a different analysis of the auditory nerve input and conveys its output to a different part of the auditory brain stem. The bushy cells are clearly involved in binaural analysis important for spatial localization of sounds. Stellate cells are able to represent vowel spectrum over a wide range of intensities. Fusiform cells integrate somatosensory and spectral information and may signal important auditory events. Responses to AM offer another illustration of how CN cell types differ in their processing of auditory nerve input. B. AM Responses of Neuronal Types in the CN The relationship between AM coding and physiological cell class, as defined by the response to pure tones, was first examined by Frisina and co-workers in the gerbil (70, 71). These authors found that envelope phase-locking in ventral cochlear nucleus (VCN) was generally enhanced relative to the auditory nerve, and they described a hierarchy of enhancement that correlated with the precision of timing of response onset to pure tones. Of the four physiological VCN cell types studied, cells with welltimed onset responses showed the highest gains, followed by choppers, PLN, and PL. The decrease in synchronization with increasing intensity is less than in the auditory nerve and in some cell types depends on fm, resulting in a peaked or tuned tMTF at high SPLs. Particularly these latter two response features, extended dynamic range and selectivity to fm, received much attention in later studies (Fig. 4). The general behavior of synchronization as a function of SPL and fm described by Frisina et al. (71) was confirmed and extended to other cell types in many subsequent studies, even though not all studies agree on the exact hierarchical ordering and the discreteness of the ordering. Some of the most interesting responses were obPhysiol Rev • VOL FIG. 4. Two important transformations between the auditory nerve (dashed lines) and cochlear nucleus (solid lines). A: enhancement of envelope synchronization and extended dynamic range is present in many cell types. B: some cell types show bandpass tMTFs. served in cells with chopper responses. Choppers are temporally tuned for fm, as reflected in bandpass tMTFs particularly at higher SPLs (gerbil, Ref. 71; cat, Ref. 229). A small percentage of choppers also shows bandpass tuning in their rMTFs (228). The fm causing the strongest synchronization is called the temporal best modulation frequency (tBMF). The occurrence of bandpass tuning is of obvious importance to the concept of a “modulation frequency filter bank” or “modulation channels” (131). This concept has some popularity, particularly in the psychophysical literature (see sect. II), and will be taken up again in our discussion of IC and auditory cortex. As mentioned, “chopping” reflects the intrinsic tendency to fire a regular burst of spikes at the beginning or sometimes entire duration of the stimulus, and these cells have therefore been viewed as resonators or intrinsic oscillators (150). SPL-dependent bandpass tuning and os- 84 • APRIL 2004 • www.prv.org 552 JORIS, SCHREINER, AND REES cillatory responses were also described earlier by Møller (187) in the rat. In a subclass of cells in the guinea pig, the intrinsic behavior is invariant with SPL and affects the temporal characteristics of the response to nondeterministic stimuli (301). There is a possibility that the intrinsic properties make these cells function as envelope filters that decompose the envelope spectrum, much in the way that inner hair cells in the turtle cochlea decompose stimulus frequency by virtue of an intrinsic electrical resonance mechanism (63). Several authors have therefore looked for correlations between AM and intrinsic oscillation behavior. Frisina et al. (71) compared the frequency of chopping with the tBMF for a sample of sustained choppers in VCN. The tBMFs spanned a range (170 –700 Hz) roughly similar to the range of chopping frequencies (80 –520 Hz), but the correlation between the two response properties was poor. There was a suggestion of interaction between chopping frequency and fm in that the tBMF only rarely exceeded the chopping frequency, which therefore seemed to set an upper bound. In a subpopulation of choppers (sustained choppers with a well-defined tBMF between 150 and 450 Hz), Rhode and Greenberg (229) noted a tendency for maximal envelope synchronization when fm matched the discharge rate to a tone at the same intensity. A strong and more general relationship, not restricted to choppers, was found by Kim et al. (141) in DCN/PVCN neurons of the unanesthetized decerebrate cat. In this study, the “intrinsic oscillation” frequency of a neuron was measured from the autocorrelation of its responses to pure or AM tones. Frequency of intrinsic oscillation and BMF were well correlated (r ⫽ 0.86) with regression close to the diagonal of equality, and the frequency ranges were roughly similar (50 –500 Hz) to those reported for VCN choppers (71, 229). Importantly, the remarkably good correlation arose from the pooling of different cell groups, rather than from a within-population trend, complicating any AM-coding scheme based on intrinsic oscillators. At least five cell types contributed to the data, surprisingly also including auditory nerve fibers. Besides choppers, the other main constituent cell types of the AVCN are the two types of bushy cells with PL and PLN responses. As expected from their powerful auditory nerve inputs, PL and PLN cells resemble auditory nerve fibers in many regards, and indeed, their Rmax and tMTF cut-off frequency distributions at different CFs largely overlap that of the auditory nerve (129, 229). For PL cells this overlap is virtually complete, but for CFs below ⬃7 kHz, PLN cells synchronize much better to envelopes than auditory nerve fibers. At very low CFs some bushy cells have enhanced synchronization to both fine-structure and envelopes (124). Comparisons of cell types across studies illustrate that one has to be careful with simple characterizations to multi-dimensional stimuli like AM. As remarked by Rhode Physiol Rev • VOL and Greenberg (229), a single response parameter is not sufficient to characterize envelope synchronization. The highest gains found in choppers exceed those of PL cells but are mostly at fm values below 500 Hz (129, 229) so that at higher modulation frequencies PL cells are superior to choppers in transmitting envelope information. Consequently, the hierarchy of modulation enhancement strongly depends on the range of modulation frequencies of interest and also, as pointed out earlier (see sect. IVB), on the chosen metric (266). Rather than providing an exhaustive listing of response parameters for all cell types, we emphasize here the properties by which different CN cells stand out most from the auditory nerve and from each other. For chopper cells this is the bandpass tuning of tMTFs; for bushy cells it is the extent of the tMTF (high cut-off frequencies). The two main response types found in PVCN are onset (Oi and Oc), associated with the octopus and multipolar morphology, respectively. Both cell types show remarkable envelope phase-locking, in line with the precision of their onset response to pure tones. Oc cells have been particularly well-studied (cat, Refs. 125, 140, 228, 229). These cells show some of the highest gains, over the widest fm and SPL range, which is why Kim et al. (140) proposed that these cells have a special role in the extraction of the fundamental frequency of voiced speech sounds. Moreover, large changes in fc and even use of a wideband carrier have little effect on magnitude of synchronization (228). Oi cells have been studied very little, but the few existing data reveal interesting properties, in line with their biophysical specializations (199). These cells show the highest gains of all CN cells, reaching Rm values near 1 (228). Moreover, their tMTFs are high in gain and invariant for SPL, but all-pass. The rMTFs of these two classes of onset cells also appear unique among CN cell classes because they can be sharply bandpass. It is unclear whether these bandpass rMTFs can sustain a rate code for modulation frequency: among the handful of Oi cells reported, the range of rBMFs was only 350 – 450 Hz. Onset units have wider frequency tuning than auditory nerve fibers (80, 118, 231). They therefore provide a test case of the suggestion that is sometimes made that tMTF bandwidths may broaden centrally by virtue of convergence of cells tuned to different CFs (180, 286). However, this would require phase information on the individual spectral components of the AM stimulus, and for frequencies above the pure-tone phase-locking range (⬎4 –5 kHz in cat), such information is not available to the central processor. Indeed, despite their wider frequency tuning, tMTF cut-off frequencies of onset cells do not exceed the limits imposed by the auditory nerve (125, 228, 229). The DCN has traditionally been regarded as a part of the CN which has poor timing properties (79, 82, 154), and 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING initial studies with AM seemed consistent with that view (horseshoe bat, Ref. 282; kangaroo rat, Ref. 29). However, more recent studies emphasized good AM coding in DCN (cat, Refs. 125, 229, 254; guinea pig, Refs. 322, 323) and specific roles for DCN in temporal processing have been proposed [pitch (150); extraction of envelopes in background noise (73) or at high SPLs (229)]. The tMTFs are typically low-pass or bandpass and differ from other CN cell types in their upper fm limit of phase-locking which never exceeds 800 Hz. To some extent, differences between studies reflect the complexity of this nucleus, both in diversity of response types and in nonlinearity of behavior (319). Oc cells can be found in deep DCN and may explain some of the high-gain responses to AM reported for DCN. Second, simple measures like maximum synchronization or cut-off frequency do not reveal the full complexity of DCN responses and give DCN a misleading “AVCN-like” appearance. Even though DCN interneurons and principal neurons can display high gain responses to AM stimuli, their response often shows strong nonmonotonicities, not only in average rate but also in magnitude and phase of envelope synchronization (125, 254, 322). These nonmonotonicities are likely a manifestation in the temporal domain of the intricate inhibitory and excitatory interactions that have been invoked to explain similar complexities in the frequency domain. A preliminary study by Frisina et al. (73) in the chinchilla suggests that envelope synchronization of DCN neurons can be enhanced by background noise, but more systematic data and comparisons with auditory nerve and VCN are needed to evaluate whether DCN neurons are special in this regard. Rhode and Greenberg (229) studied envelope synchronization in the presence of wide-band noise in different CN cell types of the cat and found that in general there is remarkable preservation of envelope synchronization even at high noise levels. As in the auditory nerve, few authors have systematically reported envelope phase data. Cells in the CN also show a linear increase in envelope phase with increasing fm, but the slopes are systematically steeper than in the auditory nerve, consistent with additional time delays required for conduction and synaptic transmission (125, 129). Delays calculated from response envelope phase are more tightly distributed and shorter than traditional measures of latency based on response onset (94, 185), as is the case for delay estimates based on fine-structure (65). Most CN studies of AM coding considered only tMTF magnitude and not phase when trying to infer functional consequences of AM tuning for the perception of natural stimuli. Delgutte et al. (40) used both tMTF magnitude and phase of responses in auditory nerve, CN, and IC to predict responses of the same neurons to speech utterances (see below) and stressed the importance of incorporating phase, particularly at very low modulation frequencies, to make succesful predictions. Physiol Rev • VOL 553 To summarize, the CN shows marked differences in AM coding relative to its auditory nerve input: wider dynamic ranges, higher gains, appearance of bandpass tMTFs, and less sensitivity to the presence of background noise. Furthermore, different cell types show marked diversity in their synchronization and average rate behavior to AM signals. A simple hierarchical ranking does not do justice to the differences among cell types and depends on whether one emphasizes Rmax values (71, 295), breadth of the tMTF (129), or statistical reliability of phase-locking (266). As in the nerve, AM coding is almost entirely temporal: bandpass rMTFs occur rarely, in a few cell classes. Our knowledge of CN responses to AM is still lacking in many ways and basically does not go far beyond phenomenology. Perhaps the most pressing question is the robustness and relevance of bandpass tMTFs, which many investigators regard as genuine envelope filters. More studies are needed to determine how invariant tMTF tuning is with stimulus parameters, what range of tBMFs is spanned at different CFs, and whether tMTF tuning indeed supports filtering of envelope energy in natural stimuli. Such information would be particularly valuable for carrier frequencies in the range of phase-locking to fine-structure (⬍4 –5 kHz), which is poorly sampled in most studies in small animal species with higher-frequency hearing than humans. There are other lacunae. Data are sparse for certain cell types, most notably pure onset units in PVCN. In most studies, the stimulus is optimized for the cell under study; there is a need for population studies in which the response to a limited set of stimuli is examined for an entire population. Finally, there is currently no evidence for any kind of within-class topographic organization (e.g., within an isofrequency strip) of AM response properties in the CN. VI. SUPERIOR OLIVARY COMPLEX: AN EXAMPLE OF TIME-TO-RATE CONVERSION Part of the CN output is directed toward nuclei in the superior olivary complex (SOC). This is an amalgam of large and small nuclei some of which take part in wellstudied circuits whose function is in feedback to the periphery (middle ear reflex and the olivocochlear efferent systems) or in the extraction of binaural differences important in spatial hearing. The preceding and following sections illustrate that, with some notable exceptions, envelope coding in the CN is largely temporally based while at the level of the IC partial conversion to a rate code is apparent. In our discussion of SOC physiology we highlight one aspect of these circuits: the conversion of an envelope time code to an average rate code. The duplex theory of sound localization holds that the azimuthal spatial position of low-frequency signals is determined primarily on the basis of the minute differ- 84 • APRIL 2004 • www.prv.org 554 JORIS, SCHREINER, AND REES ences in time at which the acoustic waveform reaches the two ears, interaural time differences (ITDs), while highfrequency signals are localized on the basis of interaural SPL or level differences (ILDs). This classical psychophysical theory seems to be embodied anatomically and physiologically in two binaural circuits in the SOC of most mammals. The circuit centered on the medial superior olive (MSO) detects ITDs and contains primarily lowfrequency cells. Another circuit, centered on the lateral superior olive (LSO), detects ILDs and has a bias towards high CFs. The detailed physiology of these circuits and their afferents is beyond the scope of this review (see Refs. 279, 312, 316). Starting in the mid-1970s, a number of investigators reported that humans can reliably discriminate ITDs of high-frequency signals at thresholds approaching those for low-frequency signals, i.e., ⬍20 s, provided that the signals are not pure tones but have a time-varying envelope, as in AM sounds with the parameters illustrated in Figure 2. Clearly, subjects can detect the on-going envelope differences that occur when complex stimuli are delayed between the two ears with high precision. Physiological studies in the IC of cat (317) and rabbit (12) provided evidence for ITD sensitivity to AM signals but indicated that this sensitivity was probably generated at a lower level. Subsequent recordings in the SOC indeed revealed cells that were sensitive to interaural delays of AM signals, and this ITD sensitivity could be understood from the binaural interactions known to occur in these nuclei and the AM coding properties of their afferents. In the MSO, ITD sensitivity to AM signals is generated by a multiplicative, cross-correlation type operation. These cells behave as coincidence detectors, which has been particularly well-documented for low-frequency signals (81, 126, 313) but holds for modulated signals as well. The average firing rate of high-CF MSO cells to AM signals varies with ITD (Fig. 5A). Moreover, the optimal ITD is predicted from the phases measured from the monaural response to an ipsi- or contralaterally presented AM signal: the firing rate is high when the envelope signals from the two ears arrive in-phase at the site of convergence (10, 122, 313). In the LSO, ITD sensitivity to AM signals is generated by a subtractive rather than a multiplicative process (Fig. 5B). These cells have ILD sensitivity by virtue of excitatory signals from the ipsilateral ear and inhibitory ones from the contralateral ear. Again bushy cells constitute both contra- and ipsilateral pathways. For ITDs at which the inhibitory and excitatory phase-locked signals reach the LSO cell coincidently, the signals cancel each other and the cell remains silent. At other ITDs cancellation is not perfect and the excitatory ear is now able to drive the cell. Thus the ILD sensitivity of the LSO cell combined with the envelope phase-locking in its afferents generates overall changes in discharge rate with ITD (10, 11, 122, 128, 129). Interestingly, in anesthetized cats LSO neurons show a “chopper” pattern to ipsilateral tone bursts, but FIG. 5. Example of sensitivity to envelope interaural time differences (ITDs) in medial and lateral superior olive (MSO, row A; and LSO, row B). Sensitivity to ITDs of binaural AM stimuli (left column) shows a complementary pattern in MSO vs. LSO and is consistent with the response phase to monaural modulation (middle column): ITDs which bring the monaural responses in-phase cause a high firing rate. The complementarity arises from opposite signs of binaural interaction (right column): in MSO a process of coincidence detection operates on excitatory inputs from both sides, while in LSO a subtractive process operates on excitatory ipsi- and inhibitory contralateral inputs. The LSO response to contralateral modulation (middle column, bottom) would be as obtained by presenting an unmodulated stimulus to the ipsilateral, excitatory ear and a modulated response to the contralateral, inhibitory ear. Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING unlike choppers in the CN, they lack tuning in the tMTFs (or rMTFs) to ipsilateral stimulation (129). The simple time-to-rate conversion that occurs in binaural SOC nuclei may have analogs in monaural processing, e.g., rMTFs in the SOC of the mustache bat appear to be shaped by monaural excitatory and inhibitory interactions and delays similar to the binaural interactions described in cat and rabbit (91). The envelope ITD sensitivity in MSO and LSO also illustrates the general point that it is probably beneficial for a time-to-rate conversion (or more generally a recoding of a stimuluslocked temporal code into another form) to occur at a peripheral neural level. Indeed, the upper frequency limit (though not necessarily the gain) of phase-locking tends to decrease with subsequent integrative stages so that a de novo comparison of monaural phases by neurons at a higher level in the neuraxis would yield a more restricted ITD sensitivity. The frequency and modulation frequency range over which ITD sensitivity occurs in the IC and higher levels is comparable to that in the SOC but is rate-based (66, 219). For example, the ITD sensitivity of high-frequency cells in the IC extends to modulation frequencies to which the cells no longer phase-lock when tested monaurally [on average 600 Hz binaurally vs. 250 Hz monaurally (12), see also sect. VIII, B and H]. Also, envelope phase-locking in the monaural inputs to the LSO extends to modulation frequencies more than an octave higher than the highest fm at which LSO neurons show ITD sensitivity (ⱕ800 Hz) (129). The use of temporal information may thus be one evolutionary reason for the extensive subcortical processing in the auditory system relative to the other sensory systems. Little is known about envelope sensitivity in other nuclei of the SOC. Olivocochlear efferent neurons in the guinea pig are surprisingly well phase-locked to AM signals below ⬃400 Hz (94), with bandpass tMTFs peaking at ⬃100 Hz. Maximal gains were ⬃8 dB higher than for auditory nerve fibers recorded in the same experiments. It is not known whether modulation differentially affects the targets of the medial olivocochlear neurons (the cochlear outer hair cells), although AM signals have been reported to be effective signals to suppress evoked otoacoustic emissions in humans (162). Remarkable AM responses were described in monaural cells in the SOC of awake rabbits (145). Cells with sustained responses showed responses to AM similar in several respects to CN choppers, but an unusual class of “off” cells was inhibited during the presentation of pure tones and responded vigorously after stimulus termination. These cells were strongly driven by AM stimuli and showed high gains over a wide range of modulation frequencies, resulting in low-pass tMTFs and rMTFs. Several properties suggested that the responses were in effect a rebound from inhibition phase-locked to the stimulus envelope, a mechanism also observed in the SOC of the bat (91). Physiol Rev • VOL 555 VII. THE NUCLEI OF THE LATERAL LEMNISCUS The nuclei of the lateral lemniscus (NLL) are embedded in the lemniscal fibers that connect the lower brain stem nuclei with the IC (166, 262). As described in these reviews, ventral (VNLL), intermediate (INLL), and dorsal nuclei (DNLL) have been identified, although in some accounts the intermediate nucleus is treated as part of the ventral nucleus with all NLL neurons located ventral to the DNLL referred to as the ventral complex of the NLL (165, 166). Despite considerable progress over the last decade in understanding the physiology of these nuclei, only two accounts (both in echolocating bats) have described responses to amplitude modulation (Ref. 109, big brown bat, Eptesicus fuscus; Ref. 310, mustache bat, Pternonotus parnellii parnellii). In the big brown bat synchronization to the modulation envelope occurred in nearly all unit types in VNLL, INLL, and DNLL (109). Both low- and band-pass tMTFs were reported. Neurons in VNLL and INLL responded to the highest frequencies of modulation with BMFs between 100 and 1,000 Hz and a preponderance of low-pass tMTFs. In contrast, a narrower range of tBMFs (100 –500 Hz) was observed in DNLL units with a high proportion of bandpass tMTFs in sustained units. Responses to a similar range of modulation frequencies were recorded in the DNLL of mustache bat but with differences in the responses of onset and sustained units (310). Most onset units synchronized equally well to modulation frequencies between 100 and 300 Hz but showed markedly bandpass rMTFs. Sustained units responded up to 800 Hz with low-pass tMTFs and flat rMTFs. Inhibition contributes very differently to these responses. Blockade of GABAA receptors led to a reduction in synchronization at all modulation frequencies in sustained neurons while onset units either increased their modulation frequency cut-off to that of sustained neurons or revealed synchronization where none existed before. The shapes of the rMTFs were not changed by blocking inhibition (310). VIII. AMPLITUDE MODULATION ENCODING IN THE INFERIOR COLLICULUS: A CENTER FOR CONVERGENCE A. Basic Organization of the IC The several parallel pathways that diverge in the cochlear nucleus from the common input of the cochlear nerve converge again in the IC, the principal midbrain nucleus in the auditory pathway. The IC is an obligatory processing center for most information ascending via the medial geniculate body to the auditory cortex. Anatomical investigations of the IC in several species have identified a broadly consistent arrangement of subdivisions: a cen- 84 • APRIL 2004 • www.prv.org 556 JORIS, SCHREINER, AND REES tral nucleus (CNIC) receiving most of the main ascending afferent input from many brain stem nuclei is surrounded dorsally, laterally, and rostrally by dorsal (DCIC) and external cortices (ECIC) (166, 200, 201). The CNIC is distinguished from the other subdivisions by its laminar organization. It is composed of two main cell types termed disc-shaped or flat cells interspersed with stellate or less-flat cells (164, 182). This cytoarchitecture gives rise in three dimensions to twisted laminae of cells and fibers (167) that constitute the substrate for the highly tonotopic frequency organization in the IC (173, 237, 252, 264). The frequency-band laminae are oriented so that neuronal best frequency increases along the dorsolateral to ventromedial axis of the nucleus. A defining feature of CNIC is the convergence of temporal, spectral, and spatial information extracted in parallel earlier in the pathway onto this laminar structure. However, the full details of how these converging inputs map onto individual neurons have yet to be elucidated, and it is not known to what extent the different strands of information are processed independently in the IC. The DCIC and ECIC as well as differing from the CNIC in their cytoarchitecture have different inputs and outputs. Descending projections from the cortex terminate, predominantly (304), although not exclusively, in the cortical divisions (248). The IC is an important source of both ascending fibers to the thalamus and descending connections to lower brain stem structures (110). The monaural and binaural response properties of single neurons in the IC have been extensively documented (see Refs. 24, 112, 113). Despite the limitations in our knowledge about its cellular organization, it is clear that the output of the IC is considerably modified relative to its input. This is exemplified by the response patterns of IC neurons to complex sounds including AM. For the most part, such knowledge is derived from studies in anesthetized animals that have focused on neurons recorded in response to monaural stimulation of the ear contralateral to the side of recording, and in what follows monaural stimulation should be assumed unless specified. Most of the studies discussed here describe recordings attributed to the central nucleus, but depending on the age of the study and the parcellation adopted, in many cases this will have included at least part of the DCIC and ECIC as well as the CNIC. Therefore, in this review the term IC is used to indicate all subdivisions. B. Modulation Transfer Functions for IC Units: Synchronization IC neurons show strongly modulated responses that for many modulation frequencies greatly exceed the modulation in the stimulus (144, 222–224). Modulation gains calculated from synchronized responses in the IC are Physiol Rev • VOL often 15–20 dB (144, 222, 224) and so are larger than equivalent measurements obtained in the auditory nerve and for most neuron types in the CN. The shape of the tMTF depends on the parameters of the stimulus (see below) but is invariably either bandpass or low pass (144, 152, 191, 222–224). Modulation gain may be enhanced in the IC, but modulation frequencies that elicit a synchronized response are restricted to a lower range than in the periphery. This is manifest in both the tBMFs of neurons in the IC and the range of frequencies over which there is significant modulation of the response (Fig. 9). In the rat, Rees and Møller (223) obtained a modal tBMF in the range of 100 –120 Hz. The tBMF never exceeded 200 Hz, and the high-frequency cut-off of the tMTF (measured 10 dB down from the BMF) did not exceed 320 Hz. In guinea pig, tBMFs fall below 150 Hz with most peaking between 50 and 100 Hz (224). Broadly similar values have been obtained in gerbil (144) and squirrel monkey (191). In the latter, 73% of neurons showed a bandpass tMTF for AM with tBMFs between 32 and 64 Hz. In rabbit, single units and multiunit clusters had a mean tBMF of 87 Hz (12). However, it is worth noting that one unit synchronized to a modulation frequency of 925 Hz. For samples of phasic neurons in both young and old mice, tBMFs were all below 200 Hz (291). Similarly in mustache bat, the majority of units (⬃70%) only synchronized their firing to modulation frequencies below 300 Hz, but a small proportion (4.5%) synchronized up to 500 Hz (20). While these values are broadly similar, the differences that exist more likely reflect species differences rather than the presence or absence of anesthetic, since there is no segregation of the values consistent with anesthetic status. Rees and Møller (223) demonstrated that the shape of the tMTF is highly dependent on stimulus level as in some cochlear nucleus neurons. When stimulus intensity is close to threshold, tMTFs are usually low-pass functions but become more bandpass as the mean intensity of the stimulus is increased. This change may be accompanied by an upward shift in the tBMF. For neurons with nonmonotonic rate-level functions, however, the tMTF becomes low pass at sound levels falling on the negatively sloping limb of the rate-level function (224). So the relationship between tMTF shape and sound level is indirect, with firing rate, perhaps reflecting the net excitatory drive to the neuron, being the better predictor of the lowfrequency slope of the tMTF. Why the effect of stimulus level is only apparent at low modulation frequencies is not clear and may depend on a number of factors including adaptation. Another possibility is that the neuron’s probability of firing at low stimulus intensities is only high near the peak of the modulation cycle resulting in highly synchronized firing. As intensity is increased, threshold is exceeded for a larger fraction of the modulation cycle leading to a reduction in synchronization. This effect 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING might not be apparent at high modulation frequencies because the frequency of modulation approaches the neuron’s maximum firing rate, so ultimately only a single spike occurs in each cycle giving a high degree of synchronization whose upper limit is determined by temporal resolution. Such effects become more apparent in the cochlear nucleus and IC than the auditory nerve because of the enhancing effects of time-dependent inhibition, membrane properties, and other nonlinearities in more central neurons, evidenced by their lower spike rates. Further evidence for a relationship between tMTF shape and firing rate is provided by the effect of background noise. Bandpass tMTFs become low pass with the addition of progressively higher levels of background noise (223). Rees and Palmer (224) showed this change correlated with the noise-induced shift in the neuron’s input/output function along the level axis and its consequent effect on the firing rate elicited by a stimulus (224). C. Modulation Transfer Functions for IC Units: Average Rate The most striking change in AM responses between the IC and its peripheral inputs is in the tuning of rMTFs; the dependence of average firing rate in the IC on modulation frequency is stronger, more common, and has a much wider diversity of patterns than is the case in the CN or the SOC (Fig. 6). (But it is important to note that we have only limited information about rate responses to modulation in the nuclei of the SOC and lateral lemniscus.) 557 rMTFs show a wider range of patterns than is usually observed for tMTFs. In the cat, Langner and Schreiner (152) identified specific patterns of rMTF in a population of single- and multi-unit clusters. These included bandpass, low-pass, high-pass, band-reject, or complex types. The majority were bandpass (70% of single units, 58% of multiunits). Similar response patterns are also found in bat (32) and mouse (291). In guinea pig, 45% of rMTFs were bandpass; the remainder included a variety of different shapes, with some units showing little effect of modulation frequency on firing rate (224). Units whose average firing rate did not change with modulation frequency were the most common type encountered in squirrel monkey, making up almost half of the total (191). The most detailed study of rMTFs in the IC is that of Krishna and Semple (144) in gerbil. In addition to confirming the rMTFs shapes described previously, Krishna and Semple (144) noted that many rMTFs were characterized by distinct ranges of modulation frequency over which firing rate was enhanced or suppressed. In some, regions of enhancement were separated by a marked region of suppression that defined a worst modulation frequency separating the two maxima. Like synchronized responses, rate responses to modulation depend on the mean level of the stimulus (144, 224). Where units have bandpass rMTFs and monotonic rate level functions, the heights of the peaks in the rMTFs increase and then decrease with the average level. They are highest when measured at sound levels on the sloping portion of the rate level function and decline as the stimulus level rises into the saturating region of the rate level FIG. 6. Transformations between the cochlear nucleus (CN) and inferior colliculus (IC). Both nuclei show a wide variety of AM responses; each column highlights only one of the types of responses observed and how these are affected by parametric stimulus variations (in SPL, m, fc) in single cells. The most striking difference between CN and IC is in the rate modulation transfer functions (rMTFs), which are only rarely sharply peaked in the CN (C) but frequently so in the IC (D), where they can also show a degree of invariance with AM parameters. This is also the case for temporal modulation transfer functions (tMTFs) in the IC (B), which reach higher maximal synchronization values than in the CN (A) and often show a degree of bandpass selectivity, but their maxima occur at lower fm than in the CN. Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org 558 JORIS, SCHREINER, AND REES function (224). Across a population of neurons with peaked rMTFs, increases and decreases in BMF with level were observed (144). In units with rMTFs containing regions of suppression, the suppression often becomes more prominent as stimulus level or modulation depth is increased. In some instances, regions of firing rate enhancement changed to suppression at high stimulus levels. Krishna and Semple (144) postulate that inhibition is an important contributor to these effects. There is general agreement across species in the modal value of the rBMF distribution in the IC. In the cat, the modal value for rBMF lies between 30 and 100 Hz (152). These values are in keeping with those reported in rat (222, 223), guinea pig (224), gerbil (144), and bat (32). In the primate, the peak of the distribution of rBMFs of multi-units was 128 Hz (191). There is less agreement over the upper frequency limit for rBMFs. In the cat, almost 20% of multiunit clusters had rBMFs greater than 200 Hz as did ⬃5% of single units (152). A few units had rBMFs as high as 1,000 Hz. rBMFs of up to 800 Hz were also reported for some units in bat (32) and mouse (291). In contrast, the maximum rBMFs recorded for single units in gerbil did not exceed 300 Hz (103, 144), and in squirrel monkey, the maximum rBMF value reported was 256 Hz (191). It is quite likely that the differences between these studies reflect true species differences, with there being no such creature as the average mammal. However, other factors might be contributory. The cat data show that rBMFs ⬎300 Hz were more prevalent in multi-unit recordings. As Langner and Schreiner (152) comment, multi-unit recordings may contain responses from the fiber inputs to the IC as well as its neurons. Given that some of these inputs originate from nuclei in which neurons synchronize to higher modulation frequencies than in the IC, their contribution could be misleading. On the other hand, units with high rBMFs may be more difficult to record as single units, and a small number of single units with high BMFs were reported. Krishna and Semple (144) suggest that misclassifying the secondary peak of enhancement as the BMF in those units with more than one rMTF peak might explain the high rBMFs reported in cat. Apart from species differences, the presence or absence of anesthesia is another factor that could account for the observed differences in the ranges of rBMFs. However, it seems unlikely that anesthesia is the only factor, since some of the largest differences are seen when comparing data from different species where no anesthetic was used [compare values above for squirrel monkey (191), bat (32), and mouse (291)]. On the other hand, similar values were obtained in some anesthetized and unanesthetized preparations, e.g., cat (152) and mouse (291). Unfortunately, definitive experiments comparing the presence and absence of anesthetic have yet to be perfomed. Physiol Rev • VOL D. What Determines the MTF Upper Limit in the IC? Lower cut-off frequencies for both tBMF and tMTF in the IC than at more peripheral stages of the pathway are generally observed across species. The reasons for this are not clear. In the auditory nerve, filter bandwidth is one limiting factor as evidenced by the correlation between the upper limit of the response to modulation and a fiber’s CF (see sect. IVB and Fig. 2). However, evidence for a similar relationship between the response to AM and CF in the IC is weak. In the cat, the upper boundary of the rBMF distribution (and presumably the tMTF distribution since rBMFs and tBMFs are reported to be similar) for multiunits increases with CF (152). But evidence of such a correlation was not apparent in single-unit data recorded in other species [rat tBMF (223), squirrel monkey (rate or synchronization not specified) (191), bat rBMFs and tBMFs (32), or gerbil (144)]. Krishna and Semple (144) examined a large data set and failed to find any correlation between CF and rBMF or between CF and the cut-off frequency of either rMTFs or tMTFs. Furthermore, the frequency bandwidths of most IC neurons are sufficiently wide to accommodate the stimulus spectrum. Thus it seems something other than frequency bandwidth is primarily responsible for setting the upper frequency limit of the response to AM in the IC. An alternative possibility is that the shift in the response to lower modulation frequencies in the IC reflects a reduction in temporal resolution. Such a reduction is suggested by an upper frequency limit of 600 Hz for phase-locking to pure tones in the IC, a substantially lower value than pertains in auditory nerve fibers (147). The mechanisms responsible have not been identified, but intrinsic membrane properties and synaptic mechanisms are possible candidates, as is the accumulated loss of temporal resolution en route from the periphery. The contribution of synaptic processing is now being investigated, but thus far blockade of inhibitory or excitatory mechanisms has failed to show any significant influence on the upper limit of synchronization. Neurons in the IC of the mustache bat seldom responded to a wider range of modulation frequencies following the blockade of GABAA, GABAB, or glycinergic inhibition (20). This finding is in contrast to the marked increase in the upper limit of synchronization in DNLL neurons in the same species with GABAergic blockade (310). Similarly, neither blockade of N-methyl-D-aspartate (NMDA) (20, 321) nor DL-␣amino-3-hydroxy-5-methylisoxazole-propionic acid (AMPA) excitatory receptors (321) resulted in changes in the upper limit of synchronization. Similarly, in chinchilla, Caspary et al. (28) found no change in the temporal response to AM with blockade of GABAA receptors, but they did report changes selectively affecting the low-frequency limb of rMTFs in some units. 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING E. Is AM Encoded in the IC by Rate or Synchronization? Whether AM is encoded in the IC by synchronization or by average firing rate remains an open question. Of course, both measures may be important either independently or combined as synchronized rate. tMTFs and rMTFs and BMFs match in many units, but in a significant percentage of neurons they are different, with, in some cases, no obvious dependence of rate on modulation frequency despite a clearly tuned tMTF (144, 152, 191, 222, 224). Population data on the MTF types obtained using synchronized or average rate measurements were reported in the cat (152). Seventy percent of units had bandpass rMTFs, and only 7% were low pass. In contrast, a much larger proportion of tMTFs showed low-pass functions (48%) compared with bandpass functions (33%), such that 60% of units with low-pass tMTFs had bandpass rMTFs. Nevertheless, the relationship between firing rate and modulation frequency that emerges in IC might signal a transformation in the encoding of AM from a temporal to a rate-based representation, and models have been proposed explaining how this might be achieved (105, 149, 160). A common approach invokes coincidence detection in IC neurons operating on synchronized responses to modulation from stellate cells in the cochlear nucleus. Although elegantly simulating many modulation responses of neurons in the IC, current implementations match the BMFs of the IC neuron and its inputs from the cochlear nucleus despite experimental data (cf. sects. V and VII and Fig. 9) which suggest that the BMF ranges are not the same. As this discussion has shown, synchronized responses to the modulation envelope are well maintained in the colliculus, and rMTFs are not simple reflections of tMTFs. It is premature, therefore, to conclude that temporal based encoding of the modulation envelope has no significance in the IC. Both rate and synchronized coding might be retained with different functional consequences. A rate code could allow the encoding of modulation frequencies that exceed the synchronization limit in the IC, and the data of Schreiner and Langner (251) support this conjecture as does the finding in squirrel monkey that the distribution of rMTFs peaks at a higher frequency than the distribution of tMTFs (191). On the other hand, some studies show that synchronization and rate measures extend over broadly similar ranges of modulation frequency (see sect. VIIIC). F. Relationship Between AM Responses and Other Neuronal Properties Possible functional relationships between response to AM and other physiological properties have not been Physiol Rev • VOL 559 well explored in the IC (at least partly because there is no generally accepted physiological classification scheme, as is the case for the CN). A variety of firing patterns to tones are recorded in the IC, and most authors have distinguished onset and sustained responses (see Refs. 112, 113 for review), which can be further subdivided into distinct classes (e.g., Refs. 221, 290). Such patterns depend on the state of intrinsic membrane conductances that in turn are modulated by inhibition (155, 209, 268). Both sustained and onset units can respond to continuous AM stimuli that last several seconds (144, 222, 224). Although some onset units fail to respond to AM, those that do respond at modulation depths well below 100% negating the argument that the response is effectively to a series of tone bursts. It does seem that onset units are the least likely to respond to AM. In both bat and the rat, most of the units failing to respond to modulation were onset types (32, 204). Other differences in the response to AM between different unit types are also beginning to emerge. In bat, average rBMFs increased progressively when comparing the responses of tonic, chopper, and onset neurons (32). Sinex et al. (267) report differences between unit types and their responses to sinusoidal and trapezoidal AM. Krishna and Semple (144) describe rMTFs with two peaks separated by a region of suppression. These were predominantly seen in units with sustained or pauser PST histograms. Onset or onset-sustained neurons showed only a single peak of enhancement. Another property of IC neurons correlating with the response to modulation is regularity of firing. Regular firing, as measured by calculating the coefficient of variation (320), is apparent in a number of different neuronal types (221). A preliminary report (225) shows that units with highly regular intrinsic oscillations show a strong correlation between tBMF and the oscillation frequency. On the other hand, cells with peaked rMTFs are mainly limited to neurons that fire irregularly to tones. G. Is Modulation Frequency Represented Topographically in the IC? Some of the responses discussed so far, in CN and IC, provide suggestive evidence for a physiological implementation of a modulation filter bank. This view would be strengthened if neurons were found to be spatially organized according to their AM tuning properties, since the creation of spatial maps is a common strategy in nervous systems. Evidence for a topographic representation of modulation frequency in the IC of cat was reported by Schreiner and Langner (251). rBMFs and tBMFs were determined for units encountered in multiple penetrations through the IC at recording sites reconstructed from the coordinates of the electrode penetration and the recording depth. The measured values, together with inter- 84 • APRIL 2004 • www.prv.org 560 JORIS, SCHREINER, AND REES polated points, were assembled to create a map of BMF. Two patterns of rBMF organization emerged. First, a gradient of rBMF extended along the dorsoventral axis of the colliculus with CF. Measurements of rBMF along such electrode penetrations revealed a progressive increase in rBMF with depth, although the overall trend was accompanied by discontinuities and reversals of rBMF. In addition, a map of BMF extended across the plane of the frequency-band laminae. The highest BMFs were found caudally in the lateral half of the lamina. Regions representing the highest BMFs were surrounded by “quasiconcentric” iso-BMF contours representing progressively lower BMFs. The diameter of the contour representing each BMF and the upper limit of BMF increased with CF. Thus, considered in three dimensions, each modulation frequency is represented on the surface of a cone having its base located in the high-frequency region of the IC and its long axis aligned with the dorsoventrally orientated tonotopic axis of the IC (Fig. 7). Schreiner and Langner (251) propose that this map demonstrates the importance of the IC in the perception of periodicity pitch and that such a representation could facilitate the integration of periodicity information across carrier frequency. In support of the map, they cite the corroborative evidence that response latency is spatially mapped across the frequency band laminae in the IC (153) and that BMF is negatively correlated with response latency. This implies that there should be a mapping of BMF along the same axis as the latency map. Evidence for a mapping of modulation frequency has also been reported in a developmental study in the gerbil with responses to the highest modulation frequencies found most laterally as in the cat (103). The publication of such a mapping of BMF has been influential in the development of theories and models of temporal processing in the auditory pathway (35–37, 105, 149). However, a correlation of BMF with location or with CF has not been confirmed in other studies; indeed, as discussed above, there is still debate about the range of modulation frequencies represented in the IC. Given the concentric organization of the modulation map described in the cat, it is unlikely that a pattern of such complexity would be found unless it were the primary objective of the study. But, as discussed in section VIIIC, the determination of BMFs from multiunit data, on which most of the mapping is based, must proceed with caution. On the other hand, it is difficult in single-unit studies to achieve the necessary sampling density that such mapping ideally requires. An additional complicating factor in this discussion is the lack of invariance of both tBMFs and rBMFs with stimulus level (144, 223). Resolution of this issue may depend on the development of techniques that enable the modulation response properties of large populations of neurons to be determined with high spatial and temporal resolution. Finally, it should be emphasized that the absence of a map would not invalidate the existence of a modulation filter bank. As an analogy, there is some evidence for a map of ITD tuning in the MSO (14, 269, 313), but a spatial organization in the IC has not been convincingly demonstrated (315). Nevertheless, the rele- FIG. 7. Bandpass temporal (A) and rate modulation transfer functions (B) in the inferior colliculus (IC), with indication of best modulation frequencies (tBMF and rBMF) and cutoff frequencies. Various definitions have been used for cutoff frequency, usually based on a decrease in gain (e.g., the frequency at which the synchronization value is 3 dB down from the maximal gain at the BMF) or statistical significance (e.g., the highest frequency at which significant synchronization is observed). C: schematic illustration of the proposed map of BMFs in the IC. The concentric circles indicate isoBMF contours within an iso-frequency plane. Dashed lines connect contours of the same BMF. Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING vance of ITD tuning for binaural hearing is not in question. H. Responses to Interaural Time Disparities in Modulation Envelopes Human subjects can localize sounds using on-going ITDs, generated by the amplitude envelope even when the carrier frequency of the sound is above 1.5 kHz and subjects can no longer localize using interaural time differences in the carrier (see sect. VI). Physiological responses to such binaurally disparate amplitude modulations were first investigated systematically by Yin et al. (317). Firing varied cyclically as a function of ITD, at a period equal to that of fm, indicating that the neurons were responding to the interaural delay of the modulation waveform, not of the carrier. In many respects, ITD sensitivity in the IC strongly resembles that in the SOC, e.g., it reflects the same two basic forms of interaction (see sect. VI and Fig. 5). There are also differences, indicating an elaboration of response properties between SOC and IC, but these are outside the scope of this review (13, 66, 172). The width of ITD tuning to sinusoidal signals is basically determined by the period of the stimulus. Low frequencies are weighted more heavily in responses based on envelope than in those based on fine structure, because envelope MTFs of IC cells typically extend further to low frequencies than their tuning to fine structure. ITD tuning therefore is typically broader to AM signals than to tones. However, even at high CFs, where phase-locking to fine structure is completely lacking, the ITD tuning to broadband noise can be surprisingly sharp (123). The presence of such tuning, in the absence of any ITD sensitivity to pure tones, indicates that envelope fluctuations generated by the interaction of the cochlear bandpass filters with the broadband stimulus can effectively be used in the computation of ITDs. I. Contribution of Nonlinearities For all but the lowest modulation depths, the response to a sinusoidal AM in the IC is not sinusoidal but more peaked with firing restricted to only part of the modulation cycle (144, 196, 222). As modulation depth is increased, changes also occur in the phase of the response histograms relative to the stimulus (144, 196, 222). Such changes are consistent with the response following the amplitude envelope at low modulation depths but changing to one which is sensitive to the rate of amplitude change at high depths. Sometimes this is associated with the appearance of a smaller second peak in the histogram indicative of a response to the downward amplitude change in the modulation cycle (222). Direct evidence for Physiol Rev • VOL 561 such responses comes from experiments using modulations with exponential envelopes (215). Similarly, asymmetries have been reported in both the rate and temporal responses of IC neurons in guinea pig to exponentially ramped and damped sinusoids (197). When such ramped and damped stimuli have the same half-life, their long-term spectra are identical, but their different temporal structures generate quite distinct percepts (205). The percentage of units showing asymmetry in the magnitude of their temporal or rate responses to these stimuli is greater than obtained using similar analyses in the VCN (216), and the proportion of neurons showing response asymmetry at each stimulus half-life closely matched human psychophysical performance (205). A few studies have investigated nonlinearities in the responses of IC neurons to AM using more complex modulation waveforms. Møller and Rees (189) recorded spike histograms synchronized to the period of a pseudorandom noise used to modulate a tone carrier. Cross-correlation of the pseudorandom noise with the histogram to obtain the impulse response followed by Fourier tranformation generates the tMTF. This estimate of the linear component of the response correlates well with responses obtained using sinusoidal modulation. An estimate of the nonlinear component can be obtained by using the impulse response to model the neuron, with the difference between the neuronal and model outputs providing a measure of the nonlinearities present in the neuronal response. The nonlinearities were predominantly even order, perhaps representing asymmetry in the response to increasing and decreasing sound intensity. Application of this technique to the owl IC similarly demonstrated the presence of significant nonlinearity (133). Such nonlinearities are more prominent in the response of IC neurons than those in the cochlear nucleus (184, 188). The AM stimulus that ultimately holds the greatest interest for auditory neuroscience is human speech. Delgutte et al. (40) compared the encoding of modulated noise and a speech utterance at the levels of the auditory nerve, CN, and IC. Step responses derived from the responses to modulation indicate that responses to amplitude changes in the IC are more phasic than those in the auditory nerve and, to a lesser extent, the CN. This was borne out by the responses to speech sounds that were characterized by bursts of activity at the onsets of syllables. When the responses to the speech waveform were estimated with the linear component of the modulation, the model accurately predicted the neural response for neurons in the auditory nerve and cochlear nucleus, but the match for the IC was poor. Although much less abundant than reports using sinusoidal modulation, these studies indicate that the emergence of nonlinear responses to modulated stimuli is a defining characteristic of processing in the IC, and the 84 • APRIL 2004 • www.prv.org 562 JORIS, SCHREINER, AND REES greater application of such nonsinusoidal AM stimuli is likely to add substantially to our knowledge of nonlinear mechanisms in the IC. IX. AMPLITUDE MODULATION ENCODING IN AUDITORY THALAMUS AND CEREBRAL CORTEX A. Basic Layout of the Thalamocortical System The medial geniculate body (MGB) of the thalamus is an obligatory station for auditory information from the midbrain to the cerebral cortex. Based on cytoarchitecture, connectivities, and physiological response properties, three main thalamic regions can be defined (304). Similarly, auditory cortex consists of several distinct fields that can be grouped into core, belt, and parabelt regions according to connectivity and physiology (130, 218). We discuss the projection systems set up in the thalamus and their relationship with the parcellation of auditory cortex. The ventral division of the MGB (MGBv) is considered the principal part and is functionally distinguished by a clear tonotopy that is related to its laminar dendritic organization. The MGBv is functionally homogeneous with sharp frequency selectivity, short latencies, and low response thresholds. Several properties, such as the density of inhibitory interneurons, sharpness of tuning, onset latency, and strength of pure-tone phase-locking, vary systematically along the anterior-posterior axis, i.e., orthogonal to the frequency gradient (236). The axons from the ventral division terminate predominantly in tonotopically organized “core” areas of auditory cortex, specifically the primary auditory cortex (AI) as well as the anterior and posterior auditory fields (AAF and PAF, respectively) in the cat and field R in the macaque monkey. The projections from MGBv also reflect the anterior-posterior gradients so that, for example, AAF in the cat receives stronger input from the anterior pole, whereas PAF and the ventroposterior auditory field (VPAF) are chiefly connected with the posterior pole. The same holds for the numerous corticothalamic feedback projections from the cortical core regions to the MGBv. Two further projection systems parallel to the tonotopic system have been identified. One “diffuse” or nontonotopic system is routed through the dorsal division of the MGB (MGBd). MGBd and its subdivisions are characterized by broad tuning, weak responses to tones, and some preference for more complex sounds. The dominant neurons are stellate cells, and the cortical projection is predominantly to nontonotopical fields in the belt and parabelt regions of auditory cortex such as the second auditory field (AII) in cat and CM in the macaque monkey. The third projection system is associated with the medial Physiol Rev • VOL division of the MGB (MGBm). This “magnocellular” area is characterized by fairly large multipolar cells and receives polysensory inputs. No clear tonotopic organization is evident, and the neurons are usually broadly tuned or have multiple response areas. MGBm projects to a wide range of cortical fields including areas in the core, belt, and parabelt regions, and it also receives widespread corticothalamic feedback. In addition, the dorsal and medial projection systems are distinguished by their termination predominantly in layers I and VI, while inputs from the main tonotopic system end in layers IV and III. Functional differences between the three projection systems and their associated regions have been mainly explored using spectral properties, such as frequency and intensity. Again, the importance of temporal dimensions in the perception of complex sounds suggests that much can be gained from the study of temporal response features in the different parts of auditory thalamus and cortex (31, 101, 210). B. Temporal Responses in the MGB Relatively few studies have addressed the capability of thalamic neurons to encode temporal information. A study of thalamic neurons in the awake guinea pig (34) revealed that some neurons phase-lock to AM tones with modulation frequencies up to 200 Hz. A more systematic study in the awake squirrel monkey (217) showed that most tMTFs were bandpass with tBMFs between 2 and 128 Hz. The most commonly encountered tBMF was at 32 Hz. MGBm had a higher median tBMF (16 Hz) than MGBv (8 Hz). Over the range of modulation frequencies tested, no significant difference was observed between rBMFs and tBMFs. This suggests that AM coding in the thalamus, at least below ⬃100 Hz, is mostly conveyed by a temporal code accompanied by rate changes due to the phasic nature of the responses. To date, there is little information available that directly contributes to the question of the increasing prominence of rate-coding in the more central auditory stations. Changes in modulation depth affect rate and synchronization differently; synchronization increased with increase in m, while the firing rate showed a nonmonotonic dependence. Changes in overall intensity of the AM signal resulted in either monotonic or nonmonotonic changes in firing rate and synchronization, with a higher percentage of nonmonotonic changes in synchronization. Recently, a number of studies in a variety of structures have utilized complex auditory spectra to estimate the spectrotemporal receptive field (STRF) of neurons using reverse correlation methods (e.g., Refs. 2, 38, 56, 58, 142, 143, 148). The STRF can be interpreted as the average signal preceding an action potential, corresponding to the spectrotemporal impulse response of the neuron. STRF 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING estimates of temporal resolution can be directly related to estimates using isolated AM sounds and would yield the same result in a linear system. Additionally, the use of complex spectra can reveal nonlinearities such as the dependence of the estimated filter shape on spectral and temporal depth of modulation and overall intensity. A recent analysis of temporal filter properties derived from STRFs in MGBv of ketamine-anesthetized cats (175) (Fig. 8) revealed a similar range of tBMFs (35 ⫾ 30 Hz) to that observed in the awake guinea pig and squirrel monkey (34, 217). As seen with isolated AM signals, individual neurons could follow modulation frequencies above 100 Hz. Compared with AM responses in the IC, it appears that the overall range of temporal following capacity in the auditory thalamus is considerably reduced (Fig. 9). A number of studies that have explored the coding of click trains in the auditory thalamus contribute significantly to our knowledge of temporal coding in the MGB. Changes in fm of an AM stimulus result in the systematic change of two potentially confounding aspects of the stimulus, namely, a change in the period between events and a change in the rise time of each event. To avoid the effects of rise-time changes with repetition rate, click trains have been widely used to explore temporal coding properties. While these two methods are not totally equivalent, they do capture closely related aspects of repetition rate coding. One of the first studies of temporal coding in the thalamus was carried out using click trains (284) in the awake, paralyzed cat. As in AM studies, maximum limiting rates (i.e., the highest click rate that showed any evidence of phase-locking) varied widely between 6 and 200 Hz. These findings were confirmed and expanded in a series of studies by Rouiller and colleagues (240, 241) in nitrous oxide-anesthetized cats. These investigators distinguished neurons by differences in the temporal precision of the responses. The largest group of neurons (“lockers,” 71%) showed tight temporal locking to the clicks. “Groupers” (8%) responded with weak temporal synchrony, and “special responders” (21%) showed no clear phase-locked responses although changes in firing rate did occur, occasionally resulting in strongest responses for click rates between 200 and 400 Hz. Overall, limiting rates between 10 and 800 Hz were observed, and ⬃50% of lockers had a limiting rate greater than 100 Hz. Keeping in mind that these limiting rates were not extracted at the 50% value of the transfer functions (the traditional measure of limiting rate), and the inherent differences between click-train analysis and AM analysis, the actual range of temporal resolution estimated by this method appears to be compatible with that observed in AM studies. Rouiller and De Ribaupierre (240) reported some differences between thalamic subdivisions regarding the percentage of lockers. More lockers were located in the anterior region of MGBv than in the posterior portion, and the highest limiting rates were also encountered in the anterior part. They observed no clear CF dependency for the distribution of lockers but noticed that the lockers had shorter latencies than groupers and special responders. Furthermore, lockers with limiting rates above 100 Hz had response latencies ⬃2–3 ms shorter than lockers with limiting rates below 100 Hz, similar to the latency-BMF correlation found in the IC (153). No obvious differences in the distribution and range of limiting rates were found between recordings made in the nitrous oxide-anesthetized and awake preparations. In summary, AM phase-locking in thalamic neurons varies over a wide range from a few Hertz to several hundred Hertz. Some neurons can follow high rates, but the majority of neurons appear to peak at rates below 100 Hz. A subgroup of neurons may respond to temporal information with changes in firing rate rather than in FIG. 8. tMTFs in the medial geniculate body (MGB) and primary auditory cortex (AI). Typical example tMTFs (synchronized firing rate) from neurons in the ventral division of the MGB (A) and in AI (anesthetized cat) (B). C: composite tMTFs for thalamus (dashed line) and cortex. By averaging all tMTFs for thalamic and cortical units separately, the temporal modulation filters of these two stations are approximated. The dotted lines indicate the 6-dB upper cut-off frequency. [Adapted from Miller et al. (176).] Physiol Rev • VOL 563 84 • APRIL 2004 • www.prv.org 564 JORIS, SCHREINER, AND REES FIG. 9. An overview of rMTF (left panel) and tMTF (right panel) properties at different anatomical levels. Each entry shows means or medians (circles) ⫾ SD (lines) and lowest and highest values (bar). Dark bars, thick lines, and solid circles are for rBMFs (left) and tBMFs (right); light bars, lines, and empty circles are for upper tMTF cutoff frequencies (right). For convenient comparison, the left panel is arranged mirror-symmetric with respect to the right. The population measures are taken from published data for one anatomical level, sublevel, or cell class; the numbered reference to the publication is shown next to the data, followed by a letter indicating the species (b, bat; c, cat; g, gerbil; gp, guinea pig; m, marmoset; r, rabbit; s, squirrel monkey), and the letter “U” if unanesthetized. Note that part of the differences between studies reflects differences in the metrics used (in particular upper cutoff, which is often defined as a corner frequency or alternatively as the upper limit of significant phase-locking). Approximate ranges of perceptual and sound classes are indicated below the abscissa. phase-locking; however, the proportion of such a group and its properties are still unexplored. It appears that the majority of neurons show limiting rates below that of the IC, but a detailed comparative study of the transformation of temporal coding from the IC to the MGB is still lacking. C. Responses to AM in Primary Auditory Cortex: Synchronization A number of studies provided initial evidence that temporal coding in auditory cortical neurons may be substantially reduced compared with subcortical levels (Fig. 9). Studies with FM and AM in the awake cat (300) and Physiol Rev • VOL guinea pig (34) showed neurons had maximum following rates of ⬍30 Hz. In later studies, the range of synchronization of AI neurons to AM was systematically explored in a variety of species. A high percentage of neurons showed band-pass tMTFs (53, 75, 157, 256). The tBMF values in AI were found to be independent of the CF of the neurons (53, 157, 256). Accordingly, temporal information in different frequency channels can be processed independently from each other; within each spectral band, AM information can be decomposed by different neurons into different AM ranges. Much attention has therefore been given to the distribution of optimal modulation frequencies. Preferred modulation frequencies commonly vary 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING between 1 and 40 Hz with the vast majority of tBMFs below 20 Hz. Across all studies, tBMFs above 50 Hz were encountered in only a very small percentage of neurons but could occasionally be as high as 100 Hz (17, 157, 255, 256). The composite tMTF in cat AI (ketamine anesthesia), constructed as the weighted sum of all tMTFs measured, shows a tBMF of 12.8Hz and a 50% cut-off frequency of 37.4 Hz (Fig. 8) (176). It is tempting to regard the presence of modulation tuning and the range of BMFs as a physiological implementation of a modulation filterbank (e.g., Ref. 35), but the functional consequences of these cortical (and subcortical) observations are at present unclear and should not be overstated. When “spatial-frequency channels” were first described in visual psychophysics and spatialfrequency tuning was later found physiologically, it was suggested that these channels formed the basis for a visual Fourier analysis of the retinal image, but this notion has been discredited (303). There is currently no unequivocal evidence that modulation tuning underlies an analysis of the modulation spectrum in the sense that the cochlea performs an analysis of stimulus spectrum. For example, will an envelope with a low fundamental (e.g., to speech syllables) but fast components (i.e., broad envelope spectrum) recruit neurons tuned to high modulation frequencies? Is the relative phase of different envelope components somehow reflected in neural synchronization or average rate? Even if modulation-tuned neurons do not perform a full envelope decomposition in the Fourier sense, it is easy to see that such envelope tuning could be useful in other ways. For example, modulation tuned channels could parse spectral stimulus components according to their dominant modulation frequency so that the spectral components with a common modulation frequency can be grouped in a further step. Differences in temporal processing between cortical neurons and their thalamic inputs are not only evident from population comparisons but were directly observed in functionally connected thalamocortical neuron pairs (34, 175) and were also evident in current source density analysis of the thalamic input and cortical output layers of AI (274). While these correlation studies reveal a reduction of temporal following capacities from MGBv to AI, the temporal modulation preferences in thalamus and cortex are not correlated by rank (175), i.e., thalamic cells with high (low) BMFs do not preferentially project to cortical cells with high (low) BMFs. These findings strongly suggest that a transformation of temporal response properties takes place at the thalamocortical interface. The width of the transfer function provides a measure of response selectivity. For individual neurons, the bandwidth of tMTFs, estimated at 50% of the maximum, is in the range of the BMF values but can vary by a factor of ⬎5 (53, 176, 256) in the anesthetized cat. Bandwidth Physiol Rev • VOL 565 variations of tMTFs in the awake marmoset monkey (157) are of similar magnitude. This means that AM selectivity varies considerably among cortical neurons but that overall the selectivity is relatively poor. Variations in species, anesthetic state, and estimation method between the different studies do not permit an easy comparison to sort out these different influences on envelope processing. However, it appears that neither anesthesia nor species-specific effects provide strong influences on the tBMF distribution of cortical neurons. This is not to say that there are no anesthetic effects; however, given the fairly large range of variability in the conditions of these studies, a simple group evaluation is unlikely to provide such evidence. The range for time-locked AM coding appears to be limited to the envelope frequencies underlying the perception of rhythm, roughness, and the following rate of syllables in communication sounds. The cortical coding of higher modulation frequencies, important for voicing or periodicity pitch information, does not seem to fully utilize the same temporal code. D. Responses to AM in Primary Auditory Cortex: Average Rate In view of the successive reduction in envelope synchronization already discussed for the different synaptic stages leading up to cortex, it is not too surprising to find the reduction in tBMF. Adverse effects on synchronization should however not necessarily affect rate tuning. For example, exquisite frequency and ITD selectivity in average rate is found at the cortical level and can be sharper than in the brain stem. Therefore, we expect to find envelope tuning in rMTFs, as it is already prominently present in the IC. Bandpass rMTFs are indeed found but appear less common than bandpass tMTFs. In the rat, ⬎90% of the tMTFs showed bandpass characteristics while only 30% of the rMTFs were bandpass (75). In AI of the awake squirrel monkey (17), this difference was less pronounced, with bandpass behavior for 49% of the tMTFs compared with 39% of rMTFs. The remaining neurons were either low pass, high pass, all pass, or had complex filter shapes. Similar results were reported for the cat (48). In awake marmosets, 73% of AI units had bandpass rMTFs, and many neurons were only driven when temporal modulations were present (157). An important difference with tMTF tuning is the consistent observation that the tuning for rMTFs extends to higher modulation frequencies, although it is still quite limited compared with the brain stem. There is also a fairly large variance, possibly related to the use of anesthesia, in the reported range of rBMFs and upper cut-off frequencies (e.g., as defined by a 50% reduction in rate) 84 • APRIL 2004 • www.prv.org 566 JORIS, SCHREINER, AND REES obtained across the various studies in AI (Fig. 9). The majority of rBMFs in anesthetized studies are below 50 Hz (46, 49, 53, 75, 256). Studies in awake animals (17, 34, 157, 190, 247, 260) yielded rBMFs that were either not substantially different from those in anesthetized animals or differed by less than a factor of two. The effect of anesthesia seems to affect the strength of the response (sustained in unanesthetized animals, onset under anesthesia) more than the range of BMFs. The reduction of the upper cut-off frequencies in tMTFs by anesthesia may be more substantial than on rMTFs (52, 83, 163) and may affect the temporal coding capacity for the highest temporally coded AM frequencies including the range of AM frequencies associated with the perceptual attributes of roughness and periodicity pitch (64). The general finding that BMFs and upper cut-off frequencies are higher in the rMTF than in the tMTF led Bieser and Müller-Preuss (17) to suggest that “low modulation rates were mostly encoded by phase-locked neural responses and the higher AM sounds by non-phaselocked spike rate variations.” While the experimental evidence for this claim was suggestive but not conclusive, Lu et al. (161) demonstrated more forcefully that this notion might indeed be true and proposed a two-stage model in which temporal modulations are combined over an integration window of ⬃30 ms; temporal patterns separated by intervals longer than 30 ms are coded explictly in temporal form, while more rapid patterns are coded implicitly by average rate. It is not entirely clear whether this scheme can fully account for the coding of modulations since, even in awake animals and for only a small fraction of the cells, rBMFs reach maximal values of only a few hundred Hertz. This is only an octave above the highest tBMFs (even when measured on the same cells, e.g., Ref. 157) and lower than the upper limit for periodicity pitch (⬃ 800 Hz) and modulation detection (⬃2.2 kHz). The markedly reduced cortical upper limit, particularly compared with the brain stem, is in stark contrast to the upper limit for ITD sensitivity to AM signals, which appears not to differ between cortex and brain stem and extends to modulation frequencies up to 1,000 Hz (awake rabbit, Ref. 67). Thus envelope-based ITD tuning created in the brain stem is relayed without degradation or recoding to AI, whereas this does not appear to be the case for AM bandpass tuning. Schulze and Langner (259, 261) suggested an alternative coding strategy; in AI of the awake as well as the anesthetized gerbil, these investigators showed rate tuning of cortical neurons to AM between 50 and 3,000 Hz, clearly outside the range of cortical phase-locking, but only when the carrier frequency was placed far above the cell’s CF. A preliminary study (171) reported similar sensitivity in the IC but attributed the mechanism to difference tones generated in the cochlea, i.e., interpreted it as Physiol Rev • VOL a spectral rather than a temporal effect. Since psychophysical studies indicate that the perception of periodicity pitch does not depend on difference tones, it is unclear whether the mechanism proposed by Schulze and Langner provides its neural basis, although the authors raise several indirect counterarguments against the role of difference tones as the explanation for their observations. Overall, then, the timing of cortical discharge encodes low modulation frequencies corresponding to the perceptual ranges characterized by rhythm and fluctuation strength (48, 53, 60) and, potentially, roughness (64, 255). A code based on the mean firing rate may represent fast AMs such as those associated with periodicity pitch, but it remains unclear whether these two coding strategies adequately explain AM coding over the entire perceptual range. E. Responses to AM in Primary Auditory Cortex: Influence of Modulation Parameters The results discussed above were mostly derived with a modulation depth (m) of 100%. Decrease in m results in monotonically reduced synchronization (60), especially for m ⬍ 0.5 (49). In the awake squirrel monkey, 86% of the neurons had maximum synchronization for 80 –100% modulation and showed a monotonic decrease with reduction of m. Average firing rate was essentially constant as function of modulation depth (17). Values of rBMF and tBMF were little affected by m in the awake marmoset (157). Changes in the overall intensity resulted in minor influences on BMF, cut-off frequency, and shape of the MTF (46, 157, 255). However, the firing rate showed a strong effect with intensity revealing a limited range of best levels (49). Phillips and colleagues (211, 212) noticed intensity-specific differences between the responses to low and high modulation frequencies. Better responses were observed for higher modulation frequencies at low intensities and for low modulation frequencies at higher intensities; that is, the shape of MTFs can be level dependent. The rMTF appears to be more resistant to changes in SPL than the tMTF (157). In a few studies, the effect of the modulation waveform was investigated. These observations suggest a common temporal window within which afferent signals are integrated. Rectangular AM resulted in stronger response synchrony than sinusoidal AM, but the tBMFs were similar (255, 256). Modulation with an exponential sine-wave envelope increased the sharpness of modulation tuning with decreasing duty cycle but showed no dramatic effects on BMF or cut-off frequency (49). Temporal synchronization to binaural beats (generated by binaural interaction in the brain stem, see sect. VI) also revealed cut-off frequencies of ⬍40 Hz (219). Moreover, results 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING from the awake primate (157) indicate that BMFs for AM and FM are often closely matched for single neurons. Using dynamic ripple spectra, i.e., spectral envelopes that are periodic along the frequency axis, to determine the temporal impulse response properties in AI by reverse correlation in anesthetized ferrets (142) and cats (175) revealed tBMFs that essentially overlapped with the value range seen in several other species estimated with AM tones. Direct comparison between two carrier types showed either no significant difference in the tBMFs for tonal and noise carriers (53, 217) or an average tBMF that is slightly lower for tonal carriers (49). This suggests that the carrier bandwidth may have little influence on temporal coding properties. F. Differences of Temporal Coding Between Cortical Fields In view of the differences between thalamic subdivisions in terms of thalamocortical connectivity (see sect. IXA) and temporal responses (see sect. IXB), it is of interest whether neurons in different cortical fields also differ in their ability to code temporal information (Fig. 9). Field AAF in the cat, a component of the core area like AI, shows evidence of higher BMFs and limiting rates than AI (111, 255, 256). There is some evidence of spatial clustering in AAF with faster following neurons more abundant for CFs above 10 kHz (53, 111, 255). Further evidence of faster following rates in AAF over AI has been obtained from STRF measurements in mice (159). The duration of STRFs from AAF was found to be shorter than in AI. Because STRF duration is inversely related to the BMF of tMTFs, it follows that AAF neurons have higher BMFs compared with AI. Another predictor for repetition following capacity is the onset latency of isolated CF tones or clicks. Schreiner and Raggio (253) reported a weak but significant negative correlation in cat AI for click latency and BMF, similar to results in the IC (153) and MGB (240). Onset latencies in AAF of cats (111) and mice (159) are shorter than in AI, further supporting the notion that AAF has a higher following capacity than AI. Cortical fields outside the core areas seem to perform at even lower temporal fidelity than that found in AI. In the cat, tBMFs and rBMFs of cortical fields AII, PAF, and VPAF were 20 – 80% of those seen for AI (53, 256). Similar results were found in the awake squirrel monkey (17). In the latter study, three groups of cortical fields could be distinguished based on their temporal properties. A group containing AI had average BMFs of ⬃8 Hz; a group that included the rostral field and the insula had BMFs of 4 Hz and below, and a group containing the anterior-lateral field had a predominance of BMFs around 2 Hz. Combined, these findings suggest that hierarchically “higher” auditory cortical fields primarily receiving input from thaPhysiol Rev • VOL 567 lamic projections other than the ventral nucleus appear to show slightly but consistently slower following capacity when tested with AM stimuli than primary cortical fields. G. Cortical Mechanisms The cause for the reduced temporal following capacity of cortical neurons compared with subcortical stations is still not entirely clear. A diversity of cellular and network properties are likely to affect cortical temporal behavior. These include mechanisms of adaptation and postexcitation suppression (19, 25, 116), postsuppression rebound (42, 47, 75), intrinsic oscillation (42, 75, 106, 134, 249), and synaptic depression (1, 169, 170). It has been suggested that tBMFs are largely determined by processes intrinsic to the cortical-thalamic network while cut-off frequency seems to be influenced by intrinsic pyramidal cell mechanisms (51). Models that include dynamic synaptic processes have been proposed that can account for many aspects of cortical responses to various repetitive signal envelopes, including sinusoidal AM stimuli (41, 54, 55). Eggermont (55) demonstrated that the envelope synchronization of cortical activity can be modeled based on two main components: the degree of input or presynaptic synchrony and the shape of a temporal filter that is determined by properties of synaptic dynamics. The input synchrony is highly dependent on the shape of the envelope waveform and reflects peripheral integrative mechanisms that determine response latency and spiking jitter (102). The properties of the synaptic dynamics are less stimulus dependent and reflect cortical synaptic activity changes after repeated stimulation that cause short-term synaptic depression or facilitation (1, 169, 170). The synaptic dynamic acts as a temporal low-pass filter on the synchronized input and is dominated by synaptic depression. This two-stage model of cortical modulation transformation holds great promise in unifying many aspects of temporal envelope processing (55) and other temporal behaviors of cortical neurons (41). It is likely, however, that other, conceivably nonlinear, influences also contribute to the shaping of MTFs. This is indicated by the observed relationships of onset latency and the period of intrinsic oscillations with BMF as well as the effects of spectral and temporal stimulus composition on cortical adaptation behavior (19, 280). H. Temporal Coding of Complex Sounds Most studies of complex multisyllable or multi“phrase” communication sounds in auditory cortex noted that neuronal responses were predominantely located at the beginning of each phrase provided that the phrases did not follow each other at rates of more than 20 –30 Hz. This effect was not dependent on the species-specific 84 • APRIL 2004 • www.prv.org 568 JORIS, SCHREINER, AND REES nature of the calls and was seen for speech sounds as well (50). For example, responses to bird songs in cat auditory cortex (273) showed preferred response intervals corresponding to ⬃10 Hz. Responses to species-specific calls in awake squirrel monkey (74), anesthetized squirrel monkey (192), and anesthetized marmoset (292) all showed “phrase”-locking in the response to repetitive call phrases around 8 –12 Hz. Similar values were obtained in the awake guinea pig to various bird and guinea pig vocalizations (34). Wang et al. (292) tested whether the temporal response to complex sounds was tuned like the response to more elemental sounds by using stretched and compressed natural vocalizations of marmosets, without changes in the spectral content of the calls. The responsiveness to the calls was maximal at the natural repetition rate of the phrases near 8 Hz. In other words, the tMTF of most neurons was tuned to the repetition rate of the natural call. Similarly, Nagarajan et al. (192) reported that the response modulation rates of cortical neurons activated by vocalizations in the marmoset monkey were highly correlated with the BMFs found for AM tones. The pulse repetitions in echolocation calls of bats are another example of temporal structures that require detailed processing by the auditory system. Phase-locked responses of cortical neurons in the bat occur over similar ranges as found for AM and click trains in other mammalian species. Sixty percent of BMFs in AI of Eptesicus fuscus were at or below 10 Hz but could be as high as 83 Hz (116). Pulse repetition coding in the awake FM bat Myotis lucifungus and the mustached bat Pteronotus parnellii had limiting rates of ⬃100 Hz (308) and up to 300 Hz (276), respectively, commensurate with the behaviorally relevant range of timing used in echolocation. A likely strategy for encoding of complex sounds in auditory cortex is by the temporal-spatial discharge pattern of distributed neuronal populations across the cortical fields (34, 207; see also Refs. 30, 39). Initial studies of the response of cortical neurons to vocalizations (34, 306, 307) combined with more recent studies of the detailed representation of species-specific vocalizations (192, 292) and speech sounds (309) in the primary auditory cortex of New World monkeys and cats provide evidence that behaviorally relevant vocalizations are well represented by spatially distributed but temporally highly coherent neuronal discharges. At major transitions during the course of the signal, a temporally coherent activation of specific neuronal subpopulations across the cortical fields is created. The synchronous timing of responses across many sites in primary auditory cortex (and in parallel in other cortical fields) may provide the necessary means for appropriate grouping or segregation of sequential elements in ongoing foreground and background sounds. The range of modulation frequencies spanned by cortical tMTFs of generally moderate selectivity may be sufficient to provide representational and, perhaps, perceptual invariPhysiol Rev • VOL ances of complex sound sequences despite potentially large variations in phoneme rate or in the sequence rate of musical tones. The distributed representation of temporal envelope information in each carrier frequency band allows a segregated processing of different temporal phenomena within a given frequency “channel” as well as processing of similar temporal aspects across frequency channels (194). I. Plasticity of Temporal Coding Properties in Auditory Cortex Studies of representational plasticity in auditory cortex of adult animals have largely focused on spectral properties, but several studies have recently examined temporal properties and reported use-dependent changes in the tMTF. Beitel et al. (15) trained owl monkeys to discriminate between two different, sequentially presented, AM rates and rewarded the animals when they correctly indicated that the second stimulus had a higher AM rate. The modulation frequencies were chosen to be in a range (4 – 40 Hz) where they could induce phaselocked cortical responses. Over the course of the training, AM discrimination thresholds gradually improved. Analysis of the tMTFs of the trained animals revealed that the shape of the transfer function changed dramatically. As a consequence, average limiting rate more than doubled from 12 Hz to ⬎30 Hz, and BMF increased from 8 to 15 Hz. This result indicates that temporal coding properties of cortical neurons can be modified by learning. Studies in rat AI investigated the influence of the statistics of the input signal on the reorganization of auditory cortex (138, 139). Stimulation of the nucleus basalis in the basal forebrain has been shown to increase the potential for cortical plasticity without explicit behavioral training of the animals (45, 92, 98, 174). Pairing of nucleus basalis stimulation with acoustic stimulation (139) caused pronounced changes in the tMTFs which depended on the temporal properties of the stimuli paired with the electrical stimulation (Fig. 10). A 20 – 40% increase of the BMF and cut-off frequency was observed when the modulation frequency of the acoustic stimulus was slightly higher than the normally observed values of the tMTFs. Pairing of electrical stimuli with modulation frequencies below the normal tBMF values caused a decrease in the neuronal cut-off frequencies. These results indicate that important aspects of temporal properties of the cortex undergo plastic reorganization, reflect aspects of the temporal statistics in the input stimuli, and can be modified by mechanisms involved in learning to match specific auditory tasks even in fully mature animals. 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING FIG. 10. Temporal plasticity in primary auditory cortex. Prolonged pairing of an AM stimulus (15-Hz trains of tones, random carrier frequencies) with electrical stimulation of the nucleus basalis resulted in a shift of the population tMTF to higher AM frequencies (dashed lines) compared with unstimulated control animals (solid line). [Adapted from Kilgard et al. (139).] X. NEUROPHYSIOLOGICAL AND PSYCHOLOGICAL STUDIES IN HUMANS A number of techniques are beginning to provide information about the analysis of AM in the human brain. MTFs generated from steady-state evoked potentials and magnetic responses to the envelope of modulated sounds (e.g., Refs. 146, 213, 220, 235, 239) are at least qualitatively similar to modulation sensitivity demonstrated psychophysically. It is noteworthy however that neither psychophysical nor event-related potential measures show much evidence of the bandpass tuning to modulation that is a feature of many single-unit responses. Estimates of group delay in evoked potentials (119, 146) suggest that responses to low fm are predominantly generated at the cortical level and those to high fm in the brain stem. Magnetic responses in auditory cortex suggest a mapping of modulation frequency that lies orthogonal to the tonotopic axis (151). These magnetic responses lock to the temporal envelope of speech signals, and the degree of locking correlates with speech comprehension (3). Functional imaging of the brain with functional magnetic resonance imaging (fMRI) has also been applied to the study of modulation: the repetition rate at which a tone burst best elicits a BOLD (blood oxygen level dependent) response decreases progressively from midbrain to thalamus to cortex, with values not dissimilar to those found in single-unit recordings from these structures in other Physiol Rev • VOL 569 mammals (97). A progressive shift in favor of low modulation frequencies at more central locations was also reported in an fMRI study using sinusoidally amplitude modulated white noise (77). In addition, this study also reported some evidence for restricted cortical regions responding better to low or high modulation frequencies but no systematic topographic representation of modulation frequency. At the cortical level other nonsensory factors are likely to play a role in the processing of modulation. Hall et al. (95) have demonstrated that activation of the planum temporale caudal to primary auditory cortex is influenced by attention to modulation. Ablations and lesions of auditory cortex have been shown to interfere with the processing of temporal tasks, such as the order of events (193), discrimination between 10- and 300-Hz trains of noise bursts (277), the detection of AM frequencies below but not above ⬃30 Hz (89), and the perception of periodicity pitch (299), to name a few examples. Studies in patients with primary cortical lesions resulting in “word deafness” also show evidence for deteriorated temporal processing capacities (88). In addition, it has been argued (87) that the pathway up to and including primary auditory cortex is not sufficient for the detection of continuous AM in humans. The range of these perceptual deficits encompasses the cortical range of temporal as well as the rate-encoded AM frequencies, corroborating the importance of the coding of envelope phenomena in auditory cortex and in some of the cortical regions to which it connects. XI. CONCLUSION Our examination of modulation processing at different anatomical levels reveals a patchy picture with many unsolved issues. As is generally the case in sensory systems, the representation evolves from isomorphic in the periphery to abstracted at the cortical level. Two general trends are clearly discernable with ascending levels: 1) a recoding of modulation selectivity from temporal form to average rate and 2) a decrease in the highest modulation frequencies encoded (either temporally or in average rate). While the first trend seems sensible, the second trend is puzzling, in particular the limited upper frequency limit at the cortical level. This observation, as well as others, may lead to the skeptical view that modulation encoding and selectivity at the different anatomical levels is epiphenomenal, in the sense that it is a necessary outcome of other properties (e.g., frequency tuning, adaptation, sensitivity to rise time, connectivity, membrane properties, synaptic dynamics) and that the gradual changes with anatomical level merely reflect change in these properties but do not indicate processing (e.g., the assembly of higher-order selectivities or the recoding of 84 • APRIL 2004 • www.prv.org 570 JORIS, SCHREINER, AND REES envelope synchronization into a spatially distributed rate code). However, if we ignore differences along the auditory neuraxis for a moment and take stock of the variety of responses reviewed, a rather optimistic view emerges of neural mechanisms dedicated to AM processing. Indeed, these responses show some of the key properties that are generally considered indicative for the coding of stimulus parameters. Tuning to modulation frequency is prominently present temporally and in average rate, and the range of optimal modulation frequencies so represented spans perceptually relevant ranges. The tuning can show invariance with SPL, modulation depth, and type of carrier and be predictive of the response to complex modulation waveforms in natural stimuli. There is even suggestive evidence for topographic mapping of modulation frequency. Selectivity to modulation waveforms or modulation paradigms more complex than the basic sinusoidally modulated tone are beginning to be reported. There are several neurobiological avenues to further explore and strengthen the case for dedicated modulation mechanisms and their link to perception. Review of the available data suggests that the most immediate gain, with existing tools, can be expected from inventive stimulus paradigms. Although sinusoidal AM may be considered a complex stimulus in the frequency domain, it is an elementary but simple stimulus in the modulation domain. The vast majority of studies of modulation processing have used single sinusoidal AM tones and have focused on modulation tuning. This is a necessary starting point, but to make a convincing case for the relevance of the tuning observed, the stimulus arsenal should be expanded. Current technology enables synthesis of more complex stimuli that are amenable to parametric exploration yet a step closer to natural stimuli. There are still basic unanswered questions to be addressed with sinusoidal AM, but it is equally clear that important properties and selectivities are only manifest with the use of nonsinusoidal envelopes or stimulus paradigms that involve modulation in ways that are closer to real-world tasks faced by the auditory system. Clever use of such paradigms is likely to make either the skeptical or optimistic view prevail. We are very grateful to A. Palmer and R. Batra for critical reading of the manuscript. During the preparation of this review, P. X. Joris was supported by the Fund for Scientific Research-Flanders Grants G.0297.98 and G.0083.02 and Research Fund K.U. Leuven Grant OT/01/42; C. E. Schreiner was supported by National Institutes of Health Grants DC-02260 and NS-34835; and A. Rees was supported by the Wellcome Trust. Address for reprint requests and other correspondence: P. X. Joris, Laboratory of Auditory Neurophysiology, K.U. Leuven, Campus Gasthuisberg, B-3000 Leuven, Belgium (E-mail: Philip.Joris@med.kuleuven.ac.be). Physiol Rev • VOL REFERENCES 1. Abbott LF, Varela JA, Sen K, and Nelson SB. Synaptic depression and cortical gain control. Science 275: 220 –224, 1997. 2. Aertsen AM and Johannesma PIM. The spectro-temporal receptive field. A functional characteristic of auditory neurons. Biol Cybern 42: 133–143, 1981. 3. Ahissar E, Nagarajan S, Ahissar M, Protopapas A, Mahncke H, and Merzenich MM. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc Natl Acad Sci USA 98: 13367–13372, 2001. 4. Anderson DJ, Rose JE, Hind JE, and Brugge JF. Temporal position of discharges in single auditory nerve fibers within the cycle of a sine-wave stimulus: frequency and intensity effects. J Acoust Soc Am 49: 1131–1139, 1971. 5. Atick J. Could information theory provide an ecological theory of sensory processing? Network 3: 213–251, 1992. 6. Attias H and Schreiner CE. Low-order temporal statistics of natural sounds. In: Advances in Neural Information Processing Systems 9, edited by M. C. Mozer, M. I. Jordan, and T. Petsche. Cambridge, MA: MIT Press, 1997, p. 27–33. 7. Attias H and Schreiner CE. Coding of naturalistic stimuli by auditory midbrain neurons. In: Advances in Neural Information Processing Systems 10, edited by M. I. Jordan, M. Kearns, and S. Solla. Cambridge, MA: MIT Press, 1998, p. 103–109. 8. Bacon SP and Grantham DM. Modulation masking: effects of modulation frequency, depth, and phase. J Acoust Soc Am 85: 2575–2588, 1989. 9. Bacon SP and Viemeister NF. Temporal modulation transfer functions in normal hearing and hearing-impaired listeners. Audiology 24: 117–134, 1985. 10. Batra R, Kuwada S, and Fitzpatrick DC. Sensitivity to interaural temporal disparities of low- and high-frequency neurons in the superior olivary complex. I. Heterogeneity of responses. J Neurophysiol 78: 1222–1236, 1997. 11. Batra R, Kuwada S, and Fitzpatrick DC. Sensitivity to interaural temporal disparities of low- and high-frequency neurons in the superior olivary complex. II. Coincidence detection. J Neurophysiol 78: 1237–1247, 1997. 12. Batra R, Kuwada S, and Stanford TR. Temporal coding of envelopes and their interaural delays in the inferior colliculus of the unanesthetized rabbit. J Neurophysiol 61: 257–268, 1989. 13. Batra R, Kuwada S, and Stanford TR. High-frequency neurons in the inferior colliculus that are sensitive to interaural delays of amplitude-modulated tones— evidence for dual binaural influences. J Neurophysiol 70: 64 – 80, 1993. 14. Beckius GE, Batra R, and Oliver DL. Axons from anteroventral cochlear nucleus that terminate in medial superior olive of cat: observations related to delay lines. J Neurosci 19: 3146 –3161, 2001. 15. Beitel R, Schreiner CE, Wang X, Cheung S, Jenkins W, and Merzenich MM. Effects of psychophysical training on the entrainment of primary auditory cortical neurons to amplitude modulated tones. Soc Neurosci Abstr 21: 1180, 1995. 16. Bernstein LR and Trahiotis C. Detection of interaural delay in high-frequency sinusoidally amplitude-modulated tones, two-tone complexes, and bands of noise. J Acoust Soc Am 95: 3561–3567, 1994. 17. Bieser A and Müller-Preuss P. Auditory responsive cortex in the squirrel monkey: neural responses to amplitude-modulated sounds. Exp Brain Res 108: 273–284, 1996. 18. Brawer JR, Morest DK, and Kane EC. The neuronal architecture of the cochlear nucleus of the cat. J Comp Neurol 155: 251–282, 1974. 19. Brosch M and Schreiner CE. Sequence selectivity of neurons in cat primary auditory cortex. Cereb Cortex 10: 1155–1167, 2000. 20. Burger RM and Pollak GD. Analysis of the role of inhibition in shaping responses to sinusoidally amplitude-modulated signals in the inferior colliculus. J Neurophysiol 80: 1686 –1701, 1998. 21. Burns EM and Viemeister NF. Nonspectral pitch. J Acoust Soc Am 60: 863– 869, 1976. 22. Burns EM and Viemeister NF. Played-again SAM: further observations on the pitch of amplitude-modulated noise. J Acoust Soc Am 70: 1655–1660, 1981. 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING 23. Buunen TJF and Rhode WS. Responses of fibers in the cat’s auditory nerve to the cubic difference tone. J Acoust Soc Am 64: 772–781, 1978. 24. Caird D. Processing in the colliculus. In: The Neurobiology of Hearing: The Central Auditory System, edited by R. A. Altschuler, R. P. Bobbin, B. M. Clopton, and D. W. Hoffman. New York: Raven, 1991, p. 253–292. 25. Calford MB and Semple MN. Monaural inhibition in cat auditory cortex. J Neurophysiol 73: 1876 –1891, 1995. 26. Cant NB and Benson CG. Parallel auditory pathways: projection patterns of the different neuronal populations in the dorsal and ventral cochlear nuclei. Brain Res Bull 60: 457– 474, 2003. 27. Cariani PA and Delgutte B. Neural correlates of the pitch of complex tones. II. Pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch, and the dominance region for pitch. J Neurophysiol 76: 1717–1734, 1996. 28. Caspary DM, Palombi PS, and Hughes LF. GABAergic inputs shape responses to amplitude modulated stimuli in the inferior colliculus. Hear Res 168: 163–173, 2002. 29. Caspary DM, Rupert AL, and Moushegian G. Neuronal coding of vowel sounds in the cochlear nuclei. Exp Neurol 54: 414 – 431, 1997. 30. Chistovich LA, Lublinskaja VV, Malinnikova EA, Ogorodnikova EA, Stoljarova EI, and Zhukov SJ. Temporal processing of peripheral auditory patterns of speech. In: The Representation of Speech in the Peripheral Auditory System, edited by R. Carlson and B. Granstrom. Amsterdam: Elsevier, 1982, p. 165–180. 31. Clarey JC, Barone P, and Imig TJ. Physiology of thalamus and cortex. In: The Mammalian Auditory Pathway: Neurophysiology, edited by A. N. Popper and R. R. Fay. New York: Springer, 1992, p. 232–334. 32. Condon CJ, White KR, and Feng AS. Neurons with different temporal firing patterns in the inferior colliculus of the little brown bat differentially process sinusoidal amplitude-modulated signals. J Comp Physiol A Sens Neural Behav Physiol 178: 147–157, 1996. 33. Cooper NP, Robertson D, and Yates GK. Cochlear nerve fiber responses to amplitude-modulated stimuli: variations with spontaneous rate and other response characteristics. J Neurophysiol 70: 370 –386, 1993. 34. Creutzfeldt OD, Hellweg FC, and Schreiner CE. Thalamocortical transformation of responses to complex auditory stimuli. Exp Brain Res 39: 87–104, 1980. 35. Dau T, Kollmeier B, and Kohlrausch A. Modelling auditory processing of amplitude modulation. I. Detection of masking with narrow-band carriers. J Acoust Soc Am 102: 2892–2905, 1997. 36. Dau T, Kollmeier B, and Kohlrausch A. Modeling auditory processing of amplitude modulation. II. Spectral and temporal integration. J Acoust Soc Am 102: 2906 –2919, 1997. 37. Dau T, Verhey J, and Kohlrausch A. Intrinsic envelope fluctuations and modulation-detection thresholds for narrow-band noise carriers. J Acoust Soc Am 106: 2752–2760, 1999. 38. Decharms RC, Blake DT, and Merzenich MM. Optimizing sound features for cortical neurons. Science 280: 1439 –1443, 1998. 39. Delgutte B. Auditory neural processing of speech. In: The Handbook of Phonetic Sciences, edited by W. J. Hardcastle and J. Laver. Oxford, UK: Blackwell, 1997, p. 507–538. 40. Delgutte B, Hammond BM, and Cariani PA. Neural coding of the temporal envelope of speech: relation to modulation transfer functions. In: Psychophysical and Physiological Advances in Hearing, edited by A. R. Palmer, A. Rees, A. Q. Summerfield, and R. Meddis. London: Whurr, 1997, p. 595– 603. 41. Denham SL and Denham MJ. An investigation into the role of cortical synaptic depression in auditory processing. In: Emergent Neural Computational Architectures Based on Neuroscience: Towards Neuroscience-Inspired Computing, edited by S. Wermter, D. J. WIllshaw, and J. Austin. Berlin: Springer, 2001, p. 494 –506. 42. Dinse HR, Krueger K, Akhavan AC, Spengler F, Schoenor G, and Schreiner CE. Low-frequency oscillations of visual, auditory and somatosensory cortical neurons evoked by sensory stimulation. Int J Psychophysiol 26: 205–227, 1997. 43. Drew T and Doucet S. Application of circular statistics to the study of neuronal discharge during locomotion. J Neurosci Methods 38: 171–181, 1991. Physiol Rev • VOL 571 44. Drullman R, Festen JM, and Houtgast T. Effect of temporal modulation reduction on spectral contrasts in speech. J Acoust Soc Am 99: 2358 –2364, 1996. 45. Edeline JM, Hars B, Maho C, and Hennevin E. Transient and prolonged facilitation of tone-evoked responses induced by basal forebrain stimulations in the rat auditory cortex. Exp Brain Res 97: 373–386, 1994. 46. Eggermont JJ. Rate and synchronization measures of periodicity coding in cat primary auditory cortex. Hear Res 56: 153–167, 1991. 47. Eggermont JJ. Stimulus induced and spontaneous rhythmic firing of single units in cat primary auditory cortex. Hear Res 61: 1–11, 1992. 48. Eggermont JJ. Differential effects of age on click-rate and amplitude modulation-specific coding in primary auditory cortex of the cat. Hear Res 74: 51– 66, 1993. 49. Eggermont JJ. Temporal modulation transfer function for AM and FM stimuli in cat auditory cortex. Effects of carrier type, modulating waveform and intensity. Hear Res 74: 51– 66, 1994. 50. Eggermont JJ. Representation of a voice onset time continuum in primary auditory cortex of the cat. J Acoust Soc Am 98: 911–920, 1995. 51. Eggermont JJ. How homogeneous is cat primary auditory cortex? Evidence from simultaneous single-unit recordings. Audit Neurosci 2: 79 –96, 1996. 52. Eggermont JJ. Firing rate and firing synchrony distinguish dynamic from steady state sound. Neuroreport 8: 2709 –2713, 1997. 53. Eggermont JJ. Representation of spectral and temporal sound features in three cortical fields of the cat. Similarities outweigh differences. J Neurophysiol 80: 2743–2764, 1998. 54. Eggermont JJ. The magnitude and phase of temporal modulation transfer functions in cat auditory cortex. J Neurosci 19: 2780 –2788, 1999. 55. Eggermont JJ. Temporal modulation transfer functions in cat primary auditory cortex: separating stimulus effects from neural mechanisms. J Neurophysiol 78: 305–321, 2002. 56. Eggermont JJ, Johannesma PIM, and Aertsen AMHJ. Reversecorrelation methods in auditory research. Q Rev Biophys 16: 341– 414, 1983. 57. Erulkar SD, Butler RA, and Gerstein GL. Excitation and inhibition in cochlear nucleus. II. Frequency-modulated tones. J Neurophysiol 31: 537–548, 1968. 58. Escabi MA, Schreiner CE, and Miller LM. Dynamic time-frequency processing in the cat midbrain, thalamus, and auditory cortex: spectrotemporal receptive fields obtained using dynamic ripple spectra. Soc Neurosci Abstr 24: 1879, 1998. 59. Evans EF. Cochlear nerve and cochlear nucleus. In: Handbook of Sensory Physiology, edited by W. D. Keidel and W. D. Neff. Berlin: Springer, 1975, p. 1–108. 60. Fastl H, Hesse A, Schorer E, Urbas J, and Müller-Preuss P. Searching for neural correlates of the hearing sensation fluctuation strength in the auditory cortex of squirrel monkeys. Hear Res 23: 199 –203, 1986. 61. Fenton MB. Natural history and biosonar signals. In: Hearing by Bats, edited by R. R. Fay and A. N. Popper. New York: Springer, 1995, p. 37– 86. 62. Fernald RD and Gerstein GL. Response of cat cochlear nucleus neurons to frequency and amplitude modulated tones. Brain Res 45: 417– 435, 1972. 63. Fettiplace R and Fuchs PA. Mechanisms of hair cell tuning. Annu Rev Physiol 61: 809 – 834, 1999. 64. Fishman YI, Reser DH, Arezzo JC, and Steinschneider M. Complex tone processing in primary auditory cortex of the awake monkey. I. Neural ensemble correlates of roughness. J Acoust Soc Am 108: 235–246, 2000. 65. Fitzgerald JV, Burkitt AN, Clark GM, and Paolini AG. Delay analysis in the auditory brainstem of the rat: comparison with click latency. Hear Res 159: 85–100, 2001. 66. Fitzpatrick DC, Batra R, Stanford TR, and Kuwada S. A neuronal population code for sound localization. Nature 388: 871– 874, 1997. 67. Fitzpatrick DC, Kuwada S, and Batra R. Neural sensitivity to interaural time differences: beyond the Jeffress model. J Neurosci 20: 1605–1615, 2000. 84 • APRIL 2004 • www.prv.org 572 JORIS, SCHREINER, AND REES 68. Forest TG and Green DM. Detection of partially filled gaps in noise and the temporal modulation transfer function. J Acoust Soc Am 82: 1933–1943, 1987. 69. Frisina RD. Subcortical neural coding mechanisms for auditory temporal processing. Hear Res 158: 1–27, 2001. 70. Frisina RD, Smith RL, and Chamberlain SC. Differential encoding of rapid changes in sound amplitude by second-order auditory neurons. Exp Brain Res 60: 417– 422, 1985. 71. Frisina RD, Smith RL, and Chamberlain SC. Encoding of amplitude modulation in the gerbil cochlear nucleus. I. A hierarchy of enhancement. Hear Res 44: 99 –122, 1990. 72. Frisina RD, Smith RL, and Chamberlain SC. Encoding of amplitude modulation in the gerbil cochlear nucleus. II. Possible neural mechanisms. Hear Res 44: 123–142, 1990. 73. Frisina RD, Walton JP, and Karcich KJ. Dorsal cochlear nucleus single neurons can enhance temporal processing capabilities in background noise. Exp Brain Res 102: 160 –164, 1994. 74. Funkenstein HH and Winter P. Responses to acoustic stimuli of units in the auditory cortex of awake squirrel monkeys. Exp Brain Res 18: 464 – 488, 1973. 75. Gaese BH and Ostwald J. Temporal coding of amplitude and frequency modulation in the rat auditory cortex. Eur J Neurosci 7: 438 – 450, 1995. 76. Geisler CD. From Sound to Synapse. Oxford, UK: Oxford Univ. Press, 1998. 77. Giraud A, Lorenzi C, Ashburner J, Wable J, Johnsrude I, Frackowiak RSJ, and Kleinschmidt A. Representation of the temporal envelope of sounds in the human brain. J Neurophysiol 84: 1588 –1598, 2000. 78. Glattke TJ. Unit responses of the cat cochlear nucleus to amplitude-modulated stimuli. J Acoust Soc Am 45: 419 – 425, 1968. 79. Godfrey DA, Kiang NYS, and Norris BE. Single unit activity in the dorsal cochlear nucleus of the cat. J Comp Neurol 162: 269 – 284, 1975. 80. Godfrey DA, Kiang NYS, and Norris BE. Single unit activity in the posteroventral cochlear nucleus of the cat. J Comp Neurol 162: 247–268, 1975. 81. Goldberg JM and Brown PB. Response of binaural neurons of dog superior olivary complex to dichotic tonal stimuli: some physiological mechanisms of sound localization. J Neurophysiol 22: 613– 636, 1969. 82. Goldberg JM and Brownell WE. Discharge characteristics of neurons in anteroventral and dorsal cochlear nuclei of cat. Brain Res 64: 35–54, 1973. 83. Goldstein MH Jr, De Ribaupierre F, and Brown RM. Responses of the auditory cortex to repetitive acoustic stimuli. J Acoust Soc Am 31: 356 –364, 1959. 84. Green GG and Kay RH. Channels in the human auditory system concerned with the waveform of modulation present in amplitude and frequency-modulated tones. J Physiol 241: 50 –52, 1974. 85. Greenberg S. Possible role of low and medium spontaneous rate cochlear nerve fibers in the encoding of waveform periodicity. In: Auditory Frequency Selectivity, edited by B. C. J. Moore and R. D. Patterson. New York: Plenum, 1986, p. 241–248. 86. Greenwood DD and Joris PX. Mechanical and “temporal” filtering as codeterminants of the response by cat primary fibers to amplitude-modulated signals. J Acoust Soc Am 99: 1029 –1039, 1996. 87. Griffiths TD, Penhune V, Peretz I, Dean JL, Patterson RD, and Green GG. Frontal processing and auditory perception. Neuroreport 11: 919 –922, 2000. 88. Griffiths TD, Rees A, and Green GG. Disorders of human complex sound processing. Neurocase 5: 365–378, 1999. 89. Grigoreva TI, Figurina II, and Vasilev AG. Role of the medial geniculate body in the production of conditioned reflexes to amplitude-modulated stimuli in rats. Zh Vyssh Nervn Deyat 37: 265– 271, 1988. 90. Grimault N, Bacon SP, and Micheyl C. Auditory stream segregation on the basis of amplitude-modulation rate. J Acoust Soc Am 111: 1340 –1348, 2002. 91. Grothe B. Interaction of excitation and inhibition in processing of pure tone and amplitude-modulated stimuli in the medial superior olive of the mustached bat. J Neurophysiol 71: 706 –721, 1994. Physiol Rev • VOL 92. Gu Q and Singer W. Effects of intracortical infusion of anticholinergic drugs on neuronal plasticity in kitten striate cortex. Eur J Neurosci 5: 475– 485, 1993. 93. Gummer AW and Johnstone BM. Group delay measurement from spiral ganglion cells in the basal turn of the guinea pig cochlea. J Acoust Soc Am 76: 1388 –1400, 1984. 94. Gummer M, Yates GK, and Johnstone BM. Modulation transfer function of efferent neurones in the guinea pig cochlea. Hear Res 36: 41–52, 1988. 95. Hall DA, Haggard MP, Akeroyd MA, Summerfield AQ, Palmer AR, Elliott MR, and Bowtell RW. Modulation and task effects in auditory processing measured using fMRI. Hum Brain Mapp 10: 107–119, 2000. 96. Hall JW, Haggard MP, and Fernandes MA. Detection in noise by spectro-temporal pattern analysis. J Acoust Soc Am 76: 50 –56, 1984. 97. Harms MP and Melcher JR. Sound repetition rate in the human auditory pathway: representations in the waveshape and amplitude of FMRI activation. J Neurophysiol 88: 1433–1450, 2002. 98. Hars B, Maho C, Edeline JM, and Hennevin E. Basal forebrain stimulation facilitates tone-evoked responses in the auditory cortex of awake rat. Neuroscience 56: 61–74, 1993. 99. Hartmann WM. The physical description of signals. In: Hearing, edited by B. C. J. Moore. San Diego, CA: Academic, 1995, p. 1– 40. 100. Hartmann WM. Signals, Sound, and Sensation. New York: Springer, 1997. 101. Heil P. Representation of sound onsets in the auditory system. Audiol Neuro-otolaryngol 6: 167–172, 2001. 102. Heil P and Neubauer H. Temporal integration of sound pressure determines thresholds of auditory-nerve fibers. J Neurosci 21: 7404 –7415, 2001. 103. Heil P, Schulze H, and Langner G. Ontogenetic development of periodicity in the inferior colliculus of the mongolian Gerbil. Audit Neurosci 1: 363–383, 1995. 104. Henning GB and Ashton J. The effect of carrier and modulation frequency on lateralization based on interaural phase and interaural group delay. Hear Res 4: 185–194, 1981. 105. Hewitt MJ and Meddis R. A computer model of amplitude-modulation sensitivity of single units in the inferior colliculus. J Acoust Soc Am 95: 2145–2159, 1994. 106. Horikawa J, Tanahashi A, and Suga N. After-discharges in the auditory cortex of the mustached bat: no oscillatory discharges for binding auditory information. Hear Res 76: 45–52, 1994. 107. Houtgast T. Frequency selectivity in amplitude-modulation detection. J Acoust Soc Am 85: 1676 –1680, 1989. 108. Houtgast T and Steeneken HJM. The modulation transfer function in room acoustics as a predictor of speech intelligibility. Acustica 28: 66 –73, 1973. 109. Huffman RF, Argeles PC, and Covey E. Processing of sinusoidally amplitude modulated signals in the nuclei of the lateral lemniscus of the big brown bat, Eptesicus fuscus. Hear Res 126: 181–200, 1998. 110. Huffman RF and Henson OW Jr. The descending auditory pathway and acousticomotor systems: connections with the inferior colliculus. Brain Res Rev 15: 295–323, 1990. 111. Imaizumi K, Priebe NJ, Crum PAC, Bedenbaugh PH, Cheung SW, and Schreiner CE. Modular functional organization in cat anterior auditory field (Abstract). Program No. 488.6. 2003 Abstract Viewer/Itinerary Planner. Washington, DC: Soc. Neurosci, 2003, Online. 112. Irvine DRF. The Auditory Brainstem: A Review of the Structure and Function of Auditory Brainstem Processing Mechanisms. Berlin: Springer-Verlag, 1986. 113. Irvine DRF. Physiology of the auditory brainstem. In: The Mammalian Auditory Pathway: Neurophysiology, edited by A. N. Popper and R. R. Fay. New York: Springer-Verlag, 1992, p. 153–231. 114. Javel E. Coding of AM tones in the chinchilla auditory nerve: implications for the pitch of complex tones. J Acoust Soc Am 68: 133–146, 1980. 115. Javel E and Mott JB. Physiological and psychophysical correlates of temporal processes in hearing. Hear Res 34: 275–294, 1988. 116. Jen PHS, Hou T, and Wu M. Neurons in the inferior colliculus, auditory cortex and pontine nuclei of the FM bat, Eptesicus fucus, 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. respond to pulse repetition rates differently. Brain Res 613: 152– 155, 1993. Jenison RL. A Dynamic Model of the Auditory Periphery Based on the Responses of Single Auditory-Nerve Fibers (PhD thesis). Madison: Univ. of Wisconsin, 1991. Jiang D, Palmer AR, and Winter IM. Frequency extent of twotone facilitation in onset units in the ventral cochlear nucleus. J Neurophysiol 75: 380 –395, 1996. John MS and Picton TW. Human auditory steady-state responses to amplitude-modulated tones: phase and latency measurements. Hear Res 141: 57–79, 2000. Johnson DH. The Response of Single Auditory-Nerve Fibers in the Cat to Single Tones: Synchrony and Average Discharge Rate (PhD thesis). Cambridge, MA: MIT, 1974. Johnson DH. The relationship between spike rate and synchrony in responses of auditory-nerve fibers to single tones. J Acoust Soc Am 68: 1115–1122, 1980. Joris PX. Envelope coding in the lateral superior olive. II. Characteristic delays and comparison with responses in the medial superior olive. J Neurophysiol 76: 2137–2156, 1996. Joris PX. Interaural time sensitivity dominated by cochlea-induced envelope patterns. J Neurosci 23: 6345– 6350, 2003. Joris PX, Carney LHC, Smith PH, and Yin TCT. Enhancement of synchronization in the anteroventral cochlear nucleus. I. Responses to tonebursts at the characteristic frequency. J Neurophysiol 71: 1022–1036, 1994. Joris PX and Smith PH. Temporal and binaural properties in dorsal cochlear nucleus and its output tract. J Neurosci 18: 10157– 10170, 1998. Joris PX, Smith PH, and Yin TCT. Coincidence detection in the auditory system: 50 years after Jeffress. Neuron 21: 1235–1238, 1998. Joris PX and Yin TCT. Responses to amplitude-modulated tones in the auditory nerve of the cat. J Acoust Soc Am 91: 215–232, 1992. Joris PX and Yin TCT. Envelope coding in the lateral superior olive. I. Sensitivity to interaural time differences. J Neurophysiol 73: 1043–1062, 1995. Joris PX and Yin TCT. Envelope coding in the lateral superior olive. III. Comparison with afferent pathways. J Neurophysiol 79: 253–269, 1998. Kaas JH and Hackett TA. Subdivisions of auditory cortex and processing streams in primates. Proc Natl Acad Sci USA 97: 11793– 11799, 2000. Kay RH. Hearing of modulation in sounds. Physiol Rev 62: 894 – 975, 1982. Kay RH and Matthews DR. On the existence in human auditory pathways of channels selectively tuned to the modulation present in frequency-modulated tones. J Physiol 225: 657– 677, 1972. Keller CH and Takahashi TT. Representation of temporal features of complex sounds by the discharge patterns of neurons in the owl’s inferior colliculus. J Neurophysiol 84: 2638 –2650, 2000. Kenmochi M and Eggermont JJ. Autonomous cortical rhythms affect temporal modulation transfer functions. Neuroreport 8: 1589 –1593, 1997. Khanna SM and Teich MC. Spectral characteristics of the responses of primary auditory-nerve fibers to amplitude-modulated signals. Hear Res 39: 143–158, 1989. Kiang NYS. Peripheral neural processing of auditory information. In: Handbook of Physiology. The Nervous System. Sensory Processes. Bethesda, MD: Am Physiol Soc, 1984, sect. 1, vol. III, pt. 2, chapt. 15, p. 639 – 674. Kiang NYS. Curious oddments of auditory-nerve studies. Hear Res 49: 1–16, 1990. Kilgard MP and Merzenich MM. Plasticity of temporal information processing in the primary auditory cortex. Nature Neurosci 1: 727–731, 1998. Kilgard MP, Pandya PK, Vazquez J, Gehi A, Schreiner CE, and Merzenich MM. Sensory input directs spatial and temporal plasticity in primary auditory cortex. J Neurophysiol 86: 326 –338, 2001. Kim DO, Rhode WS, and Greenberg S. Responses of cochlear nucleus neurons to speech signals: neural encoding of pitch, intensity and other parameters. In: Auditory Frequency Selectivity, Physiol Rev • VOL 141. 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. 159. 160. 161. 162. 573 edited by B. C. J. Moore and R. D. Patterson. New York: Plenum, 1986, p. 281–288. Kim DO, Sirianni JG, and Chang SO. Responses of DCN-PVCN neurons and auditory nerve fibers in unanesthetized cats to AM and pure tones: analysis with autocorrelation/power-spectrum. Hear Res 45: 95–113, 1990. Kowalski N, Depireux DA, and Shamma SA. Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra. J Neurophysiol 76: 3505–3523, 1996. Kowalski N, Depireux DA, and Shamma SA. Analysis of dynamic spectra in ferret primary auditory cortex. II. Prediction of unit responses to arbitrary dynamic spectra. J Neurophysiol 76: 3524 –3534, 1996. Krishna SB and Semple MN. Auditory temporal processing: responses to sinusoidally amplitude-modulated tones in the inferior colliculus. J Neurophysiol 84: 255–273, 2000. Kuwada S and Batra R. Coding of sound envelopes by inhibitory rebound in neurons of the superior olivary complex in the unanesthetized rabbit. J Neurosci 19: 2273–2287, 1999. Kuwada S, Batra R, and Maher VL. Scalp potentials of normal and hearing-impaired subjects in response to sinusoidally amplitude-modulated tones. Hear Res 21: 179 –192, 1986. Kuwada S, Yin TCT, Syka J, Buunen TJF, and Wickesberg RE. Binaural interaction in low-frequency neurons in inferior colliculus of the cat. IV. Comparison of monaural and binaural response properties. J Neurophysiol 51: 1306 –1325, 1984. Kvale M and Schreiner CE. Perturbative M-sequences for auditory systems identification. Acustica 83: 653– 658, 1997. Langner G. Periodicity coding in the auditory system. Hear Res 6: 115–142, 1992. Langner G. Neural processing and representation of periodicity pitch. Acta Otolaryngol Suppl 532: 68 –76, 1997. Langner G, Sams M, Heil P, and Schulze H. Frequency and periodicity are represented in orthogonal maps in the human auditory cortex: evidence from magnetencephalography. J Comp Physiol A Sens Neural Behav Physiol 181: 665– 676, 1997. Langner G and Schreiner CE. Periodicity coding in the inferior colliculus of the cat. I. Neuronal mechanisms. J Neurophysiol 60: 1799 –1822, 1988. Langner G, Schreiner CE, and Merzenich MM. Co-variation of latency and temporal resolution in the inferior colliculus of the cat. Hear Res 31: 197–202, 1987. Lavine RA. Phase-locking in response of single neurons in cochlear nuclear complex of the cat to low-frequency tonal stimuli. J Neurophysiol 24: 467– 483, 1971. Le Beau FEN, Rees A, and Malmierca MS. Contribution of GABA- and glycine-mediated inhibition to the monaural temporal response properties of neurons in the inferior colliculus. J Neurophysiol 75: 902–919, 1996. Lesser HD, Oneill WE, Frisina RD, and Emerson RC. On-off units in the moustached bat inferior colliculus are selective for transients resembling “acoustic glint” from fluttering insect targets. Exp Brain Res 82: 137–148, 1990. Liang L, Lu T, and Wang X. Neural representations of sinusoidal amplitude and frequency modulations in the primary auditory cortex of awake primates. J Neurophysiol 87: 2237–2261, 2002. Liberman MC. Auditory-nerve response from cats raised in a low-noise chamber. J Acoust Soc Am 63: 442– 455, 1978. Linden JF, Liu RC, Sahani M, Schreiner CE, and Merzenich MM. Spectrotemporal structure of receptive fields in areas AI and AAF of mouse auditory cortex. J Neurophysiol 90: 2660 –2675, 2003. Lorenzi C, Micheyl C, and Berthommier F. Neuronal correlates of perceptual amplitude-modulation detection. Hear Res 90: 219 – 227, 1995. Lu T, Liang L, and Wang X. Neural representations of temporally asymmetric stimuli in the auditory cortex of awake primates. J Neurophysiol 85: 2364 –2380, 2001. Maison S, Micheyl C, and Collet L. Medial olivocochlear efferent system in humans studied with amplitude-modulated tones. J Neurophysiol 77: 1759 –1768, 1997. 84 • APRIL 2004 • www.prv.org 574 JORIS, SCHREINER, AND REES 163. Makela JP, Karmos G, Molnar M, Csepe V, and Winkler I. Steady-state responses from the cat auditory cortex. Hear Res 45: 41–50, 1990. 164. Malmierca MS, Blackstad TW, Osen KK, and Molowny RL. The central nucleus of the inferior colliculus in rat—a Golgi and computer reconstruction study of the neuronal and laminar structure. J Comp Neurol 333: 1–27, 1993. 165. Malmierca MS, Leergaard TB, Bajo VM, Bjaalie JG, and Merchan MA. Anatomic evidence of a three-dimensional mosaic pattern of tonotopic organization in the ventral complex of the lateral lemniscus in cat. J Neurosci 18: 10603–10618, 1998. 166. Malmierca MS and Merchán MA. The auditory system. In: The Rat Nervous System, edited by G. Paxinos. San Diego, CA: Academic, 2004, p. 995–1080. 167. Malmierca MS, Rees A, Le Beau FEN, and Bjaalie JG. Laminar organization of frequency-defined axons within and between the inferior colliculi of the guinea pig. J Comp Neurol 357: 124 –144, 1995. 168. Mardia KV and Jupp PE. Directional Statistics. New York: Wiley, 1999. 169. Markram H, Lübke J, Frotscher M, and Sakmann B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science 275: 213–215, 1997. 170. Markram H and Tsodyks M. Redistribution of synaptic efficacy between neocortical pyramidal neurons. Nature 382: 807– 810, 1996. 171. McAlpine D. Are pitch neurones the result of difference tones on the basilar membrane? Ass Res Otolaryngol Abstr 25: 40, 2002. 172. McAlpine D, Jiang D, Shackleton TM, and Palmer AR. Convergent input from brainstem coincidence detectors onto delay-sensitive neurons in the inferior colliculus. J Neurosci 18: 6026 – 6039, 1998. 173. Merzenich MM and Reid MD. Representation of the cochlea within the inferior colliculus of the cat. Brain Res 77: 397– 415, 1974. 174. Metherate R and Weinberger NM. Cholinergic modulation of responses to single tones produces tone-specific receptive field alterations in cat auditory cortex. Synapse 6: 133–145, 1990. 175. Miller LM, Escabi MA, Read HL, and Schreiner CE. Functional convergence of response properties in the auditory thalamocortical system. Neuron 32: 151–160, 2001. 176. Miller LM, Escabi MA, Read HL, and Schreiner CE. Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex. J Neurophysiol 87: 516 –527, 2002. 177. Miller MI and Sachs MB. Representation of voice pitch in discharge patterns of auditory-nerve fibers. Hear Res 14: 257–279, 1984. 178. Moody DB, Cole D, Davidson LM, and Stebbins WC. Evidence for a reappraisal of the psychophysical selective adaption paradigm. J Acoust Soc Am 76: 1076 –1079, 1984. 179. Moore BCJ. Effects of relative phase of the components on the pitch of three component complex tones. In: Psychophysics and Physiology of Hearing, edited by E. F. Evans and J. P. Wilson. London: Academic, 1977, p. 349 –358. 180. Moore BCJ. An Introduction to the Psychology of Hearing. San Diego, CA: Academic, 2003. 181. Moore BCJ and Sek A. Effects of relative phase and frequency spacing on the detection of three-component amplitude modulation. J Acoust Soc Am 108: 2337–2344, 2001. 182. Morest DK and Oliver DL. The neuronal architecture of the inferior colliculus in the cat: defining the functional anatomy of the auditory midbrain. J Comp Neurol 222: 209 –236, 1984. 183. Møller AR. Coding of amplitude and frequency modulated sounds in the cochlear nucleus of the rat. Acta Physiol Scand 86: 223–238, 1972. 184. Møller AR. Responses of units in the cochlear nucleus to sinusoidally amplitude-modulated tones. Exp Neurol 45: 104 –117, 1974. 185. Møller AR. Latency of unit responses in cochlear nucleus determined in two different ways. J Neurophysiol 38: 812– 821, 1975. 186. Møller AR. Dynamic properties of primary auditory fibers compared with cells in the cochlear nucleus. Acta Physiol Scand 98: 157–167, 1976. Physiol Rev • VOL 187. Møller AR. Dynamic properties of the responses of single neurones in the cochlear nucleus of the rat. J Physiol 259: 63– 82, 1976. 188. Møller AR. Coding of increments and decrements in stimulus intensity in single units in the cochlear nucleus of the rat. J Neurosci Res 4: 1– 8, 1979. 189. Møller AR and Rees A. Dynamic properties of the responses of single neurons in the inferior colliculus of the rat. Hear Res 24: 203–215, 1986. 190. Müller-Preuss P. On the mechanisms of call coding through auditory neurons in the squirrel monkey. Eur Arch Psychiatry Neurol Sci 236: 50 –55, 1986. 191. Müller-Preuss P, Flachskamm C, and Bieser A. Neural encoding of amplitude modulation within the auditory midbrain of squirrel monkeys. Hear Res 80: 197–208, 1994. 192. Nagarajan S, Cheung S, Bedenbaugh P, Beitel R, Schreiner CE, and Merzenich MM. Representation of spectral and temporal envelope of twitter vocalizations in common marmoset primary auditory cortex. J Neurophysiol 87: 1723–1737, 2002. 193. Neff WD, Diamond DM, and Casseday JH. Behavioral studies of auditory discrimination. In: Handbook of Sensory Physiology, edited by W. D. Keidel and W. D. Neff. New York: Springer, 1975, p. 307– 400. 194. Nelken I, Rotman Y, and Bar Yosef O. Responses of auditorycortex neurons to structural features of natural sounds. Nature 397: 154 –157, 1999. 195. Nelken I and Young ED. Two separate inhibitory mechanisms shape the responses of dorsal cochlear nucleus type IV units to narrowband and wideband stimuli. J Neurophysiol 71: 2446 –2462, 1994. 196. Nelson PG, Erulkar SD, and Bryan JS. Responses of units of the inferior colliculus to time-varying acoustic stimuli. J Neurophysiol 29: 834 – 860, 1966. 197. Neuert V, Pressnitzer D, Patterson RD, and Winter IM. The responses of single units in the inferior colliculus of the guinea pig to damped and ramped sinusoids. Hear Res 159: 36 –52, 2001. 198. Neuweiler G. Auditory adaptations for prey capture in echolocating bats. Physiol Rev 70: 615– 641, 1990. 199. Oertel D, Bal R, Gardner SM, Smith PH, and Joris PX. Detection of synchrony in the activity of auditory nerve fibers by octopus cells of the mammalian cochlear nucleus. Proc Natl Acad Sci USA 97: 11773–11779, 2000. 200. Oliver DL and Huerta MF. Inferior and superior colliculi. In: The Mammalian Auditory Pathway: Neuroanatomy, edited by D. B. Webster, A. N. Popper, and R. R. Fay. New York: Springer-Verlag, 1992, p. 168 –221. 201. Oliver DL and Shneiderman A. The anatomy of the inferior colliculus: a cellular basis for integration of monaural and binaural information. In: The Neurobiology of Hearing: The Central Auditory System, edited by R. A. Altschuler, R. P. Bobbin, B. M. Clopton, and D. W. Hoffman. New York: Raven, 1991, p. 195–222. 202. Osen KK. Cytoarchitecture of the cochlear nuclei in the cat. J Comp Neurol 136: 453– 483, 1969. 203. Palmer AR. Encoding of rapid amplitude fluctuations by cochlearnerve fibres in the guinea-pig. Arch Oto-Rhino-Laryngol 236: 197– 202, 1982. 204. Palombi PS, Backoff PM, and Caspary DM. Responses of young and aged rat inferior colliculus neurons to sinusoidally amplitude modulated stimuli. Hear Res 153: 174 –180, 2001. 205. Patterson RD. The sound of a sinusoid: spectral models. J Acoust Soc Am 96: 1409 –1418, 1994. 206. Patuzzi RB and Robertson D. Tuning in the mammalian cochlea. Physiol Rev 68: 1009 –1082, 1988. 207. Pelleg-Toiba R and Wollberg Z. Discrimination of communication calls in the squirrel monkey: “call detectors” or “cell assemblies.” J Basic Clin Physiol Pharmacol 2: 257–271, 1991. 208. Perkel DH and Bullock TH. Neural coding. Neurosci Res Program 6: 221–348, 1968. 209. Peruzzi D, Sivaramakrishnan S, and Oliver DL. Identification of cell types in brain slices of the inferior colliculus. Neuroscience 101: 403– 416, 2000. 210. Phillips DP. Neural representation of stimulus time in the primary auditory cortex. Ann NY Acad Sci 682: 104 –118, 1993. 211. Phillips DP and Hall SE. Responses of single neurons in cat 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING 212. 213. 214. 215. 216. 217. 218. 219. 220. 221. 222. 223. 224. 225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235. auditory cortex to time-varying stimuli: linear amplitude modulations. Exp Brain Res 67: 479 – 492, 1987. Phillips DP, Hall SE, and Hollett JL. Repetition rate and signal level effects on neuronal responses to brief tone pulses in cat auditory cortex. J Acoust Soc Am 85: 2537–2549, 1989. Picton TW, Dauman R, and Aran JM. Steady-state responses produced in humans using sinusoidal frequency-modulation. J Otolaryngol 16: 140 –145, 1987. Plomp R. The role of modulation in hearing. In: Hearing: Physiological Bases and Psychophysics, edited by R. Klinke and R. Hartmann. Berlin: Springer-Verlag, 1983, p. 270 –276. Poon PW and Chiu TW. Single cell responses to AM tones of different envelopes at the auditory midbrain. In: Acoustic Signal Processing in the Central Auditory System, edited by J. Syka. New York: Plenum, 1997, p. 253–261. Pressnitzer D, Winter IM, and Patterson RD. The responses of single units in the ventral cochlear nucleus of the guinea pig to damped and ramped sinusoids. Hear Res 149: 155–166, 2000. Preuss A and Müller-Preuss P. Processing of amplitude modulated sounds in the medial geniculate body of the squirrel monkey. Exp Brain Res 79: 201–211, 1990. Read HL, Winer JA, and Schreiner CE. Functional architecture of auditory cortex. Curr Opin Neurobiol 12: 433– 440, 2002. Reale RA and Brugge JF. Auditory cortical neurons are sensitive to static and continuously changing interaural phase cues. J Neurophysiol 64: 1247–1260, 1990. Rees A, Green GGR, and Kay RH. Steady-state evoked responses to sinusoidally amplitude-modulated sounds recorded in man. Hear Res 23: 123–133, 1986. Rees A, Malmierca MS, and Le Beau EN. Regularity of firing of neurons in the inferior colliculus. J Neurophysiol 77: 2945–2965, 1997. Rees A and Møller AR. Responses of neurons in the inferior colliculus of the rat to AM and FM tones. Hear Res 10: 301–330, 1983. Rees A and Møller AR. Stimulus properties influencing the responses of inferior colliculus neurons to amplitude-modulated sounds. Hear Res 27: 129 –143, 1987. Rees A and Palmer AR. Neuronal responses to amplitude-modulated and pure-tone stimuli in the guinea pig inferior colliculus, and their modification by broadband noise. J Acoust Soc Am 85: 1978 – 1994, 1989. Rees A and Sarbaz A. The influence of intrinsic oscillations on the encoding of amplitude modulation by neurons in the inferior colliculus. In: Acoustical Signal Processing in the Central Auditory System, edited by J. Syka. New York: Plenum, 1997, p. 239 –252. Rhode WS. Interspike intervals as a correlate of periodicity pitch in cat cochlear nucleus. J Acoust Soc Am 97: 2414 –2429, 1995. Rhode WS. Physiological-morphological properties of the cochlear nucleus. In: Neurobiology of Hearing: the Central Auditory System, edited by R. A. Altschuler, R. P. Bobbin, B. M. Clopton, and D. W. Hoffman. New York: Raven, 1991, p. 47–77. Rhode WS. Temporal coding of 200% amplitude modulated signals in the ventral cochlear nucleus of the cat. Hear Res 77: 43– 68, 1994. Rhode WS and Greenberg S. Encoding of amplitude modulation in the cochlear nucleus of the cat. J Neurophysiol 71: 1797–1825, 1994. Rhode WS and Smith PH. Characteristics of tone-pip response patterns in relationship to spontaneous rate in cat auditory nerve fibers. Hear Res 18: 159 –168, 1985. Rhode WS and Smith PH. Encoding timing and intensity in the ventral cochlear nucleus of the cat. J Neurophysiol 56: 261–286, 1986. Rieke F, Bodnar D, and Bialek W. Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory neurons. Proc R Soc Lond B Biol Sci 262: 259 –265, 1995. Ritsma RJ. Existence region of the tonal residue. J Acoust Soc Am 34: 1224 –1229, 1962. Robles L and Ruggero MA. Mechanics of the mammalian cochlea. Physiol Rev 81: 1305–1352, 2001. Rodenburg M, Verveij C, and Van Den Brink G. Analysis of evoked responses in man elicited by sinusoidally amplitude modulated noise. Audiology 11: 283–293, 1972. Physiol Rev • VOL 575 236. Rodrigues-Dagaeff C, Simm G, De Ribaupierre Y, Villa A, De Ribaupierre F, and Rouiller EM. Functional organization of the ventral division of the medial geniculate body of the cat: evidence for a rostro-caudal gradient of response properties and cortical projections. Hear Res 39: 103–126, 1989. 237. Rose JE, Greenwood DD, Goldberg JM, and Hind JE. Some discharge characteristics of single neurons in the inferior colliculus of the cat. I. Tonotopical organization, relation of spike-counts to tone intensity, and firing patterns of single elements. J Neurophysiol 26: 294 –320, 1963. 238. Rosen S. Temporal information in speech: auditory and linguistic aspects. Philos Trans R Soc Lond B Biol Sci 336: 367–373, 1992. 239. Ross B, Borgmann C, Draganova R, Roberts LE, and Pantev C. A high-precision magnetoencephalographic study of human auditory steady-state responses to amplitude-modulated tones. J Acoust Soc Am 108: 679 – 691, 2000. 240. Rouiller E and De Ribaupierre F. Neurons sensitive to narrow ranges of repetitive acoustic transients in the medial geniculate body of the cat. Brain Res 48: 323–326, 1982. 241. Rouiller E, De Ribaupierre Y, Toros-Morel A, and De Ribaupierre F. Neural coding of clicks in the medial geniculate body of cat. Hear Res 5: 81–100, 1981. 242. Ruggero MA. Systematic errors in indirect estimates of basilar membrane travel times. J Acoust Soc Am 67: 707–710, 1980. 243. Ruggero MA. Physiology and coding of sound in the auditory nerve. In: The Mammalian Auditory Pathway: Neurophysiology, edited by R. R. Fay and A. N. Popper. New York: Springer-Verlag, 1992, p. 34 –93. 244. Ruggero MA and Rich NC. Timing of spikes initiation in cochlear afferents: dependence on site of innervation. J Neurophysiol 58: 379 – 403, 1987. 245. Ryan MJ and Rand AS. Phylogenetic inference and the evolution of communication in tungara frogs. In: The Design of Animal Communication, edited by M. D. Hauser and M. Konishi. Cambridge, MA: MIT, 1999, p. 535–557. 246. Sachs MB and Abbas PJ. Rate versus level functions for auditorynerve fibers in cats: tone-burst stimuli. J Acoust Soc Am 56: 1835– 1847, 1974. 247. Saitoh K, Maruyama N, and Kudoh M. Sustained response of auditory cortex units in the cat. In: Brain Mechanisms of Sensation, edited by Y. Katsuki, R. Norgren, and M. Sato. New York: Wiley, 1981, p. 31– 43. 248. Saldana E, Feliciano M, and Mugnaini E. Distribution of descending projections from primary auditory neocortex to inferior colliculus mimics the topography of intracollicular projections. J Comp Neurol 371: 15– 40, 1996. 249. Sally SL and Kelly JB. Organization of auditory cortex in the albino rat: sound frequency. J Neurophysiol 59: 1627–1638, 1988. 250. Schorer E. Critical modulation frequency based on detection of AM versus FM tones. J Acoust Soc Am 79: 1054 –1057, 1986. 251. Schreiner CE and Langner G. Periodicity coding in the inferior colliculus of the cat. II. Topographical organization. J Neurophysiol 60: 1823–1840, 1988. 252. Schreiner CE and Langner G. Laminar fine structure of frequency organization in auditory midbrain. Nature 388: 383–386, 1997. 253. Schreiner CE and Raggio MW. Neuronal responses in cat primary auditory cortex to electrical cochlear stimulation. II. Repetition rate coding. J Neurophysiol 75: 1283–1300, 1996. 254. Schreiner CE and Snyder RL. Modulation transfer characteristics of neurons in the dorsal cochlear nucleus of the cat. Soc Neurosci Abstr 13: 1258, 1987. 255. Schreiner CE and Urbas JV. Representation of amplitude modulation in the auditory cortex of the cat. I. The anterior auditory field (AAF). Hear Res 21: 227–241, 1986. 256. Schreiner CE and Urbas JV. Representation of amplitude modulation in the auditory cortex of the cat. II. Comparison between cortical fields. Hear Res 32: 49 – 64, 1988. 257. Schroeder MR. Modulation transfer functions: definition and measurement. Acustica 49: 179 –182, 1981. 258. Schuller G. Natural ultrasonic echoes from wing beating insects are encoded by collicular neurons in the CF-FM bat, rhinolophus- 84 • APRIL 2004 • www.prv.org 576 259. 260. 261. 262. 263. 264. 265. 266. 267. 268. 269. 270. 271. 272. 273. 274. 275. 276. 277. 278. 279. JORIS, SCHREINER, AND REES ferrumequinum. J Comp Physiol A Sens Neural Behav Physiol 155: 121–128, 1984. Schulze H and Langner G. Periodicity coding in the primary auditory cortex of the Mongolian gerbil (Meriones unguiculatus): two different coding strategies for pitch and rhythm? J Comp Physiol A Sens Neural Behav Physiol 181: 651– 664, 1997. Schulze H and Langner G. Representation of periodicity pitch in the primary auditory cortex of the Mongolian gerbil. Acta Otolaryngol Suppl 532: 89 –95, 1997. Schulze H and Langner G. Auditory cortical responses to amplitude modulations with spectra above frequency receptive fields: evidence for wide band spectral integration. J Comp Physiol A Sens Neural Behav Physiol 185: 493–508, 1999. Schwartz IR. Superior olivary complex and lateral lemniscal nuclei. In: The Mammalian Auditory Pathway: Neuroanatomy, edited by D. B. Webster, A. N. Popper, and R. R. Fay. New York: Springer-Verlag, 1992, p. 117–167. Sek A and Moore BCJ. The critical modulation frequency and its relationship to auditory filtering at low frequencies. J Acoust Soc Am 95: 2606 –2486, 1994. Semple MN and Aitkin LM. Representation of sound frequency and laterality by units in central nucleus of cat inferior colliculus. J Neurophysiol 42: 1626 –1639, 1979. Shannon RV, Zeng FG, Kamath V, Wygonski J, and Ekelid M. Speech recognition with primarily temporal cues. Science 270: 303–304, 1995. Shofner WP, Sheft S, and Guzman SJ. Responses of ventral cochlear nucleus units in the chinchilla to amplitude modulation by low-frequency, two-tone complexes. J Acoust Soc Am 99: 3592– 3605, 1996. Sinex DG, Henderson J, Li HZ, and Chen GD. Responses of chinchilla inferior colliculus neurons to amplitude-modulated tones with different envelopes. J Assoc Res Otolaryngol 3: 390 – 402, 2002. Sivaramakrishnan S and Oliver DL. Distinct K currents result in physiologically distinct cell types in the inferior colliculus of the rat. J Neurosci 21: 2861–2877, 2001. Smith PH, Joris PX, and Yin TCT. Projections of physiologically characterized spherical bushy cell axons from the cochlear nucleus of the cat: evidence for delay lines to the medial superior olive. J Comp Neurol 331: 245–260, 1993. Smith RL and Brachman ML. Response modulation of auditorynerve fibers by AM stimuli: effects of average intensity. Hear Res 2: 123–133, 1980. Smith RL and Brachman ML. Adaptation in auditory-nerve fibers: a revised model. Biol Cybern 44: 107–120, 1982. Smith ZM, Delgutte B, and Oxenham AJ. Chimaeric sounds reveal dichotomies in auditory perception. Nature 416: 87–90, 2002. Sovijarvi ARA. Detection of natural complex sounds by cells in the primary auditory cortex of the cat. Acta Physiol Scand 93: 318 –335, 1975. Steinschneider M, Reser DH, Fishman YI, Schroeder CE, and Arezzo JC. Click train encoding in primary auditory cortex of the awake monkey: evidence for two mechanisms subserving pitch perception. J Acoust Soc Am 104: 2935–2955, 1998. Struhsaker CT. Auditory communication among vervet monkeys (Cercopithecus aethiops). In: Social Communication Among Primates, edited by S. A. Altmann. Chicago, IL: Univ. of Chicago Press, 1967, p. 281–324. Suga N, O’Neill WD, Kujirai K, and Manabe T. Specificity of combination sensitive neurons for processing of complex biosonar signals in auditory cortex of the mustached bat. J Neurophysiol 49: 1573–1626, 1983. Symmes D. Discrimination of intermittent noise by macaques following lesions of the temporal lobe. Exp Neurol 16: 201–214, 1966. Terhardt E. über die durch amplitudenmodulierte Sinustöne hervorgerufene Hörenempfindung. Acustica 20: 210 –214, 1968. Tsuchitani C and Johnson DH. Binaural cues and signal processing in the superior olivary complex. In: Neurobiology of Hearing: The Central Auditory System, edited by R. A. Altschuler, R. P. Bobbin, B. M. Clopton, and D. W. Hoffman. New York: Raven, 1991, p. 163–193. Physiol Rev • VOL 280. Ulanovsky N, Las L, and Nelken I. Processing of low-probability sounds by cortical neurons. Nature Neurosci 6: 391–398, 2003. 281. Van Tassell DJ, Soli SD, Kirby VM, and Widin GP. Speech waveform envelope cues for consonant recognition. J Acoust Soc Am 82: 1152–1161, 1987. 282. Vater M. Single unit responses in cochlear nucleus of horseshoe bats to sinusoidal frequency and amplitude modulated signals? J Comp Physiol A Sens Neural Behav Physiol 149: 369 –388, 1982. 283. Verhey JL, Dau T, and Kollmeier B. Within-channel cues in comodulation masking release (CMR): experiments and model predictions using a modulation-filterbank model. J Acoust Soc Am 106: 2733–2745, 1999. 284. Vernier VG and Galambos R. Response of single medial geniculate units to repetitive click stimuli. Am J Physiol 188: 233–237, 1957. 285. Viemeister NF. Temporal modulation transfer functions based upon modulation thresholds. J Acoust Soc Am 66: 1364 –1380, 1979. 286. Viemeister NF and Plack CJ. Time analysis. In: Human Psychophysics, edited by W. A. Yost, A. N. Popper, and R. R. Fay. New York: Springer, 1993, p. 116 –154. 287. Von Helmholtz HLF. Die Lehre von den Tonempfindungen als physiologiche Grundlage für die Theorie der Musik. Trans. Ellis AJ 1954. On the Sensations of Tone as a Physiological Basis for the Theory of Music. New York: Dover, 1863. 288. Voss RF and Clarke J. 1/f noise in music and speech. Nature 258: 317–318, 1975. 289. Wakefield GH and Viemeister NF. Selective adaption to linear frequency-modulated sweeps: evidence for direction-specific FM channels. J Acoust Soc Am 75: 1588 –1592, 1984. 290. Walton JP, Frisina RD, and O’Neill WE. Age-related alteration in processing of temporal sound features in the auditory midbrain of the cba mouse. J Neurosci 18: 2764 –2776, 1998. 291. Walton JP, Simon H, and Frisina RD. Age-related alterations in the neural coding of envelope periodicities. J Neurophysiol 88: 565–578, 2002. 292. Wang X, Merzenich MM, Beitel R, and Schreiner CE. Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: temporal and spectral characteristics. J Neurophysiol 74: 2685–2706, 1995. 293. Wang X, Merzenich MM, Beitel R, and Schreiner CE. Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: temporal and spectral characteristics. J Neurophysiol 74: 2685–2706, 1995. 294. Wang X and Sachs MB. Neural encoding of single-formant stimuli in the cat. I. Responses of auditory nerve fibers. J Neurophysiol 70: 1054 –1075, 1993. 295. Wang X and Sachs MB. Neural encoding of single-formant stimuli in the cat. II. Responses of anteroventral cochlear nucleus units. J Neurophysiol 71: 59 –78, 1994. 296. Wang X and Sachs MB. Transformation of temporal discharge patterns in a ventral cochlear nucleus stellate cell model: implications for physiological mechanisms. J Neurophysiol 73: 1600 –1616, 1995. 297. Warr WB. Parallel ascending pathways from the cochlear nucleus: neuroanatomical evidence of functional specialization. In: Contributions to Sensory Physiology, edited by W. D. Neff. New York: Academic, 1982, p. 1–38. 298. Weiss TF and Rose C. A comparison of synchronization filters in different auditory receptor organs. Hear Res 33: 175–180, 1988. 299. Whitfield IC. Auditory cortex and the pitch of complex tones. J Acoust Soc Am 67: 644 – 647, 1980. 300. Whitfield IC and Evans EF. Responses of auditory cortical neurons to stimuli of changing frequency. J Neurophysiol 28: 655– 672, 1965. 301. Wiegrebe L and Winter IM. Temporal representation of iterated rippled noise as a function of delay and sound level in the ventral cochlear nucleus. J Neurophysiol 85: 1206 –1219, 2001. 302. Wightman FL and Green DM. The perception of pitch. Am Sci 62: 208 –215, 1974. 303. Wilson HR and Wilkinson F. Evolving concepts of spatial channels in vision: from independence to nonlinear interactions. Perception 26: 939 –960, 1997. 304. Winer JA. The functional architecture of the medial geniculate 84 • APRIL 2004 • www.prv.org AUDITORY MODULATION PROCESSING 305. 306. 307. 308. 309. 310. 311. 312. 313. 314. body and primary auditory cortex. In: The Mammalian Auditory Pathway: Neuroanatomy, edited by D. B. Webster, A. N. Popper, and R. R. Fay. New York: Springer-Verlag, 1992, p. 222– 409. Winter IM, Robertson D, and Yates GK. Diversity of characteristic frequency rate-intensity functions in guinea pig auditory nerve fibers. Hear Res 45: 191–202, 1990. Winter P and Funkenstein HH. The effects of species-specific vocalization on the discharge of auditory cortical cells in the awake squirrel monkey (Saimiri sciureus). Exp Brain Res 18: 489 –504, 1973. Wollberg Z and Newman JD. Auditory cortex of squirrel monkey: response patterns of single cells to species-specific vocalisations. Science 175: 212–214, 1972. Wong D, Maekawa M, and Tanaka H. The effects of pulse repetition rate on the delay sensitivity of neurons in the auditory cortex of the FM bat, Myotis lucifugus. J Comp Physiol A Sens Neural Behav Physiol 170: 393– 402, 1992. Wong SW and Schreiner CE. Representation of CV-sounds in cat primary auditory cortex: intensity dependence. Speech Communication 41: 93–106, 2003. Yang L and Pollak GD. Differential response properties to amplitude modulated signals in the dorsal nucleus of the lateral lemniscus of the mustache bat and the roles of GABAergic inhibition. J Neurophysiol 77: 324 –340, 1997. Yates GK. Dynamic effects in the input/output relationship of auditory nerve. Hear Res 27: 221–230, 1987. Yin TCT. Neural mechanisms of encoding binaural localization cues in the auditory brainstem. In: Integrative Functions in the Mammalian Auditory Pathway, edited by D. Oertel, A. N. Popper, and R. R. Fay. New York: Springer, 2002, p. 99 –159. Yin TCT and Chan JCK. Interaural time sensitivity in medial superior olive of cat. J Neurophysiol 64: 465– 488, 1990. Yin TCT, Chan JCK, and Irvine DRF. Effects of interaural time delays of noise stimuli on low-frequency cells in the cat’s inferior colliculus. I. Responses to wideband noise. J Neurophysiol 55: 280 –300, 1986. Physiol Rev • VOL 577 315. Yin TCT, Chan JK, and Kuwada S. Characteristic delays and their topographic distribution in the inferior colliculus of the cat. In: Mechanisms of Hearing, edited by W. R. Webster and L. M. Aitkin. Clayton, Victoria, Australia: Monash Univ. Press, 1983, p. 94 –99. 316. Yin TCT, Joris PX, Smith PH, and Chan JCK. Neuronal processing for coding interaural time disparities. In: Binaural and Spatial Hearing in Real and Virtual Environments, edited by R. Gilkey and T. Anderson. New York: Lawrence Erlbaum, 1997, p. 427– 445. 317. Yin TCT, Kuwada S, and Sujaku Y. Interaural time sensitivity of high-frequency neurons in the inferior colliculus. J Acoust Soc Am 76: 1401–1410, 1984. 318. Yost WA, Sheft S, and Opie J. Modulation interference in detection and discrimination of amplitude modulation. J Acoust Soc Am 86: 2138 –2147, 1989. 319. Young ED. The cochlear nucleus. In: The Synaptic Organization of the Brain, edited by G. M. Shepherd. New York: Oxford Univ. Press, 1998, p. 121–158. 320. Young ED, Robert JM, and Shofner WP. Regularity and latency of units in ventral cochlear nucleus: implications for unit classification and generation of response properties. J Neurophysiol 60: 1–29, 1988. 321. Zhang HM and Kelly JB. AMPA and NMDA receptors regulate responses of neurons in the rat’s inferior colliculus. J Neurophysiol 86: 871– 880, 1902. 322. Zhao HB and Liang ZA. Processing of modulation frequency in the dorsal cochlear nucleus of the guinea pig: amplitude modulated tones. Hear Res 82: 244 –256, 1995. 323. Zhao HB and Liang ZA. Temporal encoding and transmitting of amplitude and frequency modulations in dorsal cochlear nucleus. Hear Res 106: 83–94, 1997. 324. Zwicker E. Die Grenzen der Hörbarkeit der Amplitudenmodulation und der Frequenzmodulation eines Tones. Acustica 2: 125–133, 1952. 325. Zwicker E and Fastl H. Psychoacoustics. Berlin: Springer, 1999. 84 • APRIL 2004 • www.prv.org