Physiol Rev 93: 1563–1619, 2013 doi:10.1152/physrev.00029.2012 AUDITORY DISTORTIONS: ORIGINS AND FUNCTIONS Paul Avan, Béla Büki, and Christine Petit Laboratory of Neurosensory Biophysics, University of Auvergne, School of Medicine, Clermont-Ferrand, France; Institut National de la Santé et de la Recherche Médicale (INSERM), UMR 1107, Clermont-Ferrand, France; Centre Jean Perrin, Clermont-Ferrand, France; Department of Otolaryngology, County Hospital, Krems an der Donau, Austria; Laboratory of Genetics and Physiology of Hearing, Department of Neuroscience, Institut Pasteur, Paris, France; Collège de France, Genetics and Cell Physiology, Paris, France; INSERM UMRS 1120, Paris, France; and Université Pierre et Marie Curie, Paris, France Avan P, Büki B, Petit C. Auditory Distortions: Origins and Functions. Physiol Rev 93: 1563–1619, 2013; doi:10.1152/physrev.00029.2012.—To enhance weak sounds while compressing the dynamic intensity range, auditory sensory cells amplify soundinduced vibrations in a nonlinear, intensity-dependent manner. In the course of this process, instantaneous waveform distortion is produced, with two conspicuous kinds of interwoven consequences, the introduction of new sound frequencies absent from the original stimuli, which are audible and detectable in the ear canal as otoacoustic emissions, and the possibility for an interfering sound to suppress the response to a probe tone, thereby enhancing contrast among frequency components. We review how the diverse manifestations of auditory nonlinearity originate in the gating principle of their mechanoelectrical transduction channels; how they depend on the coordinated opening of these ion channels ensured by connecting elements; and their links to the dynamic behavior of auditory sensory cells. This paper also reviews how the complex properties of waves traveling through the cochlea shape the manifestations of auditory nonlinearity. Examination methods based on the detection of distortions open noninvasive windows on the modes of activity of mechanosensitive structures in auditory sensory cells and on the distribution of sites of nonlinearity along the cochlear tonotopic axis, helpful for deciphering cochlear molecular physiology in hearing-impaired animal models. Otoacoustic emissions enable fast tests of peripheral sound processing in patients. The study of auditory distortions also contributes to the understanding of the perception of complex sounds. L I. II. III. IV. V. VI. VII. VIII. IX. X. XI. XII. XIII. XIV. XV. INTRODUCTION 1563 HISTORICAL PERSPECTIVE: AUDITORY... 1566 MECHANOELECTRICAL TRANSDUCTION... 1569 NONLINEAR RESPONSE OF THE... 1575 INTERACTION OF TWO TONES IN THE... 1580 FROM COMPLEX STIMULI TO NO... 1595 THE PARTICULAR STATUS OF THE... 1599 MATHEMATICAL MODELS OF... 1599 AUDITORY MASKING AND THE COCHLEA 1602 CONCLUSION AND PERSPECTIVES:... 1605 HISTORICAL INSERT: TARTINI’S TONES 1606 APPENDIX I: OTOACOUSTIC... 1608 APPENDIX II: BACKWARD VERSUS... 1608 APPENDIX III: COCHLEAR DISTORTION... 1611 APPENDIX IV: CHARACTERIZATION... 1612 I. INTRODUCTION A. Scope of the Review Hearing has evolved to cover a huge range of sound intensities. The mammalian ear is sensitive enough to detect weak sounds [20 Pa, or 0 dB sound pressure level (SPL)] when the vibrations of molecules in the transmitting medium barely exceed displacements due to thermal, Brownian motion (19). This impressive sensitivity is achieved by a subtle sound-amplifying mechanism operated by the sensory cells themselves. This amplifier in the mammalian auditory organ, the cochlea, has been reviewed previously (228). The acoustic powers processed by the auditory system extend over a thousand billion times above detection threshold (20 million Pa or 120 dB SPL). Furthermore, in the presence of noise, the cochlear ability to extract the characteristics of sound remains robust, thereby allowing efficient central processing (“cocktail-party effect”; Ref. 39). If amplification were linear, the displacements of sound-detecting cochlear receptors, of the order of a nanometer around threshold, would reach the potentially damaging micrometer limit around 60 dB SPL, the normal level of conversational speech. Moreover, the resulting responses could not fit in the restricted dynamic range of auditory neurons innervating the sensory cells, as their action potential rates can only vary little, between a few per second, the spontaneous rate of their activity, and a few hundred per second at saturation. Actually, amplification gradually diminishes at increasing stimulus intensities, thereby producing the required compression. So, the displacements of liv- 0031-9333/13 Copyright © 2013 the American Physiological Society 1563 AVAN, BÜKI, AND PETIT ing auditory detectors are a highly nonlinear version of the vibrations of the external transmitting medium. Nonlinear systems share a generic definition: the response to a combination of stimuli is not a simple algebraic sum of individual responses. However, depending on the design of a nonlinear system, the mathematical rules describing its behavior can be very different. For example, compressive nonlinearity may be achieved by instantaneous clipping of the stimulus when it exceeds some limit. This strategy has a drawback. It deforms the stimulus strongly and causes waveform distortion (WD). An alternative method of compression, the sluggish automatic gain control (AGC) acting over several stimulus periods, popular in the engineering and hearing-aid worlds, generates, in contrast, minimal WD. Finally, critical oscillators poised on the verge of selfoscillation (34, 60) could also combine reasonably fastreacting compressive amplification and limited WD. The scope of this review is to unravel the origin and design of auditory nonlinearity. It will lead us to examine how its manifestations, compression and WD, relate to the mechanoelectrical transduction (MET) process. Its operation inescapably produces WD, and the mechanical coupling of auditory sensory cells to their surroundings transforms nonlinear products into cochlear vibrations. However, evolution seems to have managed to reach the best balance between compressive (nonlinear) amplification, faithful WD-free processing of sound, and the potential beneficial effects of WD. Indeed, although nonlinearity results in conspicuous distortions in the frequency spectrum of complex sounds, with the suppression of existing components and the creation of novel ones not present in the stimulus, perception puts up with these distortions. As we will discover, WDs also have a practical implication. Being easy to detect noninvasively, they may be used to probe the peripheral auditory function. B. Anatomical and Physiological Framework In mammals, the cochlear duct is divided longitudinally, from base to apex, by the cochlear partition made of the basilar membrane (BM) and the overlying auditory sensory epithelium, the organ of Corti. This partition separates two of the three internal compartments of the cochlea, scala tympani (ST) and scala media (FIGURE 1A). Reissner’s membrane separates scala media from the third cochlear compartment, scala vestibuli (SV). Both ST and SV are filled with perilymph (FIGURE 1A, light yellow area) while scala media contains endolymph (FIGURE 1A, light-blue area), a fluid that differs from perilymph and other extracellular fluids by its high K⫹ concentration and a positive electrical potential of about ⫹100 mV. Sound pressure waves are transmitted to cochlear fluids via the outer and middle ears to SV by an ossicle, the stapes, which, by vibrating in the oval window of the cochlea, produces a pressure gradient through the BM. 1564 The organ of Corti contains the auditory sensory cells, the inner hair cells (IHCs) and outer hair cells (OHCs). The transduction of sound is achieved by a unique structural feature of hair cells, protruding microvilli, namely F-actin filled processes called stereovilli or commonly, though improperly, stereocilia, which are organized in bundles (FIGURE 1, B AND C). When mechanical vibrations produced by sound propagate along the BM (see sect. IVA), the resulting shearing motion between the BM and the tectorial membrane overlying the hair cells displaces stereocilia bundles. In these bundles, stereocilia organized in three rows are connected by multiple fibrous links, among which the socalled tip-links extend from the tip of a stereocilium to the side of the adjacent taller one (FIGURE 1C). When tensed by vibrational movements, tip-links produce a conformational change of MET channels located at the tip of stereocilia of the small and medium rows (18), thereby modulating their conductance (gating). In consequence, driven by their positive electrochemical gradient, K⫹ ions, which abound in the endolymph, enter the cells, together with Ca2⫹ ions. The ion currents lead to cell depolarization and to interactions between Ca2⫹ ions and various molecules inside the hair cell (35, 71). It has been suggested in the 1940s (77), then confirmed in the 1980s that amplification of acoustic stimuli in mammals stems from a positive feedback mechanism (45) resulting from bidirectional transduction in OHCs (287), namely, their ability to respond mechanically to membrane-potential changes produced by MET (30; for a review, see Ref. 8). Electromotility is due to the presence of a motor protein, prestin, in their lateral plasma membrane (307). The resulting forces influence the mechanical impedance of the cochlear partition by feeding energy back into its movements. When applied with the proper timing, this feedback partly compensates for the loss of energy due to friction with cochlear fluids, thereby achieving amplification in an intensitydependent, nonlinear, compressive manner (228). The sharpened BM resonance elicited by the positive feedback process converts the originally passive and broadly tuned frequency filtering of the cochlear partition into a highly tuned frequency representation, in which the high and low frequencies are processed at the base and apex of the cochlea, respectively, forming a tonotopic map along the longitudinal cochlear axis (200). Whereas OHCs are essentially confined to a micromechanical activity, IHCs aligned along the longitudinal axis in a single row of ⬃3,500 cells (in humans) and innervated by afferent neurons of the auditory nerve (FIGURE 1A) form the actual sensory cells of the cochlea. Hair cells fulfilling comparable functions are found in a broad spectrum of vertebrate orders, from frogs through lizards and birds, both in hearing and vestibular organs (104, 158). Compared with mammals, the ways soundinduced vibrations reach the hair cells and set their stereo- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS A B SV RM TM SM BM AN ST C FIGURE 1. A: schematic cross-section of a mammalian cochlea, showing the three fluid-filled compartments scala vestibuli (SV), scala media (SM) and scala tympani (ST), the basilar membrane (BM), and Reissner’s membrane (RM). The organ of Corti contains one row of inner hair cells (IHCs, black arrow) and three rows of outer hair cells (OHCs, red arrow), overlaid by the tectorial membrane (TM). Afferent auditory neurons forming synapses at the base of IHCs make the auditory nerve (AN). B: scanning electron micrograph of the organ of Corti of a mouse. The tectorial membrane (TM) is lifted, revealing the W-shaped stereocilia bundles protruding from the tops of OHCs organized in three spiraling rows. Beneath the reticular lamina (RL), the cylindrical bodies of OHCs can be seen behind the thin, winding elongations (phalangeal processes, PP) of supporting cells (the Deiters cells). The IHCs are not visible; they form a spiraling longitudinal row hidden beneath the tectorial membrane. On the lower side of the tectorial membrane, the imprints of the tallest OHC stereocilia can be recognized (one imprint is delineated by arrows). C: scanning electron micrograph of the top of an OHC, with its stereocilia bundle organized in three rows. The tips of small and intermediate stereocilia are connected to the sides of their taller neighbors by tip-links (arrows), essential for opening the MET channels during auditory stimulation. At this early stage (postnatal day 7), the tops of stereocilia are not yet connected by transverse links (see FIGURE 5). Scale bars: 4 m (B) and 1 m (C). [Courtesy of Aziz El Amraoui (A) and Vincent Michel (B and C).] cilia bundles into motion are different, and auditory hair cells are less specialized than OHCs and IHCs. [Frequency tuning can stem from the intrinsic resonance of hair bundles or from electrical properties of basolateral membranes (158).] Compressive amplification is also ensured by a positive feedback principle even though the somatic, prestindriven electromotility is substituted by “active” hair-bundle motility (104). The topic of this review pertains to hair Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1565 AVAN, BÜKI, AND PETIT cell-driven nonlinearities and not to the exact feedback mechanism affording amplification and frequency tuning. Examples will be borrowed from nonmammalian auditory sensory organs when needed. The gating principle of a mechanosensitive channel necessarily produces cycle-by-cycle WD (101) and even, clipping at saturating levels. Because of the feedback process, the sound wave of the stimulus itself gets contaminated by WD. Flattening and clipping add harmonics 2f, 3f, etc., to the sinusoidal wave of a tone at frequency f. If the stimulus contains a combination of tones, as the processing of auditory stimulation on the BM is spatially distributed, waves at neighboring and even remote frequencies travel together and can interact. The response to one tone of a complex stimulus may result in smaller displacements of the BM than it would have done if the tone had been presented in isolation. The stimuli undergoing this so-called suppression become less audible. Even though suppression diminishes the audibility of weak probe tones in the presence of noise, it has also been suggested that by increasing contrasts, it has overall positive influences on sound analysis (262). Another result of WD is that new sinusoids at arithmetic combinations of the original frequencies emerge in the response. This cochlear nonlinearity is found even at the lowest measurable stimulus levels, which led to call it “essential” (79) in contrast to the nonlinearities only observed at saturation. Although WD deforms the frequency spectrum of incoming sounds, it does not dramatically disturb perception as the cochlear compressive and filtering activity, applied to WD, ensures that it remains small in most circumstances. All manifestations of auditory nonlinearities, compression, WD, suppression, even spontaneous activity, have been proposed to emerge from the same processes and coexist or vanish together. Thus their study should highlight basic cochlear mechanisms of sound processing including the analysis of complex sounds. Otoacoustic emissions (OAEs), sounds emitted by a nonlinear cochlea (121), deserve a par- ticular emphasis. Although usually mixed with the stimulus that evokes them, they are easily singled out by their characteristic nonlinear properties and long latencies. They have become unique tools for noninvasive evaluation of peripheral hearing, whether for human audiology or in animal models with cochlear deficits. II. HISTORICAL PERSPECTIVE: AUDITORY STIMULUS PROCESSING AND THE NONLINEARITY PRINCIPLE The existence of WD in a system supposed to faithfully convert sound messages into information for the brain is counterintuitive, e.g., for high-fidelity specialists who obsessively fight any form of distortion. Thus before 1978, almost all reports of objective measurements of WD, apart from a few exceptions (193, 289), ascribed them to nonlinearities outside the cochlea, e.g., either in the middle ear (95) or in the earphones. Actually, the middle ear does not distort at physiological SPLs (4), and earphone nonlinearities can be controlled (260). However, a few early psychophysical studies, which will be reviewed first, noticed unique features of WD that identified the cochlea as its source. With regard to objective measurements, it took two sets of seemingly unconnected observations to make clear that an intact cochlea works nonlinearly and is not a passive resonator (FIGURE 2). The first set of data (228) concerned the compressive growth of BM motion with the level of the stimulus at the cochlear places tuned to the frequency of the probe tone (225). The second set of data described instantaneous WD in normal cochleas, the central topic of this review. A. Perceptive Evidence It has been acknowledged for a long time that when two tones at frequencies f1 and f2 are mixed, intermodulation FIGURE 2. Two manifestations of nonlinearity in the cochlea. Left: BM velocity against tone level, at frequencies equal to and higher than the characteristic frequency, 10 kHz, at the measurement place: the closer the frequency to 10 kHz, the more compressive (shallower) the BM growth. [From Robles et al. (230).] Right: spectrum of the sound detected in the ear canal of a mouse in response to a pair of pure tones at frequencies f1 and f2 (60 dB SPL), displaying distortion products at combination frequencies (solid arrows, odd-order combinations; dashed arrows, even-order ones). 1566 Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS WD results in the production of audible tonal components at arithmetic combinations (f2-f1, 2f1-f2, 2f2-f1, etc.), absent in the stimulus, called distortion products (DPs) or combination tones. Their discovery is attributed to the musician Tartini (267), and the first physically based investigation, to Helmholtz, although doubts can be raised as to the exact physiological significance of these early observations (see sect. XI). The very unusual dependence of DPs on stimulus level and frequency spacing were first disclosed by psychoacoustical methods (79, 259) ascertaining their cochlear origin. Goldstein (79) evaluated the most prominent endogenous DP, i.e., at 2f1-f2, by cancelling it by an adjustable external tone at exactly 2f1-f2. At a precise combination of level and phase of the external tone, trained subjects, in whose ear the experiment was performed, no longer heard the 2f1-f2 component, as it was at the same level and exactly out of phase with the canceling tone. The concept of “essential nonlinearity” coined by Goldstein stemmed from his key observation that when the amplitude of stimuli decreased, that of the response at 2f1-f2 decreased in similar proportions (1 dB per dB decrease of the primary tones at f1 and f2) so that this nonlinear response was always approximately at the same dB level relative to the stimuli. In contrast, nonlinearities due to saturation in a passive system do not behave this way. It can be shown (see sect. VF) that the amplitude of the 2f1-f2 component depends on the square of the amplitude at f1 times the amplitude at f2 so that when both increase by 1 dB at the input of a passive, saturating nonlinear system, the 2f1-f2 component should increase by 3 dB. The importance of Goldstein’s work was not immediately grasped. Today, its groundbreaking consequences are clear. Goldstein’s results indicate that there must exist a compressive amplifying device in the cochlea. More precisely, one has to assume that the amplitude of BM displacement increases by only 0.33 dB/dB increase of the stimulus, to explain why the 2f1-f2 response increases by the observed 1 dB/dB, i.e., 3 times 0.33 dB/dB. Rhode was the first to detect compression by direct mechanical measurements of BM vibrations (225), but the significance of his results was also overlooked for years. Another finding of Goldstein pointing to the cochlear origin of the DP was its strong dependence on frequency ratio f2/f1. Here was the first explicit statement that WD, cochlear sensitivity, compression, and frequency selectivity may be associated (79). Last, the audibility of the 2f1-f2 DP for stimuli down to 20 dB SPL (79) called for another explanation of this distortion than saturation. B. Objective Evidence: Otoacoustic Emissions Goldstein’s discovery relied on perceptive reports from two subjects, including himself. Objective evidence of inter- modulation from the cochlea was provided by the pioneer studies of David Kemp (120, 121), revealing the existence of OAEs from the ear. These low-intensity sounds emitted by healthy ears, either spontaneously or in response to sound stimuli, could be collected by a microphone sealed in the ear canal. Three different stimulus paradigms, using transient sounds (clicks, with a broad frequency spectrum), a single pure tone or a pair of pure tones, evoke three main categories of sound-evoked OAEs. The first observations were reported by Kemp (121) as transient-evoked OAEs (TEOAEs) initially called “Kemp echoes” (FIGURE 3). This finding was as groundbreaking a discovery as that of Goldstein because the existence of OAEs showed that the cochlea responds to sound by creating vibrations. Kemp echoes could not be regular linear echoes as they grew nonlinearly, not in proportion to click intensity. The second category of OAEs, evoked by a pure-tone stimulus, was found at the same frequency as the stimulus, stimulus-frequency OAEs (SFOAEs), which made them technically more difficult to separate from the stimulus. Last, intermodulation OAEs showed up in the spectrum of sound in the ear canal, in response to pairs of pure-tone stimuli with close enough frequencies f1 and f2, less than half an octave apart (79). In 1977, Kemp (120) had found that the DP heard at 2f1-f2 was also emitted in the ear canal as an OAE, the so-called DPOAE, yet its weak level in human ears led its discoverer to set it aside. Henceforth, a distinction will be made between DPs, which encompass intracochlear vibrations at combination frequencies, their neuronal correlates and their perception, and DPOAEs, the acoustic correlates of DPs detected in the spectrum of sound in the ear canal after backward propagation of DP vibrations through the middle ear. Pa mPa 0.3 20 0 -0.3 -20 0 10 20 ms FIGURE 3. Transient-evoked otoacoustic emission (TEOAE) evoked by a click and detected by a miniature microphone in the sealed ear canal of a normal subject. Stimulus ringing (in red, left-hand pressure scale) does not last beyond 3 ms. After this time, the averaging procedure adds the responses to clicks at opposing polarities and different amplitudes, chosen in such a way that the overall average of clicks is zero. Because the linearly growing responses are cancelled, a derived TEOAE remains, demonstrating the existence of a compressively growing response. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1567 AVAN, BÜKI, AND PETIT Strong doubts were initially raised whether otoacoustic emissions might be an experimental artifact. The original experiment was carefully replicated (295, 298), and the only alternative explanation for the compressive growth of OAEs with stimulus intensity, middle-ear muscle function, was excluded. Clear evidence of a close relationship between the presence of OAEs and cochlear sensitivity (7) ruled out the possibility that emissions were produced elsewhere than in the cochlea. in the stiffness of the cochlear partition upon tip-link disruption. These changes were large enough to show that OHC stereocilia can indeed influence the impedance of the cochlear partition (36). In turn, the cochlear partition can drive sound pressure in a sealed ear canal, and measurements of the reverse middle-ear transfer function in guinea pig (157) and gerbil (52) have shown that intracochlear DPs come out as DPOAEs with a level approximately reflecting the middle-ear transformer effect, that is, ⫺35 dB regardless of frequency. C. Cochlear Origin of Nonlinearities It was soon suggested that the links between noninvasively measured OAEs, regardless of their category, and OHC functions could be exploited for inferring useful information regarding cochlear frequency tuning. In the frequency domain, so-called suppression tuning curves of OAEs were built. These represent for an interfering tone, how the level required to influence an OAE varies with interfering frequency. They reveal that interference happens only in a narrow frequency interval with a width similar to that of cochlear resonators (27). In the time domain, the delay of OAEs relative to their stimuli betrays the interplay of underlying resonators, such that the sharper their tuning, the longer they ring, as do OHC responses (200, 228). Nonphysiological, equipment-related causes of distortion show only a delay corresponding to the time it takes for sound to travel a few centimeters in the air and through the external or middle ear (in humans, a fraction of a ms). Conversely, TEOAEs ring several tens of milliseconds after the click evoking them has faded away (121) (FIGURE 3). Measurements of the latency of the other two types of OAEs are more delicate as they are usually produced by continuous tones with no definite onset. Either special methods utilizing Because of their large levels above the noise floor in laboratory animals (127, 128), DPOAEs have been extensively used in pathophysiological experiments conducted between 1980 and 1996, which unequivocally demonstrated their cochlear origin. In non-mammals and despite the differences in the physiology of their auditory organs, WD and OAEs were also found (15, 160). Evidence of the physiological links between DPOAEs and OHCs was collected from a variety of experiments enumerated in TABLE 1, the principles of which were either to contrast the effects on DPOAEs of damage to OHCs versus other cells, or to activate known modulators of OHC function. Nonetheless, when explored more thoroughly, the correlations between DPOAEs and the status of OHCs at places tuned to stimulus frequencies were less straightforward than initially reported, which required the issue of the places and mechanisms of DP generation to be revisited. Objections to the idea that OHCs in the cochlea could drive BM motion strongly enough to produce detectable sound in the ear canal were lifted by studies revealing large changes Table 1. Early evidence of the links between OAEs and OHCs Experiment OHC Activity Acoustic overstimulation Aminoglycoside administration Medial olivocochlear efferent neural bundle stimulation Electrical stimulation of OHCs Assayed losses of OHCs by platimun salts Assayed losses of IHCs by platinum salts DP grams in human hearing losses due to sensory-cell disease DP grams in human hearing losses due to neural dysfunction OAEs Reference Nos. Abolished Abolished Temporarily affected Decrease then recovery Abolished Slightly inhibited Abolished Slightly Decreased 7, 196, 309 127, 241, 252 (Before the mid 1980s, the role of OHCs being unknown, the effects of noise on OAEs could be attributed to cochlear, but not OHC impairment) 28 40, 128, 211 Increased Abolished Increased Abolished 188 269 Unaffected Unaffected 269 Abolished Abolished 89, 166 Unaffected Unaffected 196, 261 OAE, otoacoustic emission; OHC, outer hair cell; IHC, inner hair cell; DP, distortion product. 1568 Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS interrupted stimuli allow DPOAE onset latency to be visualized directly (293), or the frequency dependence of the phase of DPOAEs reveals their so-called group delays (see sect. IVF) (130, 185, 186, 221, 231). D. Essential Links Between Nonlinearity and Cochlear Activity Around 2000, several publications independently emphasized that the Hopf-bifurcation formalism provides a parsimonious, generic model of an active ear that unites all its aforementioned properties. It describes the behavior of critical oscillators, poised (e.g., by negative feedback) for operating near their dynamic instability (34, 60, 105). With the assumption that the damping term of an oscillator contains some adjustable parameter, a bifurcation occurs when this parameter sets damping to zero. On one side of the bifurcation, damping is negative and the (unstable) system oscillates spontaneously, while on the other side, damping is positive and the oscillator is stable, yet neither sensitive nor finely tuned. It is when poised near the bifurcation that the critical oscillator displays the key features of an active cochlea, i.e., high sensitivity, compressive amplification, and level-dependent frequency tuning, combined with the above-described nonlinearities. Stability is ensured by a distortion-generating damping term increasing with the third power of oscillation amplitude (see sect. VF). Several specific models of the cochlea present a Hopf bifurcation, even some of them not explicitly built for this purpose (265). The confirmations of a shared origin for cochlear activity and nonlinearity have changed the status of OAEs since the 1980s. Initially, OAEs were considered as anecdotal byproducts, worth being studied only because they might have a clinical utility, such as universal screening of congenital hearing impairment in neonates (22, 108, 290). From the 1990s on, their potential as tools for assessing cochlear nonlinearities has been understood. OAEs open a unique window on the intimacy of auditory stimulus processing in a hearing organ. III. MECHANOELECTRICAL TRANSDUCTION AND NONLINEARITY Mammalian OHCs act within a feedback loop, which processes the auditory stimulation. For modeling purposes, four functional stages of this feedback loop can be separated, all of them being theoretically able to distort to some extent. Locally, transverse sound-induced BM vibrations propagated from the basal cochlea produce a radial shearing motion between the reticular lamina, formed by the apical cell surface of hair cells and pillar cells supporting them, and the tectorial membrane, which deflects the stereocilia bundles. The tallest stereocilia embedded in the tectorial membrane are directly deflected in the excitatory direction, away from the modiolus, when the BM moves upward toward scala media (stage 1). The displacement of OHC bundles away from the modiolus opens MET channels by pulling on the tip-links (see sect. IB; FIGURE 1C). The endocochlear potential, about ⫹100 mV, combined with the negative polarization of OHCs, near ⫺40 mV (109), strongly drives endolymphatic K⫹ ions into the cells through open MET channels (stage 2). The sound-modulated current through an OHC produces a change in its membrane potential. As OHCs are endowed with electromotility (see sect. I), forces are generated (stage 3) which, by acting with the appropriate timing, produce positive feedback and a negative-damping effect that enhances BM vibration (stage 4) (FIGURE 4). In the hair cells of nonmammalian species, as mentioned previously, no somatic electromotility has been reported. The active feedback loop, instead, involves the stereocilia bundles (104). It has been proposed that hair-bundle mechanics contributes to the active process also in the cochlea (194, 205), which, in the four-stage model, would imply active force generation at stages 2 and 3, that is, not only stage 3. FIGURE 4. Left: stages of OHCs activity organized in a feedback loop at the origin of amplification (see text for explanation). Right: at stage 2 (mechano-electrical transduction; MET), the current versus stereocilia bundle deflection transfer function is a sigmoid-shaped Boltzmann function (plot labeled Po). MET-channel functioning is also at the origin of the nonlinear stiffness of hair bundles, which reaches a minimum when the opening probability of MET channels is 0.50 (plot labeled “stiffness”). As a result, the force exerted on a stereocilia bundle as against its deflection (solid line labeled “force”) departs from a straight line (dotted line labeled “linear”). [From van Netten (276). Copyright 1997, with permission from Elsevier.] Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1569 AVAN, BÜKI, AND PETIT A. Which Stage Is Nonlinear? At stage 1, the shearing motion between BM and tectorial membrane contains nonlinear terms due to the complex trigonometry of their relative displacements but only at large deflections (⬎10°), beyond the physiological scale of a few hundred nanometers (deflection ⬍3°). Stage 3 is linear even though in vitro experiments pinpoint two potential causes of nonlinearity. The somatic current versus voltage characteristic of OHCs (97) is indeed complex, due to the interplay of ionic fluxes through membrane channels of the cell soma, and the OHC cell-body length relates to membrane potential in a slightly nonlinear manner when its changes exceed tens of millivolts (8). However, for stimulations above 1 kHz, the somatic current-voltage relationship becomes linear, and in the physiological range of changes, the voltage-dependent electromechanical activity of OHCs distorts negligibly (240). Therefore, relevant contributions of OHCs to significant mechanical nonlinearity and WD should be searched for at stage 2. Here, two possible mechanisms can be found. First, the directly measured stiffness of hair bundles is asymmetrical, larger in the excitatory than inhibitory direction (68). Second, when zooming in on the OHC MET channels, the plot relating K⫹ ion current versus hair-bundle displacement, the so-called transducer transfer function, is sigmoid shaped. Experimental plots can be fitted by a first-order Boltzmann transfer function describing the statistical process by which MET channels in a hair bundle shuffle from an open to a closed state (288). These two nonlinear contributions, deflection-dependent change in hair-bundle stiffness and MET function, actually are tightly interconnected, as shown by experiments that used antibiotics to block MET function and also abolished the asymmetry of hairbundle stiffness (102, 235). Therefore, gating of MET channels and activity-dependent hair-bundle stiffness might be two sides of the same coin, which suggested the term gating stiffness (or its inverse, “compliance”) to describe this nonlinearity (see sect. IIIB). Nowadays, second-order asymmetrical Boltzmann transfer functions assuming one open and two closed states are more widely utilized for describing the activity of OHC MET channels (153). Their point of inflection lies closer to the lower saturation level than in a first-order function. The current varies from zero, when all channels are closed, to a saturated value corresponding to all channels open, and the operating point (OP) describes the percentage of open channels at rest, thought to be near 50% in OHCs (109, 234). The slope of the transfer function around the OP is S ⫽ ⌬I/⌬␣, where ⌬I is the current through the cell produced by a deflection ⌬␣ of the hair bundle. The larger S, the larger the change in membrane potential in response to a given ⌬␣, thus the larger the strength of feedback onto the BM and the gain of the OHC-based cochlear amplifier. From the maximum-gain position at the inflection point, if the OP shifts 1570 toward a saturating part of the transfer function, its curvature and ability to generate WD change too. The sinusoidal input tends to get clipped more on one side, as opposed to symmetrical clipping when the OP coincides with the center of symmetry of the function. The exact shape of the transfer function at the OP, where the hair bundle operates, is of importance because, as will be shown in section VF, the magnitudes of DPs depend on the coefficients of its expansion in a Taylor power series. The lowest-order DP at f2-f1 is called quadratic, and the next terms at 2f1-f2 and 2f2-f1, cubic, as they depend on the terms of the series to the powers 2 and 3, respectively. As a rule, even-order DPs are such that the sum of coefficients of f1 and f2 in the DP frequency formula is an even integer (1⫹1, for the f2-f1 DP), whereas for odd-order DPs, this sum is an odd integer number (2⫹1 for the cubic DPs). Characteristically, an even-order DP is generated by symmetric components of the transfer function so that the pattern of WDs and their changing content in even-order DPs betrays the asymmetry of the transfer function which can be adjusted by changing the OP (21, 29, 70, 253). In summary, in the widely accepted view that in mammals the processing of auditory stimuli by OHCs is organized in a feedback loop, the compressive amplification of stimulus vibrations stems from the whole feedback process, whereas the other nonlinear manifestation, WD, is associated with MET-channel gating. Not surprisingly, cochlear amplification, compression, and WD generally exist and vanish together. For example, in mutant mice with nonfunctioning MET channels, there is no transduction and no cochlear activity, and all cochlear nonlinearities are absent (33, 178). Nevertheless, WD originating from MET-channel gating can exist in the absence of amplification, in a passive cochlea (9, 183). The most common situation of passive WD happens in response to low-frequency stimuli of ⬃70 dB SPL at basal cochlear places which are not tuned to them, but will vibrate passively (169) (see sect. VIA). B. Mechanisms of Nonlinearity on a Subcellular Level The mechanosensitive channels of auditory sensory cells, whose gating is directly controlled by hair-bundle deflection and not by enzymatic reactions as in nonmechanical sensory modalities, can respond to frequencies up to ⬃100 kHz. The stereocilia of a mature OHC hair bundle, arrayed in three rows of increasing height, are coupled together by tip-links (207), made of cadherin-23 and protocadherin-15 (119) and likely anchored to the MET channel. The deflection of a hair bundle by the acoustic stimulus in the excitatory direction increases the tension in tip-links and related gating springs, putative elastic elements which pull on MET-channel gates and increase their open probability (Po). Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS At present, even though attractive candidates have been proposed for the MET channel (118, 129), the exact molecular architecture of the MET machinery remains unknown. However, many basic characteristics of its gating have been inferred from ex vivo measurements of ionic current through the channel and of stiffness measurements, against hair-bundle deflection, using calibrated flexible fibers to deflect stereocilia tips (e.g., Refs. 68, 102). To understand the origin of the deflection dependence of stereocilia bundle stiffness (68), the first requirement is the identification of its components. Apart from the tip-link molecular complex and the stereociliary rootlets around which stereocilia rotate as rigid rods (23, 132), the stereocilia of OHCs are expected to be stiffened by zipperlike clustered fibrous links, the horizontal top-connectors that join the upper parts of adjacent stereocilia within and between rows (81, 280, 282) (FIGURE 5). All contributions are thought to be independent of deflection, which was confirmed experimentally for the first two ones (44, 100). Let K⬁, the sum of these contributions, be the deflection-independent hairbundle stiffness. The gating process also introduces a deflection-dependent stiffness term Kg due to the fact that when a MET channel opens, the associated conformational change leads to a significant shortening of the associated gating spring by a few nanometers, which relaxes it partially. As a consequence of this decrease in stiffness, the gating springs of the remaining channels bear additional tension, a gating force Fg that increases Po (101). As shown by the current versus deflection mechanoelectrical transfer function, Po obeys a sigmoid Boltzmann statistical law (FIGURE 4B) Po ⫽ 1 ⁄ 共1 ⫹ exp关⫺ Fg共X ⫺ X0兲 ⁄ kBT兴兲 (1) in which Fg is the gating force, X-X0 is the displacement of the tip of the hair bundle, kB is the Boltzmann constant, and T is the absolute temperature. If one considers more than one open and one closed states for the MET channel, higher-order Boltzmann statistics replace the first-order one in Equation 1. Thermodynamics then predicts that the stiffness of a hair bundle carrying N transduction units is (101) KHB ⫽ K⬁ ⫹ Kg ⫽ K⬁ ⫺ 共N Fg2 ⁄ kBT兲Po共1 ⫺ Po兲 (2) When all channels are either closed or open, KHB ⫽ K⬁. When a MET channel is pulled open by a stimulus so that Po increases, Kg, the nonlinear Po-dependent term in Equation 2, describes a decrease in stiffness, i.e., an increase in compliance, with momentous consequences. Indeed, as the gating force Fg adds up to the stimulus (FIGURE 6), it becomes easier to deflect the hair bundle and thus to open more MET channels. The single-channel gating force can be expressed as a function of the transducer current I(X) and its derivative dI/dX (277) Fg共X兲 ⫽ kBT ⁄ I共X兲dI共X兲 ⁄ dX FIGURE 5. Morphology of OHC hair bundles. The low magnification scanning electron micrograph (top) shows the three rows of OHCs and their aligned stereocilia. At higher magnification (bottom), the top connectors are visible, marked by arrows. Scale bars: 2.5 m (top) and 0.25 m (bottom). [From Verpy et al. (282).] (3) Thus both the nonlinear gating stiffness of the hair bundle and the gating force that modulates the stimulus directly relate to the sigmoid Boltzmann transfer function and to MET-channel operation. As a result, the larger the size of Kg relative to K⬁ in Equation 2, the more conspicuous the overall hair-bundle nonlinearity. In hair-cell preparations from the frog saccule, a vestibular organ, Kg is so large that the overall hair-bundle stiffness becomes negative within the range of MET channel operation, which leads to bundle instability and spontaneous oscillations (172). In mammalian hair cells, the linear stiffness K⬁ dominates so that gating forces elicit much less mechanical nonlinearity (K⬁ and the K⬁/Kg ratio are 10 times larger than in frogs) (277). Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1571 AVAN, BÜKI, AND PETIT F Fg FIGURE 6. Principle of gating compliance. Tip-links (green) connect neighboring stereocilia (left). A MET channel (blue) can be pulled open by hair bundle deflection by a stimulus (force F) in the excitatory direction (middle). The conformational change due to channel opening (right) shortens the gating spring, a linear elastic element activating the gate. This shortening reduces tension in the spring, which amounts to a gating force Fg. The molecular complex anchoring the tip-link to the taller stereocilium cytoskeleton (pink) may adjust its tension by slipping along the actin filaments, thereby controlling the open probability of the MET channel (MET adaptation process). C. Horizontal Top Connectors: An Unexpectedly Large Part Measurements on mutant mice that lack stereocilin (Strc⫺/⫺ mice), a model for the hereditary recessive form of human hearing impairment DFNB16 (281), raised a surprising issue regarding the part played by MET channels in the generation of cochlear nonlinearity (282). Stereocilin is a protein associated with horizontal top connectors joining adjacent stereocilia within and between rows, and with the links that attach the tallest stereocilia to the tectorial membrane (280). In the absence of stereocilin, in Strc⫺/⫺ mice, horizontal top connectors do not develop, and the tips of OHC stereocilia are less clearly aligned than in control mice (FIGURE 7A). Their tip-links persist until postnatal day 15, P15 (FIGURE 7B), and MET currents are found within normal range, ensuring near-normal auditory thresholds during a few days after the onset of hearing. Yet, DPOAEs and WDs in the electrical responses of OHCs (FIGURE 7, C, right panel, and D, dashed line) are abolished in response to stimuli below 90 dB SPL, whereas in wild-type mice they are evident in the 20- to 90-dB SPL interval (FIGURE 7, C, left panel, and D, solid line). A strong decrease of masking, in its suppressive component, is also observed (FIGURE 7E). As discussed, suppressive masking is viewed as enhancing contrasts among simultaneously present spectral components (262, 282). Whether decreased suppressive masking is detrimental to the intelligibility of speech in noisy environments remains to be tested in patients affected by DFNB16. Between P15 and P60, Strc⫺/⫺ mice gradually entirely lose OHC function, which is attributed to the irreversible loss of the tip-links. Of note, in a recently published cohort of affected patients (69), mutations in the stereocilin-encoding gene, STRC, emerged as a major contributor to pediatric bilateral sensorineural hearing impairment. Yet it fre- 1572 quently came with only mild increase in hearing thresholds, which thus indicates that OHC hair bundles without the stereocilin-composed horizontal top-connectors can still function in humans. The intriguing issue raised by the phenotype of Strc⫺/⫺ mice is that it apparently contradicts the tenet that distortion is an exclusive property of MET channels in relation to their ability to open and close normally. By functioning enough in young Strc⫺/⫺ mice to ensure the detection of low-level sounds, MET channels must distort. Yet no sign of WD was detected between 20 and 90 dB SPL. Of note, this divorce between normal responses at SPLs near the auditory threshold, and the absence of WD and defects in masking (FIGURE 7) may only be apparent, and be compatible with a normal functioning of MET channels when hair-bundle deflection is minimum ⫺40 dB SPL for threshold measurements, but could become abnormal at larger hair-bundle deflections ⱖ70 dB SPL for WD, DPOAE, and masking measurements. Several hypotheses may be brought forward to explain these results. The most radical line of thought would be to challenge the current view by positing that the recorded distortion stems, not from MET channel nonlinearity (Kg in Equation 2 would be too small), but from some passive nonlinear mechanism of the hair-bundle, so far overlooked, that involves the horizontal top connectors. This implies that the molecular assembly containing stereocilin has specific biophysical properties able to confer nonlinear stiffness to a hair bundle. Notably, some models of hair bundles predict that nonlinear stiffness can stem from changes in bundle geometry due to deflection when a cohesive hair bundle is formed (190). An even simpler view would assume the horizontal top connectors to be tight when a hair bundle is deflected in one direction, and relaxed in the other direc- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS FIGURE 7. In the Strc⫺/⫺ mice, which lack stereocilin, the horizontal top connectors are missing. The tops of OHC hair bundles are less aligned than they should be (scanning-electron micrograph, A), while a close-up view of stereocilia (B) shows that tip-links are present at postnatal day 14 (arrowheads), which accounts for preserved MET and auditory sensitivity. C: contrary to wild-type controls at the same age (left), the cochlear microphonic potential (CM) in Strc⫺/⫺ mice (right) in response to pairs of pure tones contains no WD (arrows mark intermodulation frequencies). D: Lissajous patterns of CM in response to 80-dB SPL tones display a sigmoid pattern in Strc⫹/⫹ mice (solid line) and a linear pattern in Strc⫺/⫺ mice (dashed line), despite similar CM amplitudes suggesting an equivalent function of the cochlear base. E: masking tuning curves plotted at postnatal day 14 (representing the threshold level of a masker tone, against its frequency, required to suppress the neural response to a fixed probe tone at 10 kHz) are equally narrow in Strc⫺/⫺ mice and Strc⫹/⫹ mice, yet although probe tones are presented at similar levels (diamonds), masker levels have to be set ⬃20 dB higher in Strc⫺/⫺ mice to exert masking. [From Verpy et al. (282).] tion, without having any specific nonlinear mechanical property per se. Equation 2 would be replaced by KHB ⫽ K⬁ ⫹ Kd, with Kd depending on the state of tension or relaxation of horizontal top connectors, i.e., on the direction of hair-bundle deflection. Alternative explanations integrate the role of MET channel nonlinearity in the generation of WD. The coordinated opening of MET channels implicating the horizontal top connectors might combine weakly nonlinear transfer functions of individual MET channels into a markedly nonlinear mechanoelectrical transfer function of the hair bundle. The interest of a coordinated motion of bundled stereocilia has been recently emphasized in the context of transduction sensitivity rather than nonlinearity, but the same arguments may be relevant for both topics. One important issue is whether MET channels work in series or in parallel. In a mode of action in series of the MET channels, only the stereocilia in a radial relationship mediated by tip-links are involved. The opening of one MET channel is expected to reduce the force exerted on the other MET channels within the same radial column, which results in negative channel cooperativity (117). A possible exception was observed in a model of hair bundle that examined the series engagement of two transducer channels, with one of them assumed to be sensitive at small displacements and the second one at larger ones. In this case, the series mode of activation allowed complementary activation with improved dynamic range and intact sensitivity (278). It is noteworthy that hair-cell activation curves in this two-channel series model showed greater asymmetry and nonlinearity and much more compression at higher levels, relative to the single engagement case. Cooperativity between only two channels was sufficient for deeply influencing the nonlinearity of hair cell transfer functions (278). This strong impact of radial mechanical coupling between MET channels of one column of a stereocilia bundle suggests that a possible contribution of the inter-row horizontal top connectors to channel cooperativity and the ensuing nonlinearity might have to be considered in addition to the tip-links invoked (278). Actually, several investigations of hair-bundle preparations whereby stereocilia splaying is prohibited suggested that the transduction channels work mechanically in parallel, rather than in series. Parallel coupling results in positive cooperativity in response to a force stimulus, i.e., the more MET channels open, the larger the force transmitted to the closed ones (116, 136). In hair cells from the bullfrog sacculus, the constraints on hair-bundle geometry were found to be strong enough for spontaneous stereocilia movements at opposite edges of the Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1573 AVAN, BÜKI, AND PETIT bundle to show high coherence and synchrony (136). In the same type of hair cell, fast video analysis of hair-bundle motion (117) revealed the sliding displacement of adjacent stereociliary tips, indicative of parallel gating. In these saccular hair cells, links other than the horizontal top connectors could be severed without influencing bundle motion, thereby suggesting that sliding motion, at least in these cells, rests on horizontal top connectors. Mathematical models of three-dimensional motion of a whole hair bundle (190) support the idea that the asymmetry of the MET current versus hair-bundle deflection relationship increases with the number of MET channels working together. Altogether, these studies suggest that strong lateral coupling among MET channels might impinge the Po Boltzmann pattern, and thereby may increase distortion enough to produce the major WD that the Strc⫺/⫺ mice fail to produce. Could the W shape of the OHC hair-bundles have a special impact on the proposed stereocilin-mediated positive MET channel cooperativity? The fact that the basal insertion points, around which neighbor stereocilia pivot, lay at different distances from the bilateral symmetry axis of the hair bundle generates shearing motion between stereocilia of the same row when the hair bundle is deflected. Parallel coupling via intrarow horizontal top connectors, possibly reinforced by the anchoring of the tallest stereocilia in the overlying tectorial membrane, should allow, when a number of MET channels open, the tension release of their gating springs to affect the force acting on other MET channels of the hair bundle. In the absence of stereocilin in a W-shaped hair bundle, only the three stereocilia (FIGURE 1C) of the same radial column can interact via their tip-links, because the lateral constraint between stereocilia of neighboring radial columns has vanished. [Notably, in hair bundles organized in a straight line, as in IHCs, whether connected or not, stereocilia of neighboring radial columns are expected not to slide relative to each other, or to slide to a far more modest extent.] At hearing threshold, cooperative opening of channels should be irrelevant so that one hardly expects the presence of top connectors to make any difference on hearing sensitivity. But at large enough sound pressure levels for hair-bundle deflection to engage several MET channels, in the 20 –90 dB SPL range where WD is explored and was not found in the Strc⫺/⫺ mice, the large difference in nonlinearity between the normal and Strc⫺/⫺ mice might be related to presence versus absence of positive cooperativity resulting in parallel coupling. At present, horizontal top connectors can be concluded as essential to the generation of WD, but which of the proposed mechanisms underlies their contribution remains to be clarified. D. Other Consequences of Coordinated Stereocilia Motion Possibly Contributing to Hair-Bundle Nonlinearity The effects of coordinated stereocilia motion have been mostly studied in terms of interaction between stereocilia 1574 and surrounding fluid. One possibly detrimental consequence of this interaction is damping by friction. The complex hydrodynamics of stereocilia have been studied either by finite-element modeling or interferometric measurements. Their outcomes concur in showing that in a cohesive hair bundle such that the fluid trapped between stereocilia is immobilized, the viscous drag is reduced compared with a vibration mode allowing relative squeezing among stereocilia shafts (135). An additional advantage of trapping the viscous fluid between stereocilia might consist of a more concerted activation of MET channels (135). Brownian motion in the fluid bathing stereocilia is another phenomenon that might hamper MET efficiency by introducing noise in MET channel openings. It has been suggested that parallel coupling of transduction elements should reduce the negative impact of noise by increasing the synchrony of MET channels (50, 117, 191). Numerical simulations of hair bundle motion, in the simpler case of the straight rows of IHC hair bundles (258), also emphasized the importance, for MET-channel dynamics, of controlling the degree of splay among stereocilia. The authors stressed the role of tectorial-membrane positioning in this dynamics, and furthermore, predicted the formation of nanovortices between rows of stereocilia. Of note, nanovortices vanish when stereocilia lose cohesion (Chadwick, personal communication). Vortices being a known source of WD, this may provide one more link between hair-bundle cohesion and nonlinearity. E. OHC Bundles: Their Environment and Operating Point If the processes allowing appropriate gating of MET channels form the source of WD and are an essential stage of nonlinear amplification, this does not preclude that linear elements in the environment of hair bundles modulate WD or contribute to transferring it into a detectable form. The electromotility of OHCs, for example, transfers the nonlinearity of the membrane potential due to MET-channel operation into a mechanical nonlinearity (see sect. IIIA). Despite the absence of DPOAE at low stimulus levels in prestin-null mice mutants lacking somatic motility and amplification (143), the observation of physiologically vulnerable DPOAEs around 70 dB SPL indicates, however, that prestin is not necessary to distortion generation when a high stimulus level overcomes the lack of amplification (144). The OP defined by the percentage of open MET channels at rest controls the gain of the cochlear amplifier and influences the amount and symmetry of WD. What known processes modulate OP position? The percentage of open channels at rest tends to be stabilized by the process of adaptation (101). Conversely, the OP can be shifted by biasing the Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS BM by infrasound (21, 239); endolymphatic hydrops, i.e., swelling of the scala media (11, 268, 270); and modulation of OHC function by medial olivocochlear efferent neurons activated by the ipsi- or contralateral presentation of sound (3, 83, 189). It has even been suggested that the resting position of the BM and the OP of OHCs are continually adjusted by a control mechanism that modulates cochlear sensitivity and tuning (140). Displacement-sensitive experimental setups indeed reveal long-lasting shifts of the BM from its resting position (31) to which regular, velocitysensitive measuring devices are insensitive. Fluid pressure in scala media may provide a suitable control mechanism, with hydrops representing an extreme situation (142). The tectorial membrane, which embeds the tallest stereocilia of OHCs and incidentally, possibly contributes to the cohesion of OHC stereocilia bundles when stereocilin is present, might influence sound-evoked OAEs. Nonetheless, OAEs do exist in ears of lower vertebrates without tectorial membrane (17). In Tecta (deltaENT/deltaENT) mice with mutations in the Tecta gene encoding ␣-tectorin (139), and a tectorial membrane detached from the cochlear epithelium, DPOAEs are still produced at stimulus levels above 65 dB SPL (151). Their extant level can be explained by an abnormal mode of OHC stimulation by viscous coupling to the endolymph bathing the hair bundles, resulting in a 35-dB increase in hearing thresholds. The source of nonlinearity thus requires higher stimulus levels to compensate the decreased cochlear amplification. It produces DPOAEs that decrease in amplitude with increasing frequency, whereas DPOAE amplitude depends little on frequency in control mice, with an effect of the frequency ratio f2/f1 more irregular than in control mice (151). This supports the view of a tectorial membrane acting as a mechanical filter shaping the properties of sound-evoked OAEs, without influencing their generating mechanism, which depends on OHC hair bundles with normal MET channels and horizontal top connectors, as they are in Tecta mutants (280). In summary, overwhelming evidence points to the nonlinear processes that ensure the gating of MET channels in sensory-cell hair bundles, i.e., their gating compliance and the cohesive motion of stereocilia, as the source of WD, one of the manifestations of auditory nonlinearity. In OHCs, MET processes also are the first stage of a feedback loop that dynamically controls BM motion and decreases cochlear amplification at increasing input levels. In this way, MET contributes to the other manifestation of auditory nonlinearity, the compressive (⫾0.33 dB/dB) growth of responses with stimulus level at frequencies near resonance. Off resonance, compression no longer exists. This will lead us, in the next sections, to distinguish between WDs coming from hair bundles at resonance (nonlinearly growing, “essential” WD) versus off-resonance (more linearly growing, associated with saturation of MET currents on either side of the Boltzmann transfer function, i.e., saturation WD), to examine the spatial distribution of WD sources, and to review how WDs from different locations propagate and recombine to form noninvasively recorded OAEs. Although OAEs conveniently reveal the existence of WD and reflect many of its important properties, it does not mean that their generation mechanisms are purely nonlinear, and these mechanisms will have to be reviewed according to OAE type. IV. NONLINEAR RESPONSE OF THE COCHLEA TO A SINGLE TONE This section examines the simplest case of a pure tone at frequency f. It briefly reviews how acoustic vibrations travel along the cochlear BM and peak according to the rules of cochlear tonotopy (200); what nonlinearities the vibrating OHCs generate; from where and how the resulting OAEs can propagate backward to the external auditory canal. A. Forward Travel The thickness and width of the BM, separating ST from scala media and on which rests the organ of Corti (FIGURE 1A), gradually vary from base to apex, and so does the local resonance frequency, which decreases from base to apex. The Reissner’s membrane, rather uniformly compliant, is usually thought to play little mechanical part. Incoming sounds enter SV at the oval window, where the pistonlike vibration of the stapes produces a compression wave traveling at the speed of sound in water, ⬃1,500 m/s (206); however, it likely causes little significant motion of sensory cells. As a general rule, the impedance of a resonant system driven below its resonance frequency is dominated by its stiffness. Thus with respect to audible frequencies except the highest ones, the basal-most cochlea behaves as a stiff partition. Thus the amplitude of the pressure wave in ST is small, and there is a pressure difference across the BM. Energy transfer occurs as the stiff BM responds to this pressure difference by rapid vibrations. As shown by Békésy (14a), coupled interactions between the BM and cochlear fluids of ST and SV initiate a transverse, forward-moving wave traveling along the stiff BM surface at a few hundred meters per second, which increases in amplitude and gradually slows down to a few meters per second while approaching the site of resonance, called CF (for characteristic frequency) location (218, 227). This traveling wave excites auditory sensory cells by efficiently moving their stereocilia bundle. Toward CF location, the wavelength (ratio of speed to frequency) decreases and a large phase delay accumulates. Theory indicates that the place of maximum vibration along the longitudinal cochlear axis is just basal to CF location. Exactly at CF location, at resonance, the stiffness and mass terms of the cochlear impedance cancel each other, and the purely resistive impedance absorbs the energy of the tone, by so-called critical-layer absorption (147), Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1575 AVAN, BÜKI, AND PETIT so that the amplitude of vibration is less, as the envelope of BM vibrations at f peak very near CF location. For the sake of simplicity, no distinction will be made henceforth between CF location and this peak. More apically, the pressure difference between SV and ST vanishes, and BM vibrations decline sharply. If by some means a vibration is generated at frequency f at a location tuned to a frequency lower than f (it will be shown later on that WD may generate such frequency components), this vibration finds too slack a BM to propagate in either direction. Frequency is an important variable controlling the shape of BM vibrations. Yet, an even more relevant variable is the scaled ratio f/CF between the stimulus frequency f and the characteristic frequency of the measurement location. From accumulating observations of BM motion, Zweig (310) was the first to pinpoint that the amplitude and phase patterns of BM motion are determined solely by f/CF, a property which he called cochlear scaling symmetry [except in the most apical part of the cochlea where scaling symmetry apparently breaks (49, 168)]. In sensitive cochleas, the traveling wave is amplified by OHCs along its way. Their feedback, by compensating damping, boosts the BM response to soft sounds provided feedback occurs at an appropriate phase. This happens only between 1 and 2 mm basal to the CF location (in the chinchilla, at CFs around 10 kHz) (228). The enhancement of cochlear resonance and sensitivity comes with finer tuning as the feedback is efficient only at frequencies near f (200). Cochlear amplification is nonlinear, such that at low stimulus levels, the vibration at the CF location is ⬃1,000 times larger than it would be in the case of damaged OHCs whereas, above 90 dB SPL, BM vibration is independent of OHC status (225, 228). The CF-to-stapes vibration ratio is ⬎1,000 at low stimulus levels, as a result of energy concentration due to traveling-wave speed reduction and active reduction of damping by feedback. Its decrease by more than two orders of magnitude as the stimulus level increases (e.g., FIGURE 1D in Ref. 220) means that the growth of the response at the CF location is strongly compressive (between 0.2 and 0.5 dB/dB increase in SPL), but again only along the short section on the BM where OHCs efficiently boost the traveling wave. Where OHCs are far from resonance (passive) or damaged, feedback exerts no amplification and BM vibrations at f grow linearly 1 dB/dB (FIGURE 2). BM displacement thus has two components, near CF location, a high and sharp peak strongly depending on the “activity” of the OHC-driven cochlear amplifier, and between stapes and CF location, a broad “passive” tail region, hardly sensitive to OHC status. Around the peak, amplification ensures a large input to the nonlinearly working MET channels of active OHCs, accounting for essential distortion at low SPLs. Along the broad tail, OHC MET channels still open if stimulated high enough and produce “passive” nonlinear phenomena including WD. Their existence will be discussed in the next sections. Compared with 1576 essential distortion from around CF, their growth is expected to be steeper (see sect. IIA), and their time delays shorter as their generation place is nearer the oval window. B. Always Forward? Section IVA may give the false impression that cochlear waves can only propagate from stapes to CF location. Actually, from a vibrating site on the BM, mechanics prescribe that its elementary contribution to the wave, a “wavelet,” starts traveling both ways. The physicist Huygens introduced the principle that every point of a wavefront acts as the source of wavelets that, by spreading out in all directions, not only that of the incoming wave, combine together to continue the propagation. Fresnel (70a) later completed this principle by stating that the amplitude of a wave at any given measurement point equals the vector sum of the amplitudes of all the wavelets reaching that point. The result of this summation depends on whether wavelet phases are similar or widely different. Fast phase variations among wavelets lead to small contributions even though individual amplitudes may be large (251). A combination of large amplitudes and slowly varying phase (phase coherence; Ref. 247) leads to a large outcome. Thus the way that spatially distributed wavelets can or cannot efficiently combine introduces emergent directionality, critical for accepting the concept of OAEs. In a linear medium, the wavelets have the same frequency as the stimulus, but in the nonlinear cochlea, the incoming wave also generates wavelets at frequencies different from its own (see sect. V). For a regular pure tone coming via the stapes or the bony wall, from the nondirectional principle of Huygens, forward directionality emerges. Wavelets from basal places, launched when the incoming stimulus reaches their place, incur an increasing phase lag while going forward. In this direction, they meet more apically produced wavelets, in phase with them because they have been launched with an increasing phase lag, as the stimulus reached their places later. The phase lags accumulate in parallel, which favors wavelet recombination in the forward direction. Wavelets starting basal-ward, conversely, experience strong cancellation as those from apical places, launched with already a large phase lag, incur an additional lag while traveling basally and interfere with wavelets produced more basally and presenting much less phase lags (launched earlier and having less to travel). The case of OAEs generated within the cochlea is different and will be examined in sections IVF and V. C. Vibrations at f Generate Harmonic Distortion In the cochlea stimulated by a single tone, WD is revealed by the production of acoustic harmonic distortion at integer Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS multiples of the stimulus frequency, the fundamental frequency f, i.e., 2f, 3f, etc. (FIGURE 8). Through a tiny opening, a laser beam is focused on the BM (41, 125, 232), or a displacement-sensitive probe is placed near it (197), and the stimulus frequency is usually set near the CF of the tested site, CFsite. Few reports exist on basal harmonic distortion, possibly because it seems small and labile, absent in Reference 232, ⫺30 dB relative to stimulus level in Reference 41, and ⫺25 dB in Reference 197. Apical harmonic distortion is conspicuously larger, ⫺7 dB relative to stimulus level (125), perhaps in relation to the different geometry of OHCs in the cochlear apex addressed in section VII. In Reference 197, the sensitivity of the preparations was not optimal, and the origin of all detected harmonic pressure components, at 2f and 3f, was tracked to the place tuned to f (because the delays of the harmonics and of the fundamental frequency f, inferred from their phase accumulation against frequency, were the same). Cooper also attempted to shift the stimulus frequency to f ⫽ CFsite/2 (41). A response was measured at CFsite, due to the harmonic 2 ⫻ (CFsite/2). Harmonics, wherever produced, are vibrations and propagate accordingly. Where had this vibration been generated? It could hardly be the location tuned to the stimulus frequency, as it was when CFsite tones were used, because the second harmonic cannot travel along the BM at places tuned to lower frequencies. The only alternative hypothesis is that it was produced at basal places and traveled, forward, to the detector. While reaching there, its CF location, it was amplified and became detectable, hence its name “amplified distortion” (41). The existence in this experiment of harmonics generated at their frequency place on the BM and not at the CF location highlights the importance of considering basal contributions to WD and not only CF ones. produced by spectrally richer sounds, the observed harmonic distortion levels are low. The energy carried by harmonics never exceeds 4% of that of the stimulus. There is likely an AGC in the cochlea that tends to lower the cochlear gain, when the sound level increases, enough for WD to be kept to a minimum (315). Harmonic distortions can be heard only at large stimulus levels (286). Their contributions produced at places tuned to f are filtered out; furthermore, they cannot propagate to their CF locations. As for “amplified distortions,” they likely suffer from the upward spread of masking by the louder, lower-frequency tone at f (see sect. IXA). Second, at variance with the outcome of two-tone stimulus experiments (see sect. VF) favoring oddorder WD, the 2f (even-order) harmonic is the largest, not the 3f, 5f . . . harmonics. The balance between odd- and even-order WD is very sensitive to the position of the OP of the nonlinear source (see sects. VF and VIII), which may differ from one experiment to another depending on how the invasive measurements affect cochlear homeostasis. D. Waveform Distortions at Off-Resonance Places Section IVA suggested the existence of WD even at offresonance basal places. This can be checked easily. An electrode at the round window detects several electric correlates of the vibrations of basal hair cells, an oscillating change in electric potential at the stimulus frequency called cochlear microphonic (CM) potential, and a steady change following the envelope of the stimulus, called summating potential (SP). They are thought to relate to the oscillating and steady components, respectively, of the receptor potentials of hair cells. The electrode collects spatially averaged responses in which, at stimulus frequencies below a few kHz, basal haircell contributions to the CM prevail. Being off-resonance, Two noteworthy facts were reported at basal sites (41). First, and this will be encountered again in nonlinearities 3 2 velocity (mm/s) 1 0 -1 -2 -3 300 1000 frequency (Hz) 5000 -4 0 2 4 6 8 10 time (ms) FIGURE 8. Waveform (right) and spectrum (left) of the velocity signal from an apical Hensen cell, a support cell in lateral position relative to the outer row of OHCs in the mammalian organ of Corti, extracted from interferometric measurements, in response to a 256-Hz (f) tonal stimulus in a healthy cochlea. Harmonic components at 2f, 3f, etc. (arrows) are visible at decreasing levels 10 –20 dB below the fundamental component in relation to distorted waveforms (top right). Harmonics vanished in a few hours after death, while the Hensen-cell response became almost sinusoidal [From Khanna and Hao (125). Copyright 2000, with permission from Elsevier.] Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1577 AVAN, BÜKI, AND PETIT they are small yet combine in unison, due to the long wavelength of the traveling wave. In contrast, the larger responses from hair cells near the CF location undergo fast phase rotations leading to spatial cancellation. The CM is largely dominated by contributions from OHCs, as its amplitude changes little after selective destruction of IHCs (57). The SP originates from the asymmetry of the transfer function of the hair-cell transduction process at the OP, which generates a sustained depolarizing component. The main sources of SP are IHCs, but a sizeable component from OHCs can be identified all along the cochlear spiral (57). The CM waveforms display considerable distortion with large harmonics in response to a pure tone (193, 289). The CM WD is attributed to OHCs (38, 139). Its presence shows that basal OHCs do distort, even though excited by off-resonance tones. The sinusoidal deflection of stereocilia produces a saturating, nonsinusoidal current through the MET channels, which reveals the underlying Boltzmannshaped deflection/current transfer function and the OP of basal OHCs (FIGURE 7). Mechanically induced position changes of the organ of Corti, e.g., by gel injections (238) or application of a biasing force on the bony cochlea (308), come with shifted OP and modified CM harmonics, but also an increased SP. This pinpoints the practical interest of monitoring any of these compound signals from off-resonance places, as they are sensitive to subtle changes in haircell transduction. E. Emissions Produced by a Single Tone: Stimulus-Frequency Otoacoustic Emissions Reemission of additional sound from within the cochlea is illustrated by the measurements of SFOAEs emitted at the same frequency as the stimulus. Although their measurement is straightforward, their properties and relation to cochlear nonlinearity are complex. It is worthwhile to study them carefully, however, because they reveal many properties of an active cochlea in a noninvasive manner. It had been reported by Elliott (61) that auditory threshold plots against frequency from normal ears display quasi-periodic ripples of a few decibels. Following up on psychoacoustic investigations of this auditory threshold microstructure, Kemp (120) slowly varied the frequency of a low-level pure tone while recording the SPL in the sealed ear canal. He detected narrow ripples, superimposed on broad changes in SPL due to earphone calibration. The ripples signaling periodic changes in the ear’s impedance correlated with auditory threshold microstructure. The most straightforward interpretation was that of an interference mechanism between the stimulus itself and a pressure wave emitted from within the ear and likely partially reflected at the middle ear boundary. Fast phase rotation of the acoustic emission with frequency would create a regular pattern with periodic 1578 crests when emission and stimulus add together in phase, and troughs when they are out of phase and cancel each other. This OAE, elicited by a single continuous tone at the stimulus frequency, was called SFOAE. How can one SFOAE be separated from its stimulus as they share the same frequency? Their different growths at increasing stimulus level provide a first solution. The relative size of ripples decreases and eventually vanishes around 70 dB SPL as the SFOAE saturates. The SFOAEs can thus be extracted by subtraction of the scaled, high-level pattern from the low-level one. The suppressive action of a tone superimposed on the stimulus at a slightly lower frequency affords a convenient alternative (120). Suppression removes the stimulus-related wave traveling inside the cochlea, so the SFOAE may be extracted by subtraction of the suppressed from the unsuppressed plot (FIGURE 9, A for the amplitude and B for the phase). F. Backward Propagation of SFOAEs Kemp and Chum (123) suggested that the SFOAE came from the peak of the incoming traveling wave, where nonlinear reflection occurred on some impedance discontinuity. The current dominant view is that this reflection mechanism is linear. Its description is necessary, however, as its interplay accounts for many properties of all types of OAEs and shapes their nonlinear characteristics. How can wavelets of an SFOAE combine in a regular manner to form a significantly large backward wave (see sect. IVB)? Bragg scattering offered a possibility (264). Strong positive interference happens in the backward direction when an X-ray wave falls upon a distributed array of periodic scattering centers, with a spatial period of half its wavelength. The difficulty is that there seems to be no regular scattering array of OHCs in the cochlea (150, 304), unless regularity would arise from periodic disruptions induced by the curvature of the coiled cochlea (159), but there are SFOAEs from uncoiled auditory organs. A breakthrough occurred when a class of models of an active cochlea was shown (311) to generate a periodic structure of coherent reflecting sites out of spatially irregular distributions of scattering centers. The theoretical analysis showed that Bragg-like scattering could occur from a disorderly array satisfying several assumptions. First, if the peak of the mechanical response to the stimulus is tall, scattering is localized from the place where the response is largest. At this place, if the peak of the response is broad enough to encompass many scattering centers, some will present a spatial frequency matching the wavelength of the incoming sound. These will be singled out, by a sort of bandpass filter-like mechanism extracting a few coherent components from noise. The requirements of a tall and broad peak may seem conflicting as, to produce a tall peak, the cochlear amplifier generates frequency tuning which sharpens the Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS FIGURE 9. A: amplitude of the residual sound signal corresponding to an SFOAE, plotted against frequency, in response to a swept tone at 30 dB SPL in a chinchilla ear. Open circles, initial measurement; solid line, repeated measurement after ⬃1 h. B: phase versus frequency function. C: group delay against frequency, extracted, at each frequency, from the slope of the phase versus frequency function. Arrows marking the positions of amplitude notches, phase reversals and extreme group-delay values, respectively, happen to coincide [From Siegel et al. (251). Copyright 2005, with permission from Acoustical Society of America.] D and E: principle of group-delay measurements. The phase rotation of a signal from a local site, when its evoking stimulus is swept in frequency, accumulates in proportion to the time it takes for the signal to propagate from its generation place to the measurement point (e.g., the stapes or external ear canal), and therefore, to their distance. The red and green sine waves start from the same site at different frequencies, thus undergoing different phase rotations in a given time interval. D: short propagation delay, small phase gradient. E: long propagation delay, large phase gradient, for the same change in frequency. peak. Actually, the wavelength is around 0.2 mm at a 16 kHz CF location, in the gerbil (218), and the 1- to 2-mmwide region over which amplification generates a tall peak (228) covers several wavelengths and many scattering centers. The backward contributions from scattering centers in the peak region will add in phase and generate a large backward-emitted SFOAE. Bragg scattering is fundamentally linear and additive, contrary to what was initially suggested (123). Nonetheless, cochlear activity requiring the nonlinear action of OHCs is essential for the generation of a tall and broad peak allowing spatial filtering and, ultimately, efficient Bragg-like emission of SFOAEs. The nonlinear characteristics that SFOAEs carry with them are their saturation at increasing stimulus level and their suppressibility. This view of purely linear coherent reflection has been challenged by recent observations based on time-delay considerations. If an SFOAE results from backward scattering on irregularities near the peak of the BM response to the stimulus, theory predicts that the time it takes for the SFOAE to reach the ear canal is the sum of two delays of similar size, one for the forward journey to the characteristic place on the BM and one for the backward journey (249). Because an SFOAE is a steady response to a steady tone, its timing cannot be directly evaluated. However, a classic way to derive temporal information from sinusoidal responses is to study how their phase relative to the stimulus varies with frequency. Let us consider a system through which the travel time is t, for example, to some characteristic place and back. For sound at frequency f, this travel time, or delay, translates into a phase shift ⌬ ⫽ 2ft (every period, i.e., every 1/f, the phase of the sound rotates by one cycle). At frequency f ⫹ ␦f, the travel time leads to a phase shift ⌬= ⫽ 2(f ⫹ ␦f)t, assuming that t varies little with f. It follows that t ⫽ 1/2 d/df; thus the travel time can be deduced from how much the phase of the response to a pure tone varies with frequency (FIGURE 9, C AND D). The delay measured by this technique is called group delay in signal processing, as the envelope of a group of waves at different frequencies (or wave-packet) moving at the group velocity vg, undergoes a group delay t ⫽ ⌬x/vg when moving by ⌬x. Average SFOAE delays are ⬃2 ms in cats and guinea pigs at 1 kHz (2 cycles), and even longer in humans, 10 ms at 1 kHz (10 cycles) (247). By comparing BM group delays to SFOAE ones in a broad frequency interval in the chinchilla cochlea, Siegel et al. (251) found that SFOAEs had too short a delay relative to a doubling of BM group delays, especially at frequencies ⬍4 kHz. They ranged from 0.7 to 2 ms, well below the esti- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1579 AVAN, BÜKI, AND PETIT mated round-trip travel between stapes and CF location. The authors proposed two possible explanations. One is that the backward travel of the SFOAE is much shorter than the forward travel of the stimulus along the BM, which suggests a different propagation mechanism, perhaps by a compression wave through cochlear fluids (see sect. VH). Another possibility is that contributions to the SFOAE are widely distributed, not only near the peak of the traveling wave, but also basally with shorter delays. These basal sources, although being smaller in size than the CF-location ones, would undergo smooth spatial summation in longwavelength cochlear regions, while cancellation in the short-wavelength region would offset the advantage of larger levels, finally emphasizing short-latency contributions (251). Since then, the criticism of Siegel has been addressed (248) by qualifying the previous model and incorporating more general considerations no longer based on “SFOAE delay ⫽ twice the BM delay.” New evidence has been produced that SFOAEs (in cats) are as sensitive to low-frequency interference as cochlear responses from the place tuned to f, which is thus likely their dominant site of production (145). Other evidence, in the guinea pig, accommodates combined mechanisms of SFOAE generation, by showing that nonlinear mechanisms (independent of the presence of coherent reflection sites, but dependent on the high-level wave at f) can strongly influence the emission at high stimulus levels, whereas linear coherent reflection dominates at moderate and low levels (80). Subtle mechanisms such as position-dependent reflectance along the cochlea, multiple intracochlear reflections, or generation of “intermodulation” SFOAEs (difference tones occurring between harmonics, e.g., 2f ⫺ f or 3f ⫺ 2f; see sect. VB and Ref. 64) might also have to be taken into account. G. SFOAEs Without Traveling Wave The previous explanation refers to the BM and its traveling wave, but the existence of scattering and of SFOAEs does not require them. For instance, it is possible to measure SFOAEs in lizard ears, which lack a BM-supported traveling wave (16), resonance being incorporated in hair-cell bundles (161, 204). Lizard SFOAEs, including their latency, are similar to those measured in mammalian ears. Obviously, in this case, latency cannot be attributed to the propagation time of a traveling wave that slows down while approaching its CF location. Delays arise not only from travel times, but also filter responses. The more finely tuned a filter, the later the beginning of its response and the longer its duration. It has been shown with the help of a physical model (16) that the only requirement for long-delay SFOAEs is the presence of a slightly disarrayed battery of active filters. Similar qualitative conclusions have been reached regarding the presence of long delays in TEOAEs evoked by short impulse stimuli. Their mathematical principles, simple and generic, are described in APPENDIX I. They account for 1580 how MET nonlinearity produces detectable OAEs in all tetrapod ears, however diverse their anatomy. The basic requirement is the presence of an active filtering principle, regardless of how it is achieved. In summary, this section shows how the properties of OAEs evoked by a single tone can be understood by considering several key functions of an auditory organ; how waves propagate and interact with sensory cells; how fast they travel; what amplification and compression they undergo and whether a tall and broad peak is produced at CF location; and how finely waves are filtered. By measuring SFOAE properties, it is possible to probe several aspects of auditory stimulus processing and perception without disrupting their fragile anatomical support. The next section elaborates on these methods by examining a spectrally richer situation. V. INTERACTION OF TWO TONES IN THE COCHLEA When two tones at frequencies f1 and f2, called the primaries (f2 ⬎ f1), are presented simultaneously, the envelopes of BM motion, that peak at two different places, overlap over a broad interval. The amplitudes of the peaks at f1 and f2 on the BM still grow with stimulus level in a compressive manner and harmonics are still produced at 2f1, 2f2, 3f1, etc. However, as BM and OHC vibrations are nonlinear, two other important nonlinear manifestations show up from where the vibrations at f1 and f2 interact, i.e., all over the basal part of the cochlea and particularly near the place tuned to f2, the higher frequency. The presence of vibrations at one frequency tends to suppress the vibrations at the other frequency, and conversely, intermodulation due to the combined presence of vibrations at two frequencies generates vibrations at new frequencies not present in the stimulus. A. Two-Tone Suppression Two-tone suppression (2TS) occurs when the presentation of a (suppressor) tone at frequency fs decreases the mechanical response to a probe tone at frequency f, usually set near the resonance frequency of the measured spot (228). Suppression is one of the mechanisms of masking whereby a weak probe tone is no longer audible in the presence of a louder masker sound (see sect. IXA). An objective correlate is two-tone rate suppression in auditory-nerve fibers (107). As the firing rate of neurons that code for a probe tone is reduced by the presentation of a suppressor tone at a different frequency, so is the audibility of the probe tone. Early single-fiber recordings of auditory neurons revealed two domains of the frequency versus level map of neuronal activity, on both sides of the excitatory region centered on the probe frequency, such that a suppressor tone decreased the Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS activity due to the probe even when too weak to elicit an activity of its own (236). As IHCs and neurons were not excited by the suppressor tone, suppression had to be a micromechanical mechanism, rather than a synaptic or neural one. Direct evidence of 2TS in BM vibrations (226, 233) confirmed this mechanical interpretation. The characteristics of 2TS on the BM have been dissected in the hope that they would shed light on its underlying nonlinear mechanism. The growth of BM response to a probe tone presented alone is compressive at intermediate intensities, at frequencies near the CF of the measured place. Extensive experimental studies of 2TS have evaluated how the BM vibration at the probe frequency changes when a second suppressor tone is added, at a frequency either higher or lower than CF (233). Although suppressors presented alone produce responses smaller than the probe, they induce a clear suppression, monotonically increasing with suppressor intensity, and decreasing with probe intensity. In addition to decreasing the sensitivity to the probe at CF, 2TS decreases the compressive nonlinearity of BM growth. Suppression is also a tuned phenomenon, so that for a probe at CF, suppressor frequencies close to CF are most efficient (42, 43). When the probe frequency departs from CF, lowor high-frequency suppressors lose their suppressive influence (233). The phase changes of BM vibrations at CF in the presence of a suppressor tone are reminiscent of those produced by stimulus-intensity changes (42, 228). Last, 2TS is labile and requires healthy OHCs (202, 233). In summary, experimental facts concur in suggesting a close correlation of 2TS with the interaction of OHCs and BM and with its results, enhanced sensitivity, compressive amplification, and frequency tuning. To give these facts a common theoretical framework requires an analysis of the complex behavior of a nonlinear oscillator driven by a mixture of input signals at two different frequencies and different combinations of levels. Of note, the simplest experimental model eliciting realistic 2TS is a single hair cell from the bullfrog’s sacculus placed in a two-compartment preparation allowing cell activity and resonance (14), which indicates that 2TS is not specific of the complex architecture of the mammalian cochlea, but can be ascribed to the active resonance of a nonlinear oscillator. This suggests the use of a generic model of a compressive, self-tuned resonator with mass, stiffness and damping, such as the one provided by the mathematical formalism of a critical oscillator near a Hopf bifurcation (105), solvable using reasonable approximations (110). At low stimulus levels, a controlled negative resistance almost cancelling the viscous drag poises a critical oscillator near self-oscillation and allows cochlear amplification. The resistive term also contains a stabilizing amplitude-dependent term ⱍzⱍ2z (where z denotes the amplitude) that generates compression near resonance by decreasing amplification at increasing input levels (34, 60). Most characteristics of 2TS can thus be predicted in terms of a competition between probe and suppressor to drive a critical oscillator (FIGURE 10) (64, 110). In the presence of a single tone, the oscillator tuned to it behaves as a sharp filter with narrow bandpass. Thus, if a second, interfering tone is added at a frequency far enough from resonance, it is filtered out and bears no influence. When the interfering tone gets nearer to the resonance of the filter, all the more when it is louder, the oscillator starts responding to it, which increases z and the damping, ⱍzⱍ2z. Increased damping decreases the response to the probe tone, which gets suppressed. A critical oscillator becomes linear if it is stimulated at frequencies differing from its resonance frequency, which explains why the suppressive effect of a low- or highfrequency masker decreases when the probe tone frequencies moves away from CF (233). On FIGURE 10 and in FIGURE 10. Two-tone suppression in a critical oscillator, predicted by the Hopf-bifurcation formalism. The probe tone is at resonance, and the suppressor frequency is variable. A: when probe and suppressor tones have the same amplitudes, the response of the oscillator to the probe tone (red line) decreases as the suppressor approaches its frequency. In the meantime, the response of the oscillator to the suppressor (black line) increases when the suppressor gets within the bandwidth of nonlinear amplification, while remaining smaller than what it would be in the absence of the probe tone (thin line). B: influence of the level of the probe tone on the responses to the probe (red line) and suppressor (black line), for a suppressor with fixed amplitude and frequency (near that of the probe). The thin red line and thin horizontal black line indicate the responses to the probe and to the suppressor, respectively, when presented alone. The vertical distance between thin or thick lines indicates the contrast between probe and suppressor, which is larger when there are suppressive interactions [From Jülicher et al. (110). Copyright 2001, with permission from National Academy of Sciences, USA.] Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1581 AVAN, BÜKI, AND PETIT Reference 110, it can be seen how the general critical-oscillator framework accounts for the outcomes of most experimental situations of 2TS mentioned above. The model also reveals how suppression acts the other way round as the probe tone itself, by also influencing the damping term, competes with the responses to simultaneously presented tones and tends to suppress them. A general rule that holds regardless of the exact mathematical model of nonlinearity is that for suppression to be significant, the suppressor level at the input of the nonlinearity must get within 10 dB of the probe stimulus (64), which can be achieved either when the suppressor frequency moves closer to CF or, at fixed frequency of the suppressor, when its intensity increases. Thus the fact that a realistic 2TS stems from the parsimonious formalism of Hopf oscillators does not refute other models of cochlear nonlinearity (see sect. VIIIA). Whereas, for a single nonlinear oscillator, suppressors below and above the resonance frequency play symmetrical parts, it is not the case for the cochlea. The pattern of Békésy’s traveling wave accounts for the so-called upward spread of masking (see sect. IXA), whereby a low-frequency suppressor influences higher frequency probes, whenever the tail of the suppressing envelope presents a significant amplitude at the place tuned to the probe (see sect. IVA). In contrast, high-frequency interfering tones, unable to reach beyond their characteristic place, can suppress only if they peak near enough to the probe to influence its traveling wave along the short section where its amplification takes place (203). The extent of the active mechanism along the BM can be evaluated in this manner. Patuzzi (203) concluded that activity spans a relatively broad distribution of OHCs, e.g., 650 m basal to the characteristic place of an 18-kHz probe tone in the guinea pig (203). It might seem that suppression means “loss of information.” However, by ensuring that when a given filter receives several inputs, the larger one (likely the one at resonance) will win the competition, 2TS also improves spectral analysis. The spectral components most easily suppressed at one site of the cochlea are off-resonance ones. Actually, they will propagate to their own CF locations, where the competition will favor them at the expense of off-frequency competitors. In this respect, 2TS acts as a contrast enhancer (14, 262). Nonetheless, very loud low-frequency suppressors will irretrievably suppress all competitors. B. Distortion-Product Otoacoustic Emissions Conserved Throughout Evolution Two-tone suppression can lead to the complete masking of a soft tone by a loud one, but in the two-tone paradigm discussed in this section, with similar moderate levels at f1 and f2, 2TS is weak. The second nonlinearity typical of this two-tone paradigm is an enriched frequency spectrum of BM motion, with new spectral components at integer com- 1582 binations of frequencies f1 and f2. As already seen in section II, these audible DPs objectively detected in the ear canal (120) differ from instrumental nonlinearities by their existence even at low stimulus levels (essential nonlinearity, Ref. 79) and their spectral content with a predominance of odd-order DPs (see sect. XI; Ref. 267). The presence of DPs has been ascertained at many spots along the mammalian cochlea, and in the first place, on the BM, with the help of optic interferometric measurements at basal locations (FIGURE 11) (229, 230) and near the apex, with slightly different characteristics (43, 86, 125). DPs are found in the pressure wave in SV throughout the cochlea (13), and near the BM in ST (54). Last, the CM reflecting transduction currents through basal OHCs contains DPs (289). Because the common source of DPs is the nonlinear mechanics inherent to the gating of mechanosensitive channels, it is not surprising to observe DPs and DPOAEs in lower vertebrates, as ubiquitous as SFOAEs (15). Even in insects, DP equivalents in vibrating structures have been described (82, 285). The idea to take advantage of this evolutionary-rooted similarity has led to the development of simple ex vivo systems in which DPs and attendant characteristics, activity, frequency tuning, and compressive gain can be studied together in the absence of complex propagation issues specific of the cochlea, such as BM resonance, propagation of vibrations, interaction of DPs from many neighboring cells, and recombination from discrete DP sites of generation. For example, hair cells from the neuroepithelium of bullfrogs, in settings containing two-fluid compartments preserving their natural ionic environment, can be excited individually either by fluid jets or calibrated glass micropipettes acting on their stereocilia bundle. In this nonmammalian, nonauditory hair cell, the MET channel gating compliance is qualitatively similar to that of OHCs, and amplification activity exists even though it involves only the hair bundle. DPs have been observed in the current through hair cells, in their membrane potentials (106), and in hair-bundle displacements (14). The latter case offers a neat illustration of how 2TS and DP properties emerge together from the active or passive properties of hair-bundle motion (14). The nonlinearities in the mechanical response of the frog saccular hair bundle were explored. FIGURE 12A, which is modified from Barral and Martin (14), depicts the compressive increase of the hair bundle’s displacement observed for a stimulus at the characteristic frequency of spontaneous oscillations of the hair bundle (its resonance frequency). An interfering tone of about the same frequency produces strong 2TS and decreases the compression (FIGURE 12A). Off-resonance, the mechanical response of the hair bundle to a single stimulus, is decreased at low stimulus levels but grows faster with increasing level than near resonance, as Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS FIGURE 11. When pairs of equal-level tones at f1 and f2 are presented near CF (8 kHz, in a chinchilla cochlea) at frequencies such that 2f1-f2 ⫽ CF and f2/f1 ⫽ 1.05 (left), the spectrum of BM vibrations at CF site displays a series of equidistant peaks flanking the primary responses, the odd-order DPs at intermodulation frequencies. The largest DPs are the cubic ones at 2f1 ⫺ f2 and 2f2 ⫺ f1, and DPs decrease in amplitude at increasing orders, forming a tentlike pattern. At decreasing stimulus levels (in dB SPL on the left side of each spectrum), the DP levels show a slow decrease, all approximately at the same rate in dB/dB. At higher f2/f1 (1.25 on the right side), only the 2f1-f2 response is visible in addition to small peaks at stimulus frequencies, because the primaries and DPs other than 2f1-f2 are too far in frequency, and filtered out at the BM place of measurement. Again, its decrease in amplitude, in dB/dB decrease of stimuli, is compressive [From Robles et al. (230).] compression does not exist (FIGURE 12B). Likewise, 2TS no longer acts (FIGURE 12B). The narrow range of efficiency of 2TS was as predicted by the model of Jülicher et al. (110) (FIGURE 10). As regards intermodulation, with stimuli near resonance, the DPs differed from the off-resonance case by the shallow slope of their decrease when stimulus levels decreased [near 1 dB/dB, as in Goldstein’s experiment (79) vs. ⬎2.7 dB/dB off-resonance], that ensured their persistence at low stimulus levels, and the fact that fewer DP frequencies stood out, essentially the nf1-(n-1)f2 and nf2-(n1)f1 odd-order series (with n ⫽ 2, 3, 4 . . .) flanking f1 and f2 (FIGURE 12, C VERSUS D) (14). As we will see in sections V, C–F, and VIIIA, many of these characteristics of single-hair cell DPs also show up in mammalian DPOAEs. C. DPOAEs, a (Not so Transparent) Window on the Inner Ear Direct access to cochlear micromechanics and OHC operation to assess the enhanced sensitivity, compression, and filtering requires the opening of the bony cochlea, with a high risk of damaging the vulnerable active mechanisms. Even minimal damage would affect the validity of measurements. Furthermore, only a few cochlear spots are accessible near the round window and at the apex. Conversely, DPOAE measurements are totally harmless and straightforward over a large range of frequencies, and DPOAEs are easier to extract than SFOAEs as their frequencies differ from those of the stimuli. Yet, their relation to cochlear amplification is not straightforward, and this requires the issues of sites of generation and of propagation mechanisms [absent from the single hair-cell model of Barral and Martin (14)] to be teased apart. A prerequisite is to understand the logics behind the conspicuous changes in DP amplitudes and phases when stimulus characteristics are modified. Consider for example the 2f1-f2 DP, usually the largest one. For a given f2, its amplitude varies considerably when the f2/f1 ratio varies, from 1.60 to near 1.00 (so-called f1-sweep). This dependence clashes with the intuitive assumption that the dominant contributions to a DP come from where maximum interference occurs between f1 and f2, near the peak response of the BM to f2. As the DPs of the f1-sweep come from the same OHCs, should not their amplitude be constant? Experimentally, a “bandpass filter”-like characteristics is observed, such that the DPOAE at 2f1-f2 reaches a maximum near f2/f1 ⫽ 1.20 –1.25 (FIGURE 13) (5, 88). The heard DP is not constant either, but gets louder at decreasing ratios (79). A tentative explanation for the “bandpass filter”-like outcome of an f1-sweep is the interplay of a mechanical filter near the generation site, tuned to a lower frequency. At optimal ratios, 2f1-f2 ⬇ f2/1.4, so that a suitable filter would have to be tuned at half an octave below the main filter. The Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1583 AVAN, BÜKI, AND PETIT FIGURE 12. A: growth function of the motion of the stereocilia bundle of a hair cell from a frog sacculus in response to a tone at frequency f1 (f1 ⫽ 13 Hz, near the characteristic frequency of spontaneous oscillations, 11 Hz), in the presence of a suppressor tone at frequency f2 ⫽ 12 Hz, added to the test tone at increasing amplitudes (symbols: right panel). B: same hair bundle as in A, tested off-frequency, with f1 ⫽ 130 Hz and f2 ⫽ 120 Hz. C: frequency spectrum of hair-bundle motion in response to a two-tone stimulation at f1 ⫽ 20 Hz and f2 ⫽ 22 Hz, near resonance, with rather high stimulus forces (red labels: main intermodulation frequencies). D: same cell as in C, with off-resonance two-tone stimulation at f1 ⫽ 200 Hz and f2 ⫽ 220 Hz. At lower stimulus forces (FIG. 3, A AND C, in Ref. 14), the two-tone stimulus of C still elicited visible DPs, whereas that of D did not. [From Barral and Martin (14). Copyright 2012, with permission from National Academy of Sciences, USA.] tectorial membrane does present such a characteristic radial resonance (84, 314), which suggested it as a possible candidate (5, 26). Another, nonexclusive possibility, would be that at f2/f1 ratios near 1, the 2f1-f2 and other lower band DPs, by falling near the primary frequencies, undergo 2TS (113) strongly decreasing their amplitudes. In these frameworks, the noninvasively accessible DPOAEs would uncover the influence of subtle cochlear micromechanics. Yet, the most conclusive interpretation so far relies on the phase behavior and directionality of interfering traveling waves from distributed DP generation sites (65, 244), developed in the following section. D. Distributed Sources and Propagation(s?) of DPOAEs Potential DP generation sites are distributed over a broad interval where the envelopes at f1 and f2 overlap and OHCs respond to both frequencies, particularly, but not only near f2. The lowest stimulus levels in most publications range from 40 – 60 dB SPL, and at these levels, the envelope of BM motion definitely has a broad tail (228). If we consider a DP-generating region along the BM, stimulation evokes DP-wavelets, each at a particular location of this region, from which it travels in two directions, apically to the CF location and basally. At the stapes or at their CF location, all wavelets are vectorially added and, depending on their respective phases, cancellation or enhancement occurs. The DP level at the stapes determines the DPOAE level, and the DP level at CF determines the loudness of the heard combination tone. The propagation of DPs from their generation places follows the same principle as pure tones and SFOAEs (see sect. IV, B and E). As intermodulation creates new frequencies and wavelengths, the prediction of phase relationships between the wavelets emitted from different generation sites at the frequency of each DP is significantly more complex. Its mathematics (244, 266) are outlined accessibly in APPENDIX II. The particular case of 2f1-f2, the largest DP, is examined there without loss of generality. Due to scaling symmetry, the vibration patterns of f2, f1, and 2f1-f2 are just translated versions of each other along a logarithmic representation of the cochlea. The phase at f2 rotates fast when this primary nears its CF location, while at the same spot, f1 and 2f1-f2 accumulate less and less phase with increasing f2/f1 ratio, as the peaks at f1 and 2f1-f2 move farther apically (see APPENDIX II). A DP wavelet emitted at some place along the region of generation, starts with a phase lag due to the phase lag of the stimuli on their way to 1584 Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS E. One Nonlinearity, Several Paths FIGURE 13. Average “bandpass filter-like” pattern of DPOAEs in normal mice (top: CBA/J strain) and slightly hearing-impaired mice (bottom: CD1) when an f1-sweep is performed at fixed f2 so that the ratio f2/f1 varies from 1.1 to 1.5. The level of the DPOAE at 2f1-f2 elicited by a 60 dB SPL equal-level pair of (f1, f2) tones displays a blunt peak for f2/f1 around 1.25, whereas lower and higher ratios yield smaller DPOAEs. In mice with 20- to 40-dB loss in sensitivity due to progressive degeneration of OHCs near the places tuned to f1 and f2, the residual DPOAEs have a similar behavior. [From Le Calvez et al. (138). Copyright 1998, with permission from Elsevier.] the generation site, then accumulates another phase lag while traveling from its generation site to the measurement place, either the stapes or the CF. Here the DP results from vector addition of wavelets from all generation sites. The dependence of the DP amplitude on the f2/f1 ratio is explained by the computation, for each ratio, of the difference in overall phase among wavelets. A large DP stems from wavelets with the same phase accumulation, phase coherence (244). In the absence of phase coherence, the DP is small. Inspection of the vibration patterns of f2, f1, and 2f1-f2 leads to predict an f2/f1 ratio-dependent directionality, properly accounting for a large DPOAE amplitude (corresponding to backward propagation of wavelets) at intermediate ratios (near 1.25), and for a loud heard DP cubic difference tone at small ratios (near 1). The fact that a theory robustly explains an experimental fact is encouraging yet constitutes no definite proof. This theory makes use of the powerful mathematical wavelet description of Huygens, the main difficulties of which are to make sure that all possible wave sources are included, and that all propagation modes are properly described. It cannot be excluded that new sources be uncovered and that the reverse propagation of DP waves eventually turns out to use another mode than the BM traveling wave (see sect. VH, although we will see there that it is unlikely). Then the theoretical basis of the ratio dependence of DPOAEs and DPs would have to incorporate the new sources and propagation paths. According to the two-mechanism model of Shera (246), the forward-propagating wavelets at 2f1-f2, when their phases are coherent enough to let them reach their CF location, are back-scattered on cochlear irregularities there, by linear coherent reflection (see sect. IVF). The DPs reflected by this mechanism, primarily arose from the basal nonlinear interference between f1 and f2. At the stapes, the DP from the coherent reflection site combines with the DP directly emitted backward from the places of nonlinear interaction. The two mechanisms and propagation modes identified by this model for the measured DPOAEs share the same initial nonlinear stage. As we will see, experiments reveal two distinct phase behaviors of the DPOAEs: one with a uniform phase independent of primary frequencies, and a second one with a phase that rapidly rotates with primary frequencies. This accords quite well with a two-path model, either the one developed by Shera (246) or any alternative one predicting the two phase behaviors. The first path is straightforward and its existence unchallenged. Part of the DP directly comes backward from the place of nonlinear interaction, and its uniform phase relates to scaling symmetry (see sect. VD). As for the second path, the model of coherent reflection on local BM irregularities (246) implies a suitable frequencydependent phase. One alternative to it has been proposed recently (217), in which the Reissner’s membrane (FIGURE 1A) is assumed to support a bidirectional propagation mode, the backward component of which would contribute to the overall DP at the stapes. Reichenbach’s model postulates that distortion on the BM evokes a traveling wave along the Reissner’s membrane which, below 1 kHz but not at high frequencies, can significantly interact with the BM. It also predicts that DPs, both at 2f1-f2 and 2f2-f1, can travel along the Reissner’s membrane, forward and backward, by a short-wavelength mechanism similar to capillary waves at the surface of a water tank. Interestingly, along the nonresonant Reissner’s membrane, these waves do not incur the limitation that prohibits propagation across a resonant position. For the BM-propagated 2f2-f1 DP, backward travel from the place where maximum nonlinear interaction between primary tones occurs is impossible, as it is apical to the 2f2-f1 CF location. Upper-band DPs experience a shift of their generation place toward their own CF location (163, 165), where nonlinearity is less. The Reissner mode, by letting 2f2-f1 DPs travel from the place tuned to f2 along the whole cochlea, might produce a larger outcome. Experimental validation, by interferometric measurements in rodents (217), was restricted to showing the presence of a 1-kHz 2f1-f2 DP on the Reissner’s membrane, traveling forward beyond its CF location with a fast rotating phase, over places where the BM could not vibrate at 2f1-f2. Important unsettled issues are whether large DPs could show up on the Reissner’s membrane, particularly at 2f2-f1, and not only travel backward, but also influence the BM-propagated mo- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1585 AVAN, BÜKI, AND PETIT tion at 2f2-f1, even at higher frequencies where the motion of Reissner’s membrane is supposed to influence BM motion very little. The extent of generation sites and mixture of propagation paths are essential issues for the outcome of DPOAE measurements from a locally damaged cochlea to be interpreted usefully. An unfortunate combination of stimulus parameters, favoring the coexistence of several contributions with similar levels and different phases in a normal cochlea, might generate intractable complexity offsetting the advantage of DPOAEs as local probes of active cochlear mechanics (51). Local damage removing only one contribution might even fail to be noticed on the overall map of DPOAE levels. This is why intense research has been devoted to identifying choices of stimulus parameters that emphasize one mechanism and path at the expense of the other ones. It is helpful in practice that these choices for the 2f1-f2 DP extend to all other lower-band DPs (3f1–2f2, 4f1–3f2, etc.). Phase criteria have been developed for separating the sources when they coexist (134). As seen above, the lower-band DPs traveling straight back from the nonlinear place dominate at intermediate f2/f1 ratios (⬎1.10 for 2f1-f2). Their principle of generation and propagation essentially depends on how waves at f1 and f2 overlap. When f2 and f1 vary at constant f2/f1 ratio, scaling symmetry applies, vibration patterns are conserved, and the phase at any point moving with the traveling wave envelopes changes little, thus the generated DPs are “wave-fixed” (134). In two-dimensional plots of phase against DPOAE frequency at varying f2/f1 ratios (133, 168) (FIGURE 14), the independence of DPOAE phase on DPOAE frequency at constant f2/f1 ratio translates into a horizontal banding pattern (FIGURE 14B, top). Conversely, the DP components traveling forward from the generation place then backscattered on local irregularities near the DP CF place (dominant at small f2/f1 ratios) are “place-fixed” as they strongly depend on the irregular places. As the stimuli are swept in frequency, the excitation patterns move along the BM and different reflecting sites are encountered, with different phase rotations as a result. The reflectionmechanism DPs are thus characterized by a vertical banding FIGURE 14. Maps of DPOAE level (A and C) and phase (B and D) against DPOAE frequency (horizontal axis) and f2/f1 frequency ratio (vertical axis) in the ear of a normally hearing rabbit. The horizontal dashed white line (corresponding to f2/f1 ⫽ 1.00) splits the plots into an upper and a lower part representing the characteristics at 2f1-f2 and 2f2-f1, respectively. Different colors are used to represent DPOAEs with amplitudes or phases falling in different dB or 45°-wide intervals. Horizontal banding in the phase map means that DPOAE phase at fixed f2/f1 ratio is independent of DPOAE frequency: it is determined by f2/f1 and obeys frequency scaling. Vertical banding means that DPOAE phase, regardless of the f2/f1 ratio, is determined by some coherently reflecting microstructure, e.g., at the (fixed) place tuned to the DPOAE. Yet, compared with A and B, C and D are plotted in the presence of an interfering tonal stimulus near 2f1-f2, meant to jam the coherent-reflection process at the place tuned to 2f1-f2. Its failure to influence the banding pattern suggests that vertical bands may be due to coherent reflection on a more basal microstructure. [From Martin et al. (168). Copyright 2009, with permission from Acoustical Society of America.] 1586 Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS pattern (FIGURE 14B, bottom) contrasting with the horizontal pattern of wave-fixed DPs. Within one vertical band, the measured phase is independent of f2/f1 as, for a chosen DP frequency, the scattering places remain fixed (since coherent reflection is supposed to occur at the place tuned to the DP frequency). A caveat is that, even though the lower-band DPs arising from coherent-reflection at the place tuned to their frequency are expected to generate a vertical-banding pattern, evidence has been produced recently that some vertically banding DP components originate from another place, basal to f2, using methods more extensively described in section VIA and APPENDIX III. For example, FIGURE 14D shows a persistent vertical banding pattern despite the presence of an interference tone, which by being placed near the DP frequency, suppresses coherent reflection occurring at the DP CF place (hence the decrease in overall DPOAE level seen in FIGURE 14C). In other words, although consistent with coherently reflected DP components, vertical banding is not a signature of apical coherent reflection (167). This issue is also raised by upper-band DPs. In their case, only place-fixed components have been observed, with fast phase rotation and vertical banding at all frequency ratios (see FIGURE 14B, bottom). It is a common observation in mammals that their levels are smaller than their lower-band cousins of the same order, but not in chicken, lizards, or frogs (15). A first explanation considers that, as the place near f2 with maximum nonlinear interaction has a resonance frequency less than DP frequencies, the BM is too slack for launching these DPs directly toward the stapes. Upper-band DPs would be more efficiently produced more basally (163, 169), with a region of place-fixed reflections occurring somewhere between the DP generation region and the DP CF place (133). With regard to both upper- and lower-band DPs, it is noteworthy that at high primary levels, the tails of the envelopes of BM motion at f1 and f2 extend far basally, where the absence of compression allows a steeper growth than at resonance. At ⬎70 dB SPL, they are almost as large as they would be more apically at the CF location. Conceivably, the interactions in OHCs responding at both frequencies remain nonlinear enough, and phases along the interaction interval vary smoothly enough, that significant DPs could come from basal places. Some authors found no evidence of it (302), while others did (170). The complexity of wave combinations is such that DPs from basal sites may be easily overlooked because their removal (by basal lesions or by application of a third interacting tone well above f2; see APPENDIX III) hardly influences the overall DP levels. Vector subtraction has to be used for separating the basal contributions from the standard ones. Computations of vector differences between 2f1-f2 DPOAEs without versus with an interfering tone (that turns off the contribution from the place where it peaks) recently led to the observations that in small rodents, there is no significant contribution from the CF place of the DPOAE; and that a large DPOAE component sensitive to high-frequency interfering tones, one-third octave above f2, thus likely generated basally, displays a vertical banding pattern (167). The proposed explanation still rests on the idea that vertically banding components come from a relatively fixed location when primary frequencies are swept, but assumes that what determines this fixed basal location is some constraint to which the DPOAEs have to conform. For example, for the 2f2-f1 DPOAE, contributions apical to its CF location would be absorbed when traveling over this place, and for the 2f1-f2 DPOAE, suppression by high-level primary stimuli (113) would spare only the contributions from sites well basal to stimulus CF locations (167). Without contradicting the existence of coherent reflection, and without the need to invoke the Reissner’s membrane as an alternative propagation medium (217), such evidence strongly supports a complementary place-fixed mechanism involving basal generation sites. At increasing stimulus intensities (70 dB SPL or more), other combinations of interactions have been suggested. For example, 2f2-f1 could be produced in two stages, generation of the harmonic 2f2, which in turn could interact near its own basal peak with f1, to generate a quadratic product [2f2] ⫺ f1 (64). However, these mechanisms can likely be neglected as long as stimulus levels are kept ⱕ65 dB SPL, as was the case for rodent ears measured in Reference 167. A currently active issue is whether this complex behavior of DPOAEs in a cochlea can be found in other vertebrates without tonotopically organized BM, electromotile hair cells, or tectorial membrane (15). In birds in which the auditory sensory-cell dichotomy is less marked than in mammals, the dependence of DPOAE phase on primary ratio is similar to mammals, with clear place- and wavefixed components. In geckos with no BM traveling wave, more similarities with the DPOAEs of mammals have been found above than below 1 kHz, as for the latter frequencies, no evidence exists for the distortion versus reflection classification. In frogs with no flexible BM, no wave-fixed behavior has been found (15), and the question is open whether a traveling wave could still exist along another flexible structure. Solving these issues would help to establish what minimum architecture leads to a physiological complexity comparable to mammalian DPOAEs, which in the previous sections had been explained mainly in terms of bidirectional wave propagation from distributed places, with little reference to anatomy. F. Spectral Characteristics and Growth From an engineering standpoint, the spectral pattern of distortions coming out of any nonlinear black box provides specific insights into its distorting mechanism, nonlinear transfer function, and OP. A caveat for applying this ap- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1587 AVAN, BÜKI, AND PETIT proach to the cochlea is the requirement that DPs come directly from one emission place, which we now know is at best an approximation. Nonetheless, it reveals several relevant features. At small enough input amplitudes, a nonlinear transfer function can be developed as a Taylor power series for which the coefficient of the nth power term is the nth derivative of the transfer function evaluated at the OP and divided by n! (For a linear system, only the first derivative differs from 0 as the transfer function is a straight line.) Input signals considered throughout this section are of the form A1sin2f1t ⫹ A2sin2f2t. In the power series, terms such as A12A2⫻sin22f1t⫻sin2f2t show up and, with the help of some elementary trigonometric formulas, can be converted into sums of terms corresponding to all combination tones, among which sin2(2f1 ⫺ f2)t, the 2f1-f2 DP (its cubic nature mentioned in sect. IIIA reveals itself in its coefficient A12A2, proportional to A3 in the commonest situation with A1 ⫽ A2 ⫽ A). The resulting expression of the amplitude at 2f1-f2, for example, is far from simple, except its lowest-order term (62, 66) ADP ⬇ A12A2关3 ⁄ 4 a3 ⫹ 5 ⁄ 8 a5共2A12 ⫹ 3A22兲 (4) ⫹ 105 ⁄ 64 a7共A14 ⫹ 4A12A22 ⫹ 2A24兲 ⫹ . . . 兴 A similar treatment of all terms of the Taylor series predicts the distortion spectrum as a series of equidistant peaks at intermodulation frequencies. Indeed, this is observed in physiological responses from auditory sensory organs (230), the largest DPs being oddorder ones at (n ⫹ 1)f1 ⫺ nf2 (lower band) and (n ⫹ 1)f2 ⫺ nf1 (upper band) (n ⫽ 1, 2, . . .) surrounding the main two peaks at primary frequencies. The higher the order (defined by p ⫽ 2n⫹1), the smaller the DP as the terms in the Taylor series (assuming small amplitudes) get increasingly smaller with increasing exponent. At increasing primary levels, the observed tentlike pattern (FIGURE 11) changes little as all DP peaks grow at approximately the same rate of decibel per decibel increase of stimuli. In linear physical units, the amplitude of the pth-order DP grows as the amplitude of the input to the nonlinear system to the power p (thus at p dB per dB increase in input). Other features are not generic but specific of the distorting system, and here, of its physiology, as already seen for the slow average growth of DPs with SPL, which betrays the compressive amplification (see sect. IIA; Ref. 79). Another specific property of physiological DPs is that evenorder components, such as the quadratic DP at f2-f1, are smaller and more labile in many species (37). In contrast to odd-order DPs that depend on asymmetric components of the nonlinear transfer function of the MET channels (a3, a5, a7, etc., in Equation 4), even-order ones rely on symmetric components (70). The local symmetry of the MET transfer function and the exact content in odd- and even-order terms 1588 of its Taylor expansion, depend on the OP position. It is thus expected that its displacements influence the whole series of DPs in a coordinated manner. With the assumption that the OP moves along a second-order Boltzmann function, it was modeled that when the OP is at the inflection point where the steepest slope ensures maximum cochlear amplification (at low sound levels), the even-order f2-f1 DP is minimum whereas the 2f1-f2 is large (70). This provides one explanation of the predominance of odd-order components in usual DPOAE spectra (FIGURE 2). However, as soon as the OP moves away from the inflection point, the predicted DP pattern dramatically changes with a sharp increase of the f2-f1 DP, together with the 2f1 and 2f2 (evenorder) harmonics, whereas the 2f1-f2 DP sharply decreases in amplitude. Larger OP shifts lead to reciprocal changes of the even- versus odd-order DPs (70). Experimentally, low-frequency biasing of the cochlear partition has been used as a tool for displacing the OP along the cochlear transducer function and reconstruct its shape (21, 29). Other experiments have exploited the changes in the balance between odd- versus even-order DPs to monitor endolymph volume disturbances produced by direct fluid injection in scala media (253). The outcome of these manipulations in terms of complex combinations of 2f1-f2 versus f2-f1 DP changes was similar to those of former experiments attempting to modulate the cochlear amplifier by decreasing the endocochlear potential using furosemide, a loop diuretic (180, 187, 253), which suggests that volume-induced OP shifts occurred in such circumstances. Even-order DPOAEs may also provide a convenient tool for monitoring the medial olivocochlear efferents, a neuronal pathway that innervates the OHCs via cholinergic synapses. Physiological activation of these efferents by the presentation of sound in either ear produces a decrease in the gain of the cochlear amplifier, likely beneficial in the presence of background noise (83). By affecting OHC function, efferent activity influences DPOAEs which therefore can serve for assaying the efferent control (40, 131, 187, 211). The 2f1-f2 DPOAE responds to contralateral noise by a disappointingly small decrease in amplitude, but the initially weak f2-f1 DPOAE component can almost reach the level of its 10-dB-larger 2f1-f2 counterpart when medial efferents are activated by low noise levels (30 dB SPL) (3). As, in this experiment, efferent activation also strongly modified the phase of changes exerted by a biasing infrasound, a common contribution of OP shift in both paradigms was suggested (3). The changes in even- versus odd-order DPOAEs after continuous low-level sound stimulation, noticed long ago, had already been attributed to efferent activity (25). G. Nonmonotonic DPOAE Changes With Stimulus Parameters Despite the prediction, in the previous section, of a simple growth of DPs with SPL, real-life data disclose conspicu- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS ously nonmonotonic DPOAE growth functions with deep minima (notches) that separate two portions with different slopes (66, 152, 173, 180, 292). Such notches are consistently observed in small mammals for primary levels around 60 –70 dB SPL, at many combinations of primary frequencies (179, 292), but they are less frequent in human ears (291). A single measurement at the notch could lead a naive experimenter to conclude that absent DPOAEs pinpoint damaged OHCs, whereas a slight arbitrary shift of stimulus level would have produced different results. The theoretical background of these notches, therefore, has to be explained; otherwise, one would have to deny the practical validity of DP measurements. A first generic explanation of a notch has been brought forward (179), drawing on the characteristic 180° phase change observed whenever a nulling happens (152, 180, 182, 292) (FIGURE 15). This pattern suggests interference between two DPs of opposite phases coming from two different sites or mechanisms. When they happen to have equal amplitudes, cancellation happens. On one side of the notch, one mechanism dominates while on the other side, the other mechanism takes over, and their physiological meanings may be totally different. The observation of different slopes on both sides of a notch has suggested that the DPs from active compressive mechanisms, the ones discussed so far, dominate at lower levels while at high levels, passive DP sources with a steep 3 dB/dB growth would take over. Their lack of vulnerability to cochlear damage whereas at the same time, low-level, slow growing DPs have vanished, might point to their being valueless. Actually, there is mounting evidence that the nonlinear sources of slow- and fast-growing DPs likely are the same MET channels of the same OHCs. When impaired, these OHCs no longer exert compression so that the DPs that they produce are only the fast-growing ones (9, 183). The reported lack of vulnerability depends on what means were used to “silence” OHCs. Loop diuretics or transient hypoxia, for example, by decreasing the endocochlear potential without affecting OHC structures, kill low-level DPs without affecting high-level ones (181, 292). Conversely, factors destroying OHCs and their MET channels [aminoglycosides (28, 292); genetic defects (33, 96); excessive noise (9)] readily remove any trace of the “passive” DP component, right where OHCs have been damaged, while in the same cochlea, places where OHCs survive in an inactive state still produce DPs at high stimulus levels (9). In hair-bundle preparations from the bullfrog’s sacculus (14), the DPs produced when a hair bundle is passive, either when stimulated far from its intrinsic resonance or at high level, keep providing specific information on the function of MET channels, even though the hallmarks of activity have been lost. In the more complex situation of the cochlea in vivo, the possibility that significant DP contributions come from basal OHC-stimulated off-resonance is now strongly substantiated (170). FIGURE 15. Dependence of the 2f1-f2 DPOAE level (A) and phase (B) on primary levels in the ear of a guinea pig before (thick line) and at varying times (as indicated in the inset) after injection of furosemide, a loop diuretic which, by decreasing the endocochlear potential, affects cochlear amplification. A deep notch splits the initial growth function into a shallower low-level segment and a steeper high-level segment (A), with a half-cycle difference in phase between the two segments (B). Furosemide essentially affects the low-level segment, as if two different sources of DPOAEs were combined, an active one at low levels, vulnerable to furosemide, and a passive one at higher levels and with a steeper growth, not physiologically vulnerable (180). Their combination would generate a notch when their levels are similar because they happen to be out of phase. An alternative explanation assumes a single nonlinear transfer function with changing gain and operating point. This function would generate both segments and even predict the observed shift (⌬L) in notch position with decreasing amplifier gain. [From Lukashkin et al. (152). Copyright 2002, with permission from Acoustical Society of America.] Nevertheless, an alternative generic explanation of growthfunction notches has taken shape, implying a single compressive nonlinearity and changing level (66, 153). Several mathematical models of nonlinear transfer functions have been tested for their influent parameters, in search of signatures of a single-site mechanism producing a notched pattern. The origin of nulls has been interpreted in terms of the cancellation of terms in the Taylor series describing the nonlinear transfer function, due to the mixing of an oddorder nonlinear term with the next highest odd-order nonlinear term (66). In this framework, nulling turned out to be a commonplace result of the nonlinear nature of a transfer Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1589 AVAN, BÜKI, AND PETIT Table 2. Possible (not mutually exclusive) explanations for the presence of notches in DPOAE (2f1-f2) growth functions and of microstructure in DPOAE versus frequency plots Explanation Reference Nos. Two emission principles, one active and vulnerable, one passive and robust Two emission sites, one near f2 or basal to it (distortion), one at the place tuned to 2f1-f2 (coherent reflection) and/or two propagation modes with different phase rotations One generation site, two propagation modes at different speeds along BM and Reissner’s membrane One generation site, one propagation mechanism, notch due to a change in saturation on the transfer function at increasing level One generation site, one propagation mechanism, shift in OP Retarded action of an AGC on each primary, generating two spectral series, and interference between them. Single hair bundle in its passive (off-resonance) regime, with a gating compliance strong enough to elicit negative stiffness 180, 292 133, 170, 246 217 152 70 271 14 DPOAE, distortion product otoacoustic emission; BM, basilar membrane; AGC, automatic gain control. function. Several properties of this function (even symmetry, degree of asymmetry, severity of clipping) could equally easily produce nulling, while a mere shift of the OP could profoundly alter the ability to produce it (TABLE 2). Small changes in the frequencies of stimuli, not only their levels, can also lead to large changes in DPOAE levels. A so-called fine structure shows up in the DP grams (DP vs. frequency plots) of human ears, consisting of up to 20-dBdeep troughs (92) (FIGURE 16) that modulate the overall level-versus-frequency pattern in a quasi-periodic manner. Notches are more densely packed in human ears than in other species, about 1/10th of an octave apart (92). It appears that their density is inversely proportional to DP time delay, of the order of 10 ms around 1 kHz in humans. The fine structure persists in an interval of primary levels over 20 dB wide (FIGURE 16, A VERSUS B). Its acknowledged explanation is an already familiar one (see sect. VE), building on the existence of the two discrete DP contributions from the nonlinear and coherent-reflection sites (111, 133, 263). As their relative phases rotate with frequency, an interference pattern of regularly spaced nulls is created in the DP gram. The lack of fine-structure in small mammals comes with the lack of reflection component from the 2f1-f2 place, inferred in Reference 167 from the failure of an interfering third tone (see APPENDIX III) to exert any change in the DP pattern when set near 2f1-f2, where it was meant to suppress the reflection mechanism. Human ears contrast with rodent ears in that no notch in growth function is observed around 60 –70 dB SPL primary levels; nonetheless, another type of notch in the growth function has been reported in close relation to the commonly observed deep fine structure (92). These notches correspond to the fact that the fine structure tends to shift in frequency when the primary level changes so that the DP grams at different primary levels are not parallel. At fixed frequency, this generates irregular growth functions with dips around 50 dB SPL, with little difference between the 1590 slopes below and above a dip (92). For human DPOAEs, growth notches and fine structure thus rest on a common background, the distortion versus coherent reflection dualism. In rodents, the explanation of growth notches is more contentious. To decide whether the notch in a DPOAE growth function is due to the interference between two contributions from different sites, not the coherent-reflection one but other possibilities have been invoked (170), or FIGURE 16. The high-resolution plot of DPOAE amplitude against frequency at fixed stimulus level and frequency ratio commonly displays pseudo-periodic notches, 5–10 dB deep (example in one human ear; A: high-level stimuli, f1 ⫽ 70 dB SPL and f2 ⫽ 60 dB SPL; B: low levels, f1 ⫽ 55 dB SPL and f2 ⫽ 40 dB SPL). Salicylate intake, which decreases cochlear amplification by affecting the electromotility of OHCs, affects the depth of notches and shifts their frequency (open circles: initial measurement; dashed and dotted black lines: measurements at various times during salicylate treatment; closed circles: post-control showing the stability of DPOAE fine structure). [From Rao and Long (213). Copyright 2011, with permission from Acoustical Society of America.] Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS to the dynamic behavior of a single nonlinear source, a suitable experimental test would be to perform well-defined local damage, e.g., to OHCs in the basal cochlea, and examine whether it affects the notches produced by primary frequencies with more apical CF locations. Alternatively, DPs produced by the same stimuli could be measured at different places to check whether a notch is stable at all locations, in which case it would be an inherent part of a single nonlinearity. The two-site model predicts that the notch exists because of a particular 180° phase difference which would not hold if the measurement site moved closer to one site and further from the other one. The latter test was performed in a few guinea pigs as a complement of the protocol of Reference 13, the 2f1-f2 signal being measured in the ear canal as a DPOAE and in the perilymph of SV1 as a pressure wave, and it revealed that the notch in the DP growth function was present at both measurement sites (P. Avan, P. Magnan, and A. Dancer, unpublished data). In contrast, in the cat, the stable dip visible in the DPOAE fine structure at one f2/f1 ratio vanished when the intracochlear DP was noninvasively assayed using a third tone as a calibration probe (245), in support of a two-mechanism model. This discrepancy, yet to be explored, suggests that indeed, notches in DPOAE plots may arise from different mechanisms, among those theoretically available (TABLE 2), thus bear different significances. H. DPOAEs: Their Timing and Backward Propagation Evaluation of cochlear tuning is an important challenge in audiology, as decreased tuning may account for impaired detection of sounds in noisy environments. Direct measurements of tuning on the BM or in single-neuron responses are highly invasive, thus prone to damage the system. Behavioral measurements of tuning (psychophysical tuning curves or notched-noise masking experiments; Ref. 184) are time-consuming even in experienced subjects. Indirect objective measurements are suggested by the rule that the more narrowly tuned a filter, the longer the delay of its response. The cochlea can be considered as a filterbank, which led Shera (248) to analyze how cochlear tuning, cochlear filter delay, and OAE delay covary. Thus the topic of this section, i.e., how to measure the timing of OAEs, might provide reliable information about the other two. A thorny issue is that filter delay is only one part of OAEdelay beside the time traveling waves need to reach the nonlinear place(s) and the backward propagation time(s) of OAEs from the generation place(s) to the recording site in the ear canal (FIGURE 17). This is why so much energy had to be devoted to the first task of identifying the sites where OAEs are produced and their propagation pathways to the detection system. Strong evidence is now available concerning the existence of distributed sites of DPOAE production, all along the sites where primary envelopes overlap and not only where they peak. One direct backward pathway from the nonlinear generation sites and one forward then backward pathway ascribed to coherent reflection have been described, both influenced by the f2/f1 ratio (see sect. VE). Noninvasive measurements of OAEs only provide access to their round-trip travel time, and this indirectly, except when viewed in the time domain using special averaging techniques (293). The phase plot of DPOAEs evoked by a fixed tone and a swept-frequency tone is approximately linear and from its negative slope, the DPOAE group delay in the ear canal can be inferred (see sect. IVF). The filter delay is the DPOAE group delay minus the travel times of the stimulus to the generation place and of OAEs along their propagation path(s). The forward travel time is given by the onset delay of BM response (231). The mainstream hypothesis is that DPs reach the stapes via retrograde BM traveling waves by the same mechanism and at the same speed as the forward traveling wave, in which case, DPOAE delay ⫽ filter delay ⫹ 2 onset delays. In a thorough analysis of available data, Ruggero (231) listed all possible alternatives to this view, among which that whereby OAEs propagate via very fast acoustic compression waves in the cochlear fluids, implying a different prediction: DPOAE delay ⫽ filter delay ⫹ onset delay. For DPs to move the stapes and produce DPOAEs, the important parameter is the pressure difference between stapes footplate and round window, not BM motion as was the case for sensory cells, and a compression wave might achieve pressure transmission efficiently (296). The stapes footplate of mammals is far from the basal BM so that to reach the oval window, DPs have to travel through the bulk of SV fluid at least for the last stage of their journey (219). A first observation is that the formula giving the filter delay combines several components, and cumulative errors can lead to largely erroneous estimations. The most recent experimental attempt at improving the accuracy of OAE delay measurements (176) used a modified emission technique designed for emphasizing the contribution of the return journey to the emission delay. Partial suppression of the SFOAE at frequency f2 was produced by a complex of closely spaced low-frequency tones around frequency f1, inspired from Reference 192. The resulting modulation of the f2 SFOAE is nothing but intermodulation, which thus translates into DPOAEs consisting of a series of spectral components, at f2 plus the frequency separation between the components of the f1 complex. Frequency f1 itself bears no influence in this series thus can be shifted at will, and for instance can be set very low. Then, the forward journey of an f1 tone to the place tuned to f2, where OAEs emerge, is shortened to a minimum, and the OAE phase behavior is dominated by the reverse travel of the OAE complex. The authors found a reverse travel time of ⬃400 s or 2.5 cycles Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1591 AVAN, BÜKI, AND PETIT f2 f1 τback- compression 2f1-f2 Stapes τbasal generation τforward + filter τstapes-microphone τbackward-BM Microphone site of DP-frequency FIGURE 17. Main contributions to the delays of cochlear responses from different locations and mechanisms, when the cochlea is excited by a pair of pure tones at frequencies f1 and f2 (vibration patterns in blue and red, respectively; delays, likely similar at f1 and f2, in violet). The 2f1-f2 DPOAE propagates both forward and backward via different putative mechanisms associated with different speeds and delays (vibration patterns and delays in green). Upward-pointing arrows mark measurement sites between which propagation delays can be evaluated. The basal site from which the propagation delay to the stapes is basal, actually corresponds to multiple locations, likely broadly distributed between the stapes and the generation site tuned to f2. Straight horizontal lines indicate propagation via a fast compression wave, and undulating ones, via a slow transverse traveling wave. (176), seemingly shorter than the forward travel time of a BM wave at the same frequency, estimated at 900 s (223). As indirect measurements are inconclusive even when confounding factors are kept to a minimum, invasive experiments are required to definitely establish by which mechanism DPOAEs reach the stapes to exit the cochlea: slow backward traveling waves along the BM, fast compression waves through SV fluid, or alternative paths such as that recently suggested along the Reissner’s membrane (217). The common goal of the scarcely published experiments was to assess the timing of DPs along their way, from comparisons of phase data collected at different spots along the cochlear partition. Three different experimental protocols were used. In the first one (13), the authors inserted a sensitive piezoresistive hydrophone in the SVs and STs of the first and second turns of the guinea pig cochlea. A second group (219) performed interferometric measurements of gerbil BM motion at different places extending for ⬃1 mm in the longitudinal direction (93) and of stapes motion (94). Last, spatial variations in intracochlear pressure were measured with the help of a fiber-optic pressure detector at different places close to the BM in ST in the basal turn of 1592 gerbil cochleas (53). All investigators measured DPs in response to intermediate level primaries (40 –70 dB SPL) in cochleas with preserved sensitivity, with the thresholds of neural responses after the DP measurements being within a few decibels of their initial values. What sparked the controversy were the consistent findings by Ren and co-workers (94), with continually improving experimental setups, of DPs reaching the stapes earlier than any other BM measurement spot in the cochlea (FIGURE 18A). Measurements at different longitudinal places of the BM, thought to be well basal to the place where the DP was allegedly generated (the f2 CF place), should have revealed a backward traveling wave with an increasing phase lag toward the stapes. Instead of that, systematically increasing phase lags were found at increasing distances from the stapes, suggesting that on the BM, DP waves traveled in the forward direction and not the reverse one. This led Ren to propose that the DPs reach the stapes and round window almost instantaneously via a compression wave through the fluid, and that from there, the asymmetric motion of cochlear windows due to their difference in impedance generates a forward traveling wave along the BM. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS FIGURE 18. A: delay between stapes and BM responses to a DP at one basal cochlear site computed from phase measurements, suggesting that the BM response lags, as if the stapes were reached almost instantaneously by the DP via a compression wave, before the DP started traveling forward as a transverse wave. [From He et al. (94). Copyright 2010, with permission from Elsevier.] B: DP pressure against DP frequency in the ST of one preparation, at two distances from the BM (thick line, 37 m; thin line, 117 m). The DP amplitude drops off by ⬎5 dB with increasing distance from the BM, suggesting a dominant traveling-wave mode close to the BM, as a compression mode should fill up the ST. [From Dong and Olson (53). Copyright 2008, with permission from Acoustical Society of America.] C: phase difference between pressures at the DP frequency in SV of the second and first turn of guinea pig cochleas (SV2 vs. SV1), against DP frequency. For f2 ⫽ 5 kHz, the average difference is 0 (dashed vertical line) as the DPOAE site of generation falls midway between the two measurement points. A propagation speed of 40 m/s from apex to base is inferred, in line with published interferometric BM measurements at similar frequencies, i.e., between 20 and 100 m/s (227). D: the DP level in SV displays little spatial variations and matches the DPOAE level in the ear canal when taking into account the reverse middle-ear transfer function. [From Avan et al. (13). Copyright 1998, with permission from John Wiley & Sons.] In the other experiments using a fiber-optic pressure sensor (53), the technique, although requiring stimulus levels ⱖ70 dB SPL, allowed physiologically vulnerable DPs to be mapped near the BM at one cochlear location approximately tuned to 20 kHz. The complexity of DP patterns confirmed the combination of basally produced DPs traveling forward and apically produced DPs going backward. The fact that the DP level fell off when the tip of the pressure sensor was moved away from the BM by a few hundred microns (FIGURE 18B) revealed the existence of a slow DP traveling wave along the BM, without precluding a fast DP pressure wave near the noise floor of the system, i.e., 55– 60 dB SPL. Previous data with the same setup (54) had shown evidence of a fast, spatially invariant compression wave dominating at ⬎100 m from the BM, whereas the slow backward propagating wave took over at closer distances. Most of Ren’s findings were thus confirmed including, in a number of configurations, DPOAEs leading in phase their corresponding BM DPs. The interpretation clashed with Ren’s in terms of which DP wave was most relevant, as the larger slow wave near the BM was emphasized. The phase lead of DPOAEs was explained, not in terms of the stapes vibrating at the DP frequency before the BM, but by assum- ing the combination of a forward traveling DP produced basal to the measurement point and propagating toward its CF location, and of a flat-phased DPOAE obeying scaling symmetry, i.e., the wave-fixed component of Kemp. The fiber-optic work, though showing DP waves traveling on the BM in the reverse direction, could not assess which, these or the fast compression waves away from the BM, dominate in the external ear canal. Another experiment (13) addressed this issue, by measuring the properties of the pressure wave far from the BM, near the bony wall of the cochlea, both in SV and ST in turns 1 and 2 of the guinea pig cochlea (sites SV1 and SV2 tuned to 9 and 3 kHz, respectively). The single pressure detector could be moved back and forth without perturbing cochlear sensitivity and earcanal DPOAEs, and its sensitivity allowed stimuli at 50 dB and 60 dB SPL to be used for generating DPs at 2f1-f2 well above the noise of the detector. Phases were compared between the two measurement sites for stimulus frequencies between 18 and 1.5 kHz (FIGURE 18C). The DPs, larger in SV than ST by ⬎10 dB, were comparable in magnitude (55– 60 dB SPL in SV) in the first and second turns, within a few decibels on average (FIGURE 18D). Their phase differ- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1593 AVAN, BÜKI, AND PETIT ence between SV2 and SV1 was about ⫹90° for f2 ⫽ 3 kHz, i.e., with the main DP generation site near SV2, thus the DP pressure wave took ⬃160 s to travel the 6 mm from its generation site toward SV1, at an average speed of ⬃36 m/s. For comparison, the pressure difference between SV and ST at stimulus frequencies traveled from SV1 to SV2 at ⬃40 m/s. As for levels, as the reverse middle-ear transfer function applies a flat 35-dB loss (52, 157), the 55- to 60-dB SPL of the DP wave filling up the SV, correctly matched the DPOAE level in the ear canal, i.e., 22 dB SPL. At f2 ⫽ 9 kHz with the dominant DP generation site near SV1, the DP wave at SV2 lagged by 90°, as expected from a DP wave traveling apically from its generation place. This experimental setup reveals a clearly shorter backward travel time than the one derived from OAE studies, including recent ones (160 vs. 400 s; Ref. 176). It conclusively indicates that the measured pressure throughout SV is the relevant signal at the origin of DPOAEs beyond the stapes footplate, thus addressing a question left unanswered in Reference 53. Nonetheless, it contradicts the compression wave theory in several respects. In the first place, 40 m/s is 40 times slower than the speed of sound in water, and actually comparable to the average speed of transverse traveling waves at the cochlear base (227). Second, a compression wave should spread through the whole cochlea and produce, almost instantly, similar pressures in SV and ST, whereas the finding of a ⬎10-dB difference suggests that the BM, acting as a high-impedance device, was coupled to the SV pressure wave, as would a membrane serving as support for a backward traveling wave. The relationships between the space-filling pressure wave in SV measured in Reference 13 near the bony wall and the pressure measured in Reference 53 near the BM on its ST side, at two places where the wavelengths were clearly very different, remain to be clarified, i.e., how do sound waves propagate between these two places, and how fast? Going back to data modeling, the most thought-provoking finding of Ren is undisputed, the negative slope of phase data. As the initial explanation, i.e., that the DPs reached the oval window ahead of the BM responses, has lost ground, this negative slope calls for an alternative explanation. A recently proposed one (250) rests on the outcome of the so-called Allen-Fahey paradigm (6), invoked by Ren in support of the compression-wave theory (222). This consists in keeping constant, not only the DP CF site, but also the DP level at this site by monitoring the responses of auditory neurons tuned to the DP frequency. The ensuing measurements of DPOAEs under different combinations of stimulus frequencies and ratio f2/f1 reveal a strong influence of the latter parameter. At f2/f1 ⬇ 1, wavelets from distributed sources recombine much more strongly toward the DP CF site than toward the stapes. This can be explained only if wavelet phases undergo the large rotations that come with 1594 slow wave retro-propagation (see sect. VD). A hybrid model might solve the conundrum by assuming DP source coupling into both slow and fast waves, with the DPOAE arising predominantly by fast-wave coupling, and the DP at its CF location by slow-wave coupling. The simplest class of hybrid models fails to improve the accuracy of predictions of the Allen-Fahey experiment. More complex models, that remain to be designed, are criticized on the grounds of suspiciously ad hoc requirements such as the need to radiate similar amounts (250). Nevertheless, both fast and slow waves have been detected and their interferences produced notches (54), which suggests that they do radiate similar amounts. Another recent theoretical approach (254) examines several conventional models of sound propagation in the cochlea, which consistently predict that slow reversetraveling transverse waves can produce negative phase slopes if two conditions are met: a broad enough distribution of DP sites of generation and a reasonably large stapes reflectivity. Even more recently, application of a two-dimensional nonlinear hydrodynamic cochlea model (283) to the analysis of Ren’s experiments suggested that the DP generation sites were well basal to the measurement sites, thus reconciling the negative phase slopes with the possibility of slow-wave backward propagation. The distribution of generation sites basal to the f2 CF location was estimated to 0.5 mm for f2 ⫽ 12 kHz, which emphasizes, once again, two important caveats. What relates DPOAE levels to local cochlear status is not a simple proportionality relationship, as contributions undergoing significant phase rotations within the generation sites interfere in a complex manner; and DPOAE versus lesion maps may not coincide as DPs basal to local lesions still contribute (170). At present, evidence that the dominant mode of OAE backward propagation is by slow transverse waves along the BM has become overwhelming, supported as it is both by theoretical models and experiments, even though hints of a fast compression mode and perhaps, of a second slow transverse mode along the Reissner’s membrane have been reported. However, this propagation controversy had the positive consequences of fostering innovative experimental setups allowing fine measurements of longitudinal BM responses and intracochlear sound pressures, and of emphasizing the inescapable complexity of DPs. For a model to realistically predict even the simplest outcomes (f2/f1 ratio, Allen-Fahey paradigm, suppression patterns), it must acknowledge that DP sites of production span a broad cochlear interval, much broader than the place tuned to f2, and that DPs travel and recombine along intricate paths. Thus the naive idea that a DP gram provides an “undistorted” map of cochlear function no longer safely supports audiological applications. On the other hand, DP grams provide much more useful insights into the intimacy of cochlear micromechanics, for experimenters willing to spend some time designing proper paradigms and deciphering their outcomes. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS VI. FROM COMPLEX STIMULI TO NO STIMULUS A. Three or More Tones Suppression induced by cochlear nonlinearity provides a simple means for probing DP generation sites. As suppression arises from compressive amplification, which is tuned, a suppressor tone is most efficient near the place tuned to its frequency. Thus when a signal from an unknown site undergoes strong suppression, this site is likely to be located near the characteristic place of the suppressor. The addition to two DPOAE-generating primaries of a third tone swept in frequency and level allows suppression tuning curves to be built, depicting the suppressor level required to induce a given amount of DPOAE level change, against suppressor frequency (FIGURE 19). These curves display a sharp tip around f2, indicating that DPOAE suppressibility is tuned. It confirms the sensitivity of the 2f1-f2 DPOAE to cochlear status around f2, the place of maximum interaction between primaries (27, 163, 165). The suppression patterns of DPOAEs also display a fine structure, notably confirming the existence of the secondary DP mechanism (the coherentreflection one) around the place tuned to 2f1-f2 (72). The suppression paradigm has been refined to explore all possible sources of DPs and all possible nonlinear interactions in the cochlea, and the maps now include the effect of suppression on DPOAE phases (64, 169). The addition of a third tone f3 to a two-tone stimulus (f1, f2) at the input of a nonlinear system also creates a huge increase in complexity by multiplying the possible arrangements of combination tones. Their full analysis is outlined in APPENDIX III. It has revealed the importance of basal generation of DPs, more than one octave above f2 even at moderately loud stimulus levels (103, 168, 169) (FIGURE 19). These results stress the need to take into account the broad distribution of nonlinear sources to correctly explain initially unexpected properties of DPs (see sect. VH for the paradoxical phase lead of DPs at the stapes, for example). Novel combinatorial possibilities that lead to the production of additional tones have been discovered or inferred from three-tone studies, extending the two-mechanism taxonomy of nonlinear versus coherent-reflection contributions (246). The need for faster paradigms, e.g., for collecting phase-gradient data to infer from them cochlear travel times and tuning information, has also inspired attempts at replacing f1, in the (f1, f2) pair of stimuli, by a group of closely spaced tones with carefully chosen spacings (176, 272). A whole new set of DPs has been disclosed in this manner, thus opening a new field of investigations (see APPENDIX III). B. Impulses FIGURE 19. Interference response areas in a rabbit ear, obtained as color maps of the changes in amplitude (top) and phase (bottom) of a 2f1-f2 DPOAE (primary tones near 3 kHz, vertical arrows, at 60 dB SPL), in the presence of a third interfering tone swept in frequency (horizontal axis) and level (vertical axis). In addition to a first suppression lobe in the (f1, f2) region, a second prominent lobe peaks about an octave higher. The phase change in the high-frequency suppression lobe differs from that in the primary region (dark red vs. blue-green), suggesting the presence of two different DPOAE sources with different phases, each in turn being removed by the interference tone at different places. [From Martin et al. (169). Copyright 1999, with permission from Elsevier.] The briefer a sound impulse (a click), the broader its spectral range. Thus a click spreads over most of the BM. An advantage is that the entire cochlea can be probed for its ability to emit OAEs simultaneously by collecting the TEOAEs generated by this sudden excitation (121). The tradeoff is that, while the response of a linear system to a click is a mere superposition of the responses to each spectral component of the click, the click-evoked response of the nonlinear cochlea bears complex relationships to the responses to tones. These already require a careful analysis of complex issues regarding the processing and propagation of soundinduced vibrations in the cochlea (see sects. IV and V), and TEOAEs add two degrees of complexity in relation to nonlinearity and traveling-wave dispersion along the BM, i.e., the fact that the lower frequency OAE components being processed more apically will increasingly lag the higherfrequency ones and spread out in time. The wide popularity that TEOAEs have gained as an audiological tool, for giving an idea of OHC function over a broad range of frequencies in a single recording, justifies a brief overview of their prop- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1595 AVAN, BÜKI, AND PETIT erties in relation to cochlear nonlinearity and other types of OAEs. Recordings of TEOAEs in human ears display a complex pattern of oscillations (FIGURE 3) generally lasting 10 –20 ms after click onset, with the lowest frequency components tending to fade out the latest. The onset of TEOAEs is generally hidden by the fact that the recording equipment removes the first millisecond so as to ensure that the processed response cannot be contaminated by stimulus ringing in the external ear canal. This procedure may remove a sizeable part of the short-lived, high-frequency TEOAE components (⬎6 kHz, typically). The spectrum of TEOAEs thus extends from 6 kHz down to ⬃0.5– 0.7 kHz, below which middle-ear transfer is thought to be unfavorable. Characteristically, the growth of TEOAEs is compressively nonlinear, much less than 1 dB per dB increase in click level, which inspires the design of the so-called nonlinear-artifact rejection protocol. In the original equipment designed by Kemp (122), one block of regularly spaced stimuli is made of three clicks at level L and positive polarity, followed by one click at level L⫹10 dB (i.e., three times larger) and with negative polarity. In the averaging process, linearly growing contributions, among which the echo of the stimulus on passive structures, get cancelled and only the nonlinearly growing part of the TEOAE remains. Another typical feature of TEOAEs, as conspicuous in the time domain as in the spectral one, is an idiosyncratic distribution of beats among higher and lower frequency oscillations, and of rather irregular peaks and troughs in the spectrum (210). Such irregularities suggest that TEOAEs arise from place-fixed perturbations along the BM, as their exact distribution, differing from cochlea to cochlea, would strongly influence how backward-scattered wavelets would recombine to form the final TEOAE. Indeed, simple hardware or computational models of banks of filters (300, 313) excited by impulse stimuli produce outputs closely resembling natural TEOAEs once some degree of disarray has been introduced in their frequency-to-place map. Such models also called into question the idea that narrow-band frequency components extracted from a TEOAE could be literally interpreted in terms of round-trip travel time of the corresponding frequency component along the BM (which does not mean that it does not relate to it). First, some models of the cochlea as a travel-less filterbank accurately predict the observed spectral structure and temporal delays of TEOAEs (300), and second, the temporal properties of oscillations generated by filter disarray reflect the way oscillations from neighboring filters, by interfering with one another, produce waxing and waning envelopes, the peak delays of which depend on the frequency differences among filters and not on their central frequencies (313). One anticipates that TEOAEs may relate to SFOAEs, in response to every single frequency component of the click, 1596 or to DPOAEs, responses to any pair of components of the click (fi, fj) such that nfi ⫺ mfj ⫽ f, with n and m integers, or to both, which raises the issue of which relationship is closest. As SFOAEs mainly stem from a coherent-reflection mechanism (part IV) whereas DPOAEs result from a mixture of contributions from a nonlinear, wave-fixed mechanism and from (linear) coherent reflection (part V), which mechanism accounts best for TEOAE properties must be examined. The mechanism issue is also important for interpreting the pattern of changes in TEOAEs produced by OHC impairment over some frequency interval along the BM. It has been seen with DPOAEs that the nonlinear mechanism of generation can spread over a broad basal interval where OHC damage may impinge on OAE levels at lower frequencies. The dominant view is that TEOAEs arise mainly from (linear) coherent reflection (112, 246). If TEOAEs are principally made of a mixture of SFOAEs, a close correspondence is expected between frequencies such that OHCs tuned to them are healthy (efficient emitters giving rise to coherent reflection) and frequency bands represented in the TEOAE, and indeed often reported (210). The structure of TEOAE spectral peaks, highly stable in a given ear, supports the idea that place-fixed irregularities serve as primers for coherent reflection. Furthermore, TEOAE and SFOAE growth functions at fixed frequency, and TEOAE and SFOAE spectra at fixed stimulus intensity tend to match one another in a given ear (112). The rapid phase variation of TEOAE phase with frequency and that already reported with SFOAEs closely resemble each other and differ from the nearly frequencyindependent phase of DPOAEs (244). Yet, a few discrepancies suggest a more complex picture for the mechanism of TEOAEs. For example, the correlation between the map of healthy OHCs and the TEOAE spectrum is only approximate. After basal OHCs had been selectively exposed to acute damage, TEOAE responses were found to be decreased in lower-frequency intervals, although OHCs tuned to these frequencies had been unaffected (10) (FIGURE 20). Cogent evidence of a nonlinear mechanism other than coherent reflection has been brought about by the finding in TEOAE responses of frequency components not present in the stimulus and, thus, produced by a distortion kind of mechanism (305). Such a mechanism predicts a not-so-straightforward mapping of healthy versus damaged cochlear intervals, as indeed observed (10). Insight into the coherent reflection versus distortion contributions (place- versus wave-fixed), was provided by reports that in guinea pigs and humans, with an open-canal recording technique allowing the onset of TEOAEs to be reliably analyzed (301, 303), TEOAE generation appeared to shift from a wave-fixed to a place-fixed mechanism over the time course of the response, with the phase slope shifting with poststimulus onset time from a shallow to a steep pattern. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS (167), which suggests a weak or nonexistent coherent-reflection mechanism. In addition to accounting for large rodent DPOAEs only made of the nonlinear wave-fixed component, combined with the conclusion of Reference 301, it explains why the TEOAEs of rodents are small and shortlasting, as only their early, wave-fixed component exists. The longer-latency TEOAEs relying on coherent reflection are strong only in primates. The short latency of TEOAEs in small rodents finds its counterpart in the large-frequency spacing of the periodic crests found in SFOAE spectra, as it is inversely proportional to the emission latency. FIGURE 20. TEOAE amplitude spectrum in the ear of a guinea pig before versus after exposure to a loud high-frequency sound (top part of the graph; dotted line, before exposure; solid line, after exposure) against audiometric changes induced by the exposure [changes in compound action potential (CAP) thresholds between 1 and 30 kHz: inverted triangles]. While CAP threshold changes did not extend below 4 kHz, a decrease in TEOAE (solid line, bottom part of the graph) occurred even in the 1.5- to 3-kHz interval. [From Avan et al. (10). Copyright 1995, with permission from Acoustical Society of America.] This framework helps understand the differences between TEOAEs in humans or non-human primates, e.g., in Reference 164, and small laboratory rodents (TABLE 3). In the latter, TEOAEs are smaller in amplitude and much shorter in duration (e.g., Refs. 10, 12, 295). Conversely, DPOAEs usually have much larger levels in small laboratory animals (30 – 40 dB SPL are common) than in humans (⬃10 dB SPL). These observations, summarized in TABLE 3, fit the two-mechanism model of Shera (246) if one assumes that the coherent-reflection mechanism is larger in primates than nonprimate mammals. Indeed, in four species of small rodents, the presence of an interacting tone at the frequency of the DP, meant to suppress the coherent-reflection component, actually does not affect any of the DP properties Taking advantage of the sharp onset of the TEOAE-evoking stimulus, the timing of suppression of active mechanisms has been demonstrated directly, which was impossible with other types of OAEs. A click suppressor presented at varying times before the click stimulus (114) produces a TEOAE change depending on postsuppressor delay (90, 279). This change discloses some typical features of an AGC, a system able to adjust its gain according to the input level integrated over the immediate past and such that the larger the integrated level, the smaller the updated setting of the gain. Experiments revealed the presence of a cochlear AGC acting with a time constant of about two periods, with a controlling slope of ⬃0.5 dB/dB (90, 279). Its general significance will be examined more closely in section VIIIB. C. Spontaneous Vibrations Even without stimulation, the ear can spontaneously emit continuous tones, SOAEs (210). They were objectively detected and identified for the first time by Kemp (120). It was initially argued that SOAEs might be mere filtered noise; however, the statistical properties of the temporal distribution of SOAE amplitudes strongly suggest the existence of a Table 3. OAE mechanisms in humans versus rodent ears, in TEOAEs versus DPOAEs Human Ears Distortion mechanism Coherent reflection mechanism DPOAE amplitude DPOAE growth notches DPOAE fine structure TEOAE duration TEOAE amplitude TEOAE structure Present Present, yet not strong: ⫺3 to ⫺15 dB relative to the distortion component (1); present in only 3 of 10 ears (167) Small (120, 167):ceiling 15 dB SPL Few, around 50 dB SPL (92) Conspicuous (91) Long (121) Large (121) Two parts, the early one attributed to distortion, the later one to coherent reflection (303) Rodent Ears Present Absent or weak (167) Large (127, 167):ceilings 35–40 dB SPL Many, around 60–70 dB SPL (291) Weak or absent (167) Short (10) Small (if integrated over the same 20-ms window as for human ears) (10) Only the early (distortion-related) part (10) TEOAE, transient evoked otoacoustic emission; DPAOE, distortion product otoacoustic emission; SPL, sound pressure level. Reference numbers are given in parentheses. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1597 AVAN, BÜKI, AND PETIT genuine intracochlear source of energy (19). Many species from all four tetrapod classes have SOAEs so that the presence of a basilar or tectorial membrane is not critical. These SOAEs are usually made of one or a few narrow spectral peaks, 0.15– 40 Hz wide in humans (19, 274), with highly stable frequencies and levels. Apart from rare exceptions (76), a normal cochlear sensitivity is required for SOAEs to be present in humans. However, a sizeable percentage of normally hearing individuals have no SOAE, from 35% in human neonates to ⬃70% in adults (210). As expected, SOAEs have correlates on the BM (195). In his pioneer work, Gold (77) predicted that spontaneous oscillations would occur in his local regenerative oscillator model, close to the region of instability. However, he wrongly assumed that SOAEs would be objective correlates of tinnitus. His efforts at recording them in tinnitus-complaining subjects failed because tinnitus is a phantom neural percept in relation to peripheral deafferentation due to sensory-cell failure that precludes OAEs. Despite his conceptual error, Gold had made an important point. In the engineering domain, families of limit-cycle oscillators (e.g., the van der Pol oscillator) poised on the verge of auto-oscillation can, in the absence of input other than the inherent noise, generate outputs presenting many highly nonlinear characteristics of SOAEs (104), even though isolated oscillators cannot emulate all SOAE properties (274). A pure tone in the ear canal can enhance or suppress an SOAE at a neighboring frequency, and entrain SOAE frequencies (148). Repeated click stimuli synchronize the existing SOAEs that show up among the averaged TEOAEs as stable, inordinately long-lasting oscillations. Changes in cochlear status, whether general [aspirin- or temperature-induced (148, 149)] or targeted to a specific frequency interval (exposure to high-level sound) affect the SOAEs by decreasing their level and shifting their frequency, before recovery brings them back to their initial state. Minor changes in the boundary conditions at the stapes, produced by a change in ear-canal, middle-ear, or intracochlear pressure (47), similarly affect SOAEs (FIGURE 21). When the cochlear partition is biased by the application of a loud infrasound, SOAEs exhibit dynamic behaviors consistent with the shifting of the OP of OHCs (20). Privileged frequency spacings between adjacent SOAEs have been reported (24), similar to the average frequency spacings between the crests in SFOAE and TEOAE spectra (313). All these properties illustrate typical characteristics of cochlear micromechanics, i.e., subtle impedance adaptation for optimal wave propagation; strong nonlinearities adjusting the amplification such that near threshold, when the cochlea amplifies optimally, cochlear oscillators are close to instability; as a result of the former two properties, a tendency for coupled oscillating elements to lock to a common frequency linear coupling (297, 299), in sharp contrast, always pulls resonance frequencies apart. 1598 FIGURE 21. Frequency spectrum of the sound collected by a microphone in the sealed ear canal of a human subject, initially displaying three peaks (SOAEs) in the 1,150- to 1,350-Hz frequency interval (bottom). Sound recordings were then repeated every 10 min while the subject was moved from an upright to a ⫺20° headdown posture, before being taken back to upright (horizontal arrows on the left mark the times when his body was tilted). Body tilt produces an increase in intracranial pressure that stiffens the boundary between inner and middle ear. This change strongly influenced the SOAE spectrum, as the initial three SOAEs vanished while two novel spectral peaks appeared in turn (vertical arrows, a then b). When the subject went back to his initial posture, the three initial peaks immediately reappeared, yet with amplitudes differing from the initial recordings, the central peak being smaller while its neighbors were larger and narrower. [From de Kleine et al. (47). Copyright 2000, with permission from Acoustical Society of America.] Two theories of SOAE generation have been proposed. The first one is inspired by the behavior of van der Pol oscillators, and supported by experiments demonstrating that isolated hair bundles can spontaneously oscillate. This striking phenomenon has been described in sensory cells from bullfrog saccules surviving in controlled ionic environments (171). Oscillations relate to the existence of an unstable interval of negative stiffness, calcium-dependent and due to the influence of the gating compliance. Every time a deflected bundle moves toward its resting position, it enters the unstable region and flips back and forth without stopping. If deflection-independent stiffness terms exceed the negative stiffness associated with gating compliance, which is likely the case in mammalian cochleas (277), spontaneous bundle oscillations may no longer occur. An alternative mechanism for SOAE generation is that of cochlear standing-wave resonances. In this view, SOAEs result from mul- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS tiple internal reflections of forward- and backward-traveling waves between the stapes boundary and a place with disarrayed reflection sites giving rise to coherent reflection. In its passive version, the energy comes either from sounds from the environment or physiological noise, while in the active, prevailing view, the cochlea acts as a biological “laser cavity” maintaining stable standing waves by coherent wave amplification (243). VII. THE PARTICULAR STATUS OF THE APICAL COCHLEA Experimental data from the cochlear apex are scarce, mostly from interferometry (43, 85, 125). Little amplification can be observed even in a sensitive cochlea, and the growth function is said to be “linear,” a misnomer only referring to the absence of compression. Broad and symmetrical tuning curves are observed, in stark contrast with the high-frequency pattern of a sharp tip with a steep cutoff on the high-frequency side and a U-shaped low-frequency tail. Nonetheless, WD is definitely present and physiologically vulnerable, and large harmonic components have been recorded from the motion of the apical organ of Corti in healthy guinea pig cochleas (125). A revealing pattern of changes was observed after death, at different spots in the organ of Corti where the probing laser beam was focused. Whereas the motion of reticular lamina connecting the tops of OHC bodies decreased, as intuition would have suggested, BM motion increased almost a hundredfold, with a sharp increase in tuning. It suggests that at variance with the rest of the cochlea, the apex is subject to a different feedback mechanism with a different purpose. Active negative feedback (instead of the positive basal one) exerted by OHCs would tend to decrease BM motion. Fundamental principles of negative feedback are to reduce noise, to stabilize the amplification, and to make the growth of BM motion “linear” even if it was compressive without feedback (124, 125). Feedback manipulations did not affect the size of harmonics, which the authors saw as an indication that their hypothesized feedback loop coupling fluid velocity acting on the BM to the Hensen support cells on which they focused their interferometer, did not include the WDgenerating OHC stereocilia bundles (125). The apical DPs, not only harmonics but also intermodulation products at f2-f1 and 2f1-f2, can be heard, have correlates in the responses of neurons in the auditory midbrain (2), and usefully underpin psychophysical tasks (99, 174, 209). Below some corner frequency, measured at 1.3 kHz in gerbils (2), DPOAEs are smaller than would be expected from the large intracochlear DPs. This might be due to a decrease in middle-ear gain, as happens in human temporal bones (212). Yet, large interspecies differences in the frequency dependence of the middle-ear gain have been reported, perhaps in relation to differences in ear-canal volumes (157, 212), making it difficult to find a general explanation for the issue of low-frequency DPOAE levels. Another model exploring the peculiarities of apical responses has been proposed (216), hypothesizing a ratchet mechanism that strives to explain how frequency selectivity may be achieved in low-frequency auditory-nerve fibers, despite the fact that the resonance frequency of the apical BM is likely higher than the frequencies processed apically. Indeed, predictions from realistic values of mass and stiffness indicate a discrepancy between apical BM resonance and the actual limit of low-frequency hearing. The ratchet model also examines why the steep high-frequency cutoff that characterizes wave propagation along the basal BM is not visible apically. The ratchet mechanism (216) is such that every second half-cycle, OHC electromotility uncouples hair-bundle forces from the BM. The resulting unidirectional coupling is unfavorable to amplification of BM motion, hence its observed lack of significant compressive nonlinearity. The main perceptive interest would be that contrary to BM motion, hair-bundle motion would be tuned, with open and symmetrical tuning curves, as observed in auditory neurons, extending the hearing sensitivity to frequencies well below the low-frequency limit of BM resonance. This ratchet model may well be anatomically grounded. The orientation of OHCs in the apical cochlear turn strikingly differs from basal ones in that, instead of standing perpendicular to the BM, their axis is tilted by ⬃67° relative to the reticular lamina joining the apexes of hair cells (125), which may favor periodical uncoupling of hair bundles from the tectorial membrane. The specificity of low frequencies affects not only the tuning mechanism, but also wave propagation. A break of scaling symmetry has been pinned down at the cochlear apex (49). Although suppression experiments show that human DPOAEs below 1.5 kHz still come directly backward from the nonlinear site of generation, and not from secondary coherent reflection, they lose their characteristic phase invariance at constant f2/f1, the one underpinning the horizontal-banding pattern of FIGURE 14B. A possible explanation is suggested by the recent model of cochlear mechanics positing two sound propagation modes, the classical one along the BM and another one along Reissner’s membrane (217). Although the latter wave may be of uncertain relevance at high frequencies, the model derives from impedance and wavelength considerations that, at low frequencies, it can interact with BM motion and disturb BM resonance and DP propagation. VIII. MATHEMATICAL MODELS OF NONLINEARITY The ultimate sources of peripheral auditory nonlinearities are clearly identified as the MET channels of hair cells. Their nonlinear behavior can thus be incorporated in models of cochlear mechanics that attempt to predict realistic characteristics. Many of these models correctly describe the nonlinear compressive amplification and frequency tuning. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1599 AVAN, BÜKI, AND PETIT Auditory WDs are more difficult to accurately predict, as they present distinctive features such as a rate of growth that depends on the passive or active modality at the generation site, and the preeminence of odd-order distortions. These idiosyncracies were ignored at the time of Helmholtz when he tried to propose the first mathematical model of auditory nonlinearity (95), which led him to dismiss some of Tartini’s observations that contradicted the outcome of his model (see sect. XI: Historical Insert). In the field of cochlear mechanics, modeling efforts endeavor to address a few questions, e.g., what is the most parsimonious model of active resonators in which nonlinear compressive amplification, frequency tuning and distortions are coupled and present realistic features? Which parameters of this model are critical for accurate predictions? If the cochlea gives sluggish responses to brief excitations, is it due to mere filter delays that come with any battery of highly tuned oscillators, or does it betray the presence of stimulus envelopeextracting devices from which an AGC can be built, with the advantage that its compression is almost WD-free? Last, if a model predicts a correct outcome, does it vindicate the architecture of the model or can its correct outcome stem from generic considerations that any other model of a similar class would also satisfy? A. Generic or Specific? The last decade has witnessed thriving literature using the Hopf-bifurcation formalism to describe the motion of auditory hair cells or of their stereocilia bundles as critical oscillators poised on the verge of self-oscillation (34, 60, 105). The advantage of this model lies in its ability to describe many features of auditory stimulus processing by hair cells, compressive amplification and nonlinearity with limited risk of instability, just by introducing a control parameter in the behavior of a mass-stiffness resonator with damping. This control term allows viscous damping to be offset at very low oscillation amplitudes. Damping contains an additional nonlinear term varying as the third power of displacement. Its sharp increase at higher amplitudes reduces the gain of the oscillator and avoids instability. The dynamics of a critical oscillator described by the complex variable Z, in the presence of a force f(t), is described by the differential equation ⱍⱍ dZ ⁄ dt ⫽ ⫺ 共r ⫹ i0兲Z ⫹ B Z 2 Z ⫹ ei ⁄ ⌳ f共t兲 (5) where 0 is the resonance radian frequency, the real parameter r controls the proximity to the bifurcation (which happens when r ⫽ 0). The multiplicative factor acting on the external force expresses the existence of friction (⌳) and of a phase shift at which f acts on Z. The amplitude-dependent |Z|2. Z cubic damping term predicts compressive amplification at a rate of 0.3 dB/dB, consistent with experimental data. By slowing down the decrease of DPs with decreasing stimulus level, this rate also underpins the prediction of essential nonlinearity with the characteristics that Gold- 1600 stein described long ago. Two-tone suppression, DPOAEs, and many of their characteristics can also be correctly predicted, at least within the approximation of small displacements as a Taylor expansion. Refinements such as the incorporation of a traveling wave for exciting the resonators have been successfully implemented (56, 110). Other systems of time-dependent differential equations can be built from attempts to model a realistic auditory organ with its BM and OHCs. For example, they can include MET transfer functions as sole source of nonlinearity, longitudinal coupling between OHCs for introducing some degree of delayed feed-forward, and a combination of hair-bundle dynamics and OHC-body electromotility reconciling the two alternative models of an amplifying mechanism (265). A Hopf bifurcation may even emerge from such systems, without the need for individual oscillators to be critically poised near a bifurcation (265). Modeling the active nonlinear cochlea is still a very open field in which many details remain to be sorted out. To choose the most suitable set of models, the ability to simulate realistic nonlinearities may be a good criterion. It is thus important to establish which characteristics are generic and which are specific to a particular model. For example, does the success of Hopf-bifurcation views of auditory stimulus processing in hair cells, for predicting realistic cochlear nonlinearities and not only the compressive growth, definitely vindicate the underlying model? Actually, theories published in the 1930 –50s in electrical engineering (55, 64) have shown that several properties of a (time independent) nonlinear system mapped by an input/ output sigmoid transfer function automatically derive from a minimum set of hypotheses. Any transfer function can be modeled as a combination of odd and even-order terms [an odd function is such that f(x) ⫽ ⫺f(⫺x), while f(x) ⫽ f(⫺x) for an even one] that underlie odd and even DPs, respectively. Experimentally derived voltage/displacement haircell transfer functions share with any arbitrary nonlinear function displaying large second and third derivatives, many nonspecific properties of their combination tones. For example, the condition allowing two-tone interference to generate maximum DPs is always that the amplitudes of both tones be equal at the input of the nonlinear system, that is, at the interference place (137). The growth of a DP with the amplitude of one primary tone when the amplitude of the second primary is kept constant is also stereotyped. With A1 (amplitude at f1) held constant, the amplitude of the 2f1-f2 DP increases linearly when A2 (at f2) increases from a small value. With A2 constant, the amplitude of the 2f1-f2 DP increases in proportion to A12, reaches a maximum, then decreases. As for suppression, it always requires that the suppressor drives the nonlinear function into its saturation region, and that if the suppressed signal is, itself, into saturation, the suppressor must be no less that 10 dB below it. Not surprisingly, Equation 5 predicts these fea- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS tures equally well. Another intriguing, yet generic property is the occurrence of sharp nulls in the output of the system (TABLE 2). Although sometimes interpreted as proof of interference between sources that would happen to be equal in amplitude and out of phase, they can actually mark transitions between saturations from one to both sides of a nonlinear transfer function (64, 154). However, modeling issues are far from straightforward because in addition to local properties of nonlinear oscillators, several other aspects of auditory processing in the cochlea shape the outcome of nonlinearities. Wave mixing from distributed cochlear sources of DPs is one compounding factor. A second one is that the input to the nonlinear transfer function producing cochlear nonlinearities is not the sound in the external ear canal, but what reaches the DPgenerating system after propagation, amplification, and compression have occurred. Likewise, a measured DP is what eventually reaches the measurement spot, and after leaving its site of generation, the DP has undergone several modulating interactions, such as filtering at the generation place (tuned to another frequency), absorption at its own CF location, coherent reflection, reflection at the impedance discontinuity at the cochlea/middle ear boundary. B. Instantaneous or Sluggish? Another intriguing question has been little explored, namely, whether cochlear nonlinearities are static or dynamic (59, 155, 315). Above, this review has largely rested on descriptions of the nonlinearity associated with hair bundle motion as instantaneous, with immediate distortion of the current through the bundle. A dynamic nonlinearity would introduce some degree of sluggishness, due to its dependence on the past history of the system. The most typical example is provided by an AGC, a design that engineers use to obtain compressive amplification, similar to that in the cochlea, while minimizing waveform distortion. It has already been mentioned that the moderate levels of distortion in the ear are striking, in view of the strong nonlinearity of the transfer function of MET channels. In an AGC, the gain is adjusted as a function of the average level of the output from the system. This is extracted with the help of some low-pass filter, providing an integrated view of the stimulus over a finite interval of recent history. Different integration times define a whole continuum of AGC systems, and even a Hopf-bifurcation system implicitly assumes sluggish gain control (271). However, neither sluggishness nor the moderate distortion level suffices to imply the existence of an AGC. Regarding the time issue, any tuned system stores energy and responds to a stimulus with a delay that increases with the degree of tuning. As for the pattern of distortions, their frequencies, particularly those of harmonics and even-order intermodulation, are far enough from the CF of their generation site to be filtered out more than the odd-order intermodulation components, again in pro- portion to the narrowness of tuning. To probe the existence of an AGC experimentally, more distinctive features must be sought. Several experimental observations strongly support the presence of a noninstantaneous control of cochlear gain. For example, in the auditory nerve, the period histogram of neuronal discharges in response to a pure tone, up to a few kilohertz, remains a half-wave sinusoid over a broad range of levels (the sinusoid is rectified because there cannot be any action potential when hair bundles move in the inhibitory direction). Yet the growth of the synchronized (i.e., phase-locked) rate of action potentials with stimulus level is highly compressive (74). With a system displaying an instantaneous nonlinearity, even though its tuning made it sluggish, it would be paradoxical that such a compression result in a sinusoidal, thus undistorted period histogram, but not with an AGC system. Initially, the proposed AGC was a neural mechanism, either a reservoir of neurotransmitter or a possible slow modulation of cochlear gain by an efferent control of OHCs via the medial olivocochlear neural pathway (155). Later, indirect evidence of a cochlear AGC has been produced (315). Electric responses to tone bursts from supporting cells, thought to be quite similar to OHC responses yet easier to record directly, displayed several signatures of an AGC in addition to small WD despite high compression, i.e., asymmetry of the on- and off-transients; a frequency increase with time during the off transient, within a modulated envelope. In an AGC, the gain is turned off during signal onset so that the on-response builds up faster, while the increase in gain after signal termination generates a complex shape in the offtransient. Direct evidence of the size of the AGC-like delay has emerged from several paradigms. The first one (see sect. VIB) consisted in measuring the timing of TEOAE changes when suppressed by a click emitted a few milliseconds before the regular TEOAE-evoking stimulus (90, 115). A time constant of two periods was inferred. Another protocol (273) studied the timing of 2TS in auditory neurons tuned to a broad interval of frequencies, by evaluating the group delays of their responses. The delay of suppression was larger than the delay of excitation by several hundred microseconds, which suggested an AGC with, again, an integration time of about two periods at CF location. Last, responses of the BM of guinea pig cochleas to white noise were analyzed by cross-correlating the input noise to the BM velocity response (214), which yields a so-called “firstorder Wiener kernel” (see sect. VIIIC, APPENDIX V). In a linear system, this is a standard procedure providing its impulse response. Comparison of the BM responses to noise and clicks, therefore, evaluated by how much the cochlea departed from linearity. It was found to depart very little so that cochlear filtering could be said to be “quasi-linear.” Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1601 AVAN, BÜKI, AND PETIT This is as surprising as the fact that cochlear responses to tones exhibit so little harmonic distortion (43, 224, 230), despite the high measured compression rate. If cochlear processing worked as a bandpass linear filter combined with an instantaneous nonlinearity, the wave shapes of responses to noise could not be accurately predicted by a linear method. Thus it is not only convenient but also necessary to invoke an AGC in the cochlea. The existence of an AGC drastically reduces harmonic distortion, but interestingly, without necessarily precluding intermodulation. Assume an AGC with a response time extending over several periods, and an amplitudemodulated signal. Its envelope, extracted by the AGC filter, adjusts the gain. A sine wave is transformed, after a few periods, into a sine wave with smaller amplitude. The amplitude-modulated signal, conversely, gets distorted as its envelope is flattened by the decrease in gain triggered by the growth of this envelope. This generates DPs, and it can even be shown that these DPs, even when generated at a single location, present a rather realistic microstructure suggestive of interference (this time among terms with different mathematical origins; Ref. 271). Overall, a variety of experimental clues point to the existence of a functional AGC in the cochlea, even though an open issue is the actual identity of the low-pass filter mechanism extracting the average level of the output from the system, to provide an integrated view of the stimulus over about two periods. C. From Particular Stimuli to a General Characterization Most published investigations of cochlear function have used only a very restricted set of stimuli, i.e., either pure tones, pairs of pure tones for the specific topic of DPOAEs, or brief impulses. What is straightforward for a linear system, that is, to predict the response of the system to a spectrally complex stimulus from the responses to each spectral line of this stimulus, becomes only approximately true in the case of a nonlinear cochlea. One has to consider that the prominent nonlinearities of the cochlea are small enough for being mere perturbations in an almost linear cochlea. When using a pair of tones, for example, 2TS is conveniently evaluated by measuring how the amplitude of the response to the probe tone changes in the presence of a second, suppressing tone. Likewise, it is possible to study the interaction of two simultaneous tones at similar levels by detecting the combination tones that their interference generates, and perturbation theory is valid as combination tone levels are generally found at more than 30 dB below primary levels, i.e., they are more than 30 times smaller. The Wiener-Volterra kernel formalism provides a powerfully general method in strongly nonlinear systems. Briefly presented in APPENDIX IV, it describes the responses of a system as a sum of functionals unfolding the nonlinearities at 1602 increasing orders, extracted from cross correlations between the output of the system, and its input stimulus at varying time delays between input and output. Gaussian white noise is the privileged input, as its mathematical interest is to provide Wiener kernels that are independent of each other and can be numerically computed. Its ecological interest is that, better than tones or clicks, it allows both simultaneous and prolonged stimulation of the whole cochlea. First-order Wiener kernels from a linear system would give its impulse response, but in a nonlinear one, it contains the impulse response plus corrective terms. Second-order Wiener kernels give a measure of how much the presence of a first impulse stimulus influences the response to a second one (162). They have sometimes been successfully applied to characterizing the temporal coding of auditory nerve fibers even when the carrier frequency of the stimulus is too high for phase-locking of the action potentials to occur. In this case, it is from the envelope modulations also produced by the stimulus that, indirectly but efficiently, temporal coding in these neurons can be inferred. The extended use of Wiener kernels for studying cochlear mechanics, seldom attempted (156, 275), usually requires their outcome to be compared with that of a model, in search of an optimal match ensured by adjusting the model parameters. Thus the advantages of these general methods, unconstrained by the limitations of low-level approximations to explore the nonlinearities of a system, are somewhat offset by the need to start with a few hypotheses regarding the block diagram of the system under study. IX. AUDITORY MASKING AND THE COCHLEA Auditory masking is one typical nonlinear phenomenon. It is revealed by the psychophysical task of detecting a probe tone, first alone and at a level slightly above detection threshold, enough to be easily detected, and then, at the same level but mixed with a second sound with varying frequency and level, called masker. The masker characteristics are then adjusted in order that in the masking configuration, the probe tone is no longer heard, or the auditory potential evoked by the probe is no longer detected. Differences between masking and nonmasking configurations probe the nonlinearity of the system. Mixtures of sounds at different frequencies and levels are frequently encountered in everyday life, either when the vibrations from different sources are mixed in the transmitting medium before reaching the ears, or when a single source produces a combination of components at different frequencies (e.g., a musical instrument or a voice). Nonlinear interactions among frequency components of the mixture allow some components to mask others, according to well-acknowledged empirical rules (58, 286). The detection of speech alone or in the presence of background noise can become difficult if crucial components get masked. The study of masking should help quantify the perceptive consequences of cochlear nonlinear- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS ity. However, whether measured by perceptive tests in subjects or by neuronal recordings, masking actually results from a blend of three mechanisms, only one of which, the suppressive masking, is of cochlear origin while the other two mechanisms are neuronal ones. This section examines which are the distinctive features of the suppressive masking, from which information on cochlear nonlinearity can be derived. A. Basic Characteristics of Auditory Masking For a tone, serving as a probe, presented with a narrowband masker (it is not recommended to use a pure tone as masker as the amplitude modulation of the mixture of probe and masker might provide unwanted perceptive cues) the threshold of masking increases roughly in proportion to masker level for probe frequencies around the masker frequency region. In other words, starting from a probe tone just detected in the presence of a masker, when the level of this masker increases by 10 dB, the probe tone has to be set 10 dB higher to be heard again. In contrast, thresholds increase more rapidly for probe frequencies well above the masker frequency region (58, 286). This so-called “upward spread of masking” expresses the detrimental effect of highlevel, lower-frequency maskers on the detection of higher frequency probe tones. As the probe must increase by over 1 dB/dB increase of masker level, to remain audible, psychophysical references often call this masking “nonlinear,” as opposed to a “linear,” 1 dB/dB race for audibility between probe and masker levels. With reference to the meaning of linear and nonlinear throughout this review, these terms are misnomers, and as such, when necessary, will be mentioned with quotes, as even the “linear” masking is an essentially nonlinear effect. vate other neurons enough to modify their activity. This detection is called off-frequency listening as it is mediated by neurons not tuned to the probe. The perceptive properties of the detected signal may of course differ from what they should be if detected through the neurons tuned to it, in terms of pitch for example, but since the psychoacoustic tasks are usually based on mere binary detection, the fact that tested subjects exploit off-frequency listening to achieve a task may be difficult to guess even by the subjects themselves.] The line-busy explanation is a sketchy one, positing that, as neurons have a limited rate of action potentials due to their refractory period of a few milliseconds, if the masker drives them at the maximum rate, the occurrence of the probe tone, which when presented alone increases the discharge rate, can no longer influence it as it is already saturated. This explanation does not hold when the masker is weak enough not to saturate the neuron, and yet masking still occurs. The term swamping is more conservative, suggesting that masking results from the fact that the probe tone increases the discharge rate, but not enough for being detected because the probe-induced activity is swamped by the masker. Swamping happens for two reasons. The first one is that the rate-level functions have a compressive slope so that the increment in discharge rate is smaller when the tone is mixed up with the masker than when alone. The second reason is the increase in variance of the discharge rate of a nerve fiber when its mean rate increases, which makes it more difficult to detect the slight increase in rate due to the probe tone (48). The swamping masking mechanism requires that the masker itself excite the neurons responding to the probe tone (excitatory masking). It is not cochlear, and not even specific of hearing. Yet, whenever masking experiments investigate the nonlinear cochlear mechanisms, their results will most often be contaminated by this nonlinear behavior of neurons. B. Line-Busy Masking C. Adaptation The earliest systematic investigations of the mechanisms of masking relied upon recordings of auditory nerve fibers (48, 107, 256), and contributed to the identification of the three masking mechanisms. The earliest to be invoked (67) is purely neuronal, the line-busy mechanism. Let us consider a probe tone presented alone, above the threshold of a recorded auditory nerve fiber. This induces an increase in discharge rate of this fiber. If, when the same probe tone is added to a continuous masker, no change happens to the discharge pattern produced by the masker alone, a simple interpretation is that the neuron is busy responding to the masker so that the probe tone can no longer modify its activity; thus it can no longer make itself detectable. [A caveat is that even though the neuronal activity in response to the probe tone is no longer detectable in the group of auditory fibers tuned to the probe, this does not mean that the probe can no longer be detected at all. It may still acti- The second mechanism by which masking happens in auditory neurons is called adaptation, again an excitatory, specifically neuronal mechanism happening when a masker has been presented long enough for adapting the neurons responding to it, thus decreasing the response of these neurons when the probe tone is presented (87, 257). Adaptation still goes on for a few milliseconds after the masker has been turned off. This phenomenon is called nonsimultaneous masking, here, forward masking. D. Suppressive Masking The third mechanism, suppressive masking, is one inherent property of nonlinear cochlear mechanics, whereby the masker suppresses the mechanical response to the probe Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1603 AVAN, BÜKI, AND PETIT tone along the basilar membrane, by jamming the active mechanism by which the probe tone is processed. The simplest protocol revealing this masking mechanism is 2TS, described in section VA. In contrast to line-busy and adaptation mechanisms, suppressive masking is not caused by an excitatory neuronal mechanism. Even before direct measurements on the BM and hair cells revealed the mechanical nature of the suppressive nonlinearity (201, 233), experiments using microelectrode recordings of neuronal activity showed that a masker can suppress the response of a neuron to a probe tone despite not exciting this neuron when presented alone (236). More complex situations than the 2TS paradigm have been studied, i.e., noise (126) and speech sounds (237), for which evidence of suppression in the discharge rate of auditory neurons has also been obtained. Suppression is strong enough that, for a probe tone at the CF of a neuron, the level required to produce a given average discharge rate may shift upward by 40 –50 dB in the presence of a suppressor (48, 73, 75). Despite this large effect and more generally, the impact of nonlinearity on the cochlear processing of sound, traditional psychophysical models rest on an excitatory picture of masking explaining the influence of a masker by the way its pattern of excitation spreads along the cochlea to the place of the probe tone (67, 286), and swamps neuronal activity in response to the probe. Accordingly, many models used for interpreting masking, loudness, and frequency and intensity discrimination do not take nonlinear interactions into account. Nowadays, the accuracy of these linear masking models is increasingly called into question (48). Recent reports of strong changes in masking, in the particular case of Strc⫺/⫺ mutant mice with no suppressive interaction despite normal cochlear sensitivity and frequency tuning, raise the same issue (282) (see sect. IIID). In an extensive study at the level of the auditory nerve, Delgutte (48) therefore sought to differentiate between excitatory and suppressive mechanisms over ranges of masker levels and of frequency spacings between probe and masker. Comparisons were made of masking thresholds obtained in simultaneous and forward masking. Suppression may contribute to the former, but in no way to the latter one. Thus the differences in masker levels required to mask a fixed signal at the CF of the recorded fiber, could be attributed to suppression. For masker frequencies well below the probe frequency, the amount of suppressive masking was found to be large, increasing with masker level more rapidly than did excitatory masking. The results of Delgutte, obtained with probes ⬍40 dB SPL, thus indicate that the upward spread of masking is largely due to the growth of suppression rather than to that of excitation. This view, supported by other physiological studies, e.g., Reference 199, suggests that psychoacoustic models should be revisited to incorporate suppression. Accordingly, physiologically motivated models that describe cochlear compression with the help of nonlinear, velocitydependent feedback characteristics similar to those dis- 1604 cussed in section VIIIA, now achieve realistic predictions of OAE properties, of 2TS and of the outcome of some psychophysical tasks (63). The psychoacoustic finding by Oxenham and Plack (198) of a strong “nonlinear” (i.e., ⬎1 dB/dB) upward spread of masking in nonsimultaneous masking apparently contradict the physiological results of Delgutte (48), who found no suppression in the nonsimultaneous masking condition, no strong upward spread of masking, and a merely “linear” (1 dB/dB) growth of masking. A likely cause of this discrepancy is that Oxenham’s probes fell between 40 and 90 dB SPL. It is in this range of levels that the compressive behavior of the BM is most intense, whereas at ⬍40 dB SPL, compression decreases and probably vanishes below 20 dB SPL (228). In the region tuned to the probe tone, BM vibrations in response to maskers between 40 and 90 dB SPL and near the probe frequency (“on-frequency”) grow slowly as they undergo strong compression, whereas the responses to off-frequency maskers, masker frequency well below that of the signal, grow at a rate of 1 dB/dB increase of the masker level. This differential growth of BM responses to the probe versus the masker suffices to explain a large growth of masking for probes with a frequency above the masker frequency even in the nonsimultaneous, i.e., nonsuppressive masking conditions observed by Oxenham and Plack. In addition, and in agreement with physiological studies, the same authors did also find a major role for suppression in determining thresholds at high masker levels, for an offfrequency masker (198). Playing with simultaneous versus nonsimultaneous-masking conditions and with on- versus off-frequency maskers turns out to be suitable for separating the various causes of masking and study their distinctive properties. When more than two tones are presented simultaneously to the cochlea, a more complex situation than 2TS pertaining to the intelligibility of speech sounds, unmasking phenomena may occur and reinforce auditory-contrast sharpening. A two-tone stimulus can be less effective to mask a probe tone than one of its components (98, 242). In a nonsimultaneous masking experiment, the masker, made up of a mixture of two sounds, is presented first, then a gap, then the probe tone. This probe tone can be better heard if mutual suppression happens between the two simultaneously present components of the forward masker, as this suppression decreases the efficiency of the overall masker. Conversely, simultaneous presentation of all sounds would lead to no clear result because the signal, being present simultaneously, would be suppressed by the masker, and not only the weak component of the masker by the stronger one. This explains an apparent paradox. It is widely accepted in the psychoacoustical literature that effects of suppression are not revealed in simultaneous masking (98), which might sound weird since suppression requires simultaneous presence of suppressor and probe. The par- Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS adox disappears when one distinguishes the suppression exerted by the masker on the probe (“masker-to-probe suppression”) from the suppression among frequency components of a complex masker (“within-masker suppression”) (48). X. CONCLUSIONS AND PERSPECTIVES: NONLINEARITIES AT THE CORE OF AUDITORY MECHANISMS While the physiological and psychophysical relevance of compressive amplification in the nonlinear cochlea is well acknowledged, WD, the other manifestation of auditory nonlinearity, for the time being, seems to draw its major interest as a tool for probing cochlear mechanisms. A coherent picture has gradually emerged from three decades of investigations of the active mechanics in auditory organs and of OAEs, relating amplification and frequency tuning to all manifestations of nonlinearity. An amplitude-dependent positive damping term ensures mechanical stability and produces compression. Nonlinearity itself is a mandatory consequence of MET channel functioning, and WD and its consequences afford a privileged insight not only on the processes that ensure an appropriate gating of MET channels, but also on the integrated mechanism of feedback which, within auditory sensory cells, actively processes auditory stimuli. While the basic functional principles of auditory nonlinearity are increasingly well understood from mammals to lower vertebrates, the molecular assembly surrounding its core, the MET channel, remains incompletely established. The solution to this protracted molecular hunt would not radically change any physical principle of the gating process leading to the generation of WD, anchored as they are in thermodynamic laws. However, it would make it easier to dissect the interaction of MET-machinery subparts, e.g., the pore, gating spring, tip-link, and the mechanisms that reset the resting position of the MET channel. Our knowledge of how interstereociliary links work also remains imperfect. An improved understanding of hair-bundle architecture might help to explain why DPs become undetectable in the absence of top connectors even though MET channels still operate, and more generally, how the shape of the MET transfer function and the characteristics of cochlear nonlinearities can depend on a coordinated motion of the stereocilia bundle. The most readily detected WDs are the DPOAEs, yet their propagation from stereocilia bundles to the stapes is a complex topic so that, in practice, it may be useful to choose eliciting stimuli such that one generation site or one propagation mode dominates the DPOAE pattern. The controversy regarding backward propagation of sound by fast compression waves versus slow traveling waves has led to considerable refinements of measuring devices and theoret- ical tools. The part played by slow traveling waves seems dominant; however, intermediate structures and paths ensuring DP propagation between hair bundles and the stapes remain a matter of active research. DPs present a strikingly complex microstructure that challenges simplistic interpretations, e.g., of a change in DP amplitude. Some DP features, deep notches for example, are reminiscent of interference patterns among different sources, yet some possibly are mere products of the complexity of the nonlinear transfer function of a single source, with its saturations, asymmetries, and mobile OP. No less convincingly, other DP features have been shown to stem from genuine interference among sources or paths. Is it possible that still other paths and mechanisms exist that have been missed? Propagation is only one aspect of the broader issue of DPs as noninvasive windows on cochlear frequency tuning. Their use heavily draws on the possibility to extract reliable timing data which, with steady tones as stimuli, remains a nagging issue. In particular, despite evidence that the cochlea operates as a (minimally distorting) AGC exploiting a noninstantaneous nonlinearity, surprisingly little hard data can be found regarding the timing and hard-wiring of what controls the noninstantaneous extraction of stimulus level to adjust the gain. Several reports point to an interaction time of a few periods (90, 273), while thought-provoking, extreme models incorporate the whole history of cochlear activity over hours (142), by assuming that the scala media may act as a pressure accumulator, with pressure and the resulting shift in resting position (140) as a control parameter of cochlear operation. In contrast, Hopf bifurcation-based models predict noninstantaneous distortion, but its mechanisms are not explicit, and at present, a tangible way to control critical oscillators, e.g., by Ca2⫹ ions entering the MET channels, is suggested only in models of nonmammalian ears. The perceptive benefits of WD remain an open issue even though the coherence of all cochlear nonlinear properties grants it, at least, the status of a sensitive probe. Several mechanisms tend to maintain WD at low levels, i.e., compression by controlling their growth; filtering (14) as WDs are produced at places not tuned to them; and an AGC mode of functioning. One likely advantage of WD is that the resulting suppressive interactions may improve the analysis of complex sounds by enhancing local contrasts and favoring resonant components (262). Suppressive interactions within the spectral components of complex maskers might decrease their masking effect on a probe tone (98, 242). Last, DPs may usefully contribute to the processing of complex sounds as the activity of auditory centers contains a robust representation of the f2-f1 and 2f1-f2 components (2). Pitch extraction is likely assisted by neuronal responses to these DPs, as suggested by masking experiments that decrease pitch salience and neuronal representations of DPs in parallel (209, 255). [The missing-fundamental perceptive phenomenon is different (see sect. XI, Historical Insert), as Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1605 AVAN, BÜKI, AND PETIT it relies on central processes that persist when DPs have been cancelled.] Regardless of its influence on perception, WD still deserves the efforts required to disentangle its intricate origin and propagation mechanisms, as its detection moved from the laboratory to everyday use in audiological setups decades ago. Several examples deserve to be mentioned owing to their everyday usefulness. With sound-evoked OAEs, universal screening of sensorineural hearing impairment has been implemented in neonates, which most often includes OHC dysfunction (22, 122). Being intracochlear sources of sound, OAEs act as a virtual calibrated earphone on the internal side of the stapes, probing the influence of acoustic impedance mismatches along sound-propagation pathways. Impedance changes due to increased pressure in labyrinthine fluids creating endolymphatic hydrops, or even to causes outside the field of otology such as increased cerebrospinal fluid pressure, can be monitored and applied to clinical and physiological investigations (32). Longitudinal studies in large human cohorts have suggested that soundevoked OAE decline with noise exposure might detect latent damage ahead of audiometric changes, thus serve as a forewarning of hearing loss (141). In mutant laboratory mice, sound-evoked OAEs rapidly sieve the auditory function and help unveil the associated impaired subcellular structures. Last, central modulations of cochlear function via the olivocochlear medial efferent nerve bundle, possibly affording protection against noise overstimulation and attention-driven improvement of signal-to-noise ratio, can also be assayed by OAEs (83). XI. HISTORICAL INSERT: TARTINI’S TONES The composer Giuseppe Tartini (1692–1770) reported the perception of an additional tone (“terzo suono”), when two tones at frequencies f1 and f2 played in a loud and sustained manner by two violins or oboes were listened to by an experimenter, himself, placed midway, near enough for the sounds to be loud and with similar levels, thus 5– 6 steps away. He published a detailed account of his experiments in Trattato di Musica secondo la vera scienza dell’armonia printed in 1754 in Padua, Italy. This captivating book (available online as a free e-book) classically leads his name to be used in the alternative designation as “Tartini tones” of intermodulation WD components at arithmetic combinations (2f1-f2, 3f1–2f2, f2-f1, etc). A few years before, the organist Sorge had published similar findings in his book Vorgemach der musikalischen Komposition (Lobenstein, 1745–1747). However, doubts arise whether Tartini and, especially, Tartini’s readers were actually interested in combination tones. From Tartini’s own words, “one has discovered a new harmonic phenomenon . . . if two sounds of just intonation be sounded clearly and loudly together, there will result a third 1606 sound lower in pitch than the other two, and which will be the fundamental sound of the harmonic series of which the first two sounds form an integral part . . .” The suspicion increases at the reason invoked by Tartini in support of his claim to have discovered combination tones well before anybody else, in another of his books published in 1767, De’ principii dell’armonia musicale contenuta nel diatonico genere. There he tells that many people could testify that when aged 22, at Ancona, he had discovered the third sound, later (1728) used in Padua to teach pupils how to obtain pure intonation on the violin. When playing double stops (by pressing two strings simultaneously), the pupils’ ability to hear the “terzo suono” served as a guide to correct intonation, since its occurrence indicated that the intonation was pure. As a music teacher, Leopold Mozart had in mind a similar scope when he wrote that Tartini tones are a device that “a violinist can use in playing double-stopping, and which will help him to play with good tone, strongly, and in tune.” Tartini also reported that any two consecutive sounds of the series of harmonics produced by a vibrating body invariably produce the same “terzo suono,” and often used the term “basso fondamentale” throughout his works. Pitch, the key element invoked in all these reports, is the quality that distinguishes two different notes played on the same instrument with equal force, and that allows notes to be organized on a scale from low to high. While the pitch of a pure tone and its frequency are isomorphic, the pitch of a mixture of tones relates to physical cues in a less straightforward manner. The so-called “periodicity pitch” (146) derives from the detection in central auditory nuclei of periodic patterns of discharges in afferent neurons at the period 1/f of the fundamental component, when several harmonics of the same fundamental f (mf, nf, etc., with m and n integers) coexist. The pitch of a sound complex (mf, nf) matches that of f presented alone, as the main periodicity of the discharge patterns is the same regardless of the presence of energy at f. When it is missing, the so-called “missing fundamental” (306) shows up as a phantom perception from the neural auditory system. This phenomenon is universally experienced by telephone users. Most telephone lines cut low frequencies ⬍300 Hz so that all male voices are deprived of their fundamental, but the perception of their pitch is not in the least affected. The identification of the fundamental pitch can even persist, although in a weaker form, if each harmonic of a pair is presented in a dichotic manner, showing that the reconstruction of pitch from the two ears is made in the central auditory system. Both the missing fundamental and the Tartini tones are spectacular perceptive effects involving additive mechanisms in relation to the occurrence of a series of regularly spaced spectral lines. Yet they are fundamentally different in that a Tartini’s “terzo suono” is physically produced in the cochlea irrespective of any particular relationship between f1 and f2, while a missing fundamental can be perceived, centrally, only if the primaries are consonant. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS The insistence of Tartini and especially his followers to focus on pitch may suggest that if they fell on combination tones, as they were actually interested in harmonic series and for Tartini, in the use of double-stops in the pieces he composed, their interest was largely caught by the missingfundamental phenomenon. An incidental paragraph on page 17 of Tartini’s Trattato, however, shows that Tartini, as the keen observer he was known to be, did hear genuine combination tones: “perche il terzo suono si ha non solo dagl’ intervalli composti da quantita razionale, ma si ha ancora dagl’ intervalli composti da quantita irrazionale” (“because the third sound stems not only from intervals composed of rational quantities, but also from intervals composed of irrational quantities,” in which case no missing fundamental can exist). A century later, Hermann von Helmholtz was the first to propose a mathematical theory of Tartini’s tones, based on a Taylor expansion of the mechanical response of a nonlinear system to two primary tones, limited to the quadratic term assumed to generate the largest contribution (it does, unless the nonlinear system displays a peculiar kind of symmetry). Regardless of any harmonic or inharmonic relationship between f1 and f2, components at f2-f1 and f2⫹f1 emerge (Die Lehre von den Tonempfindungen, APPENDIX XVI). Helmholtz thus ascribed Tartini’s observations to the f2-f1 quadratic combination tone. Intriguingly, however, he noticed discrepancies between his work and Tartini’s, subtle yet most revealing. According to Helmholtz, Tartini had been an unreliable observer . . . In his chapter IV, Helmholtz wrote that “Tartini notated all secondary pitches an octave too high.” In hindsight, now that the properties of cochlear distortion are understood, this rebuke provides a forceful vindication of Tartini’s outstanding skills as an experimenter, and of his indisputable discovery of Tartini’s tones as a characteristic cochlear phenomenon: not only did Tartini notate his pitches correctly, but this remark shows that Helmholtz could not have detected Tartini’s tones in his studies! Among the ratios tested by Tartini were 4:3:(2), 5:4:(2), 6:5:(2), 8:5:(2) and 5:3:(2), the first two numbers of each series referring to the primary sounds played by each violin, deliberately chosen as integer multiples of the same fundamental frequency, and in parentheses, the terzo suono identified by Tartini (177). With the difference tone at f2-f1 in mind, Helmholtz predicted that the difference tones reported by Tartini should have been 4:3:(1), 5:4:(1), 6:5:(1), 8:5:(3) and 5:3:(2), thus in many cases, secondary pitches different from Tartini’s. Helmholtz thought, wrongly, as we now know, that higher-order combination tones did not exist as such, but were first-order combination tones of particular harmonics of the primaries (208). For reasons explained in section V, it is now acknowledged that physiological odd-order DPs are much louder than even-order ones, the 2f1-f2 tone reaching ⫺15 dB below the primary tones while the f2-f1 is not audible at stimulus levels below 50 dB SL (79). It is therefore likely that Tartini detected the (n⫹1)f1-nf2 combination tones, with the right pitch and not an octave too high, and to give another example, 8:5:(2 ⫽ 2 ⫻ 5 - 8) and not at all the 8:5:(3 ⫽ 8 ⫺ 5) of Helmholtz. Tartini’s combinations 5:4:(2) and 6:5:(2) correspond to higher odd-order combinations than 2f1-f2 [that would have been 5:4:(3) and 6:5:(4), respectively], the 3f1–2f2 and 4f1–3f2 DPOAEs that can be quite large and fully audible. Moreover, Tartini also understood that the terzo suono was best heard when the primary tones were of similar level so that the musicians playing each primary could not hear it as clearly as himself standing midway. This is now attributed to the need for the two primaries to have envelopes of similar size on the BM to maximize their nonlinear interaction and the resulting DPs. Tartini thus deserves full credit for the discovery of physiological combination tones. His power of analysis led him to disclose several of their unusual properties, now known to be signatures of cochlear nonlinearity, i.e., odd-order predominance and envelope-overlap requirement. But this raises a new shocking question: could an outstanding scientist like Helmholtz have so deeply misunderstood the issue of Tartini tones? The (obvious) answer is that he did not. Helmholtz actually never claimed to have specifically studied physiological combination tones; on the contrary, he explains thoroughly, in chapters VII and XI of his book, that many of his sound sources distorted sound, especially when the primary sounds producing combination tones were played on the same instrument or by mechanically coupled sound sources. They also produced higher harmonics. Helmholtz sometimes used purer tones from tuning forks, but mostly ⬍ 200 Hz (177). Helmholtz also mentioned his misgivings regarding his, and others’ ability to identify the pitches of combination tones, all the more when buried in a complex mixture of spectral components, as seen above, hence his lack of confidence in Tartini’s reports. In many cases, his measurements were done with the help of one of his resonators, a cumbersome, yet remarkable ancestor of digital spectral analyzers. This could be done only for sounds generated, outside the ear, by the aforementioned nonlinear sources. Helmholtz’s only mistake was to have assumed that the characteristics of these sources, and of their carefully studied exogenous combination tones, also applied to the case of combination tones from the ear. This would have been justified if the distorting element had been a passive structure, such as the eardrum or ossicles as Helmholtz proposed. Actually, being produced by the MET channels of OHCs, the endogenous combination tones bear the signatures of a slow rate of increase with stimulus level and of odd-order symmetry. All the quantitative characteristics reported by Helmholtz, at variance with these signatures, point to exogenous combination tones, i.e., that they are Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1607 AVAN, BÜKI, AND PETIT produced by even-order, quadratic distortion (his appendix XVI); that their growth function is steeper than that of primary tones (in the second page of his chapter VII; we know that it should be 2 dB/dB in the absence of compression); that the dominant spectral lines in the case of quadratic distortion are at f2-f1 and f2⫹f1, the summation tone, that Helmholtz most likely heard, not as an endogenous DP as this one is inaudible (208), but as the product of a distorting music instrument. The reasons why the properties of cochlear DPs are drastically different from instrumental ones stem from the dynamic properties of the hair bundle, discussed above, and that have only been recently unveiled. XII. APPENDIX I: OTOACOUSTIC EMISSIONS WITHOUT TRAVELING WAVE The emission of SFOAEs with long delays by the auditory organs of all tested tetrapods, i.e., mammals, birds, lizards, and frogs, regardless of the existence of a BM-supported traveling wave, raises the issue of the minimum system able to produce responses with long delays. Delays can be seen as the sum of traveling and filtering contributions. The current view is that the only requirement for producing realistic SFOAES is the presence of a slightly disarrayed battery of tuned resonators (16). This paper starts from a detailed realistic model of a nonmammalian inner ear, to thoroughly derive an analytic expression of the SFOAE delay. Other authors have reached similar qualitative conclusions regarding the presence of long delays in another category of OAEs evoked by short impulse stimuli [clicks; these transiently evoked emissions are thought to stem, at least partially, from a backscattering mechanism similar to that of SFOAEs (see sect. IVE)]. The core of their argument rests on minimal assumptions. Assume a bank of overlapping bandpass filters connected in parallel, regularly spaced, with gradually increasing center frequencies. Let the bank be excited by an impulse. Every frequency component of the impulse preferentially goes through the filter centered on its frequency. The outputs of this filter and of all other filters recombine to generate an output signal. As a whole, the battery passes all frequencies equally; thus it responds to the impulse by a similar impulse. However, the output of each individual filter is a slowly damped oscillation at its center frequency. As the oscillations from neighboring filters gradually shift in frequency, when combined they oscillate out of phase with one another and cancel each other, except at the very beginning when they oscillate in unison, hence the impulse as overall output. If the filters get slightly disarrayed, then the cancellation between neighboring components becomes imperfect so that remnants of individual responses show up, with time delays corresponding to several periods of the center frequencies of the disarrayed area. Zwicker (312), in his hardware model of the cochlea simulated as a battery of electrically tuned filters, observed this phenomenon and aptly re- 1608 marked that although the disarrayed system seemed to emit transient-evoked signals with delays that could suggest the propagation of traveling waves, these delays, in this case, merely corresponded to the time it took for the oscillations of the disarrayed filters to build up. In a workshop held in 1989 in honor of Thomas Gold and his pioneer idea of an active cochlea, Gold (78) brought forward the analogy of the sound of a concert piano when clapping one’s hands nearby. Every string significantly vibrates for several seconds and ringing is heard, but it would not if the accuracy of frequency mapping was improved by putting 20 times the number of strings. Conversely, ringing would get louder if strings were mistuned. Wit et al. (300) quantitatively modeled the behavior of a disarrayed bank of gammatone filters and showed that not only do its “emissions” resemble TEOAEs in the time domain, but they also present, in the spectral domain, a realistic quasi-periodic pattern of peaks and troughs with the periodicity emerging, not from the structure of the disarray (there was none as it was random) but from the tuning properties of individual filters, just as shown for SFOAEs (16). This happened with no intervention whatsoever of traveling wave or specific cochlear micromechanics, just a battery of tall and broad resonators with neighboring resonance frequencies and some degree of disarray. XIII. APPENDIX II: BACKWARD VERSUS FORWARD PROPAGATION OF DISTORTION PRODUCTS The principles allowing DPs to travel backward from their generation place to give rise to DPOAEs should remind the reader of those discussed for SFOAEs, even though here more mechanisms and more nonlinearities come into play. Assume that potential DP sites of generation are distributed along the places where the envelopes at f1 and f2 overlap and OHCs respond to both frequencies. Consider the case of the 2f1-f2 DP. Likely, its dominant contributions come from where the displacement is largest at primary frequencies, that is, near the peak response of the BM to f2. This place covers a broad interval. Let B and A be its limits on the basal and apical sides (FIGURE 22). Since the generation segment is long and not punctual, during stimulation DPwavelets are evoked along its entire length, every one of them at a particular location X between A and B (FIGURE 22). From X they start to travel in two directions, apically to their CF location D, and basally to the stapes S. At the stapes or at their CF location, all of them are vectorially added and, depending on their respective phases, cancellation or enhancement occurs. The DP level at the stapes determines the DPOAE level, and the DP level at CF location determines the loudness of the heard combination tone. As seen in section IVA, the phase rotation of a tone increases during propagation, by one cycle every time the wave travels by one wavelength, and as the Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS 2Φ1– Φ2 + ΦDP (≈ –Φ2) 2Φ1– Φ2 – ΦDP (≈ –Φ2) S B X A D SXS SXD B A X B 2Φ1– Φ2 + ΦDP A X A X A X 2Φ1– Φ2 – ΦDP S B X A D SXS SXD B X B A Coherent 2Φ1– Φ2 + ΦDP (≈ 2Φ2) 2Φ1– Φ2 – ΦDP (≈ 0) S SXS B X A D SXD B X B A Coherent FIGURE 22. Sketch of the interactions between waveforms at f1 and f2 along the BM (envelopes, solid lines; instantaneous vibrations, dotted lines), that lead to the generation of DPs and DPOAEs. Three different frequency ratios f2/f1 are represented from top to bottom, with f2 fixed. S, stapes, from where DPOAEs are collected; D, place tuned to the 2f1-f2 DP frequency; locus X, within the BA segment, hypothetical site of nonlinear interactions between waves at f1 and f2, along which wavelets at 2f1-f2 are launched forward (toward D) and backward (toward S). Boxplots on the right and left sides depict, for each f2/f1 ratio, the phase accumulation of 2f1-f2 wavelets launched from X along the paths SXD (on the right) and SXS (on the left). Phase coherence is ensured when phase accumulation no longer depends on X between B and A, and allows a large DP to be measured where all wavelets recombine. wavelength gets shorter when the incoming wave gets closer to its CF place, the phase accumulation speeds up. Scaling symmetry tells that the vibration patterns of f2, f1, and 2f1-f2 are just translated versions of each other along a logarithmic representation of the cochlea. At the apex, where the scaling symmetry is broken, this representation is probably not valid and its predictions fail. It is important for what follows that the phase at f2 rotates fast when this primary nears its CF location, while at the same spot, f1 and 2f1-f2 accumulate less and less phase Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1609 AVAN, BÜKI, AND PETIT with increasing f2/f1 ratio, as the peaks at f1 and 2f1-f2 move farther apically. The DP amplitude dependence on the f2/f1 ratio can be explained according to the following line of thought [FIGURE 22; from the sketches of the phase behavior of primary tones and DP, diagrams have been built displaying the phase accumulation of wavelets as a function of their generation place X, when they travel either forward (right-hand side) or backward (left-hand side); 3 different f2/f1 frequency ratios are illustrated from top to bottom]. A. Forward Propagating Wavelets In the case of wavelets traveling in the forward (apical) direction, to the place tuned to the DP, the total phase accumulation can be shown to be 21 ⫺ 2 ⫺ DP (244, 266). The journey splits into two segments, SX and XD. Along SX, primary tones f2 and f1 accumulate phase rotations 2 and 1 while reaching places of nonlinear interaction X, somewhere between A and B near the f2 peak. A DP wavelet starts from X with a 21 ⫺ 2 initial phase, and travels the next segment of the journey, XD, at a new frequency 2f1-f2. The DP term expresses the phase accumulation of the DP wavelet along its forward path XD. The nearer X is to B, the smaller the initial DP-wavelet lag, because both f2 and f1 accumulated less phase, on their way to their resonance place. Yet the nearer X to B, the longer XD, so the larger the phase lag that the wavelet accumulates on its own. Because of the sign “⫺” in (21 ⫺ 2) ⫺ (DP), wherever X lays along the BA segment, the stimulus- and wavelet-associated phase lags at the CF location, D, act in opposite directions. At small f2/f1 ratios (close to 1.0), the phase accumulations 2, 1, and DP tend to equalize, despite the necessary phase rotation along the A-B interval. For more basal Xs, wavelets experience this phase rotation during their journey forward; in the case of more apical Xs, their stimuli accumulated the same phase change instead. That wavelets experience similar phase accumulations regardless of where they were generated between B and A (bottom right diagram on FIGURE 22) is favorable to their adding up coherently to produce a high-amplitude vibration on the BM at CF location (point D) so that the DP is clearly audible. At large f2/f1 ratios however, the forward phase accumulation of DP wavelets, 21 ⫺ 2 ⫺ DP, is dominated by ⫺2. It can be seen on FIGURE 22 (top sketch) that between B and A, as the places tuned to f1 and 2f1-f2 are much more apical that that tuned to f2 at large f2/f1 ratios, 1 and DP remain small as most of the phase accumulations happen apically to A. Conversely, 2 rotates fast from B to A, and the term ⫺2 dominates the phase difference between DP wavelets launched from different Xs. The diagram depicting how 21 ⫺ 2 ⫺ DP varies against X for wavelets traveling 1610 forward to D (FIGURE 22, top right) shows a spreading spectrum of phases, unfavorable to the generation of a DP at its CF location. B. Backward Propagating Wavelets Backward propagating wavelets are responsible for the DPOAE in the ear canal. On FIGURE 22, the paths along which phase rotations are to be compared are SXS (diagrams on the left side of FIGURE 22, from top to bottom according to the f2/f1 ratio), with the same patterns of vibrations at f2, f1, and 2f1-f2 as for the forward-propagation case supporting the evaluation of phase accumulation. Its cumulated value at the stapes S is now 21 ⫺ 2 ⫹ DP, with 21 ⫺ 2, as before, representing stimulus phase accumulations along SX, their forward path to the interference places X, and DP representing the phase accumulation of the DP wavelet at 2f1-f2 from X to S. The sign flip of DP, compared with the previous case, expresses backward propagation of the DP. Once again, the more basal X, the smaller the initial phase lag with which the DP wavelet is launched. However, this time, the more basal X, the less the DP wavelet has to travel to the stapes [actually, the quantitative evaluation of DP depends on what mechanism is accepted for the reverse propagation of DPs. Whether it is a slow reverse traveling wave along the BM according to the same mechanism as forward propagation, or a fast compression wave in scala vestibuli (see sect. VH) is a topic of intense controversy. The slow traveling-wave mechanism predicts the same DP forward and backward while the fast compression-wave mechanism predicts a smaller DP backward.] At large f2/f1 ratios, the phase accumulation of DP wavelets at S, 21 ⫺ 2 ⫹ DP, is the same as at D, dominated by ⫺2, as the resonance at 2f1-f2 occurs well beyond A in the apical direction so that DP is negligible. This term ⫺2 is again unfavorable to the generation of a large overall DPOAE. For f2/f1 ⬇ 1, 1 ⬇ 2 ⬇ DP, as for the forwardpropagation case, but because of the sign flip of DP in 21 ⫺ 2 ⫹ DP, the phase difference among backwardpropagating wavelets varies approximately like ⫹22. This is too fast a phase rotation to build a large DPOAE. At decreasing f2/f1 ratio, the phase lag (21 ⫺ 2 ⫹ DP) varies monotonically, and this, from ⫺2 at high ratios to ⫹22 at low ratios. The change in sign implies that between these extremes an intermediate f2/f1 ratio can be found where the result is 0. Near this ratio, the phases of all wavelets remain coherent (FIGURE 22, left side, intermediate diagram) and the resulting DPOAE is maximum. According to the models and the results of human measurements, this happens around f2/f1 ⫽ 1.20 –1.25. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS XIV. APPENDIX III: COCHLEAR DISTORTION IN THE CASE OF THREE OR MORE TONAL STIMULI The presence of at least three tones in a nonlinear cochlea, f1, f2 (the primary tones at the origin of a DPOAE), and f3 (a suppressor tone probing the two-tone interaction) creates a large complexity as the number of possible combinations, notably if sound levels are high enough, increases considerably compared with the two-tone situation. The problem faced by experimenters is to avoid technical issues such that different combinations lead to the same spectral line, and more generally to disentangle the possible combinations to identify the sources of observed combination tones. A first issue is order aliasing, happening when two different combinations of f1, f2, and f3 lead to the same combination tone, for example, if there are integers L, M, N and frequencies f3 such that Lf1 ⫹ Mf2 ⫹ Nf3 ⫽ 2f1-f2. The resulting spectral line at 2f1-f2 is a complex vector addition of two contributions, one of which comes from places and mechanisms that were not acting in the absence of f3. The choice of f3 can be made so as to avoid obvious aliasing problems, and data can be collected using phase rotations of the suppressing tone. For example, if the f3 suppressor is presented four times with a phase set at 0, /2, , and 3/2 in succession, most combination tones containing f3 are cancelled without the directly produced 2f1-f2 being affected. To characterize the influence of an interfering tone at f3, interference response areas are built (FIGURE 19) representing how this f3 tone influences the amplitude and phase of a given DPOAE, against interfering level (the term interference is more appropriate than suppression, as the DPOAE amplitude can sometimes increase in the presence of f3). A striking feature of these areas is the presence, at moderately loud primary levels, of sharp high-frequency interference lobes when f3 is set at frequencies that can be more than one octave above f2 (FIGURE 19) (66, 163, 169). A simple explanation is that there is a secondary site of DP generation in the high-frequency region in addition to those already described; that the DP generated in this region is comparable in level to the other sources from the f2 and 2f1-f2 sites; that enhancement or suppression is observed depending on the phase difference among the mixed DP contributions so that when the f3 tone removes the basal contribution, the overall level can increase if this contribution is out of phase with the other ones. Therefore, if the plotting procedure takes into account only level changes and not phase shifts, the interference effect can well be missed (66, 302). The mechanism of basal generation of two-tone DPOAEs, revealed by a third tone, calls for an explanation. A first possibility is a harmonic mechanism (64), for example, a sound at 2f1 could be produced around the peak response to f1, and induce a response at its characteristic 2f1 place. Although small because having to travel in the wrong direction over a too loose BM, in certain circumstances, the 2f1 harmonic might remain large enough to interact nonlinearly with the lower frequency f2 stimulus, thereby generating a 2f1-f2 difference tone, from a quadratic mechanism akin to that producing f2-f1 at the place tuned to f2. Direct measurements at the cochlear apex confirm the existence of a very large mechanical component at 2f1 (85), and the high-frequency cutoff of the BM may be less sharp at higher levels than it is usually. Another mechanism termed catalyst by Fahey et al. (64) assumes the existence of richer nonlinear interactions in the presence of a third tone, that could generate 2f1-f2 and 2f2-f1 responses via intermediate stages involving f1, f3 and f2, f3 combinations. Contrary to the harmonic mechanism for which the third tone serves only for unveiling an existing source, the catalyst mechanism requires the f3 component. As an example of catalytic pathway, the combination tones f3 ⫺ f1 and f3 ⫺ 2f2 could combine at the f3 ⫺ f1 place and generate a difference tone at 2f2 ⫺ f1. Likewise, mixes (f3 ⫺ f2) ⫺ (f3 ⫺ f1) ⫹ f1 or (f3 ⫺ f2 ⫹ f1) ⫺ f3 ⫹ f1 could generate a 2f1-f2 component adding up to the cubic 2f1-f2 produced near f2. Last, cogent evidence has been produced (170), with the help of accurately mapped lesions of OHCs, of basal interactions between the tails of f1 and f2 able to produce sizeable components. Surprisingly, phase characteristics previously attributed to a coherent reflection mechanism that was thought to occur at the 2f1-f2 CF place were observed (vertical banding), in humans and rabbits, which suggests that the two-mechanism taxonomy may have to be completed. Whether the possible contribution of a traveling mode along the Reissner’s membrane would better account for this vertical banding pattern remains to be investigated (217). As for the simpler case of OAEs produced by a single tonal stimulus (SFOAEs), a similar application of a catalyst principle may account for the outcome of suppression experiments when a suppressor tone at fs is added to the SFOAEevoking probe stimulus at frequency fp (251). Nonlinear combinations of fs and fp can produce, for example, 3fp ⫺ fs, 2fp ⫺ fs, fs ⫹ 2fp, fs ⫹ fp, etc., distortion components, from which [(3fp ⫺ fs) ⫺ (2fp ⫺ fs)] or [(fs ⫺ 2fp) ⫹ (3fp ⫺ fs)] combinations produce fp, perhaps more efficiently than the fp produced by coherent reflection, the strength of which has been called into question in small rodents (167). The production of an SFOAE at fp by catalyst processes would occur at more basal sites than by coherent reflection, thus with shorter group latency (see the timing and backwardpropagation controversy in sect. IVF). Recent protocols designed combinations of even more than three tones. DP phase gradient paradigms (for example, an f1-sweep at fixed f2) afford a noninvasive means to evaluate cochlear travel times. With the goal of simultaneously acquiring data for avoiding interrecording variability and improving the computation of phase gradients, the lower of the two tones of the traditional f1, f2 stimulus can be replaced by a closely spaced group of tones (gi, gj, etc.) (175). Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1611 AVAN, BÜKI, AND PETIT The resulting DPs are made of several groups of simultaneous frequency components from which group delays can be derived from a single recording. Among the DPs, a novel category emerged, made of sidebands around the single tone primary f2. This group has no two-tone equivalent, as it consists of a set of sidebands around the single stimulus component f2 at frequencies f2 ⫹ gi ⫺ gj, with i ⫽ j. With stimulus frequencies chosen such that all possible difference and sum frequencies were unique (a so-called “zwuis” complex) (272), each i ⫽ j combination resulted in a unique DPOAE frequency, unambiguously affected to a single pair (gi, gj). Another interest of this sideband complex of DPs is that it persists at wide frequency separation (several octaves) between f2 and gi, whereas with a (f1, f2) pair, the ratio f2/f1 can hardly exceed 1.5–1.6. Moreover, the DPOAE frequencies depend only on the frequency differences gi⫺gj and are unaffected by a collective frequency shift of the entire tone complex g. It is therefore possible to efficiently use a quite low frequency for the average g, such that the forward travel time of the g stimuli to the place of nonlinear interaction at the f2 CF place is very brief. Thus the time properties of DPs should be associated principally with the reverse journey of DPs, lately a hot matter of controversy (231) (see sect. VH). XV. APPENDIX IV: CHARACTERIZATION OF NONLINEARITIES BY WIENER KERNELS In terms of the mathematics for describing a biological system such as the cochlea, it is customary to test it with the help of a restricted set of stimuli, for example, pure tones, then extrapolate the obtained results to predict the response of the system to any stimulation, however complex, for example speech. This is straightforward for a linear system, as it can be equally well described by a transfer function in the spectral domain (for example, the bell-shaped transfer function of a bandpass filter), or in the time domain, by its impulse response (the damped oscillation obtained when the bandpass filter is excited by a short click). A third method is to use Gaussian white noise as a stimulus; the first-order cross-correlation h1() between this stimulus and the system’s response is called first-order Wiener kernel. In a linear system, h1() is identical to the impulse response of the system. In a nonlinear system, the impulse response of the system also contains other components. The use of Gaussian white noise as input to the auditory system, with spikes in auditory nerve fibers as the output, was first introduced by Egbert de Boer (46) who computed the revcor (for “reverse correlation”) function as the average value of the stimulus x(t) at time before the occurrence of a spike: normalization to the stimulus power spectral density provides the cross-correlation h1(). 1612 In a linear system, from any of these equivalent descriptions, it is possible to correctly predict any response, for example, the response to a sum of sinusoids is the sum of the responses to each sinusoid presented alone. Nonlinear systems are more difficult to analyze as their response to a sum of sinusoids is not the sum of the responses to each sinusoid presented alone. The use of perturbation theory is one convenient solution for weakly nonlinear systems. For more strongly nonlinear systems, a first possibility is to use a model and derive from its analysis a set of differential equations describing the response of the system. The parameters of the model can then be adjusted by comparison between model predictions and experimental data. The topology of a strongly nonlinear system, however, is seldom well known, and a more general and systematic method relies on Volterra and Wiener formalisms (284, 294; reviewed in Ref. 59), describing the responses of a system as a sum of functionals including correction terms unfolding the nonlinearities at increasing orders. With Gaussian white noise as input to the system, its Wiener kernels are independent of each other and can be computed from input-output cross correlations. So, the outcome of the revcor method of de Boer is proportional to the first-order Wiener kernel. It provides an idea of the impulse response of the system connected to the auditory nerve fiber under investigation, from which by Fourier transform, the bandpass filter characteristics (i.e., frequency tuning) can be derived. For nonlinear systems, the first-order Wiener kernel is only one component of the impulse response to which higher kernels also contribute. In general, second-order kernels give a measure of the “cross-talk” between the responses to two impulses, i.e., of how much the presence of the first impulse influences the response to the second one (162). Another prominent interest of the second-order Wiener kernel, regarding auditory nerve fibers, is that, thanks to even-order cochlear nonlinearities encoded in the low-frequency response envelopes, it still provides information on temporal coding in auditory-nerve fibers innervating the entire length of the cochlea, whereas the revcor function vanishes as neuronal phase-locking decreases (215, 275). Studies of the nonlinearities themselves have been less rewarding so far because their translation in physiological terms often requires models that can be probed against experimental data (215). Two types of conclusions can then be reached. In Reference 275, the authors, who described the hearing organ and auditory neurons of frogs with the help of a sandwich model incorporating filters and nonlinearities in cascade, deduced the orders of the filters from a careful inspection of the second-order Wiener kernel. Yet, they also had to acknowledge the failure of their model to account for the two-tone interactions revealed by Wiener analysis. In Reference 156, Wiener analysis applied to noise-evoked OAEs demonstrated the consistence of the first-order kernel and of click-evoked OAEs. The compressive level dependence of cochlear responses was revealed in the nonlinear dependence of the first-order Wiener kernel on stimulus level, but the fact that no emission component Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS was found in the higher order kernels indicated that at each particular noise level the cochlea behaved nearly linearly, as would do a system with an AGC once set up after a time delay. 10. Avan P, Bonfils P, Loth D, Elbez M, Erminy M. Transient-evoked otoacoustic emissions and high-frequency acoustic trauma in the guinea pig. J Acoust Soc Am 97: 3012–3020, 1995. ACKNOWLEDGMENTS 12. Avan P, Loth D, Menguy C, Teyssou M. Evoked otoacoustic emissions in guinea pig: basic characteristics. Hear Res 44: 151–160, 1990. We are particularly indebted to Jean-Pierre Hardelin at Institut Pasteur for his many apt suggestions for improving the manuscript and for his meticulous editing work. The recommendations of two anonymous reviewers were of great help for improving the consistency of arguments from section to section and better emphasizing the harmony of all manifestations of distortion reviewed in this text. 13. Avan P, Magnan P, Smurzynski J, Probst R, Dancer A. Direct evidence of cubic difference tone propagation by intracochlear acoustic pressure measurements in the guinea-pig. Eur J Neurosci 10: 1764 –1770, 1998. Address for reprint requests and other correspondence: P. Avan, Laboratory of Neurosensory Biophysics (INSERM UMR 1107), School of Medicine, 28 Place Henri Dunant, 63000 ClermontFerrand, France (e-mail: paul.avan@udamail.fr). 11. Avan P, Giraudet F, Chauveau B, Gilain L, Mom T. Unstable distortion-product otoacoustic emission phase in Meniere’s disease. Hear Res 277: 88 –95, 2011. 14. Barral J, Martin P. Phantom tones and suppressive masking by active nonlinear oscillation of the hair-cell bundle. Proc Natl Acad Sci USA 109: E1344 –1351, 2012. 14a.Békésy G von. Experiments in Hearing, translated and edited by E. G. Wever. New York: McGraw-Hill, 1960. 15. Bergevin C, Freeman DM, Saunders JC, Shera CA. Otoacoustic emissions in humans, birds, lizards, and frogs: evidence for multiple generation mechanisms. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 194: 665– 683, 2008. 16. Bergevin C, Shera CA. Coherent reflection without traveling waves: on the origin of long-latency otoacoustic emissions in lizards. J Acoust Soc Am 127: 2398 –2409, 2010. 17. Bergevin C, Velenovsky DS, Bonine KE. Tectorial membrane morphological variation: effects upon stimulus frequency otoacoustic emissions. Biophys J 99: 1064 –1072, 2010. GRANTS The authors’ research was supported by grants from ERC 2011-AdG nr 294570, Fondation Louis-Jeantet, NovalisTaitbout, Réunica-Prévoyance, ANR MNP P007354 “Presbycusis,” ANR 08-ETEC-001-01 “Audiapic,” Groupe Entendre, Fondation de l’Avenir pour la Recherche Médicale (ET1-617). 18. Beurg M, Fettiplace R, Nam JH, Ricci AJ. Localization of inner hair cell mechanotransducer channels using high-speed calcium imaging. Nat Neurosci 12: 553–558, 2009. 19. Bialek W, Wit HP. Quantum limits to oscillator stability: theory and experiments on an acoustic emission from the human ear. Phys Lett A 104: 173–178, 1984. 20. Bian L. Effects of low-frequency biasing on spontaneous otoacoustic emissions: frequency modulation. J Acoust Soc Am 124: 3009 –3021, 2008. 21. Bian L, Chertoff ME, Miller E. Deriving a cochlear transducer function from lowfrequency modulation of distortion product otoacoustic emissions. J Acoust Soc Am 112: 198 –210, 2002. DISCLOSURES No conflicts of interest, financial or otherwise, are declared by the authors. 22. Bonfils P, Uziel A, Pujol R. Screening for auditory dysfunction in infants by evoked oto-acoustic emissions. Arch Otolaryngol Head Neck Surg 114: 887– 890, 1988. 23. Boutet de Monvel J, Petit C. Wrapping up stereocilia rootlets. Cell 141: 748 –750, 2010. 24. Braun M. Frequency spacing of multiple spontaneous otoacoustic emissions shows relation to critical bands: a large-scale cumulative study. Hear Res 114: 197–203, 1997. REFERENCES 1. Abdala C, Dhar S. Distortion product otoacoustic emission phase and component analysis in human newborns. J Acoust Soc Am 127: 316 –325, 2010. 2. Abel C, Kossl M. Sensitive response to low-frequency cochlear distortion products in the auditory midbrain. J Neurophysiol 101: 1560 –1574, 2009. 3. Abel C, Wittekindt A, Kossl M. Contralateral acoustic stimulation modulates lowfrequency biasing of DPOAE: efferent influence on cochlear amplifier operating state? J Neurophysiol 101: 2362–2371, 2009. 4. Aerts JR, Dirckx JJ. Nonlinearity in eardrum vibration as a function of frequency and sound pressure. Hear Res 263: 26 –32, 2010. 5. Allen JB, Fahey PF. A second cochlear-frequency map that correlates distortion product and neural tuning measurements. J Acoust Soc Am 94: 809 – 816, 1993. 6. Allen JB, Fahey PF. Using acoustic distortion products to measure the cochlear amplifier gain on the basilar membrane. J Acoust Soc Am 92: 178 –188, 1992. 7. Anderson SD, Kemp DT. The evoked cochlear mechanical response in laboratory primates. A preliminary report. Arch Oto-rhino-laryngol 224: 47–54, 1979. 8. Ashmore J. Cochlear outer hair cell motility. Physiol Rev 88: 173–210, 2008. 9. Avan P, Bonfils P, Gilain L, Mom T. Physiopathological significance of distortionproduct otoacoustic emissions at 2f1-f2 produced by high- versus low-level stimuli. J Acoust Soc Am 113: 430 – 441, 2003. 25. Brown AM. Continuous low level sound alters cochlear mechanics: an efferent effect? Hear Res 34: 27–38, 1988. 26. Brown AM, Gaskill SA, Williams DM. Mechanical filtering of sound in the inner ear. Proc Biol Sci 250: 29 –34, 1992. 27. Brown AM, Kemp DT. Suppressibility of the 2f1-f2 stimulated acoustic emissions in gerbil and man. Hear Res 13: 29 –37, 1984. 28. Brown AM, McDowell B, Forge A. Acoustic distortion products can be used to monitor the effects of chronic gentamicin treatment. Hear Res 42: 143–156, 1989. 29. Brown DJ, Hartsock JJ, Gill RM, Fitzgerald HE, Salt AN. Estimating the operating point of the cochlear transducer using low-frequency biased distortion products. J Acoust Soc Am 125: 2129 –2145, 2009. 30. Brownell WE, Bader CR, Bertrand D, de Ribaupierre Y. Evoked mechanical responses of isolated cochlear outer hair cells. Science 227: 194 –196, 1985. 31. Brundin L, Flock A, Khanna SM, Ulfendahl M. Frequency-specific position shift in the guinea pig organ of Corti. Neurosci Lett 128: 77– 80, 1991. 32. Büki B, Avan P, Lemaire JJ, Dordain M, Chazal J, Ribari O. Otoacoustic emissions: a new tool for monitoring intracranial pressure changes through stapes displacements. Hear Res 94: 125–139, 1996. 33. Caberlotto E, Michel V, Foucher I, Bahloul A, Goodyear RJ, Pepermans E, Michalski N, Perfettini I, Alegria-Prevot O, Chardenoux S, Do Cruzeiro M, Hardelin JP, Richardson Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1613 AVAN, BÜKI, AND PETIT GP, Avan P, Weil D, Petit C. Usher type 1G protein sans is a critical component of the tip-link complex, a structure controlling actin polymerization in stereocilia. Proc Natl Acad Sci USA 108: 5825–5830, 2011. 34. Camalet S, Duke T, Jülicher F, Prost J. Auditory sensitivity provided by self-tuned critical oscillations of hair cells. Proc Natl Acad Sci USA 97: 3183–3188, 2000. 59. Eggermont JJ. Wiener and Volterra analyses applied to the auditory system. Hear Res 66: 177–201, 1993. 60. Eguiluz VM, Ospeck M, Choe Y, Hudspeth AJ, Magnasco MO. Essential nonlinearities in hearing. Phys Rev Lett 84: 5232–5235, 2000. 61. Elliott E. A ripple effect in the audiogram. Nature 181: 1076, 1958. 35. Chan DK, Hudspeth AJ. Ca2⫹ current-driven nonlinear amplification by the mammalian cochlea in vitro. Nat Neurosci 8: 149 –155, 2005. 36. Chan DK, Hudspeth AJ. Mechanical responses of the organ of corti to acoustic and electrical stimulation in vitro. Biophys J 89: 4382– 4395, 2005. 37. Chang KW, Norton SJ. The effects of continuous versus interrupted noise exposures on distortion product otoacoustic emissions in guinea pigs. Hear Res 96: 1–12, 1996. 38. Cheatham MA, Dallos P. Two-tone interactions in the cochlear microphonic. Hear Res 8: 29 – 48, 1982. 62. Engebretson AM, Eldredge DH. Model for the nonlinear characteristics of cochlear potentials. J Acoust Soc Am 44: 548 –554, 1968. 63. Epp B, Verhey JL, Mauermann M. Modeling cochlear dynamics: interrelation between cochlea mechanics and psychoacoustics. J Acoust Soc Am 128: 1870 –1883, 2010. 64. Fahey PF, Stagner BB, Lonsbury-Martin BL, Martin GK. Nonlinear interactions that could explain distortion product interference response areas. J Acoust Soc Am 108: 1786 –1802, 2000. 39. Cherry EC. Some experiments on the recognition of speech, with one and two ears. J Acoust Soc Am 25: 975–979, 1953. 65. Fahey PF, Stagner BB, Martin GK. Mechanism for bandpass frequency characteristic in distortion product otoacoustic emission generation. J Acoust Soc Am 119: 991–996, 2006. 40. Collet L, Kemp DT, Veuillet E, Duclaux R, Moulin A, Morgon A. Effect of contralateral auditory stimuli on active cochlear micro-mechanical properties in human subjects. Hear Res 43: 251–261, 1990. 66. Fahey PF, Stagner BB, Martin GK. Source of level dependent minima in rabbit distortion product otoacoustic emissions. J Acoust Soc Am 124: 3694 –3707, 2008. 67. Fletcher H. Auditory patterns. Rev Mod Phys 12: 47– 65, 1940. 41. Cooper NP. Harmonic distortion on the basilar membrane in the basal turn of the guinea-pig cochlea. J Physiol 509: 277–288, 1998. 42. Cooper NP. Two-tone suppression in cochlear mechanics. J Acoust Soc Am 99: 3087– 3098, 1996. 43. Cooper NP, Rhode WS. Mechanical responses to two-tone distortion products in the apical and basal turns of the mammalian cochlea. J Neurophysiol 78: 261–270, 1997. 44. Crawford AC, Fettiplace R. The mechanical properties of ciliary bundles of turtle cochlear hair cells. J Physiol 364: 359 –379, 1985. 45. Davis H. An active process in cochlear mechanics. Hear Res 9: 79 –90, 1983. 46. De Boer E, Kuyper P. Triggered correlation. IEEE Trans Biomed Eng 15: 169 –179, 1968. 47. De Kleine E, Wit HP, van Dijk P, Avan P. The behavior of spontaneous otoacoustic emissions during and after postural changes. J Acoust Soc Am 107: 3308 –3316, 2000. 48. Delgutte B. Physiological mechanisms of psychophysical masking: observations from auditory-nerve fibers. J Acoust Soc Am 87: 791– 809, 1990. 49. Dhar S, Rogers A, Abdala C. Breaking away: violation of distortion emission phasefrequency invariance at low frequencies. J Acoust Soc Am 129: 3115–3122, 2011. 50. Dierkes K, Lindner B, Jülicher F. Enhancement of sensitivity gain and frequency tuning by coupling of active hair bundles. Proc Natl Acad Sci USA 105: 18669 –18674, 2008. 51. Dong W, Olson ES. Local cochlear damage reduces local nonlinearity and decreases generator-type cochlear emissions while increasing reflector-type emissions. J Acoust Soc Am 127: 1422–1431, 2010. 52. Dong W, Olson ES. Middle ear forward and reverse transmission in gerbil. J Neurophysiol 95: 2951–2961, 2006. 53. Dong W, Olson ES. Supporting evidence for reverse cochlear traveling waves. J Acoust Soc Am 123: 222–240, 2008. 54. Dong W, Olson ES. Two-tone distortion in intracochlear pressure. J Acoust Soc Am 117: 2999 –3015, 2005. 55. Duifhuis H. Power law nonlinearities: a review of some less familiar properties. In: Cochlear Mechanisms: Structure, Function and Models, edited by Wilson JP and Kemp DT. New York: Plenum, 1989, p. 395– 403. 68. Flock A, Strelioff D. Graded and nonlinear mechanical properties of sensory hairs in the mammalian hearing organ. Nature 310: 597–599, 1984. 69. Francey LJ, Conlin LK, Kadesch HE, Clark D, Berrodin D, Sun Y, Glessner J, Hakonarson H, Jalas C, Landau C, Spinner NB, Kenna M, Sagi M, Rehm HL, Krantz ID. Genome-wide SNP genotyping identifies the Stereocilin (STRC) gene as a major contributor to pediatric bilateral sensorineural hearing impairment. Am J Med Genet A 158: 298 –308, 2012. 70. Frank G, Kossl M. The acoustic two-tone distortions 2f1-f2 and f2-f1 and their possible relation to changes in the operating point of the cochlear amplifier. Hear Res 98: 104 –115, 1996. 70a.Fresnel AJ. Oeuvres Completes d’Augustin Fresnel Tome Premier, edited by H. de Senarmont, E. Verdet, and L. Fresnel. Paris: Imprimerie Impériale, 1866. 71. Frolenkov GI, Mammano F, Belyantseva IA, Coling D, Kachar B. Two distinct Ca2⫹dependent signaling pathways regulate the motor output of cochlear outer hair cells. J Neurosci 20: 5940 –5948, 2000. 72. Gaskill SA, Brown AM. Suppression of human acoustic distortion product: dual origin of 2f1-f2. J Acoust Soc Am 100: 3268 –3274, 1996. 73. Geisler CD. Two-tone suppression by a saturating feedback model of the cochlear partition. Hear Res 63: 203–210, 1992. 74. Geisler CD, Greenberg S. A two-stage nonlinear cochlear model possesses automatic gain control. J Acoust Soc Am 80: 1359 –1363, 1986. 75. Geisler CD, Yates GK, Patuzzi RB, Johnstone BM. Saturation of outer hair cell receptor currents causes two-tone suppression. Hear Res 44: 241–256, 1990. 76. Glanville JD, Coles RR, Sullivan BM. A family with high-tonal objective tinnitus. J Laryngol Otol 85: 1–10, 1971. 77. Gold T. Hearing II: the physical basis of the action of the cochlea. Proc Roy Soc Lond B Biol Sci 135: 492– 491, 1948. 78. Gold T. Historical background to the proposal 40 years ago of an active model for cochlear frequency analysis. In: Cochlear Mechanisms: Structure, Function and Models, edited by Wilson JP and Kemp DT. London: Plenum, 1989, p. 299 –305. 79. Goldstein JL. Auditory nonlinearity. J Acoust Soc Am 41: 676 – 689, 1967. 56. Duke T, Jülicher F. Active traveling wave in the cochlea. Phys Rev Lett 90: 158101, 2003. 80. Goodman SS, Withnell RH, Shera CA. The origin of SFOAE microstructure in the guinea pig. Hear Res 183: 7–17, 2003. 57. Durrant JD, Wang J, Ding DL, Salvi RJ. Are inner or outer hair cells the source of summating potentials recorded from the round window? J Acoust Soc Am 104: 370 – 377, 1998. 81. Goodyear RJ, Marcotti W, Kros CJ, Richardson GP. Development and properties of stereociliary link types in hair cells of the mouse cochlea. J Comp Neurol 485: 75– 85, 2005. 58. Egan JP, Hake HW. On the masking pattern of a simple auditory stimulus. J Acoust Soc Am 22: 622– 630, 1950. 82. Gopfert MC, Robert D. Active auditory mechanics in mosquitoes. Proc Biol Sci 268: 333–339, 2001. 1614 Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS 83. Guinan JJ Jr. Cochlear efferent innervation and function. Curr Opin Otolaryngol Head Neck Surg 18: 447– 453, 2010. 108. Johnsen NJ, Bagi P, Elberling C. Evoked acoustic emissions from the human ear. III. Findings in neonates. Scand Audiol 12: 17–24, 1983. 84. Gummer AW, Hemmert W, Zenner HP. Resonant tectorial membrane motion in the inner ear: its crucial role in frequency tuning. Proc Natl Acad Sci USA 93: 8727– 8732, 1996. 109. Johnson SL, Beurg M, Marcotti W, Fettiplace R. Prestin-driven cochlear amplification is not limited by the outer hair cell membrane time constant. Neuron 70: 1143–1154, 2011. 85. Hao LF, Khanna SM. Mechanical nonlinearity in the apical turn of the guinea pig organ of Corti. Hear Res 148: 31– 46, 2000. 110. Jülicher F, Andor D, Duke T. Physical basis of two-tone interference in hearing. Proc Natl Acad Sci USA 98: 9080 –9085, 2001. 86. Hao LF, Khanna SM. Vibrations of the guinea pig organ of Corti in the apical turn. Hear Res 148: 47– 62, 2000. 111. Kalluri R, Shera CA. Distortion-product source unmixing: a test of the two-mechanism model for DPOAE generation. J Acoust Soc Am 109: 622– 637, 2001. 87. Harris DM, Dallos P. Forward masking of auditory nerve fiber responses. J Neurophysiol 42: 1083–1107, 1979. 112. Kalluri R, Shera CA. Near equivalence of human click-evoked and stimulus-frequency otoacoustic emissions. J Acoust Soc Am 121: 2097–2110, 2007. 88. Harris FP, Lonsbury-Martin BL, Stagner BB, Coats AC, Martin GK. Acoustic distortion products in humans: systematic changes in amplitudes as a function of f2/f1 ratio. J Acoust Soc Am 85: 220 –229, 1989. 113. Kanis LJ, de Boer E. Frequency dependence of acoustic distortion products in a locally active model of the cochlea. J Acoust Soc Am 101: 1527–1531, 1997. 89. Harris FP, Probst R. Reporting click-evoked and distortion-product otoacoustic emission results with respect to the pure-tone audiogram. Ear Hear 12: 399 – 405, 1991. 90. Harte JM, Elliott SJ, Kapadia S, Lutman ME. Dynamic nonlinear cochlear model predictions of click-evoked otoacoustic emission suppression. Hear Res 207: 99 –109, 2005. 91. He N, Schmiedt RA. Fine structure of the 2 f1-f2 acoustic distortion products: effects of primary level and frequency ratios. J Acoust Soc Am 101: 3554 –3565, 1997. 92. He NJ, Schmiedt RA. Fine structure of the 2f1-f2 acoustic distortion product: changes with primary level. J Acoust Soc Am 94: 2659 –2669, 1993. 93. He W, Fridberger A, Porsov E, Grosh K, Ren T. Reverse wave propagation in the cochlea. Proc Natl Acad Sci USA 105: 2729 –2733, 2008. 94. He W, Fridberger A, Porsov E, Ren T. Fast reverse propagation of sound in the living cochlea. Biophys J 98: 2497–2505, 2010. 95. Helmholtz Hv. Die Lehre von den Tonempfindungen als physiologische Grundlage für die Theorie der Musik. Brunswick: Vieweg, 1863. 96. Horner KC, Lenoir M, Bock GR. Distortion product otoacoustic emissions in hearingimpaired mutant mice. J Acoust Soc Am 78: 1603–1611, 1985. 97. Housley GD, Ashmore JF. Ionic currents of outer hair cells isolated from the guineapig cochlea. J Physiol 448: 73–98, 1992. 98. Houtgast T. Lateral Suppression in Hearing. Amsterdam: Academische Pers., 1974. 99. Houtsma AJM, Goldstein JL. The central origin of the pitch of complex tones: evidence from musical interval recognition. J Acoust Soc Am 51: 520 –529, 1972. 100. Howard J, Ashmore JF. Stiffness of sensory hair bundles in the sacculus of the frog. Hear Res 23: 93–104, 1986. 101. Howard J, Hudspeth AJ. Compliance of the hair bundle associated with gating of mechanoelectrical transduction channels in the bullfrog’s saccular hair cell. Neuron 1: 189 –199, 1988. 102. Howard J, Hudspeth AJ. Mechanical relaxation of the hair bundle mediates adaptation in mechanoelectrical transduction by the bullfrog’s saccular hair cell. Proc Natl Acad Sci USA 84: 3064 –3068, 1987. 103. Howard MA, Stagner BB, Foster PK, Lonsbury-Martin BL, Martin GK. Suppression tuning in noise-exposed rabbits. J Acoust Soc Am 114: 279 –293, 2003. 104. Hudspeth AJ. Making an effort to listen: mechanical amplification in the ear. Neuron 59: 530 –545, 2008. 105. Hudspeth AJ, Jülicher F, Martin P. A critique of the critical cochlea: Hopf–a bifurcation–is better than none. J Neurophysiol 104: 1219 –1229, 2010. 114. Kapadia S, Lutman ME. Nonlinear temporal interactions in click-evoked otoacoustic emissions. I. Assumed model and polarity-symmetry. Hear Res 146: 89 –100, 2000. 115. Kapadia S, Lutman ME. Nonlinear temporal interactions in click-evoked otoacoustic emissions. II. Experimental data. Hear Res 146: 101–120, 2000. 116. Karavitaki KD, Corey DP. Hair bundle mechanics at high frequencies: a test of series or parallel transduction. In: Auditory Mechanisms: Processes and Models, edited by Nuttall AL, Ren T, Gillespie P, Grosh K, and de Boer E. Hackensack, NJ: World Scientific, 2005, p. 286 –292. 117. Karavitaki KD, Corey DP. Sliding adhesion confers coherent motion to hair cell stereocilia and parallel gating to transduction channels. J Neurosci 30: 9051–9063, 2010. 118. Kawashima Y, Geleoc GS, Kurima K, Labay V, Lelli A, Asai Y, Makishima T, Wu DK, Della Santina CC, Holt JR, Griffith AJ. Mechanotransduction in mouse inner ear hair cells requires transmembrane channel-like genes. J Clin Invest 121: 4796 – 4809, 2011. 119. Kazmierczak P, Sakaguchi H, Tokita J, Wilson-Kubalek EM, Milligan RA, Muller U, Kachar B. Cadherin 23 and protocadherin 15 interact to form tip-link filaments in sensory hair cells. Nature 449: 87–91, 2007. 120. Kemp DT. Otoacoustic emissions: concepts and origins. In: Active Processes and Otoacoustic Emissions, edited by Manley GA, Fay RR, and Popper AN. New York: Springer, 2008, p. 1–38. 121. Kemp DT. Stimulated acoustic emissions from within the human auditory system. J Acoust Soc Am 64: 1386 –1391, 1978. 122. Kemp DT, Bray P, Alexander L, Brown AM. Acoustic emission cochleography–practical aspects. Scand Audiol Suppl 25: 71–95, 1986. 123. Kemp DT, Chum R. Properties of the generator of stimulated acoustic emissions. Hear Res 2: 213–232, 1980. 124. Khanna SM. The response of the apical turn of cochlea modeled with a tuned amplifier with negative feedback. Hear Res 194: 97–108, 2004. 125. Khanna SM, Hao LF. Amplification in the apical turn of the cochlea with negative feedback. Hear Res 149: 55–76, 2000. 126. Kiang NY, Moxon EC. Tails of tuning curves of auditory-nerve fibers. J Acoust Soc Am 55: 620 – 630, 1974. 127. Kim DO, Molnar CE, Matthews JW. Cochlear mechanics: nonlinear behavior in twotone responses as reflected in cochlear-nerve-fiber responses and in ear-canal sound pressure. J Acoust Soc Am 67: 1704 –1721, 1980. 128. Kim DO, Siegel JH, Molnar CE. Cochlear nonlinear phenomena in two-tone responses. Scand Audiol Suppl 63– 81, 1979. 129. Kim KX, Fettiplace R. Developmental changes in the cochlear hair cell mechanotransducer channel and their regulation by transmembrane channel-like proteins. J Gen Physiol 141: 141–148, 2013. 106. Jaramillo F, Markin VS, Hudspeth AJ. Auditory illusions and the single hair cell. Nature 364: 527–529, 1993. 130. Kimberley BP, Brown DK, Eggermont JJ. Measuring human cochlear traveling wave delay using distortion product emission phase responses. J Acoust Soc Am 94: 1343– 1350, 1993. 107. Javel E, Geisler CD, Ravindran A. Two-tone suppression in auditory nerve of the cat: rate-intensity and temporal analyses. J Acoust Soc Am 63: 1093–1104, 1978. 131. Kirk DL, Johnstone BM. Modulation of f2-f1: evidence for a GABAergic efferent system in apical cochlea of the guinea pig. Hear Res 67: 20 –34, 1993. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1615 AVAN, BÜKI, AND PETIT 132. Kitajiri S, Sakamoto T, Belyantseva IA, Goodyear RJ, Stepanyan R, Fujiwara I, Bird JE, Riazuddin S, Riazuddin S, Ahmed ZM, Hinshaw JE, Sellers J, Bartles JR, Hammer JA 3rd, Richardson GP, Griffith AJ, Frolenkov GI, Friedman TB. Actin-bundling protein TRIOBP forms resilient rootlets of hair cell stereocilia essential for hearing. Cell 141: 786 –798, 2010. 133. Knight RD, Kemp DT. Indications of different distortion product otoacoustic emission mechanisms from a detailed f1,f2 area study. J Acoust Soc Am 107: 457– 473, 2000. 134. Knight RD, Kemp DT. Wave and place fixed DPOAE maps of the human ear. J Acoust Soc Am 109: 1513–1525, 2001. 135. Kozlov AS, Baumgart J, Risler T, Versteegh CP, Hudspeth AJ. Forces between clustered stereocilia minimize friction in the ear on a subnanometre scale. Nature 474: 376 –379, 2011. 136. Kozlov AS, Risler T, Hudspeth AJ. Coherent motion of stereocilia assures the concerted gating of hair-cell transduction channels. Nat Neurosci 10: 87–92, 2007. 137. Kummer P, Janssen T, Hulin P, Arnold W. Optimal L(1)-L(2) primary tone level separation remains independent of test frequency in humans. Hear Res 146: 47–56, 2000. 138. Le Calvez S, Guilhaume A, Romand R, Aran JM, Avan P. CD1 hearing-impaired mice. II. Group latencies and optimal f2/f1 ratios of distortion product otoacoustic emissions, and scanning electron microscopy. Hear Res 120: 51– 61, 1998. 139. Legan PK, Lukashkina VA, Goodyear RJ, Kossi M, Russell IJ, Richardson GP. A targeted deletion in alpha-tectorin reveals that the tectorial membrane is required for the gain and timing of cochlear feedback. Neuron 28: 273–285, 2000. 140. LePage EL. Frequency-dependent self-induced bias of the basilar membrane and its potential for controlling sensitivity and tuning in the mammalian cochlea. J Acoust Soc Am 82: 139 –154, 1987. 141. LePage EL, Murray NM. Latent cochlear damage in personal stereo users: a study based on click-evoked otoacoustic emissions. Med J Aust 169: 588 –592, 1998. 142. LePage EL, Olofsson A. Modeling scala media as a pressure vessel. In: What Fire Is in Mine Ears, Progress in Auditory Biomechanics, edited by Shera CA and Olson ES. Melville, NY: American Institute of Physics, 2011, p. 212–217. 143. Liberman MC, Gao J, He DZ, Wu X, Jia S, Zuo J. Prestin is required for electromotility of the outer hair cell and for the cochlear amplifier. Nature 419: 300 –304, 2002. 144. Liberman MC, Zuo J, Guinan JJ Jr. Otoacoustic emissions without somatic motility: can stereocilia mechanics drive the mammalian cochlea? J Acoust Soc Am 116: 1649 –1655, 2004. 145. Lichtenhan JT. Effects of low-frequency biasing on otoacoustic and neural measures suggest that stimulus-frequency otoacoustic emissions originate near the peak region of the traveling wave. J Assoc Res Otolaryngol 13: 17–28, 2012. 146. Licklider JCR. “Periodicity” Pitch and “Place” Pitch. J Acoust Soc Am 26: 945, 1954. 147. Lighthill J. Energy flow in the cochlea. J Fluid Mech 106: 149 –213, 1981. 148. Long GR, Tubis A, Jones KL. Modeling synchronization and suppression of spontaneous otoacoustic emissions using Van der Pol oscillators: effects of aspirin administration. J Acoust Soc Am 89: 1201–1212, 1991. 149. Long GR, Van Dijk P, Wit HP. Temperature dependence of spontaneous otoacoustic emissions in the edible frog (Rana esculenta). Hear Res 98: 22–28, 1996. 150. Lonsbury-Martin BL, Martin GK, Probst R, Coats AC. Spontaneous otoacoustic emissions in a nonhuman primate. II. Cochlear anatomy. Hear Res 33: 69 –93, 1988. 151. Lukashkin AN, Lukashkina VA, Legan PK, Richardson GP, Russell IJ. Role of the tectorial membrane revealed by otoacoustic emissions recorded from wild-type and transgenic Tecta(deltaENT/deltaENT) mice. J Neurophysiol 91: 163–171, 2004. 152. Lukashkin AN, Lukashkina VA, Russell IJ. One source for distortion product otoacoustic emissions generated by low- and high-level primaries. J Acoust Soc Am 111: 2740 – 2748, 2002. 153. Lukashkin AN, Russell IJ. A descriptive model of the receptor potential nonlinearities generated by the hair cell mechanoelectrical transducer. J Acoust Soc Am 103: 973– 980, 1998. 1616 154. Lukashkin AN, Russell IJ. Modifications of a single saturating non-linearity account for post-onset changes in 2f1-f2 distortion product otoacoustic emission. J Acoust Soc Am 112: 1561–1568, 2002. 155. Lyon RF. Automatic gain control in cochlear mechanics. In: The Mechanics and Biophysics of Hearing, edited by Dallos P, Geisler CD, Matthews JW, Ruggero MA, and Steele CR. Berlin: Springer-Verlag, 1990, p. 395– 402. 156. Maat B, Wit HP, van Dijk P. Noise-evoked otoacoustic emissions in humans. J Acoust Soc Am 108: 2272–2280, 2000. 157. Magnan P, Avan P, Dancer A, Smurzynski J, Probst R. Reverse middle-ear transfer function in the guinea pig measured with cubic difference tones. Hear Res 107: 41– 45, 1997. 158. Manley GA. Cochlear mechanisms from a phylogenetic viewpoint. Proc Natl Acad Sci USA 97: 11736 –11743, 2000. 159. Manley GA. Frequency spacing of acoustic emissions: a possible explanation. In: Mechanisms of Hearing, edited by Webster WR and Aitken LM. Clayton: Monash Univ. Press, 1983, p. .36 –39 160. Manley GA, Koppl C. What have lizard ears taught us about auditory physiology? Hear Res 238: 3–11, 2008. 161. Manley GA, Koppl C, Sneary M. Reversed tonotopic map of the basilar papilla in Gekko gecko. Hear Res 131: 107–116, 1999. 162. Marmarelis PZ, Marmarelis VZ. Analysis of Physiological Systems: The White-Noise Approach. New York: Plenum, 1978. 163. Martin GK, Jassir D, Stagner BB, Whitehead ML, Lonsbury-Martin BL. Locus of generation for the 2f1-f2 vs. 2f2-f1 distortion-product otoacoustic emissions in normalhearing humans revealed by suppression tuning, onset latencies, and amplitude correlations. J Acoust Soc Am 103: 1957–1971, 1998. 164. Martin GK, Lonsbury-Martin BL, Probst R, Coats AC. Spontaneous otoacoustic emissions in a nonhuman primate. I. Basic features and relations to other emissions. Hear Res 33: 49 – 68, 1988. 165. Martin GK, Lonsbury-Martin BL, Probst R, Scheinin SA, Coats AC. Acoustic distortion products in rabbit ear canal. II. Sites of origin revealed by suppression contours and pure-tone exposures. Hear Res 28: 191–208, 1987. 166. Martin GK, Ohlms LA, Franklin DJ, Harris FP, Lonsbury-Martin BL. Distortion product emissions in humans III Influence of sensorineural hearing loss. Ann Otol Rhinol Laryngol Suppl 147: 30 – 42, 1990. 167. Martin GK, Stagner BB, Chung YS, Lonsbury-Martin BL. Characterizing distortionproduct otoacoustic emission components across four species. J Acoust Soc Am 129: 3090 –3103, 2011. 168. Martin GK, Stagner BB, Fahey PF, Lonsbury-Martin BL. Steep and shallow phase gradient distortion product otoacoustic emissions arising basal to the primary tones. J Acoust Soc Am 125: EL85–92, 2009. 169. Martin GK, Stagner BB, Jassir D, Telischi FF, Lonsbury-Martin BL. Suppression and enhancement of distortion-product otoacoustic emissions by interference tones above f(2). I. Basic findings in rabbits. Hear Res 136: 105–123, 1999. 170. Martin GK, Stagner BB, Lonsbury-Martin BL. Evidence for basal distortion-product otoacoustic emission components. J Acoust Soc Am 127: 2955–2972, 2010. 171. Martin P, Bozovic D, Choe Y, Hudspeth AJ. Spontaneous oscillation by hair bundles of the bullfrog’s sacculus. J Neurosci 23: 4533– 4548, 2003. 172. Martin P, Mehta AD, Hudspeth AJ. Negative hair-bundle stiffness betrays a mechanism for mechanical amplification by the hair cell. Proc Natl Acad Sci USA 97: 12026 – 12031, 2000. 173. Mauermann M, Uppenkamp S, van Hengel PW, Kollmeier B. Evidence for the distortion product frequency place as a source of distortion product otoacoustic emission (DPOAE) fine structure in humans. II. Fine structure for different shapes of cochlear hearing loss. J Acoust Soc Am 106: 3484 –3491, 1999. 174. McAlpine D. Neural sensitivity to periodicity in the inferior colliculus: evidence for the role of cochlear distortions. J Neurophysiol 92: 1295–1311, 2004. 175. Meenderink SW, van der Heijden M. Distortion product otoacoustic emissions evoked by tone complexes. J Assoc Res Otolaryngol 12: 29 – 44, 2011. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS 176. Meenderink SW, van der Heijden M. Reverse cochlear propagation in the intact cochlea of the gerbil: evidence for slow traveling waves. J Neurophysiol 103: 1448 – 1455, 2010. 199. Pang XD, Guinan JJ Jr. Growth rate of simultaneous masking in cat auditory-nerve fibers: relationship to the growth of basilar-membrane motion and the origin of two-tone suppression. J Acoust Soc Am 102: 3564 –3575, 1997. 177. Meyer MF. Tartini versus Helmholtz judged by modern sensory observation. J Acoust Soc Am 26: 761–764, 1954. 200. Patuzzi R, Robertson D. Tuning in the mammalian cochlea. Physiol Rev 68: 1009 –1082, 1988. 178. Michalski N, Michel V, Caberlotto E, Lefevre GM, van Aken AF, Tinevez JY, Bizard E, Houbron C, Weil D, Hardelin JP, Richardson GP, Kros CJ, Martin P, Petit C. Harmonin-b, an actin-binding scaffold protein, is involved in the adaptation of mechanoelectrical transduction by sensory hair cells. Pflügers Arch 459: 115–130, 2009. 201. Patuzzi R, Sellick PM, Johnstone BM. The modulation of the sensitivity of the mammalian cochlea by low frequency tones. I. Primary afferent activity. Hear Res 13: 1– 8, 1984. 179. Mills DM. Interpretation of distortion product otoacoustic emission measurements. I. Two stimulus tones. J Acoust Soc Am 102: 413– 429, 1997. 180. Mills DM, Norton SJ, Rubel EW. Vulnerability and adaptation of distortion product otoacoustic emissions to endocochlear potential variation. J Acoust Soc Am 94: 2108 – 2122, 1993. 202. Patuzzi R, Sellick PM, Johnstone BM. The modulation of the sensitivity of the mammalian cochlea by low frequency tones. III. Basilar membrane motion. Hear Res 13: 19 –27, 1984. 203. Patuzzi RB. Cochlear micromechanics and macromechanics. In: The Cochlea, edited by Dallos P, Popper AN, and Fay RR. New York: Springer, 1996, p. 186 –257. 181. Mills DM, Rubel EW. Variation of distortion product otoacoustic emissions with furosemide injection. Hear Res 77: 183–199, 1994. 204. Peake WT, Ling A Jr. Basilar-membrane motion in the alligator lizard: its relation to tonotopic organization and frequency selectivity. J Acoust Soc Am 67: 1736 –1745, 1980. 182. Mom T, Avan P, Romand R, Gilain L. Monitoring of functional changes after transient ischemia in gerbil cochlea. Brain Res 751: 20 –30, 1997. 205. Peng AW, Ricci AJ. Somatic motility and hair bundle mechanics, are both necessary for cochlear amplification? Hear Res 273: 109 –122, 2011. 183. Mom T, Bonfils P, Gilain L, Avan P. Origin of cubic difference tones generated by high-intensity stimuli: effect of ischemia and auditory fatigue on the gerbil cochlea. J Acoust Soc Am 110: 1477–1488, 2001. 206. Peterson LC, Bogert BP. A dynamical theory of the cochlea. J Acoust Soc Am 22: 369 –381, 1950. 184. Moore BCJ. Psychophysical tuning curves measured in simultaneous and forward masking. J Acoust Soc Am 63: 524 –532, 1978. 185. Moulin A, Kemp DT. Multicomponent acoustic distortion product otoacoustic emission phase in humans. I. General characteristics. J Acoust Soc Am 100: 1617–1639, 1996. 186. Moulin A, Kemp DT. Multicomponent acoustic distortion product otoacoustic emission phase in humans. II. Implications for distortion product otoacoustic emissions generation. J Acoust Soc Am 100: 1640 –1662, 1996. 187. Mountain DC. Changes in endolymphatic potential and crossed olivocochlear bundle stimulation alter cochlear mechanics. Science 210: 71–72, 1980. 188. Murata K, Moriyama T, Hosokawa Y, Minami S. Alternating current induced otoacoustic emissions in the guinea pig. Hear Res 55: 201–214, 1991. 189. Murugasu E, Russell IJ. The effect of efferent stimulation on basilar membrane displacement in the basal turn of the guinea pig cochlea. J Neurosci 16: 325–332, 1996. 190. Nam JH, Cotton JR, Grant W. A virtual hair cell. I. Addition of gating spring theory into a 3-D bundle mechanical model. Biophys J 92: 1918 –1928, 2007. 191. Nam JH, Fettiplace R. Theoretical conditions for high-frequency hair bundle oscillations in auditory hair cells. Biophys J 95: 4948 – 4962, 2008. 192. Neely ST, Johnson TA, Garner CA, Gorga MP. Stimulus-frequency otoacoustic emissions measured with amplitude-modulated suppressor tones (L). J Acoust Soc Am 118: 2124 –2127, 2005. 193. Newman EB, Stevens SS, Davis H. Factors in the production of aural harmonics and combination tones. J Acoust Soc Am 9: 107–118, 1937. 194. Nin F, Reichenbach T, Fisher JA, Hudspeth AJ. Contribution of active hair-bundle motility to nonlinear amplification in the mammalian cochlea. Proc Natl Acad Sci USA 109: 21076 –21080, 2012. 195. Nuttall AL, Grosh K, Zheng J, de Boer E, Zou Y, Ren T. Spontaneous basilar membrane oscillation and otoacoustic emission at 15 kHz in a guinea pig. J Assoc Res Otolaryngol 5: 337–348, 2004. 207. Pickles JO, Comis SD, Osborne MP. Cross-links between stereocilia in the guinea pig organ of Corti, and their possible relation to sensory transduction. Hear Res 15: 103–112, 1984. 208. Plomp R. Detectability threshold for combination tones. J Acoust Soc Am 37: 1110 – 1123, 1965. 209. Pressnitzer D, Patterson R. Distortion products and the perceived pitch of harmonic complex tones. In: Physiological and Psychophysical Bases of Auditory Function, edited by Breebart DJ, Houtsma AJM, Kohlrausch A, Prijs VF, and Schoonhoven R. Maastricht, The Netherlands: Shaker Publishing BV, 2001, p. 97–104. 210. Probst R, Lonsbury-Martin BL, Martin GK. A review of otoacoustic emissions. J Acoust Soc Am 89: 2027–2067, 1991. 211. Puel JL, Rebillard G. Effect of contralateral sound stimulation on the distortion product 2f1–f2: evidence that the medial efferent system is involved. J Acoust Soc Am 87: 1630 –1635, 1990. 212. Puria S. Measurements of human middle ear forward and reverse acoustics: implications for otoacoustic emissions. J Acoust Soc Am 113: 2773–2789, 2003. 213. Rao A, Long GR. Effects of aspirin on distortion product fine structure: interpreted by the two-source model for distortion product otoacoustic emissions generation. J Acoust Soc Am 129: 792– 800, 2011. 214. Recio-Spinoso A, Narayan SS, Ruggero MA. Basilar membrane responses to noise at a basal site of the chinchilla cochlea: quasi-linear filtering. J Assoc Res Otolaryngol 10: 471– 484, 2009. 215. Recio-Spinoso A, Temchin AN, van Dijk P, Fan YH, Ruggero MA. Wiener-kernel analysis of responses to noise of chinchilla auditory-nerve fibers. J Neurophysiol 93: 3615–3634, 2005. 216. Reichenbach T, Hudspeth AJ. A ratchet mechanism for amplification in low-frequency mammalian hearing. Proc Natl Acad Sci USA 107: 4973– 4978, 2010. 217. Reichenbach T, Stefanovic A, Nin F, Hudspeth AJ. Waves on Reissner’s membrane: a mechanism for the propagation of otoacoustic emissions from the cochlea. Cell Reports 1: 374 –384, 2012. 218. Ren T. Longitudinal pattern of basilar membrane vibration in the sensitive cochlea. Proc Natl Acad Sci USA 99: 17101–17106, 2002. 196. Ohlms LA, Lonsbury-Martin BL, Martin GK. Acoustic-distortion products: separation of sensory from neural dysfunction in sensorineural hearing loss in human beings and rabbits. Otolaryngol Head Neck Surg 104: 159 –174, 1991. 219. Ren T. Reverse propagation of sound in the gerbil cochlea. Nat Neurosci 7: 333–334, 2004. 197. Olson ES. Harmonic distortion in intracochlear pressure and its analysis to explore the cochlear amplifier. J Acoust Soc Am 115: 1230 –1241, 2004. 220. Ren T, He W, Porsov E. Localization of the cochlear amplifier in living sensitive ears. PLoS One 6: e20149, 2011. 198. Oxenham AJ, Plack CJ. Suppression and the upward spread of masking. J Acoust Soc Am 104: 3500 –3510, 1998. 221. Ren T, He W, Scott M, Nuttall AL. Group delay of acoustic emissions in the ear. J Neurophysiol 96: 2785–2791, 2006. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1617 AVAN, BÜKI, AND PETIT 222. Ren T, Nuttall AL. Cochlear compression wave: an implication of the Allen-Fahey experiment. J Acoust Soc Am 119: 1940 –1942, 2006. 247. Shera CA, Guinan JJ Jr. Stimulus-frequency-emission group delay: a test of coherent reflection filtering and a window on cochlear tuning. J Acoust Soc Am 113: 2762–2772, 2003. 223. Rhode WS. Basilar membrane mechanics in the 6 –9 kHz region of sensitive chinchilla cochleae. J Acoust Soc Am 121: 2792–2804, 2007. 248. Shera CA, Guinan JJ Jr, Oxenham AJ. Otoacoustic estimation of cochlear tuning: validation in the chinchilla. J Assoc Res Otolaryngol 11: 343–365, 2010. 224. Rhode WS. Distortion product otoacoustic emissions and basilar membrane vibration in the 6 –9 kHz region of sensitive chinchilla cochleae. J Acoust Soc Am 122: 2725– 2737, 2007. 249. Shera CA, Guinan JJ Jr, Oxenham AJ. Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements. Proc Natl Acad Sci USA 99: 3318 –3323, 2002. 225. Rhode WS. Observations of the vibration of the basilar membrane in squirrel monkeys using the Mossbauer technique. J Acoust Soc Am 49 Suppl 2: 1218, 1971. 250. Shera CA, Tubis A, Talmadge CL, de Boer E, Fahey PF, Guinan JJ Jr. Allen-Fahey and related experiments support the predominance of cochlear slow-wave otoacoustic emissions. J Acoust Soc Am 121: 1564 –1575, 2007. 226. Rhode WS. Some observations on two-tone interaction measured with the Mössbauer effect. In: Psychophysics and Physiology of Hearing, edited by Evans EF and Wilson JP. London: Academic, 1977, p. 27– 41. 227. Rhode WS, Recio A. Study of mechanical motions in the basal region of the chinchilla cochlea. J Acoust Soc Am 107: 3317–3332, 2000. 228. Robles L, Ruggero MA. Mechanics of the mammalian cochlea. Physiol Rev 81: 1305– 1352, 2001. 229. Robles L, Ruggero MA, Rich NC. Two-tone distortion in the basilar membrane of the cochlea. Nature 349: 413– 414, 1991. 230. Robles L, Ruggero MA, Rich NC. Two-tone distortion on the basilar membrane of the chinchilla cochlea. J Neurophysiol 77: 2385–2399, 1997. 251. Siegel JH, Cerka AJ, Recio-Spinoso A, Temchin AN, van Dijk P, Ruggero MA. Delays of stimulus-frequency otoacoustic emissions and cochlear vibrations contradict the theory of coherent reflection filtering. J Acoust Soc Am 118: 2434 –2443, 2005. 252. Siegel JH, Kim DO, Molnar CE. Effects of altering organ of Corti on cochlear distortion products f2-f1 and 2f1-f2. J Neurophysiol 47: 303–328, 1982. 253. Sirjani DB, Salt AN, Gill RM, Hale SA. The influence of transducer operating point on distortion generation in the cochlea. J Acoust Soc Am 115: 1219 –1229, 2004. 254. Sisto R, Moleti A, Botti T, Bertaccini D, Shera CA. Distortion products and backwardtraveling waves in nonlinear active models of the cochlea. J Acoust Soc Am 129: 3141– 3152, 2011. 231. Ruggero MA. Comparison of group delays of 2f(1)-f(2) distortion product otoacoustic emissions and cochlear travel times. Acoust Res Lett Online 5: 143–147, 2004. 255. Smalt CJ, Krishnan A, Bidelman GM, Ananthakrishnan S, Gandour JT. Distortion products and their influence on representation of pitch-relevant information in the human brainstem for unresolved harmonic complex tones. Hear Res 292: 26 –34, 2012. 232. Ruggero MA, Rich NC, Recio A, Narayan SS, Robles L. Basilar-membrane responses to tones at the base of the chinchilla cochlea. J Acoust Soc Am 101: 2151–2163, 1997. 256. Smith RL. Adaptation, saturation and physiological masking in single auditory-nerve fibers. J Acoust Soc Am 65: 166 –178, 1979. 233. Ruggero MA, Robles L, Rich NC. Two-tone suppression in the basilar membrane of the cochlea: mechanical basis of auditory-nerve rate suppression. J Neurophysiol 68: 1087–1099, 1992. 257. Smith RL. Short-term adaptation in single auditory-nerve fibers: some poststimulatory effects. J Neurophysiol 40: 1098 –1112, 1977. 234. Russell IJ, Cody AR, Richardson GP. The responses of inner and outer hair cells in the basal turn of the guinea-pig cochlea and in the mouse cochlea grown in vitro. Hear Res 22: 199 –216, 1986. 235. Russell IJ, Kossl M, Richardson GP. Nonlinear mechanical responses of mouse cochlear hair bundles. Proc Biol Sci 250: 217–227, 1992. 236. Sachs MB, Kiang NY. Two-tone inhibition in auditory-nerve fibers. J Acoust Soc Am 43: 1120 –1128, 1968. 237. Sachs MB, Young ED. Effects of nonlinearities on speech encoding in the auditory nerve. J Acoust Soc Am 68: 858 – 875, 1980. 238. Salt AN, Brown DJ, Hartsock JJ, Plontke SK. Displacements of the organ of Corti by gel injections into the cochlear apex. Hear Res 250: 63–75, 2009. 239. Salt AN, Hullar TE. Responses of the ear to low frequency sounds, infrasound and wind turbines. Hear Res 268: 12–21, 2010. 258. Smith ST, Chadwick RS. Simulation of the response of the inner hair cell stereocilia bundle to an acoustical stimulus. PLoS One 6: e18161, 2011. 259. Smoorenburg GF. Combination tones and their origin. J Acoust Soc Am 52: 615– 632, 1972. 260. Sokolich WG. Improved acoustic system for auditory research. J Acoust Soc Am Suppl 1: S12, 1977. 261. Starr A, Picton TW, Sininger Y, Hood LJ, Berlin CI. Auditory neuropathy. Brain 119: 741–753, 1996. 262. Stoop R, Kern A. Essential auditory contrast-sharpening is preneuronal. Proc Natl Acad Sci USA 101: 9179 –9181, 2004. 263. Stover LJ, Neely ST, Gorga MP. Latency and multiple sources of distortion product otoacoustic emissions. J Acoust Soc Am 99: 1016 –1024, 1996. 264. Strube HW. Evoked otoacoustic emissions as cochlear Bragg reflections. Hear Res 38: 35– 45, 1989. 240. Santos-Sacchi J. Harmonics of outer hair cell motility. Biophys J 65: 2217–2227, 1993. 241. Schmiedt RA. Acoustic distortion in the ear canal. I. Cubic difference tones: effects of acute noise injury. J Acoust Soc Am 79: 1481–1490, 1986. 265. Szalai R, Tsaneva-Atanasova K, Homer ME, Champneys AR, Kennedy HJ, Cooper NP. Nonlinear models of development, amplification and compression in the mammalian cochlea. Philos Transact A Math Phys Eng Sci 369: 4183– 4204, 2011. 242. Shannon RV. Two-tone unmasking and suppression in a forward-masking situation. J Acoust Soc Am 59: 1460 –1470, 1976. 266. Talmadge CL, Tubis A, Long GR, Piskorski P. Modeling otoacoustic emission and hearing threshold fine structures. J Acoust Soc Am 104: 1517–1543, 1998. 243. Shera CA. Mammalian spontaneous otoacoustic emissions are amplitude-stabilized cochlear standing waves. J Acoust Soc Am 114: 244 –262, 2003. 267. Tartini G. Trattato di musica secondo la vera scienza dell’armonia. PADOVA: Nella Stamperia del Seminario, Appresso Giovanni Manfrè, 1754. 244. Shera CA, Guinan JJ. Mechanisms of mammalian otoacoustic emission. In: Active Processes and Otoacoustic Emissions, edited by Manley GA, Lonsbury-Martin BL, Popper AN, and Fay RR. New York: Springer, 2007. 268. Tonndorf J. Endolymphatic hydrops: mechanical causes of hearing loss. Arch OtoRhino-Laryngol 212: 293–299, 1976. 245. Shera CA, Guinan JJ Jr. Cochlear traveling-wave amplification, suppression, and beamforming probed using noninvasive calibration of intracochlear distortion sources. J Acoust Soc Am 121: 1003–1016, 2007. 246. Shera CA, Guinan JJ Jr. Evoked otoacoustic emissions arise by two fundamentally different mechanisms: a taxonomy for mammalian OAEs. J Acoust Soc Am 105: 782– 798, 1999. 1618 269. Trautwein P, Hofstetter P, Wang J, Salvi R, Nostrant A. Selective inner hair cell loss does not alter distortion product otoacoustic emissions. Hear Res 96: 71– 82, 1996. 270. Valk WL, Wit HP, Albers FW. Evaluation of cochlear function in an acute endolymphatic hydrops model in the guinea pig by measuring low-level DPOAEs. Hear Res 192: 47–56, 2004. 271. Van der Heijden M. Cochlear gain control. J Acoust Soc Am 117: 1223–1233, 2005. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org AUDITORY DISTORTIONS 272. Van der Heijden M, Joris PX. Cochlear phase and amplitude retrieved from the auditory nerve at arbitrary frequencies. J Neurosci 23: 9194 –9198, 2003. 273. Van der Heijden M, Joris PX. The speed of auditory low-side suppression. J Neurophysiol 93: 201–209, 2005. 293. Whitehead ML, Stagner BB, Martin GK, Lonsbury-Martin BL. Visualization of the onset of distortion-product otoacoustic emissions, and measurement of their latency. J Acoust Soc Am 100: 1663–1679, 1996. 294. Wiener N. Nonlinear Problems in Random Theory. New York: Wiley, 1958. 274. Van Dijk P, Wit HP. Amplitude and frequency fluctuations of spontaneous otoacoustic emissions. J Acoust Soc Am 88: 1779 –1793, 1990. 295. Wilson JP. Evidence for a cochlear origin for acoustic re-emissions, threshold finestructure and tonal tinnitus. Hear Res 2: 233–252, 1980. 275. Van Dijk P, Wit HP, Segenhout JM, Tubis A. Wiener kernel analysis of inner ear function in the American bullfrog. J Acoust Soc Am 95: 904 –919, 1994. 296. Wilson JP. Model for cochlear echoes and tinnitus based on an observed electrical correlate. Hear Res 2: 527–532, 1980. 276. Van Netten SM. Hair cell mechano-transduction: its influence on the gross mechanical characteristics of a hair cell sense organ. Biophys Chem 68: 43–52, 1997. 277. Van Netten SM, Kros CJ. Gating energies and forces of the mammalian hair cell transducer channel and related hair bundle mechanics. Proc Biol Sci 267: 1915–1923, 2000. 297. Wit HP. Amplitude fluctuations of spontaneous otoacoustic emissions caused by internal and externally applied noise sources. Prog Brain Res 97: 59 – 65, 1993. 298. Wit HP, Ritsma RJ. Evoked acoustical responses from the human ear: some experimental results. Hear Res 2: 253–261, 1980. 278. Van Netten SM, Meulenberg CJ, Lennan GW, Kros CJ. Pairwise coupling of hair cell transducer channels links auditory sensitivity and dynamic range. Pflügers Arch 458: 273–281, 2009. 299. Wit HP, van Dijk P. Are human spontaneous otoacoustic emissions generated by a chain of coupled nonlinear oscillators? J Acoust Soc Am 132: 918 –926, 2012. 279. Verhulst S, Harte JM, Dau T. Temporal suppression of the click-evoked otoacoustic emission level-curve. J Acoust Soc Am 129: 1452–1463, 2011. 300. Wit HP, van Dijk P, Avan P. On the shape of (evoked) otoacoustic emission spectra. Hear Res 81: 208 –214, 1994. 280. Verpy E, Leibovici M, Michalski N, Goodyear RJ, Houdon C, Weil D, Richardson GP, Petit C. Stereocilin connects outer hair cell stereocilia to one another and to the tectorial membrane. J Comp Neurol 519: 194 –210, 2011. 301. Withnell RH, Hazlewood C, Knowlton A. Reconciling the origin of the transient evoked ototacoustic emission in humans. J Acoust Soc Am 123: 212–221, 2008. 281. Verpy E, Masmoudi S, Zwaenepoel I, Leibovici M, Hutchin TP, Del Castillo I, Nouaille S, Blanchard S, Laine S, Popot JL, Moreno F, Mueller RF, Petit C. Mutations in a new gene encoding a protein of the hair bundle cause non-syndromic deafness at the DFNB16 locus. Nat Genet 29: 345–349, 2001. 282. Verpy E, Weil D, Leibovici M, Goodyear RJ, Hamard G, Houdon C, Lefevre GM, Hardelin JP, Richardson GP, Avan P, Petit C. Stereocilin-deficient mice reveal the origin of cochlear waveform distortions. Nature 456: 255–258, 2008. 283. Vetesnik A, Gummer AW. Transmission of cochlear distortion products as slow waves: a comparison of experimental and model data. J Acoust Soc Am 131: 3914 – 3934, 2012. 284. Volterra V. Theory of Functionals and of Integral and Integro-Differential Equations. Glasgow: Blackie and Son, 1930. 285. Warren B, Gibson G, Russell IJ. Sex recognition through midflight mating duets in Culex mosquitoes is mediated by acoustic distortion. Curr Biol 19: 485– 491, 2009. 286. Wegel RL, Lane CE. The auditory masking of one pure tone by another and its possible relation to the dynamics of the inner ear. Physics Rev 23: 266 –285, 1924. 287. Weiss TF. Bidirectional transduction in vertebrate hair cells: a mechanism for coupling mechanical and electrical processes. Hear Res 7: 353–360, 1982. 288. Weiss TF, Leong R. A model for signal transmission in an ear having hair cells with free-standing stereocilia. IV. Mechanoelectric transduction stage. Hear Res 20: 175– 195, 1985. 289. Wever EG, Bray CW, Lawrence M. The origin of combination tones. J Exp Psychol 27: 217–226, 1940. 290. White KR, Vohr BR, Maxon AB, Behrens TR, McPherson MG, Mauk GW. Screening all newborns for hearing loss using transient evoked otoacoustic emissions. Int J Pediatr Otorhinolaryngol 29: 203–217, 1994. 291. Whitehead ML. Species differences of distortion-product otoacoustic emissions: comment on “Interpretation of distortion product otoacoustic emission measurements. I. Two stimulus tones” [J Acoust Soc Am 102, 413– 429 (1997)]. J Acoust Soc Am 103: 2740 –2742, 1998. 292. Whitehead ML, Lonsbury-Martin BL, Martin GK. Evidence for two discrete sources of 2f1-f2 distortion-product otoacoustic emission in rabbit. II. Differential physiological vulnerability. J Acoust Soc Am 92: 2662–2682, 1992. 302. Withnell RH, Lodde J. In search of basal distortion product generators. J Acoust Soc Am 120: 2116 –2123, 2006. 303. Withnell RH, McKinley S. Delay dependence for the origin of the nonlinear derived transient evoked otoacoustic emission. J Acoust Soc Am 117: 281–291, 2005. 304. Wright A. Dimensions of the cochlear stereocilia in man and the guinea pig. Hear Res 13: 89 –98, 1984. 305. Yates GK, Withnell RH. The role of intermodulation distortion in transient-evoked otoacoustic emissions. Hear Res 136: 49 – 64, 1999. 306. Zatorre RJ. Neuroscience: finding the missing fundamental. Nature 436: 1093–1094, 2005. 307. Zheng J, Shen W, He DZ, Long KB, Madison LD, Dallos P. Prestin is the motor protein of cochlear outer hair cells. Nature 405: 149 –155, 2000. 308. Zou Y, Zheng J, Ren T, Nuttall A. Cochlear transducer operating point adaptation. J Acoust Soc Am 119: 2232–2241, 2006. 309. Zurek PM, Clark WW, Kim DO. The behavior of acoustic distortion products in the ear canals of chinchillas with normal or damaged ears. J Acoust Soc Am 72: 774 –780, 1982. 310. Zweig G. Basilar membrane motion. Cold Spring Harb Symp Quant Biol 40: 619 – 633, 1976. 311. Zweig G, Shera CA. The origin of periodicity in the spectrum of evoked otoacoustic emissions. J Acoust Soc Am 98: 2018 –2047, 1995. 312. Zwicker E. Suppression and (2f1-f2)-difference tones in a nonlinear cochlear preprocessing model with active feedback. J Acoust Soc Am 80: 163–176, 1986. 313. Zwicker E, Schloth E. Interrelation of different oto-acoustic emissions. J Acoust Soc Am 75: 1148 –1154, 1984. 314. Zwislocki JJ, Kletsky EJ. Tectorial membrane: a possible effect on frequency analysis in the cochlea. Science 204: 639 – 641, 1979. 315. Zwislocki JJ, Szymko YM, Hertig LY. The cochlea is an automatic gain control system after all. In: Diversity in Auditory Mechanics, edited by Lewis ER, Long GR, Lyon RF, Narins PM, Steele CR, and Hecht-Poinar E. Singapore: World Scientific, 1996, p. 354 –360. Physiol Rev • VOL 93 • OCTOBER 2013 • www.prv.org 1619