Estimation of the level and phase of the simple distortion tone in the modulation domain Aleksander Sek Institute of Acoustics, Adam Mickiewicz University, 85 Umultowska, 61-614 Poznan, Poland Brian C. J. Moorea) Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England 共Received 15 October 2003; revised 26 July 2004; accepted 27 July 2004兲 These experiments were designed to test the idea that nonlinearities in the auditory system can introduce a distortion component into the internal representation of the envelope of a sound, and to estimate the phase of the hypothetical distortion component. In experiment 1, a two-alternative forced-choice 共2AFC兲 task with feedback was used to measure psychometric functions for detecting 5-Hz probe modulation of a 4-kHz sinusoidal carrier in the presence of a masker modulator with components at 50 and 55 Hz (m⫽0.3 for each component兲. Performance was measured as a function of the relative phase, ⌬, of the probe relative to the ‘‘venelope’’ 共envelope of the envelope兲 of the masker. Performance was poorest for ⌬⫽135°. In experiment 2, ⌬ was fixed at 135°, m was set to 0.48 for each masker component, and psychometric functions for detecting probe modulation were measured using a 2AFC task without feedback. For small probe modulation depths (m⬇0.03), the detectability index, d ⬘ , was consistently negative, consistent with the existence of a weak distortion product which can ‘‘cancel’’ the probe modulation. The distortion component for the conditions of the experiment was estimated to have a phase of about ⫺25° relative to the venelope. © 2004 Acoustical Society of America. 关DOI: 10.1121/1.1795331兴 PACS numbers: 43.66.Dc, 43.66.Mk, 43.66.Nm, 43.66.Ba 关NFV兴 I. INTRODUCTION Several recent models for the perception of amplitude modulation 共AM兲 in sounds are based on the idea that the envelopes of the outputs of the 共peripheral兲 auditory filters are fed to a second array of overlapping bandpass filters tuned to different envelope modulation rates 共Kay, 1982; Dau et al., 1997a; 1997b; Ewert and Dau, 2000; Ewert et al., 2002; Verhey et al., 2003兲. This set of filters is usually called a ‘‘modulation filter bank’’ 共MFB兲. Psychoacoustical evidence consistent with the concept of an MFB has come from experiments involving detection of ‘‘probe’’ modulation in the presence of masker modulation; these experiments appear to show frequency selectivity in the modulation domain 共Bacon and Grantham, 1989; Houtgast, 1989兲. Dau et al. 共1997a兲 conducted an experiment to assess whether modulation masking could be explained in terms of the temporal similarity of the envelopes of the signal and masker, rather than in terms of the MFB. They amplitude modulated a 5-kHz sinusoidal carrier with a masker that consisted of the third to seventh harmonics of a 30-Hz fundamental frequency; the phases of the components were random. The task was to detect sinusoidal probe modulation in the range 20 to 120 Hz. The amount of modulation masking increased progressively as the probe frequency was increased from 20 to about 100 Hz. There was no maximum in the masking function at 30 Hz, even though the temporal envelope pattern of the masker and signal was similar at this frequency. The results were consistent with the idea that the a兲 Electronic mail: bcjm@cam.ac.uk J. Acoust. Soc. Am. 116 (5), November 2004 Pages: 3031–3037 auditory system performs a spectral analysis of the envelope. However, Verhey et al. 共2003兲 later suggested that the failure of Dau et al. 共1997a兲 to find a peak in the modulation masking pattern at the frequency corresponding to the ‘‘missing fundamental’’ resulted from the modulation masker impairing detection of the signal modulation on some trials and enhancing it on others, depending on the specific choice of 共random兲 masker component phases. Moore et al. 共1999兲 examined modulation masking for cases where the probe modulation was at a frequency remote from any spectral frequency in the masker modulation, but there was nevertheless a similarity between the temporal pattern of the masker modulation and the probe modulation. This was achieved by using a two-component modulator. The ‘‘beats’’ between these two components had a rate that was equal to or close to the probe frequency. A similar method had been used earlier by Sheft and Yost 共1997兲 to examine modulation detection interference 共MDI兲. Moore et al. found that the threshold for detecting 5-Hz probe modulation was affected by the presence of a pair of masker modulators beating at a 5-Hz rate 共40 and 45 Hz, 50 and 55 Hz, or 60 and 65 Hz兲. The threshold was dependent on the phase of the probe modulation relative to the beat cycle of the masker modulators; the threshold elevation was greatest when the peak amplitude of the probe modulation coincided with a peak in the beat cycle. The maximum threshold elevation of the 5-Hz probe produced by the beating masker modulators was 7–12 dB greater than that produced by the individual components of the masker modulators. These results cannot be explained in terms of the spectra of the envelopes of the stimuli, as the beating masker modulators did 0001-4966/2004/116(5)/3031/7/$20.00 © 2004 Acoustical Society of America 3031 not produce a 5-Hz component in the spectra of the envelopes. Moore et al. 共1999兲 proposed an explanation for their results based on the idea that nonlinearities within the auditory system, such as basilar-membrane compression, introduce distortion in the internal representation of the envelopes of the stimuli. This notion was initially suggested by Shofner et al. 共1996兲 on the basis of an electrophysiological study using two-component modulators. In the case of twocomponent beating modulators, a weak component, corresponding to the simple difference component, would be introduced at the beat rate. Verhey et al. 共2003兲 conducted experiments similar to those of Moore et al. 共1999兲, but included conditions using both two-component and three-component masker modulators. Following Ewert et al. 共2002兲, they used the term ‘‘venelope’’ to refer to the 共ac-coupled兲 envelope of the envelope. Like Moore et al., Verhey et al. found that, for a probe modulation frequency equal to the masker venelope periodicity, the probe modulation depth at threshold varied with the phase of the probe relative to the venelope. However, unlike Moore et al., Verhey et al. found that thresholds were lower for the in-phase condition, where maxima in the probe coincided with maxima in the venelope, than for the antiphase condition. In comparable experiments, described later, Füllgrabe et al. 共2004兲 found large individual differences in the relative probe phase leading to the poorest detectability of the probe modulation. Verhey et al. argued that basilar-membrane compression could not explain their results, as it leads to the prediction of a phase effect opposite to that found by them. They proposed that the auditory system extracts the venelope prior to the MFB, and in a separate pathway, as suggested earlier by Ewert et al. 共2002兲. The concept of venelope extraction may be regarded as a functional way of creating an internal representation that contains both the envelope and the venelope. Several researchers have also noted that venelope or ‘‘beat’’ cues may be present at the outputs of modulation filters tuned to the first-order or ‘‘carrier’’ rates 共Ewert et al., 2002; Fullgrabe and Lorenzi, 2003; Millman et al., 2003; Verhey et al., 2003兲; such cues might be used for the detection of the envelope beats 共Millman et al., 2003兲. The present paper is particularly concerned with two issues. First, we wished to establish more clearly how the detectability of probe modulation depends on the relative phase of the probe and the venelope of a two-component masker modulator beating at the same rate as the probe. Second, we wished to establish whether the effect produced by the two-component masker modulator could be explained in terms of the introduction by the auditory system of a distortion component at the venelope rate. In experiment 1 we measured psychometric functions for the detection of probe modulation for eight different relative phases of the probe and venelope. We measured psychometric functions rather than estimating thresholds using an adaptive procedure, since it was not obvious that the psychometric functions would always be monotonic 共or that d ⬘ would always be positive兲, as explained below. Also, we wished to assess whether the form of the psychometric functions could be explained using the assumption of a distortion component. In experiment 2 3032 J. Acoust. Soc. Am., Vol. 116, No. 5, November 2004 we measured psychometric functions for the detection of probe modulation in the presence of a two-component masker modulator using the relative phase that had been found in experiment 1 to lead to the poorest performance. A two-alternative forced-choice task without feedback was used. We anticipated that if a distortion component at the venelope rate was present, and was out of phase with the probe modulation, it might lead to negative d ⬘ values for some probe modulation depths, i.e., the probe would be consistently identified in the wrong interval. The results showed that this was indeed the case. II. EXPERIMENT 1: PSYCHOMETRIC FUNCTIONS FOR DIFFERENT PROBE AND VENELOPE PHASES A. Stimuli The carrier was a 4-kHz sinusoid with a level of 70 dB SPL. This relatively high carrier frequency was chosen so that the spectral sidebands produced by the modulation would not be resolved. The probe modulation frequency was 5 Hz. The masker modulator was composed of two sinusoids with frequencies of 50 and 55 Hz. The modulation index, m, for each masker modulator component was 0.3. Each modulator started in sine phase. The equation describing the masker envelope, E(t), is E 共 t 兲 ⫽1⫹0.3 sin共 2 55t 兲 ⫹0.3 sin共 2 50t 兲 , 共1兲 where t is time. Although the individual modulator components had zero amplitude at time zero, the venelope had its maximum value 共0.6兲 at time zero. The venelope of the masker modulator repeated at a 5-Hz rate, but there was no 5-Hz component in the modulation spectrum of the masker. The phase of the probe modulation relative to the venelope is defined in terms of ⌬, where ⌬ is zero when the peak in the amplitude of the probe modulation coincides with the peak in the venelope. This meant that, for ⌬⫽0, the signal starting phase was advanced by 90° 共/2 radians兲 relative to sine phase. Values of ⌬ were 0, 45, 90, 135, 180, 225, 270, and 315°. On each trial, the carrier was presented in two bursts separated by a silent interval of 300 ms. Each burst had 20-ms raised-cosine rise and fall ramps, and an overall duration 共including rise/fall times兲 of 1000 ms. The modulation was applied during the whole of the carrier, and the starting phase of the modulation is defined relative to the start of the carrier. Stimuli were generated using a Tucker-Davis Technologies array processor 共TDT-AP2兲 in a host PC, and a 16-bit digital to analog converter 共TDT-DD1兲 operating at a 50-kHz sampling rate. The stimuli were attenuated 共TDT-PA4兲 and sent through an output amplifier 共TDT-HB6兲 to a Sennheiser HD580 earphone. Only one ear was tested for each subject. Subjects were seated in a double-walled sound-attenuating chamber. B. Procedure Psychometric functions were measured using a twointerval forced-choice procedure. The masker modulation A. Sek and B. C. J. Moore: Modulation distortion was present in both intervals of a trial, and the probe modulation was presented in either the first or the second interval, selected at random. The task of the subject was to indicate, by pressing one of two buttons, the interval containing the probe modulation. Feedback was provided by lights following each response. For each subject and each value of ⌬, five different values were used for the modulation depth of the probe, m p . The values were chosen individually for each subject and each value of ⌬ on the basis of pilot trials, so as to give values for the detectability index, d ⬘ , ranging from just above zero to about 2–3. A run started with five trials using the largest value of m p . Then, in successive trials, stimuli with each value of m p were presented once, in descending order. This sequence was repeated ten times to give a total of 55 trials per run. With this procedure, subjects receive an easily detected stimulus once every five trials, which helps them to ‘‘remember’’ what aspect of the stimulus they should be listening to. Without such a reminder, subjects may ‘‘lose’’ the most effective detection cue, leading to unduly poor performance for low probe modulation depths 共Taylor et al., 1983; Moore and Sek, 1992兲. Results from the first five trials of each run were discarded. Each run was repeated at least 20 times, so that at least 200 judgments were obtained for each value of m p . C. Subjects Three subjects were tested. One was author AS. The other two subjects were paid for their services. All subjects had absolute thresholds less than 20 dB HL at all audiometric frequencies and had no history of hearing disorders. All had previous experience in psychoacoustic tasks, including tasks similar to the one used here. They received extensive practice during the pilot trials used to determine appropriate values of m p to be used in the main experiment. D. Results The percent-correct scores were converted to d ⬘ values using standard tables 共Hacker and Ratcliff, 1979兲. The pattern of results was similar across subjects. Psychometric functions for a representative subject 共AW兲 are shown by the open squares in Fig. 1; d ⬘ is plotted as a function of 20 log(mp) 共the solid squares connected by dashed lines show predictions which are explained later兲. Performance varied markedly with ⌬. Poorest performance was found for ⌬ ⫽135° and 180°. Best performance was found for ⌬⫽0° and 315°. This pattern of results was found for all three subjects and is similar to that found by Verhey et al. 共2003兲, but differs from that found by Moore et al. 共1999兲. There were some cases of negative d ⬘ values in the results. However, none of the d ⬘ values was significantly below 0, based on confidence intervals for d ⬘ calculated as described by Miller 共1996兲. To estimate threshold values of m p giving a d ⬘ value of 1, the data were fitted with functions of the form log10共 d ⬘ 兲 ⫽a⫹b log10共 m p 兲 , 共2兲 where a and b are fitting constants. Since the d ⬘ values were sometimes negative for the two smallest values of m p , the J. Acoust. Soc. Am., Vol. 116, No. 5, November 2004 FIG. 1. Open squares show results of experiment 1 for subject AW. The detectability index, d ⬘ , for detecting 5-Hz probe modulation is plotted as a function of 20 log(mp), where m p is the modulation depth of the probe. Each panel shows results for one relative phase of the probe modulation relative to the beat cycle 共the venelope兲 of the masker modulator; the relative phase is denoted ⌬. Filled squares connected by dashed lines show predictions derived as described in the text. fitting was done using only the data for the three largest values of m p . The resulting threshold estimates are shown in Table I. The pattern of results is similar across subjects, all three showing the highest thresholds for ⌬⫽135°. A withinsubjects analysis of variance on the threshold values with factor phase showed a highly significant effect of phase: F(7,14)⫽15.598, p⬍0.001. The threshold values are somewhat lower than those estimated by Verhey et al. 共2003兲 in their most similar condition 共5-kHz carrier, masker modulaTABLE I. Thresholds (d ⬘ ⫽1) estimated from the functions relating log(d⬘) to log(mp). Phase, degrees JL AW AS Mean 0 45 90 135 180 225 270 315 ⫺28.4 ⫺26.6 ⫺23.7 ⫺18.3 ⫺19.0 ⫺20.6 ⫺24.1 ⫺26.4 ⫺26.8 ⫺26.6 ⫺22.1 ⫺18.6 ⫺19.2 ⫺22.4 ⫺26.1 ⫺29.0 ⫺26.0 ⫺23.0 ⫺19.5 ⫺18.4 ⫺21.6 ⫺22.4 ⫺26.5 ⫺27.3 ⫺27.1 ⫺25.4 ⫺21.8 ⫺18.4 ⫺19.9 ⫺21.8 ⫺25.6 ⫺27.6 A. Sek and B. C. J. Moore: Modulation distortion 3033 tors at 40 and 45 Hz兲, perhaps because the signal duration used here was longer 共1000 ms versus 600 ms兲. Two methods were used to estimate the value of ⌬ giving poorest performance. Both methods are based on the assumption that the function relating the threshold to the value of ⌬ is symmetrical about the value of ⌬ giving the highest threshold. In the first method, the mean thresholds were fitted with a function of the following form: slope⫽max⫺A 共 ⌬ ⫺ offset兲 2 , 共3兲 where max is the maximum value of the function, A is a constant, and offset is the value of ⌬ at the maximum of the function. The best-fitting value of offset was 155°. In the second method, the data were fitted with a single cycle of a sine function, where the amplitude, the dc-offset, and the phase were free parameters. The best-fitting function had a maximum for ⌬⫽155°. The two methods are consistent in indicating that the poorest performance was obtained for ⌬ ⫽155°. The results are consistent with the idea that a nonlinearity in the auditory system introduced a weak distortion component into the internal representation of the envelope with a phase of ⫺25° relative to the venelope. We denote the effective modulation depth of the hypothetical distortion component by m d . The pattern of results can be understood in the following way. The probe modulation was probably detected as a change in the depth of 5-Hz modulation; for comparable effects of phase using noise carriers, see Bacon and Grantham 共1989兲 and Strickland and Viemeister 共1996兲. Regardless of the value of ⌬, subjects had to distinguish the 5-Hz modulation of depth m d in the nonsignal interval from the 5-Hz modulation in the signal interval resulting from the vector sum of the distortion component and the probe modulation, which is denoted m sum . For some values of ⌬ 共135° and 180°兲, the distortion component and probe modulation tend to cancel, leading to a small value of m sum and to poor performance. For other values of ⌬ 共0° and 315°兲, the distortion component and probe modulation are almost in phase, leading to a large value of m sum and to good performance. However, the value of d ⬘ should be monotonically related to m sum⫺m d 共the difference in modulation depth in the two intervals兲, and the relationship should be the same for all values of ⌬. To test this prediction, for each subject a starting value was assumed for m d . Assuming that the distortion component had a phase relative to the venelope of ⫺25°, the value of m sum was calculated for each value of m p . The correlation of the d ⬘ values with the values of m sum⫺m d was then determined, and the value of m d was systematically varied to determine the value giving the highest correlation. The resulting values of m d , expressed as 20 log(md), were ⫺29.1, ⫺28.3, and ⫺30.2 for JL, AW, and AS, respectively, and the corresponding correlations were 0.97, 0.97, and 0.94. These values for 20 log(md) suggest that the effective magnitude of the hypothetical distortion component is very low, corresponding to a barely detectable amount of modulation for a sinusoidal carrier 共Zwicker, 1952; Sek and Moore, 1994; Dau et al., 1997a; Kohlrausch et al., 2000; Moore and Glasberg, 2001兲. Increasing the assumed value of m d by, for ex3034 J. Acoust. Soc. Am., Vol. 116, No. 5, November 2004 FIG. 2. Scatter plots of the values of d ⬘ against the values of m sum⫺m d 共see the text兲, denoted here ‘‘difference in effective modulation depth.’’ Each panel shows results for one subject. Each symbol shows results for one value of ⌬, as indicated in the key. Linear regression lines are also shown. ample, 6 dB resulted in substantial decreases in the correlation of the d ⬘ values with the values of m sum⫺m d . Averaging across subjects, the correlation decreased from 0.96 to 0.75. Decreasing the assumed value of m d by 6 dB resulted in somewhat smaller decreases in the correlation, to a mean value of 0.91. Thus, the magnitude of the distortion component is unlikely to be much bigger than estimated above, but it could be somewhat smaller. Figure 2 shows scatter plots of the values of d ⬘ against the values of m sum⫺m d . It is clear that the data for the different values of ⌬ all lie along the same function for each subject and that d ⬘ is almost linearly related to m sum ⫺m d . The scatter plots in Fig. 2 were fitted with linear regression lines, which are shown in the figure, and these lines were used to generate predicted values of d ⬘ for each value of ⌬ and m p . The predictions for AW are shown as filled squares and dashed lines in Fig. 1. There is no evidence A. Sek and B. C. J. Moore: Modulation distortion FIG. 3. The squares and circles show results of experiment 2, in which no feedback was given and each component of the masker had a modulation depth of 0.48. The triangles reproduce results from experiment 1 in which feedback was given and each component of the masker had a modulation depth of 0.3. The value of ⌬ was fixed at 135°. Each panel shows results for one subject. for any systematic discrepancy between the predicted and obtained values. This was also true for the results of the other subjects. In summary, the results are consistent with the idea that the masking of the 5-Hz probe modulation by the twocomponent masker modulator was caused by a low-level 5-Hz distortion component in the internal representation of the masker envelope. This distortion component appears to have a phase of about ⫺25° relative to the venelope of the masker. III. EXPERIMENT 2: PSYCHOMETRIC FUNCTIONS DETERMINED WITHOUT FEEDBACK USING VERY LOW PROBE MODULATION DEPTHS A. Rationale In experiment 2, we sought further evidence for the hypothetical envelope distortion product. The experiment was similar to experiment 1, but was modified in the following ways. 共1兲 Only a single value of ⌬ was used, namely 135°. This was the value that led to the poorest performance in experiment 1. For this value of ⌬, the envelope distortion product should have been almost opposite in phase to the probe modulation. 共2兲 No feedback was given. This meant that subjects could not use the feedback to modify their strategy, and made it more likely that they would always pick the interval in which the modulation depth sounded greater. 共3兲 The modulation depth of the two-component masker modulator was increased. The value of m for each component of the masker was set to 0.48. This was done since it seemed likely that the magnitude of the hypothetical distortion product would increase with increasing modulation depth of the masker 共Moore and Sek, 2000; Sek and Moore, 2003兲. One possible problem here is that the phase of the hypothetical distortion product might change with the modulation depth of the masker. Thus, the value of ⌬ of 135° might not be optimal for producing cancellation of the probe modulation. The maximum amplitude of the masker modulator, given that the two modulator components started in sine phase, was 0.9573. To avoid overmodulation, the modulation depth of the probe was not allowed to exceed 0.0447. Given that ⌬ was 135°, the maximum amplitude of the masker and probe modulators combined was 0.9839. J. Acoust. Soc. Am., Vol. 116, No. 5, November 2004 共4兲 The probe modulation depths were chosen to be small, so that they were likely to be comparable with the modulation depth of the hypothetical distortion product, as estimated in experiment 1. This was done to increase the likelihood of finding cancellation effects. 共5兲 Several closely spaced values of m p were used, to avoid the possibility of missing the range of values of m p over which d ⬘ was negative. B. Method The subjects were the same as for experiment 1. The stimuli and method were also almost the same as for experiment 1, except that no feedback was provided. Two sets of runs were conducted, covering different ranges of the probe modulation depth, m p . In one set, the values were 0.005, 0.007, 0.01, 0.014, and 0.02. In a second set, the values were 0.02, 0.028, 0.035, 0.04, and 0.045. Each run was repeated at least 40 times, so that at least 400 judgments were obtained for each value of m p . C. Results The results for each subject are shown in Fig. 3. The results for the two sets of values of m p used in experiment 2 are shown by squares and circles. The triangles show results from experiment 1 for ⌬⫽135°; note, however, that the masker modulation depth was greater in experiment 2 than in experiment 1. The results for small values of m p show that there is a range over which d ⬘ is consistently negative. The minimum value of d ⬘ occurs for 20 log(mp)⬇⫺30, although the value varies across subjects from ⫺34 to ⫺27. At the minimum, the value of d ⬘ is about ⫺0.43. Confidence intervals for d ⬘ were calculated as described by Miller 共1996兲. For 400 forced-choice trials and for d ⬘ ⬇0.4, the 95%confidence interval is about ⫾0.2. Thus, d ⬘ values in the vicinity of the minimum differ significantly from zero (p ⬍0.05). The results support the idea that the auditory system generates a weak envelope distortion component at the venelope rate 共5 Hz兲. When the probe modulation depth is comparable to the effective modulation depth of the distortion component, and when ⌬⫽135°, the probe and distortion component modulation nearly cancel, leading subjects to select the ‘‘wrong’’ interval as containing the probe modulation. It seems reasonable to assume that the probe and distortion component are nearly equal in effective modulation A. Sek and B. C. J. Moore: Modulation distortion 3035 depth when d ⬘ is at its most negative value. As noted above, the minimum value of d ⬘ occurred for 20 log(mp)⬇⫺30. Thus, the results suggest that the distortion component in the internal representation of the envelope has a magnitude approximately equal to that produced by an input modulation depth of 0.032. The estimated values of m d for individual subjects are similar to those estimated from the data of experiment 1, and are in the same rank order; the value is highest for AW and lowest for AS. It is curious that the estimated values were not higher for experiment 2 than for experiment 1, as the masker modulation depth was greater in experiment 2. Possibly, the relative phase of the distortion component varies with the masker modulation depth, and the value of ⌬ chosen for experiment 2 was not optimal for producing cancellation of the distortion component and probe modulation. The effective level of the distortion component estimated from experiments 1 and 2 is comparable to that estimated by Moore et al. 共1999兲. For a two-component modulator with m⫽0.3 for each component, the distortion product was estimated to have an effective modulation index of 0.027 (20 log m⫽⫺33.7). In their model, Ewert et al. 共2002兲 and Verhey et al. 共2003兲 assumed that the effective magnitude of the venelope component was scaled by a factor of 0.3 relative to the envelope. For example, for a two-component modulator with m⫽0.3 for each component, the venelope amplitude fluctuates between 0 and 0.6, so the scaled venelope would have a peak-to-valley ratio of 0.2. Due to the fact that the two-component modulator does not produce a sinusoidal venelope, the venelope component at the difference frequency would have a value of m of about 0.076 (20 log m⫽⫺22.4). This is somewhat larger than estimated from experiments 1 and 2. The difference across studies may simply reflect individual differences. IV. DISCUSSION There are various ways in which nonlinearities in the auditory system might introduce a distortion component at the venelope rate into the internal representation of the envelope. Basilar-membrane compression is probably not involved, since that nonlinearity would introduce a distortion component that was 180° out of phase with the venelope. In any case, the detection of AM of a sinusoidal carrier probably depends strongly on the use of information from the high-frequency side of the excitation pattern evoked by the carrier 共Zwicker, 1956; Moore and Sek, 1994; Kohlrausch et al., 2000兲. This part of the excitation pattern appears to be processed almost linearly on the basilar membrane, at least for medium to high frequencies 共Rhode and Robles, 1974; Sellick et al., 1982兲, so a distortion component at the venelope rate would not be introduced. Basilar-membrane nonlinearity might play a greater role if subjects were forced to attend to the outputs of auditory filters tuned close to the carrier frequency 共which is not the case for most previous studies, or the current one兲. In experiments similar to the present ones, Füllgrabe et al. 共2004兲 measured the detectability of 5-Hz probe modulation of a 5-kHz carrier in the presence of a ‘‘second-order’’ modulator 共Lorenzi et al., 2001a; 2001b兲, as a function of the 3036 J. Acoust. Soc. Am., Vol. 116, No. 5, November 2004 relative phase, ⌬, between the probe modulation and the venelope of the second-order modulation. They included conditions with a notched noise centered at 5 kHz, which was intended to restrict off-frequency listening. In the absence of the notched noise, the value of ⌬ leading to poorest detectability of 5-Hz second-order modulation varied with the first-order modulation rate, suggesting that at least one component of the nonlinearity that generates the envelope distortion product is time varying; an instantaneous nonlinearity would lead to an envelope distortion component with a relative phase that was independent of the first-order rate. In the presence of the notched noise, the value of ⌬ giving poorest detectability hardly varied with first-order modulation rate, but it did vary across subjects, from about 45° to 135°. These results suggest that more than one mechanism may contribute to the nonlinearity. Possible nonlinearities contributing to distortion in the internal representation of the envelope occur in peripheral transduction processes 共Yates, 1990兲, and peripheral adaptation effects 共Smith, 1977兲. The model of Dau and co-workers 共Dau et al., 1997a; 1997b兲 incorporates ‘‘adaptation loops’’ to simulate adaptation processes, which introduce strong nonlinearity, but only for very low modulation rates 共below about 2 Hz兲. Verhey et al. 共2003兲 considered several models for generating a distortion component at the venelope rate. These included a ‘‘threshold’’ model, which effectively produced half-wave rectification of the ac-coupled envelope, and a model in which the venelope was explicitly extracted. They concluded that the venelope model gave the best fit to their data. However, for this model, the best performance is predicted to occur when ⌬ is exactly equal to 0° and the worst performance is predicted when ⌬ is exactly 180°. In our experiment 1, performance was poorest for ⌬⫽135 and 180°, and the form of the data suggested a threshold maximum centered at about 155°. In the study of Füllgrabe et al. 共2004兲, described above, the value of ⌬ leading to poorest performance was often below 180° when no notched noise was used, and was consistently below 180° when a notchednoise was used. Thus, it seems clear that the envelope distortion component is not always exactly in phase with the venelope, and that the distortion component phase may vary from one subject to another. This casts some doubt upon the idea that the envelope distortion component results from explicit extraction of the venelope at some stage in the auditory system. V. CONCLUSIONS The following conclusions can be drawn from this study. 共1兲 Experiment 1 showed that, in the presence of a pair of masker modulators beating at a 5-Hz rate 共50 and 55 Hz兲, the detectability of 5-Hz probe modulation was dependent on the phase of the probe modulation relative to the beat cycle of the masker modulators. The relative phase, ⌬, is defined as zero when the peak amplitude of the probe modulation coincides with a peak in the beat cycle 共the peak in the venelope of the masker兲. The best A. Sek and B. C. J. Moore: Modulation distortion 共2兲 共3兲 共4兲 共5兲 performance occurred when ⌬ was 0° or 315°. The poorest performance occurred when ⌬ was 135° or 180°. The pattern of the results for experiment 1 could be fitted well based on the assumption that the auditory system introduced a weak distortion component in the modulation spectrum at a 5-Hz rate. Performance appears to be based on the difference between the modulation depth of the distortion component 共in the nonsignal interval兲 and of the vector sum of the distortion component and the probe modulation 共in the signal interval兲. The value of d ⬘ is linearly related to this difference. Experiment 2 used a fixed value of ⌬ of 135°; this was the value that led to the poorest performance in experiment 1. In contrast to experiment 1, no feedback was given. The results showed that d ⬘ values were consistently negative over a range of probe modulation depths, m p ; in other words, over this range of m p subjects consistently identified the probe modulation as being in the wrong interval of the two-alternative forced-choice task. The value of m p leading to the most negative value of d ⬘ was about 0.032 关 20 log(mp)⫽⫺30兴 . The results are consistent with the idea that nonlinearities within the auditory system can introduce a weak distortion component in the internal representation of the envelopes of the stimuli, although a compressive nonlinearity does not account for the results. In the case of two-component beating modulators, a weak component is introduced at the beat rate. Even for large modulation depths of the two-component modulator, the effective modulation depth of the distortion component appears to be only about 0.03. For the conditions of our experiment, the envelope distortion component appears to have a phase between 0° and ⫺45° relative to the venelope; the best estimate of the relative phase was ⫺25°. However, the relative phase may vary across conditions 共e.g., with the frequencies of the masker modulator components兲 and across subjects. ACKNOWLEDGMENTS This work was supported by the Wellcome Trust and the Medical Research Council 共UK兲. We thank Neal Viemeister, Torsten Dau, Christian Lorenzi, and one anonymous reviewer for helpful comments on an earlier version of this paper. Bacon, S. P., and Grantham, D. W. 共1989兲. ‘‘Modulation masking: Effects of modulation frequency, depth, and phase,’’ J. Acoust. Soc. Am. 85, 2575– 2580. Dau, T., Kollmeier, B., and Kohlrausch, A. 共1997a兲. ‘‘Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers,’’ J. Acoust. Soc. Am. 102, 2892–2905. Dau, T., Kollmeier, B., and Kohlrausch, A. 共1997b兲. ‘‘Modeling auditory processing of amplitude modulation. II. Spectral and temporal integration,’’ J. Acoust. Soc. Am. 102, 2906 –2919. Ewert, S. D., and Dau, T. 共2000兲. ‘‘Characterizing frequency selectivity for envelope fluctuations,’’ J. Acoust. Soc. Am. 108, 1181–1196. Ewert, S. D., Verhey, J. L., and Dau, T. 共2002兲. ‘‘Spectro-temporal processing in the envelope-frequency domain,’’ J. Acoust. Soc. Am. 112, 2921– 2931. Füllgrabe, C., and Lorenzi, C. 共2003兲. ‘‘The role of envelope beat cues in the detection and discrimination of second-order amplitude modulation,’’ J. Acoust. Soc. Am. 113, 49–52. J. Acoust. Soc. Am., Vol. 116, No. 5, November 2004 Füllgrabe, C., Moore, B. C. J., Demany, L., Ewert, S., Sheft, S., and Lorenzi, C. 共2004兲. ‘‘Modulation masking produced by 2nd-order modulators,’’ J. Acoust. Soc. Am. 共submitted兲. Hacker, M. J., and Ratcliff, R. 共1979兲. ‘‘A revised table of d ⬘ for M-alternative forced choice,’’ Percept. Psychophys. 26, 168 –170. Houtgast, T. 共1989兲. ‘‘Frequency selectivity in amplitude-modulation detection,’’ J. Acoust. Soc. Am. 85, 1676 –1680. Kay, R. H. 共1982兲. ‘‘Hearing of modulation in sounds,’’ Physiol. Rev. 62, 894 –975. Kohlrausch, A., Fassel, R., and Dau, T. 共2000兲. ‘‘The influence of carrier level and frequency on modulation and beat-detection thresholds for sinusoidal carriers,’’ J. Acoust. Soc. Am. 108, 723–734. Lorenzi, C., Soares, C., and Vonner, T. 共2001a兲. ‘‘Second-order temporal modulation transfer functions,’’ J. Acoust. Soc. Am. 110, 1030–1038. Lorenzi, C., Simpson, M. I., Millman, R. E., Griffiths, T. D., Woods, W. P., Rees, A. et al. 共2001b兲. ‘‘Second-order modulation detection thresholds for pure-tone and narrow-band noise carriers,’’ J. Acoust. Soc. Am. 110, 2470–2478. Miller, J. 共1996兲. ‘‘The sampling distribution of d ⬘ ,’’ Percept. Psychophys. 58, 65–72. Millman, R. E., Green, G. G., Lorenzi, C., and Rees, A. 共2003兲. ‘‘Effect of a noise modulation masker on the detection of second-order amplitude modulation,’’ Hear. Res. 178, 1–11. Moore, B. C. J., and Glasberg, B. R. 共2001兲. ‘‘Temporal modulation transfer functions obtained using sinusoidal carriers with normally hearing and hearing-impaired listeners,’’ J. Acoust. Soc. Am. 110, 1067–1073. Moore, B. C. J., and Sek, A. 共1992兲. ‘‘Detection of combined frequency and amplitude modulation,’’ J. Acoust. Soc. Am. 92, 3119–3131. Moore, B. C. J., and Sek, A. 共1994兲. ‘‘Effects of carrier frequency and background noise on the detection of mixed modulation,’’ J. Acoust. Soc. Am. 96, 741–751. Moore, B. C. J., and Sek, A. 共2000兲. ‘‘Effects of relative phase and frequency spacing on the detection of three-component amplitude modulation,’’ J. Acoust. Soc. Am. 108, 2337–2344. Moore, B. C. J., Sek, A., and Glasberg, B. R. 共1999兲. ‘‘Modulation masking produced by beating modulators,’’ J. Acoust. Soc. Am. 106, 908 –918. Rhode, W. S., and Robles, L. 共1974兲. ‘‘Evidence from Mössbauer experiments for nonlinear vibration in the cochlea,’’ J. Acoust. Soc. Am. 55, 588 –596. Sek, A., and Moore, B. C. J. 共1994兲. ‘‘The critical modulation frequency and its relationship to auditory filtering at low frequencies,’’ J. Acoust. Soc. Am. 95, 2606 –2615. Sek, A., and Moore, B. C. J. 共2003兲. ‘‘Testing the concept of a modulation filter bank: The audibility of component modulation and detection of phase change in three-component modulators,’’ J. Acoust. Soc. Am. 113, 2801–2811. Sellick, P. M., Patuzzi, R., and Johnstone, B. M. 共1982兲. ‘‘Measurement of basilar membrane motion in the guinea pig using the Mössbauer technique,’’ J. Acoust. Soc. Am. 72, 131–141. Sheft, S., and Yost, W. A. 共1997兲. ‘‘Modulation detection interference with two-component masker modulators,’’ J. Acoust. Soc. Am. 102, 1106 – 1112. Shofner, S., Sheft, S., and Guzman, S. J. 共1996兲. ‘‘Responses of ventral cochlear nucleus units in the chinchilla to amplitude modulation by lowfrequency, two-tone complexes,’’ J. Acoust. Soc. Am. 99, 3592–3605. Smith, R. L. 共1977兲. ‘‘Short-term adaptation in single auditory-nerve fibers: Some poststimulatory effects,’’ J. Neurophysiol. 49, 1098 –1112. Strickland, E. A., and Viemeister, N. F. 共1996兲. ‘‘Cues for discrimination of envelopes,’’ J. Acoust. Soc. Am. 99, 3638 –3646. Taylor, M. M., Forbes, S. M., and Creelman, C. D. 共1983兲. ‘‘PEST reduces bias in forced-choice psychophysics,’’ J. Acoust. Soc. Am. 74, 1367–1374. Verhey, J. L., Ewert, S., and Dau, T. 共2003兲. ‘‘Modulation masking produced by complex tone modulators,’’ J. Acoust. Soc. Am. 114, 2135–2146. Yates, G. K. 共1990兲. ‘‘Basilar membrane nonlinearity and its influence on auditory nerve rate-intensity functions,’’ Hear. Res. 50, 145–162. Zwicker, E. 共1952兲. ‘‘Die Grenzen der Hörbarkeit der Amplitudenmodulation und der Frequenzmodulation eines Tones 共The limits of audibility of amplitude modulation and frequency modulation of a pure tone兲,’’ Acustica 2, 125–133. Zwicker, E. 共1956兲. ‘‘Die elementaren Grundlagen zur Bestimmung der Informationskapazität des Gehörs 共The foundations for determining the information capacity of the auditory system兲,’’ Acustica 6, 356 –381. A. Sek and B. C. J. Moore: Modulation distortion 3037