Solutions to P105 Problems due 12 Nov. 2001 Berg 6.9 (a) What is Ohm’s Law of Hearing? The sound quality of a complex tone depnds only on the amplitudes of its harmonics and not on their relative phases. (b) Give examples and applications and a counter example. Two complex waveforms may look completely different because the relative phases of the harmonics being added are different, but if they have the same frequency spectrum, then they will have the same timbre or sound/tone quality. An application is that in electronic sound recording, uses of magnetic field result in the phases of different frequencies being changed differently leading to phase distortion where the phases of the different frequency components are in fact changed. However, due to Ohm’s Law of Hearing, the resultant audio signal may have a different complex waveform but the same frequency spectrum and hence will be perceived as having been faithfully reproduced. A counter-example is when two tones are prsentes at an interval of nearly, but not exactly, one octave, second-order beating will occur . The amplitude of the sum wave remains approximately the same, while the wave shape is continually changing at a rate that depends upon the dviation of the two frequencies from an exact ratio of 2. In this case, the frequency spectrum remains constant, but you will perceive a type of beating. (c) Define masking. The effect of masking is when the existence of one tone affects the perception of a second tone that may be at a different frequency and intensity. In general, there is a softer tone, called the masked tone, that can be “drowned out” by the presence of a louder masking tone. (d) In general, low-frequency tones readily mask higher-frequency tones, while it is more difficult for higher-frequency tones to mask a low-frequency tone. Therefore, a 1000-Hz tone would be more readily masked by a lower frequency tone, i.e., better masking with a 800-Hz tone than a 1200-Hz tone. Berg 6.12 (a) What is Bernoulli’s Principle? For fluid flow (e.g., air or any fluid), the pressure is less where the velocity of the flow is greater and more where the velocity is less. (b) How is it used to explain the operation of the human vocal folds? The force of the air from the lungs forces the vocal folds or flaps to open. When the folds are open, the velocity of the air rushing through the opening is large. Due to Bernoulli’s principle, the pressure of the air going through the folds then drops. With this reduced pressure, the force from the surrouding tissue is then greater than the force exerted by the air and the folds or flaps slam shut. This cycle then repeats itself at a regular interval (period) as a pulse train of puffs of air come through. Berg 6.13 (a) What are vocal formants and how are they important in determining vocal sounds? Vocal formants are the resonances associated with frequency regions in the vocal tract where the amplitude of input frequencies are enhanced with respect with other regions. (b) What is the singing formant and how does it help a singer? The singing formant is an enhancement of frequencies in the region between about 2500 and 3000 Hz. Emphasis of vocal harmonics in this range increases their amplitudes above that of the average of the orchestral accompaniment and allows the voice to be heard above the orchestra. Those male voices having a dominant singing formant have better projection than singers lacking such a formant. The normal speaking voice generally lacks emphasis of frequencies in the region of the singing formant. male Fn Averaging all of these ratios, we get ----------------- = 0.851 . Using the simple model that the vocal female Fn tract can be modeled as a column of air open at one end, closed at the other, the resonant frenv sound - . Therefore the quencies, and hence the formants, are given by: F n = ----------------4L nv sound ----------------male Fn 4L male L female ratio: ----------------= = ---------------- = 0.851 is an estimate of the relative male and female -------------------female L male nv sound Fn --------------------4L female vocal tract lengths. S24. First, those peaks shown in the “input” plot are the peaks representing the harmonics of some complex wave with harmonics at 200, 400, 600, 800 Hz..... In class, we have been drawing them as “spikes” (i.e., a peak of infinitely narrow width!). What is shown is more true to real life where each harmonic peak would have some finite width. Let’s try a few frequencies to understand the procedure. At 400 Hz, the harmonic peak of the input has an amplitude of –5 dB. At that point, the filter function has a value of 0 dB (i.e., does not suppress the input in any way – equivalent to multiplying the input by a factor of 1.0). Therefore, the amplitude of the harmonic peak in the output will remain at –5 dB. Now at 200 Hz, the amplitude of the input harmonic peak is 0 dB, and the filter function is suppressing it by a factor of –13 dB (i.e., equivalent to multiplying the input by a number that is less than 1.0). This additional factor of suppression results in an output amplitude for this first harmonic peak of 0 + (–13 dB) = –13 dB. On to 600 Hz: the amplitude of the output harmonic peak would be: –8 dB + (–12 dB) = – 20 dB, and so on. The result is shown on the next page: Relative Amplitude 0 dB (a) Frequency Spectrum at Vocal Cords (source) –10 dB –20 dB –30 dB Relative Amplitude 200 400 600 800 1000 1200 1400 1600 0 dB (b) Filter Function –10 dB –20 dB –30 dB 200 400 600 800 1000 1200 1400 1600 Relative Amplitude Frequency (Hz) 0 dB Frequency (Hz) (c) Frequency Spectrum at end of Vocal Tract –10 dB –20 dB –30 dB 200 400 600 800 1000 1200 1400 1600 Frequency (Hz) S25. (i) The perceived pitch is almost always given by the fundamental frequency, or first harmonic, f 1 . Looking at the plot, one sees a harmonic series with frequencies at 200, 400, 600, 800, 1000 Hz, etc. Clearly, f 1 = 200 Hz , f 2 = 400 Hz , f 3 = 600 Hz , and so on. Therefore, the vocal cords were vibrating at a frequency of 200 Hz, and this would be the perceived pitch. (ii) Now one sees a harmonic series with frequencies at 100, 200, 300, 400, 500 Hz, etc. Clearly, f 1 = 100 Hz , f 2 = 200 Hz , f 3 = 300 Hz , and so on. Therefore, the vocal cords were vibrating at a frequency of 100 Hz, and this would be the perceived pitch. (iii) Examine closely the “envelope” of each of the frequency spectra. If you were asked to sketch the filter function of (a) and (b), you would find the exact same filter function, i.e., the same frequencies where the spectra was enhanced (formants). Since the filter functions and formants are the same for (a) and (b), the vocal tract must be being held in an identical configuration. Therefore, (a) and (b) would be perceived as identical vowels, but with (a) being the vowel voiced at a pitch of 200 Hz, and (b) at a pitch of 100 Hz. Pressure Amplitude (iv) Assuming that the input frequency spectrum was flat (actually falls slowly with frequency as in question S24, so not wildly precise, but good enough here: (a) "Envelope" Pressure Amplitude 0 Frequency (Hz) 200 400 600 800 1000 1200 1400 1600 1800 (b) "Identical envelope" Amplitude 0 Frequency (Hz) 200 400 600 800 1000 1200 1400 1600 1800 (c) Filter Function 0 Frequency (Hz) 200 400 600 800 1000 1200 1400 1600 1800