The source/filter theory of speech acoustics Sound sources vocal fold vibration (complex periodic sound) frication (turbulent noise) stop release burst (impulse source) Filter vocal tract resonances vowel acoustics voice source – complex periodic wave vocal tract filter – resonant tube [ ә ] Fn = (2n-1)c/ 4L [ a ] ?? [ i ] ?? A sequence of glottal pulses – one pulse every 6.6 ms so the glottal frequency is 150 Hz (150 pulses per second) The “glottal frequency” is also called the fundamental frequency of voicing and is abbreviated F0. The spectrum of the sequence of glottal pulses. It has a series of peaks – these are called “harmonics” Higher frequency harmonics generally have lower amplitude than lower frequency harmonics. The frequency of the lowest harmonic (the first harmonic) is equivalent to the fundamental frequency of the voice. 1st harmonic = F0 = 150 Hz (in this example) The higher harmonics are integer multipules of the fundamental – thus the 10th harmonic is at 1500 Hz. This is the spectrum of the voice SOURCE. It has yet to be filtered by a vocal tract! Review schwa acoustics. - Vocal tract is a uniform tube (same diameter for the length of the tube) - This tube has certain resonant frequencies - We can calculate the resonances given the length of the vocal tract (assume 17.5 cm for now) and the speed of sound (assume 35,000 cm/s) - Fn = (2n-1)c/4L F1 = 500 Hz F2 = 1500 Hz F3 = 2500 Hz .... The filtered voice source - [ ә ] Same set of harmonics as in the unfiltered source now harmonics at 500, 1500 and 2500 Hz are louder. 500 1500 2500 Not all vowels sound like schwa. Today we’ll look at the “tube models” approach to modeling vowel differences Next time we’ll look at the “perturbation theory” approach. A tube model of the vowel [ɑ] (low back unrounded) narrow back tube wide front tube A tube model of the vowel [ɑ] (low back unrounded) fn = (2n-1)c/4lb fn = (2n-1)c/4lf A nomogram of the two tube model: resonant frequencies for different back tube lengths. A nomogram of the two tube model. If the back tube is 3 cm long, the front tube is 13 cm long. F1 = c/(4*13) = 673 Hz F2 = 3c /(4*13) = 2019 Hz F3 = c/ (4*3) = 2917 Hz A nomogram of the two tube model. In the vowel [ a ] the front and back tubes are about equal. F1 F2 = c/(4*8) = 1093 Hz Prediction of the tube model is basically correct – we find F1 and F2 in [ ɑ ] are very close to each other in frequency. s F2 F1 ɑ A tube model for the vowel [ i ] constriction back tube front tube A tube model for the vowel [ i ] constriction fn = nc/2lb fn = (2n-1)c/4lf A tube model for the vowel [ i ] A Helmholtz resonator f = c/2π √(Ac/Ablblc) [i] with constriction 11 cm from the glottis F1 = 300 Hz (Helmholtz) F2 = 1900 Hz = c/(2*9) lb=9 cm F3 = 2200 Hz = c / (4*4) lf = 4 cm This tube model prediction is approximately correct for [ i ]. F3 = 2500 Hz F2 = 1900 Hz F1 = 400 Hz