Acoustic Theory of S..

ACOUSTICAL THEORY OF
SPEECH PRODUCTION
Robert A. Prosek, Ph.D.
CSD 301
Acoustical Theory
• There is nothing more practical than a good
theory
• The linear source-filter theory is one of the
best in our field
• Based on Gunnar Fant’s “Acoustic Theory
of Speech Production”
• The theory expresses articulatory-acoustic
relationships
Acoustical Theory
• The source is vocal fold vibration
• for some consonants, the source is more
complex
• can be in the vocal tract or a combination
of both
• The filter is the vocal tract
• extending from the vocal folds to the lips or
nares
• like all filters, the vocal tract is frequency
dependent
Acoustic Theory
• The source and the filter are assumed to be
independent
• this is an assumption made for
convenience
• it implies that you can change the output of
the vocal folds without changing the vocal
tract
• vice-versa
Vowels
• Modeled as a tube closed
at one end and open at the
other
• the closure is a membrane with a slit in it
• the tube has uniform cross sectional area
• membrane represents the source of energy (vocal
folds)
• the energy travels through the tube
• the tube generates no energy on its own
• the tube represents an important class of resonators
• odd quarter length relationship
Vowels (2)
• There are an infinite number of resonances for this tube
• we need only consider the first three or four
• the model is valid to only about 5 kHz
• The model was developed by Chiba and Kajiyama in
1941
• based on pipe organs for which a great deal was
known
Vowels (3)
• If c=35000 cm/s, and
• l=17.5 cm
• What are the first three resonances?
• The simple tube closed at one end and open at the
other, with the above length, is a reasonable
approximation of /ᴧ/ produced by a male talker
• Some points to note:
Vowels (4)
• A curved tube (vocal tract) and a straight tube (model) behave identically
acoustically out to 5 kHz
• this is because the curve begins to affect acoustic signals with a short
wavelength
• The resonances are equally spaced if the tube has uniform cross sectional
area
• Remember: all of the energy comes from the source (vocal fold vibration for
vowels)
• Changing the length of the tube changes the resonance frequencies
• Influenced by age and sex
• l= 14.5 cm for females
• l= 8.75 cm for children
Vowels (5)
• A one-vowel model isn’t very useful
• Different vowels are modeled, acoustically, by different
vocal tract shapes
• Phonetically, how are vowels distinguished?
• If we place a constriction in the tube (vocal tract)
• the resonances changes
• if you change the articulation, you change the vocal
tract shape, and the resonance frequencies,
amplitudes and bandwidths
•
•
Vowels (6)
The output energy of a vowel is the product of
•
the source energy
•
the size and shape of the resonator
•
the radiation characteristic
Glottal source characteristics for vowels
•
vocal fold vibration is periodic
•
•
what does this imply for the spectrum?
f0 or F0 is used to indicate the vocal fundamental frequency
• the amplitude of the harmonics decreases by -12
dB/octave
Vowels (7)
•
Filter characteristics for vowels
•
the vocal tract is a dynamic filter
•
it is frequency dependent
•
it has, theoretically, an infinite number of resonances
•
each resonance has a center frequency, an amplitude and a bandwidth
•
for speech, these resonances are called formants
•
formants are numbered in succession from the lowest
•
•
F1, F2, F3, etc.
•
A1, A2, A3, etc.
•
B1, B2, B3, etc.
the formants together form the transfer function
•
input-output relationship
•
formants become physically evident only when energized
Vowels (8)
• Radiation characteristic
• acoustic effect when a sound leaves a small area
and enters a large one
• The effect is to raise the slope of the spectrum by +6
dB/octave
• Acoustic Phonetic Relationships for Vowels
• F1 is inversely related to tongue height
• F2 is directly related to tongue advancement
• Lip rounding lowers all formant frequencies
Vowels (9)
• Perturbation Theory
• Volume velocity variations reflect the way air particles
vibrate at a particular point in the vocal tract
• At some points, vibration is minimal (node); at others,
maximal (antinodes)
• For F1, the antinode is at the open end and the node
is at the closed end
• For F2, there are two antinodes and two nodes
• For F3, there are three antinodes and three nodes
• etc.
Vowels (10)
• Perturbation Theory (continued)
• if a change in cross sectional area is applied (a
perturbation)
• the acoustic effect depends on proximity to a node
or an antinode
• near an antinode the formant frequency lowers
• near a node the formant frequency rises
• lip constrictions lower all formant frequencies
• laryngeal constrictions raise all formant frequencies
Vowels (11)
• Amplitude relationships
• amplitudes depend on formant frequencies
• if F1 is lowered (raised), A1 lowers (rises)
• if two formant frequencies move closer together,
then both peaks increase in amplitude
• how do you raise or lower formant frequencies?
Vowels (12)
• Source-Filter Interactions
• Some vocal tract shapes may affect vocal fold
vibration
• Singers’ formant
• High impedance constrictions require greater
subglottal air pressure
• Vocal tract - vocal fold coupling during open phase
of vibratory cycle
Consonants (1)
• The linear source-filter theory can be used to describe
the acoustics of consonants as well as vowels
• For consonants, however, the source is not always at
the level of the vocal folds
• some sources are in the vocal tract
• these sources are aperiodic
• durations and amplitudes also are different from
vowels
• Nonetheless, source-filter theory gives us a series of
expectations for the acoustic characteristics for
consonants
•
Consonants (2)
Fricatives
•
Modeled as a tube with a very severe constriction
•
The air exiting the constriction is turbulent
•
The Reynold’s number gives the conditions for turbulence
•
Re=vh/ʊ
•
Notice that turbulence can be generated in two ways
•
Zeros or antiformants can be found in the spectrum
•
Because of the turbulence, there is no periodicity unless
accompanied by voicing
•
What does an aperiodic spectrum look like?
Consonants (3)
• When a fricative constriction is tapered
• the back cavity is involved
• this resembles a tube closed at both ends
• Fn=nc/2l
• such a situation occurs primarily for articulation
disorders
Consonants (4)
• Nasal consonants
• Velopharyngeal port is open and the oral cavity is
completely blocked at some point
• The side-branch resonator produces antiformants
(zeros)
• The overall vocal tract is longer than for vowels
• What effect does this have on the spectrum?
• Oral formants, nasal formants, nasal antiformants
• Nasal murmur
Consonants (5)
• Stops
• The tube model is not altered very much for stops
• However, the time domain becomes critical
• There is a complete closure of the vocal tract
somewhere
• Pressure builds up behind the closure
• Rapid release
• The articulation results in a burst and transitions
Consonants (6)
• Other consonants are variations of these
• Affricates
• Liquids
• Glides
• Diphthongs