Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Formant amplitude measurements Fant, G. and Mártony, J. journal: volume: number: year: pages: STL-QPSR 4 1 1963 001-005 http://www.speech.kth.se/qpsr I, SPEECH ANALYSIS A. FORMANT AMPLITUDE MEASUREhIE;NTS A survey of various measures of formant amplitudes and their interrelations was made in an earlier quarterly report ( 1 9 2 ) and in a paper to the Stockholm Speech Communication Seminar ( 3 ) It is the purpose of the following report to summarize some of the results and present a revised system of symbols for the various measures together with additional material illustrating the superposition effects at a gliding pitch. The various concepts of formant amplitude we have studied are illustrated by Fig. 1-1. As Spectrum envelope amplitude Ae Root mean square amplitude These are Aa Average amplitude Ai AP Initial voice period peak amplitude Peak amplitude In common for all of these measures is that they are taken at the frequency of the formant and not necessarily at the The formant frequency F n of formant number n are simply defined exact frequency of a spectral maximum. and the formant bandwidth B n by the imaginary and real part of the corresponding pole. The amplitude of formant number n is denoted by An if no reference is made to how it is measured and otherwise with an additional subscript, s, e, a, p or i, Thus AZs is the spectrum envelope amplitude of the second formant, If amplitudes are to be expressed in dB it is recommended to adopt the symbol L for level. denotes the level of formant No. n where A The A spectrum. S 0 is a reference amplitude. measure is confined to the envelope of a line The ke measure can be calculated from an r.m.s. summation of a suitable number of harmonics within the range of the Formant amplitude concepts Harmonic spectrum Spectrum envelope amplitude A, Broad- band spectrum cm.s. amplitude A, Average amplitude A, Initial amplitude Onset Peak amplitude AP Ai I Stationary conditions Sinqle formant timefunction Fig. 1-1. Illustrations of the various concepts of formant amplitude. Note that spectral amplitudes are taken at the formant frequency (pole frequency) which not necessarily coincides with a point of maximum amplitude. formant peak. Especially at high Fo this involves rather arbitrary decisions of which harmonics to include, especially if two formants lie fairly close. Both the Be and the Aa measures are more conveniently derived from a prefiltering rectification and smoothing of the time function in which case the rectifier shall have square law characteristics for A, and linear characteristics for Aa. The band-pass filter used for the prefiltering shall have a recommended width of 500 c/s and be centered at the formant frequency. The peak value of the time function envelope within a voice period is denoted by A P . Ideally assuming a single point of excitation it is also the initial value of an exponential. In general, however, because of the possibility of a distributed excitation pattern the envelope may take an arbitrary shape. If the effect of superposition from previous voice periods is subtracted the peak value is by definition Ai. Assuming a build-up with constant periodicity and waveshape the hypothetical value Ai of the initial or reference voice period is either smaller or greater than A depending on whether the superposition is in P phase or out of phase. A study of the simple model adopted for standard synthesis procedure reveals the following interrelations where Yn = ~B,/F~ These measures a r e normalized i n terms of r . m . s . Be and Aa become n u m e r i c a l l y e q u a l i f Y amplitudes s o t h a t A P t e n d s t o z e r o , i , e . under c o n d i t i o n s when t h e o s c i l l a t i o n decays very l i t t l e during a voice period. The peak f a c t o r A /Ae and t h e form f a c t o r A,/A, P may be c a l c u l a t e d from t h e r e l a t i o n s above. It i s of i n t e r e s t t o s t u d y how t h e v a r i o u s formant amplitude measures v a r y a s a f u n c t i o n of an i n c r e a s i n g F i.e., 0' when v o c a l p u l s e s of a c o n s t a n t shape and s i z e a r e o m i t t e d a t an increasing rate. The peak amplitude of t h e f i r s t v o c a l p e r i o d A. i s by d e f i n i t i o n a c o n s t a n t . When s t a t i o n a r y p e r i o d i c c o n d i t i o n s 1 have been reached t h e measures A P f u n c t i o n w i t h i n c r e a s i n g Foe $ ' Be and Aa v a r y i n an o s c i l l a t o r y Superimposed i s a 6 d ~ / o c t r i s e o f and a 3 d ~ / o c t r i s e of Ae. The spectrum envelope amplitude A s does not o s c i l l a t e and i n c r e a s e s at a r a t e of 6 d ~ / o c t . The simple b e h a v i o r of As may t h u s be d e s c r i b e d a s follows. An i n c r e a s e of F by an o c t a v e i s followed by a r i s e i n 0 As by 6 dB. However, t h e number of harmonics w i t h i n any l i m i t e d frequency range i s halved and t h e n e t g a i n i n Ae i s t h u s 3 dB only. A ~ / Fo r~ Ai would be i d e a l parameters f o r p r a c t i c a l work c o u l d be a u t o m a t i c a l l y measured! - i f they Apart from t h e superimposed o s c i l l a t i o n s t h e measure A might be u s e f u l f o r t h e e x t r a c t i o n of P formant amplitude p a r a m e t e r s i n vocoders. It h a s t h e b e n e f i t of a r e l a t i v e small average i n c r e a s e w i t h i n c r e a s i n g F The A parameter a which i s t h e most commonly used measure of formant amplitude i n 0' a u t o m a t i c speech a n a l y z i n g systems i s j u s t as s e n s i t i v e t o t h e s u p e r p o s i t i o n e f f e c t as Be and A P and h a s t h e a d d i t i o n a l disadvantage of Fo p r o p o r t i o n a l i t y . It i s always of i n t e r e s t t o s e e how a t h e o r y developed from a mathematical model compares w i t h t h e t r u e system. e a r l i e r progress repor 6' gave ) d a t a derived male s p e a k e r u t t e r i n g t h e vowel fundamentals. [E] The from measurements of a a t a large v a r i e t y of voice An i l l u s t r a t i o n w a s a l s o g i v e n of t h e time-frequency- i n t e n s i t y spectrum of a sample phonated a t a g l i d i n g p i t c h . p a r t i c u l a r speaker W.J. This executed a h i g h degree of c o n t r o l over h i s v o i c e and t h e s u p e r p o s i t i o n e f f e c t s were r a t h e r similar comparing t h i s sample w i t h t h a t of a s y n t h e t i c i m i t a t i o n . t 1 I 0 0.5 1 F i g . 1-2. -t t I I sec. 0 0.5 1 Broad-bard Sonagram; nnd c u r v e s of o v e r a l l a m p l i t u d e ( i n t e n s i t y ) and s e l e c t i v e m e a s u r e s of P i r s t and s e c o n d f o r m a n t a m p l i t u d e of a s u s t a i n e d vowel [ E ] a t a g l i d i n g p i t c h p r o d u c e d b y t h e s p e e c h s y n t h e s i z e r OVE I1 and by Note t h e b e h a v i ~ ro f s e c o n d a human s u b j e c t (B.L.). f o r m a n t a m p l i t u d e o f t h e hm,m s p e a k e r . -t sec. (3) Fant, G., Fintoft, K., Liljencrants, J., Lindblom B., and M&rtony, J r "Formant amplitude measurements" Paper C2 presented at the Speech Communication Seminar, Stockholm 1962. . , (4) F m t , G .: "Acoustic Anslysis 2nd Synthesis of Spoech with Applications to Swzdish", Sricsson Technics 5 (1 959). . : i~Myoelastic-aerodynamic theory of voice prod~ction~.~, J. of Speech and Hearing Research, 1 (1 958) pp. 227-2440 ( 5 ) van den Berg, Jw (6) Peterson, G.E, and McKinney, N.P. : "The measurement of speech powerIi,Phonetica, 1 (1 961 ) pp. 65-84 B. SPECTROGWHIC STUDY OF TFE DYNAMICS OF VOWEL ARTICULA!J?ION B spectrographic study of vowel reduction has recently been concluded (' ) . It will be briefly summarized below. Vowel reduction is said to be a characteristic feature of languages with heavy stress but has to certain extent also been associated with rate of utterances and contextual influence. There is some evidence in the literature that, articulatorily as well as acoustically, the process of reduction amounts to centralieation. Thus in the acoustic domain a reduced vowel is located somewhere along a continuum whose extreme ends are the formant pattern of the unaffected vowel and that of the neutral vowel or schwa. An experiment was designed to test this hypothesis and to provide some insight into the dynamic properties of vowel articulation. It involved the examination of vowels pronounced under varying timing conditions and in systematically varied consonantal environment. 24 nonsense words were formed by commuting /I, 9 , Y, e, a, 8, 3, tr/ in three consonantal environments: /b-b/, d and / Preliminary experimentation prcceded the selection of sentence frames that generated durations of these vowels within 80 to 300 msec. There were four carrier phrases. Each made up one list in which each of the 24 CVC syllables occurred 5 times.