Formant amplitude measurements

advertisement
Dept. for Speech, Music and Hearing
Quarterly Progress and
Status Report
Formant amplitude
measurements
Fant, G. and Mártony, J.
journal:
volume:
number:
year:
pages:
STL-QPSR
4
1
1963
001-005
http://www.speech.kth.se/qpsr
I, SPEECH ANALYSIS
A.
FORMANT AMPLITUDE MEASUREhIE;NTS
A survey of various measures of formant amplitudes
and their interrelations was made in an earlier quarterly report ( 1 9 2 )
and in a paper to the Stockholm Speech Communication Seminar ( 3 )
It is the purpose of the following report to summarize some of the
results and present a revised system of symbols for the various
measures together with additional material illustrating the superposition effects at a gliding pitch.
The various concepts of formant amplitude we have
studied are illustrated by Fig. 1-1.
As
Spectrum envelope amplitude
Ae
Root mean square amplitude
These are
Aa Average amplitude
Ai
AP
Initial voice period peak amplitude
Peak amplitude
In common for all of these measures is that they are
taken at the frequency of the formant and not necessarily at the
The formant frequency F
n
of formant number n are simply defined
exact frequency of a spectral maximum.
and the formant bandwidth B
n
by the imaginary and real part of the corresponding pole.
The amplitude of formant number n is denoted by An if no reference
is made to how it is measured and otherwise with an additional
subscript, s, e, a, p or i, Thus AZs is the spectrum envelope
amplitude of the second formant, If amplitudes are to be expressed
in dB it is recommended to adopt the symbol L for level.
denotes the level of formant No. n where A
The A
spectrum.
S
0
is a reference amplitude.
measure is confined to the envelope of a line
The ke measure can be calculated from an r.m.s.
summation of a suitable number of harmonics within the range of the
Formant amplitude concepts
Harmonic spectrum
Spectrum envelope
amplitude A,
Broad- band
spectrum
cm.s. amplitude A,
Average amplitude A,
Initial amplitude
Onset
Peak amplitude AP
Ai
I
Stationary conditions
Sinqle formant timefunction
Fig. 1-1.
Illustrations of the various concepts of formant
amplitude. Note that spectral amplitudes are
taken at the formant frequency (pole frequency)
which not necessarily coincides with a point of
maximum amplitude.
formant peak. Especially at high Fo this involves rather arbitrary
decisions of which harmonics to include, especially if two formants
lie fairly close. Both the Be and the Aa measures are more conveniently derived from a prefiltering rectification and smoothing
of the time function in which case the rectifier shall have square
law characteristics for A, and linear characteristics for Aa.
The band-pass filter used for the prefiltering shall have a recommended
width of 500 c/s and be centered at the formant frequency.
The peak value of the time function envelope within
a voice period is denoted by A
P
.
Ideally assuming a single point
of excitation it is also the initial value of an exponential.
In
general, however, because of the possibility of a distributed
excitation pattern the envelope may take an arbitrary shape.
If the effect of superposition from previous voice
periods is subtracted the peak value is by definition Ai.
Assuming
a build-up with constant periodicity and waveshape the hypothetical
value Ai of the initial or reference voice period is either smaller
or greater than A depending on whether the superposition is in
P
phase or out of phase.
A study of the simple model adopted for standard
synthesis procedure reveals the following interrelations
where Yn
=
~B,/F~
These measures a r e normalized i n terms of r . m . s .
Be and Aa become n u m e r i c a l l y e q u a l i f Y
amplitudes s o t h a t A
P
t e n d s t o z e r o , i , e . under c o n d i t i o n s when t h e o s c i l l a t i o n decays
very l i t t l e during a voice period.
The peak f a c t o r A /Ae and t h e form f a c t o r A,/A,
P
may be c a l c u l a t e d from t h e r e l a t i o n s above.
It i s of i n t e r e s t t o s t u d y how t h e v a r i o u s formant
amplitude measures v a r y a s a f u n c t i o n of an i n c r e a s i n g F
i.e.,
0'
when v o c a l p u l s e s of a c o n s t a n t shape and s i z e a r e o m i t t e d a t an
increasing rate.
The peak amplitude of t h e f i r s t v o c a l p e r i o d
A. i s by d e f i n i t i o n a c o n s t a n t .
When s t a t i o n a r y p e r i o d i c c o n d i t i o n s
1
have been reached t h e measures A
P
f u n c t i o n w i t h i n c r e a s i n g Foe
$
'
Be and Aa v a r y i n an o s c i l l a t o r y
Superimposed i s a 6 d ~ / o c t r i s e o f
and a 3 d ~ / o c t r i s e of Ae.
The spectrum envelope amplitude A
s
does not o s c i l l a t e and i n c r e a s e s at a r a t e of 6 d ~ / o c t .
The simple b e h a v i o r of As may t h u s be d e s c r i b e d a s
follows.
An i n c r e a s e of F by an o c t a v e i s followed by a r i s e i n
0
As by 6 dB. However, t h e number of harmonics w i t h i n any l i m i t e d
frequency range i s halved and t h e n e t g a i n i n Ae i s t h u s 3 dB only.
A ~ / Fo r~ Ai would be i d e a l parameters f o r p r a c t i c a l work
c o u l d be a u t o m a t i c a l l y measured!
-
i f they
Apart from t h e superimposed
o s c i l l a t i o n s t h e measure A might be u s e f u l f o r t h e e x t r a c t i o n of
P
formant amplitude p a r a m e t e r s i n vocoders. It h a s t h e b e n e f i t of a
r e l a t i v e small average i n c r e a s e w i t h i n c r e a s i n g F
The A
parameter
a
which i s t h e most commonly used measure of formant amplitude i n
0'
a u t o m a t i c speech a n a l y z i n g systems i s j u s t as s e n s i t i v e t o t h e
s u p e r p o s i t i o n e f f e c t as Be and A
P
and h a s t h e a d d i t i o n a l disadvantage
of Fo p r o p o r t i o n a l i t y .
It i s always of i n t e r e s t t o s e e how a t h e o r y developed
from a mathematical model compares w i t h t h e t r u e system.
e a r l i e r progress repor
6' gave
)
d a t a derived
male s p e a k e r u t t e r i n g t h e vowel
fundamentals.
[E]
The
from measurements of a
a t a large v a r i e t y of voice
An i l l u s t r a t i o n w a s a l s o g i v e n of t h e time-frequency-
i n t e n s i t y spectrum of a sample phonated a t a g l i d i n g p i t c h .
p a r t i c u l a r speaker W.J.
This
executed a h i g h degree of c o n t r o l over h i s
v o i c e and t h e s u p e r p o s i t i o n e f f e c t s were r a t h e r similar comparing
t h i s sample w i t h t h a t of a s y n t h e t i c i m i t a t i o n .
t
1
I
0
0.5
1
F i g . 1-2.
-t
t
I
I
sec.
0
0.5
1
Broad-bard Sonagram; nnd c u r v e s of o v e r a l l a m p l i t u d e
( i n t e n s i t y ) and s e l e c t i v e m e a s u r e s of P i r s t and s e c o n d
f o r m a n t a m p l i t u d e of a s u s t a i n e d vowel [ E ] a t a g l i d i n g
p i t c h p r o d u c e d b y t h e s p e e c h s y n t h e s i z e r OVE I1 and by
Note t h e b e h a v i ~ ro f s e c o n d
a human s u b j e c t (B.L.).
f o r m a n t a m p l i t u d e o f t h e hm,m s p e a k e r .
-t
sec.
(3)
Fant, G., Fintoft, K., Liljencrants, J., Lindblom B., and
M&rtony, J r "Formant amplitude measurements" Paper
C2 presented at the Speech Communication Seminar,
Stockholm 1962.
.
,
(4) F m t , G .: "Acoustic Anslysis 2nd Synthesis of Spoech with
Applications to Swzdish", Sricsson Technics 5 (1 959).
.
: i~Myoelastic-aerodynamic theory of voice
prod~ction~.~,
J. of Speech and Hearing Research, 1 (1 958)
pp. 227-2440
( 5 ) van den Berg, Jw
(6) Peterson, G.E, and McKinney, N.P. : "The measurement of
speech powerIi,Phonetica, 1 (1 961 ) pp. 65-84
B.
SPECTROGWHIC STUDY OF TFE DYNAMICS OF VOWEL ARTICULA!J?ION
B spectrographic study of vowel reduction has recently
been concluded (' )
.
It will be briefly summarized below.
Vowel reduction is said to be a characteristic feature
of languages with heavy stress but has to certain extent also been
associated with rate of utterances and contextual influence. There
is some evidence in the literature that, articulatorily as well as
acoustically, the process of reduction amounts to centralieation.
Thus in the acoustic domain a reduced vowel is located somewhere
along a continuum whose extreme ends are the formant pattern of
the unaffected vowel and that of the neutral vowel or schwa.
An experiment was designed to test this hypothesis
and to provide some insight into the dynamic properties of vowel
articulation.
It involved the examination of vowels pronounced
under varying timing conditions and in systematically varied
consonantal environment.
24 nonsense words were formed by commuting /I, 9 ,
Y, e, a,
8,
3,
tr/ in three consonantal environments: /b-b/,
d
and /
Preliminary experimentation prcceded the selection
of sentence frames that generated durations of these vowels within
80 to 300 msec. There were four carrier phrases. Each made up
one list in which each of the 24 CVC syllables occurred 5 times.
Download