STANFORD ARTIFlCl,4t INTELLIGENCE LABORATORY : MEMO AIM- 208 STAN-CS-73 -379 THE HETRODYNE FILTER-AS A TOOL FOR ANALYSIS OF TRANSIENT WAVEFORMS BY JAMES ANDERSON MOORER SUPPORTED BY ADVANCED RESEARCH PROJECTS AGENCY ARPA ORDER NO. 457 JULY 1973 COMPUTER SCIENCE DEPARTMENT School of Humanities and Sciences STANFORD UNIVERSITY STANFORD ARTIFICIAL INTELLIGENCE LABORATORY MEMO A I M-208 JULY 14, 1973 COMPUTER SCIENCE DEPARTMENT REPORT CS-379 THE HETROOYNE FILTER AS A TOOL FOR ANALYSIS OF TRANSIENT WAVEFORMS bY James Anderson Moorer ABSTRACT: A method of analysis of transient waveforms is discussed. Its properties and limitations are presented in the context of musical tones. The method is shown to be useful when the risetimes of the partials of the tone are not too short. An extention to inharmonic partials and polyphonic musical sound is discussed. This research was supported in part by the Advanced Research Projects Agency of the Office of the Secretary of Defense under Contract No. SD-183. The views and conclusions contained in this document are those of the author and should not be interpreted as necessarily representing the o f f i c i a l p o l i c i e s , either expressed or implied, of the Advanced Research Projects Agency or the U.S. Government. Reproduced in the USA. Available from the National Technical Information S e r v i c e , S p r i n g f i e l d , V i r g i n i a 22151 List of Figures Figure 1 - Frequency response of a hetrodyne filter with center frequency of 400 Hz and summation period of 10 milliseconds. Figure 2 - Frequency response of a hetrodyne filter with center frequency of 100 Hz and summation period of 10 milliseconds. F i g u r e 3 - Frequency response of a hetrodyne filter with center frequency of 300 Hz and summation period of 10 milliseconds with the signal beginning halfway through the summation. Figure 4 - Hetrodyne filter applied to a signal which is a sum of sinusoids of frequencies 100, 200, and 300 Hz, with exponential attacks of time constants 30, 20 and 10 milliseconds respectively. The graphs are of the outputs of a hetrodyne fi lter with summation period of 10 mi I I iseconds. The center frequencies are, top to bottom, 100, 200 and 300 Hz. F i g u r e 5 - Same data as in figure 4, but the attacks are linear rather than exponential. F i g u r e 6 - The magnitude and phase of the output of the hetrodyne f i l t e r w h e n applied to a 132 Hz guitar tone. The apparent modulation i s t h e r e s u It of beating with an inharmonic partial at 186 Hz. F i g u r e 7 - Frequency frequency frequency, is not response of a hetrodyne f i I ter exactly an integral when the center multiple of the summation INTRODUCTION i f L The analysis of the attack transients of vocal or musical tones goes back as far as 1932 with Backhaus’ [I,21 tunable resonator and drum recorder. Lute a n d C l a r k [3,4,51 used fi I tering methods to L s e l e c t p a r t i a l t o n e s f o r a n a l y s i s a n d r e c o r d i n g . More r e c e n t l y , w i t h the advent of computer music, analysis of musical instruments for the purpose of simulation of timbre has been done by what wil I be called IL a “hetrodyne f i I ter” f o r w a n t o f a b e t t e r n a m e , Beauchamp E6l analysed each partial of a complex waveform by first multiplying the waveform by a sin and cosine at question. the frequency of the partial The result was then low-pass filtered, then squared and s u m m e d . F r e e d m a n 17,8,91, and later Keeler finite in summation over one period [10,111 used a discrete of the fundamental frequency in place of Beauchamp’s low-pass filter, an effect which as we shall show later conveniently places a zero of transmission at all harmonic partials other than the one in question. It is the purpose of this article to explore this method,’ report its characteristics, its limitations, its uses and some simple extensions. THE METHOD L I L Let us define the hetrodyne fi I ter as fol lows, We begin with a d i s c r e t e f u n c t i o n F i w h i c h r e p r e s e n t s a c o n t i n u o u s f u n c t i o n F(t) a t d i s c r e t e i n t e r v a l s t=ih, where h is the time between samples. h is cal l e d t h e “ s a m p l i n g i n t e r v a l . ” T h e r e c i p r o c a l o f h i s c a l l e d t h e “sampl ing frequency” or the “sampling rate.” Let us define a and b as fol lows: &N-l a = I i Fi cos(moih + eo) 1=Qf I I z b = , UC &-N-l Fi sin(woih + 00) t i=a (11 wi I I be c a l l e d t h e “ c e n t e r f r e q u e n c y ” . Without loss of generality, we may define TO = 80 + ah (2) i and thus rewrite the sum as going from 0 to N-l. N-l a = >Fi eos(aoih + 'P,) PO N-l b = ) Fi sin(woih + 'Oo) i.=O i L (3) Thi s change of filter variables hides the t i me-pos i t i on of the in the phase. We must remember that as the filter is advanced t h r o u g h t i m e , the phase angle will increase, and that any results which depend on this phase angle will be functions of time. Since the summation operation is linear, we may represent the input waveform Fi as a sum of sinusiods and may thus examine the response of the fi I ter to a sinusiodal excitation, as is commonly d o n e w i t h I inear fi 1 t e r s . L i N-l a = r A cos(tih + vo) i=O N-l b = x A cos(wih + Q) sin(xoih + u)o) i=O (4) With the he I p o f t h e s u m m a t i o n c a l c u l u s 1121 and some trigonometric ? identities, we may compute the summation in closed form without e r r o r a s f o I lows: L A a =:2N i 1 L I L i L + sinr(tim)Nh/2] cos[(tiw)(N-l)h/2++w] sin[(w=wo)h/2] sin[ w-cl,o)Nh 2] COS[(~-~O)(N-~)~/~+CD~~O] $In[(w=wo)h/2] (5) b=& c sin[(dao)Nh/2] sin[(wftuo)(N-l)n/2+co+Qo sin[(w+wo)h/2] + sin[( wwo)Nh/2] sin[(wwo) (N-l)h/2+Q- CDO sin[ (up(go) h/2 Th S S simpl fied somewhat by computing the sum of the squares of a and b. not a very useful expression as it stands, but it may be a2+b2= iI3 A2 ( ~~~1+ -f&gyp sin[(&Jo)Nh/2]sin[(w-w)Nh/2] $- sin[(&cl)o)h/2] sin[(a=wo)h/2] cos[wo(N-L)h + "'] (6) Now if one chooses N to be such that e Nhwo =27&k= any integer # 0 L L Then some terms in equation 6 collapse to produce a2+b2 = A2 4N2 i L 2c0s[~oh - 2uo] + sin[ (u+dh/2] sin[ (w-ruo)h/2] ) f t L L T h e s q u a r e -root o f t h e a b o v e e x p r e s s i o n I e hetrodyne f i I ter. will be termed the ” magn i t ude” of the output of the hetrodyne filter. The arctangent of the ratio of a to b will be called the “phase” of the output of the Thi s process i s s i m i l a r t o t h e d i s c r e t e F o u r i er transform, e x c e p t t h a t o n l y o n e f r e q u e n c y i s p r o c e s s e d i n s t e a d of many. The results of this analysis can e i the output of the DFT by setting the period of the center frequency to a L i easily be generalized to represent multiple of the sampling interval. If one further assumes that the frequency of the input i sinusoid is close to the center frequency, then we see that lim w-w0 a2+bp =A2 4N2 t, = $2 O+N2+0 define au)= ~-u)o W-W sin~2,oh[(N-l)/2+~+Nsin~f&h[(N-l)/2+~]j cos{2woh[(N-1)/2+oJ~Ncos@h[(N-1)/2+0;]? if N > 1 then L lim $ a tan [&d-4 (N-1)/2 + a13 , I L T h u s w e s e e t h a t i n t h e l i m i t , t h e m a g n i t u d e o f t h e o u t p u t ofthe f i I ter becomes independant of time and becomes a measure of the amplitude of the input sinusoid, The phase of the output filter remains always a function of time, function of time, i the but is also a linear its slope being determined by the difference of the center frequency and the input frequency. i L of L This reveals a method of determining the amplitude of the input sinusiod and getting a better estimate of its frequency. If t h e L center frequency of the summation is near the actual frequency of the input sinusoid, the phase will be very nearly a linear function of t inie, thus we may find the frequency deviation by fitting the phase with a straight line and observing its slope. The consequences of choosing N as above are significant. If the center frequency is L L L a multiple of some fundamental frequency, then we may choose N to coincide with the period of the fundamental and thus cancel out frequency. of all the harmonic partials except the center Figure 1 shows the log of the magnitude of the output t h e hetr&yne filter for a range of sinusoidal inputs. The , center frequency in this plot is 400 Hz and the summation period, Nh, i is L L L f k 10 mi I I i seconds. Figure 2 shows the log of the magnitude versus frequency for a center frequency of 100 Hz and the same summation per i od. Notice the zeros of t r a n s m i s s i o n at al I multiples of the summation period except the center frequency. a L ERROR ANALYSIS i F r e e d m a n [91 a n d K e e l e r [111 both show that this method is f i s u f f i c i e n t l y a c c u r a t e f o r t h e i r purposes even when the input L iL i fL L L L L I L is not a perfect Keeler does not even bother negl igable. The above work may well seduce one as it did the author into believing that this is a perfectly accurate method, universally appl icable. This is not so. T o persue t h e m a t t e r f u r t h e r , l e t u s t h e e q u a t i o n s s o m e w h a t . We s h a l l c o m p u t e n o t t w o reformulate summations but one: &-N-l G = t o? x Fie iwoh+@o (9) i=a The result will be a complex quantity whose real and imaginary parts - correspond t o t h e a a n d b d i s c u s s e d e a r l i e r . We w i l l b e i n t e r e s t e d in the magnitude and the phase shall of G. For the input waveform, Fi, we take a complex sinusoid with exponential decay. G L s inusoids. us with an elegant proof that the error in doing so is and presents . i L of computing the summation at each p o i n t , b u t a t r e g u l a r i n t e r v a l s o n l y , L i L sum signal o! = F-' .(a+j,)ih+e e jLI,oih+Qo i=o = eah[a+j(uI+wo)]+j(@+Oo) eNh'a-+j(w'wo)l- 1 (10) Again, we see the magnitude is related to the amplitude of I the input sinusoid and the phase drift with time is related to the t L L ,‘L I frequency causes differece. T h e e x p o n e n t i a l d e c a y o f t h e i n p u t s i g n a l depending on other harmonic partials, cancellation of imperfect of the attack, the speed and the deviation can be important. A more reveal ing case would be to assume the signal begins and rises exponentially to its steady-state value at zero amplitude, and that the signal begins somewhere during the summation, say at i=@. t jh(urtcuo) - 1 e _ p&+p) e(N-p)hb+hhdl e L L 1 - 1 This is equivalent to setting N to some smaller value. As the filter progresses through the attack, window wi I I t h e e f f e c t i v e widtl h of the approach N, and the response of the filter w i I I become more representative. L hb3+jh+~~)l _ (11) if N and the center frequency see that even Here we are L careful ly chosen, the frequency response is not the same, Figure 3 L L shows the response for a ten ter frequency of 300 Hz, a s u m m a t i o n interval of 10 mi I I i seconds, and P=NJZ. W e s e e that the neighboring not canceled out. This shows that if the signal begins harmonics are anywhere within the sindow, the output should not be taken to be an of the amplitude of the partial. accurate indication It i s e x a c t l y taking N to be a non-multiple of the period of the analagous to f u n d a m e n t a l f r e q u e n c y . F i g u r e 4 shows the output for an averaging The input was a sum of 300 Hz. i frequent i es 180, e center frequencies of 100, 200, and of 10 mi I I iseconds and window 200, and 300 Hz with exponential attacks of time and 10 milliseconds respectively. 30, 20, constants sinusoids of unit amplitude and The leakage among the harmonics is apparent here. If t h e a t t a c k i s n o t e x p o n e n t i a l , a n o t h e r f o r m o f d i s t o r t i o n i can occur. F i g u r e 5 shows the magnitude of the filter output for the same i f r e q u e n c i e s a s f i g u r e 4 , b u t w i t h linear and ten ter input attacks rather than exponential. i I I i a sinii lar to phase is caused by of a guitar an c a u s e an e f f e c t Figure 6 shows the magnitude and ampl i tude modulation. of the fundamental modulation i i n h a r m o n i c partials can The presence of note at 132 Hz. The apparent inharmonic partial at 186 H z , the frequency of a known box resonance. Another source of error is that of frequency quantization. N L can i not in general be the fundamental frequency and still have N be an integer. This can be tolerated, but i multiple frequency. t L chosen such that Nh is exactly the period of it also implies that o f Zn/Nh r a t h e r If we do not than set a the the center frequency must be a multiple center of the frequency fundamental to exactly a L L L L L i;i L t L 7 L !L. m u l t i p l e o f Zn/Nh, w e g e t i m p e r f e c t c a n c e l l a t i o n o f a p o l e a n d a zero. Figure 7 shows the magnitude versus frequency of such a case. Note the doublet around 400 Hz, the center frequency. Since Nh is not exactly the period of the fundamental frequency, the harmonic partials are Fur thermore, most not exactly cancelled string instruments show integral multiples of out. a deviation from perfect the fundamental frequency. This deviation also contributes to leakage among the harmonics. Equation 10 may be to determine exactly how much leakage is present. used A SIMPLE EXTENSION Although there problem of would seem :L before analysis of partials may L L LI no good solution to the the inharmonic partials. if the attack not too swift, it can be filtered time of the partial in question is out be the note beginning within the summation period, there is a technique for dealing with Ii to then the harmonic partials is done. The harmonic be filtered out to analysis allow of the inharmonic partials. is The filter advocated here a comb filter. This is described simply by the recurrence relation: *I-l = x* - x* m (12) L L L L L I i With frequency response peJdl = Jsin*(*h) . We see that the multiples of the comb filter + [co&nh)-l]* has a of transmission at all base frequency l/mh. The only hazards with the comb filter are those of transient response i zero (13) and preturbing the harmonic p a r t i a l s . T h e t r a n s i e n t r e s p o n s e o f a c o m b f i l t e r i s e x p l i c i t . It is identical ly zero beyond mh s e c o n d s . If t h i s can be tolerated, then the f i lter may be useful, If the frequencyy o f s o m e h a r m o n i c p a r t i a l f a l l s n e a r o n e o f the zeros of transmission of the comb, it will be attenuated. We m a y F L prevent this by a At those zeros of t resonater is method which appeared in the comb that we wish to eliminate, a digital used configuration are G o l d and R a d e r [13,141. to cancel out the zero. described in ful I detail The details of the in the references and will not be repeated here. r L L I L L I t L L I i For analysis of an inharmonic partial, the harmonic partials may a l l be e l i m i n a t e d w i t h a s i n g l e comb, subject to the limitation that the partials may not be exact multiples of the fundamental, and that the frequency of the fundamental becomes quantized when the comb length, m, is chosen. The comb wi I I, of course, attenuate the signal under a n a l y s i s b y s o m e a m o u n t . We m a y p r e d i c t t h e a t t e n u a t i o n f r o m equation 13 and then multiply the results of the hetrodyne analysis by the reciprocal of the attenuation factor to obtain a better estimate of the amplitudes of the partials. CURRENT USES The technique of combing out unwanted signals and then I L waveform is the basis for an automated the remaining analysing system for the analysis of polyphonic music currently When two or more instruments are playing developed by the author. simultaneously, a pitch and duration being Fourier analysis is used to get an estimate of the of each note. All notes but one are then I eliminated by combing, L determine the attack time of the note and to correct the estimated pitch of the note. This L piece. The result and the hetrodyne filter is applied to is iterated for each of the notes in is subjected the to a heuristic analysis and is eventually displayed as a musical score of the piece under analysis. i Also under L parameters of musical tones trenielo, L study by a colleague and steady-state is how the physical such as risetimes of the partials, value contribute to the perceived timbre o f t h e i n s t r u m e n t . The hetrodyne method as outlined above is used to determine these parameters from digitized recordings of tones of L ! L L I b actual musical instruments. I CONCLUS ON The hetrodyne filter has been shown to be a useful tones as long as its for the analysis of the partials of musical L 4L L I imi tat ions are observed. quick, perfect It can fail if the frequencies of integral multiples of sampling rate is significant, method if the attack times are the partial3 too deviate too far from the fundamental frequency, or if the so low that frequency quantization effects become It can also fail if substantial frequency modulation (vi brato) i s present. An extention of the method to tones with inharmonic partials and even to multiple use of the comb simultaneous notes was shown to be possible by filter as long as the effects of the transient response of the comb was judged to be tolerable. These methods are currently in use by the author and his colleagues in the analysis of digitized musical sound, , L t 1 L i (_/---/(_ ,A-c.’ir--- --Yz==--/-,--- L L L t L t L L L A-- /--c -*----/---c--- -/’ _--c‘---, /--------c.-x=-m a---L-=z/-c.--------*- /--c/----c--/_-L. -t// 3 . 1 l I i L 1 L L I” C .-I x 0 c (1, 3 cr ul Lk i L ! i t i . P m 64 - t.5 - :: .7 - o- I-” - -- 0 0 . 1- .2 .4 71.d P-- - r-- f----- 25 I f---- f------ r- 50 f-- Figure 4 r-- i--- I-- r-- 75 r-=-- r-- f--- I i i I , i L L i i- -Ln iU - Li, i'l J L c_:, -8 I----J <.Q-,!-I - I L i L f i i i I I I I I I L REFERENCES 1) H. Backhaus, "Ober Ge i genk I ange, " Zei tschr i f t f u r Techni sche Physik, Vol 8, #ll, 1 9 2 7 , pp509-515 L r L 2) H. Backhaus, L L in der Austr i k, " Ze i tschri ft fur Technische Physik, vol 13, #l, ~~31-46, 1932 3) D a v i d A . L “Uber d ie Bedeutung der Ausgl eichsvorgange Lute “Phys i ca I Correlates of Nonpercuss i v e f l u s ical Instrument Tones, " PhD thesi s, MIT, 1963 4) D a v i d Lute, flelvil l e C l a r k , 'Physical Correlates of Brass Instrument Tones," J. Acoust, SOL Am, vol 42, ~1232, 1 9 6 7 . 5) D a v i d Lute, Nonpercussive flelvi I le Clark, 'Duration of Attack Transients of O r c h e s t r a l I n s t r u m e n t s , " J . A u d i o E n g . Sot., vol 13, #3, ~194, 1 9 6 5 . i I t 6) James W. Beauchamp, "A Computer System for Time-Variant Harmonic Analysis and Synthesi s o f M u s i c a l T o n e s , " i n "flusic by Computers, " J. W. Beauchamp and He i nz Von Foerster eds, John Wiley and Sons, i L t . York, 7) M . D . Freedman, "Analysis of M u s i c a l I n s t r u m e n t T o n e s , ' J, A c o u s t . SOL Am vol 41, ~793, 1 9 6 7 t i New A 8) M . D . Freedman, "A Method For Analysing Musical Tones," J Audio E n g . Sot., V o l 1 6 , #4, ~419, O c t . 1 9 6 8 ., . “ A Techni q u e f o r Ana Iysis o f M u s i c a l I n s t r u m e n t 9) M.D. Freedman, IL L Tones, ” PhD d i s s e r t a t i o n , U n i v e r s i t y o f I I I inoi s, Urbana. 1965 ‘The Attack Transients of Some Organ Pipes,” IEEE 10) J.S. K e e ler, Tran. and Electroacoustics, on Aud io V o l A U - 2 0 , #5, pp378-391, December 1372 t L i L I L i L I 11) J . S . K e e l e r , ‘ P i e c e w i s e - P e r i o d i c A n a l y s i s o f A l m o s t - P e r i o d i c Sounds and Musical Transients,” IEEE Tran. on Audio and Electroacoustics, Vol AU-20, #5, ~~338-344, December 1 9 7 2 12) R.W. Hamming, ‘Numerical Methods for Scientists and Engineers,” New ‘fork: McGraw-Hi I I, 1962 Rader, 13) C h a r l e s fl. Techniques in Bernard Go I d, the Frequency Domain,” Proc. “Digi tat Fi I t e r D e s i g n I E E E , v o l 5 5 , #2, F e b . 1 9 6 7 , pp149-171 i 14) B e r n a r d G o l d , C h a r l e s M. R a d e r , “Digital Processing of S i gna I s, ” McGraw-Hill Book Company, New York, 1969 I L L i L