Guerino Mazzola (Fall 2014©): Introduction to Music Technology III Digital Audio III.9 (We Oct 29) Phase vocoder for tempo and pitch changes Guerino Mazzola (Fall 2014©): Introduction to Music Technology Phase Vocoder This algorithm is built in order to enable tempo and pitch changes of a digital sound file. Tempo change: Same music, but played at a different tempo Pitch change: Same tempo, but transposed pitch. The algorithm was first described by James L. Flanagan and R.M. Golden in a paper “Phase Vocoder” published in Bell System Technical Journal, Vol. 45, No. 9, p. 1493, November 1966. James L. Flanagan Phase Vocoder Guerino Mazzola (Fall 2014©): Introduction to Music Technology The basic technique starts with so-called resampling. amplitude time Δ amplitude time Δ’ frequencies’ = frequencies. Δ/Δ’ problem: sound changes dramatically! Why??? Phase Vocoder Guerino Mazzola (Fall 2014©): Introduction to Music Technology The fundamental idea here is this: 1. Construct a new sample with longer duration + same pitch ⇒ back to original duration + higher pitch via resampling 2. Construct a new sample with shorter duration + same pitch ⇒ back to original + lower pitch via resampling. So basically we are dealing with the time change problem! The procedure is that we first cover the original signal by a sequence of sound frames of equal length, but in order to grasp their commonalities, we choose overlapping frames. Typically this is achieved by 75%, and the frame duration is typically 1/20 sec (corresponding to 20 Hz fundamental frequency for finite Fourier). amplitude time Guerino Mazzola (Fall 2014©): Introduction to Music Technology Phase Vocoder The idea is to work on these frames, processing them on the frequency space, and then generating a synthesis sound by adding these new frames with different overlapping times and thereby changing the tempo of the overall signal: Guerino Mazzola (Fall 2014©): Introduction to Music Technology Phase Vocoder A frame is generated from the original sample by multiplying it with a Hanning window function H(t) amplitude time amplitude time amplitude time Guerino Mazzola (Fall 2014©): Introduction to Music Technology Phase Vocoder amplitude time D First step of the algorithm: Analysis The frame is the transformed to frequency representation via FFT. The fundamental frequency is of course f = 1/frame duration = 1/D. The highest frequency is fs = n.f, so that we have n frequency intervals from 0 to fs(n-1)/n Hz. Attention: n has nothing to do with the original sample frequency of the signal!! The temporal delay of ¼ frame has then 2n/4 (temporal) samples; this number is called analytical hop size hopa. In other words, we have the equation hopa/2fs = 2n/(4.2nf) = D/4 = Da = analytical hop time between successive frames. Phase Vocoder Guerino Mazzola (Fall 2014©): Introduction to Music Technology What is the problem now? The reproduction of the frames with different distances causes phase problems: synthetical hop time analytical hop time Da Ds Phase Vocoder Guerino Mazzola (Fall 2014©): Introduction to Music Technology sin(2πft) Second step (processing) : We look at the phase problem: sin(2πf(t+Da)) = sin(2πft+ΔΦ) ΔΦ can be calculated, omit this! ΔΦ = 2πf.Da , f = ΔΦ/2πDa = “true frequency”. Frame i1 Third step (synthesis): Replace Da by the synthetic frame distance Ds and then set a new phase of frame i ΔΦs, i = ΔΦs, i-1 + 2πf.Ds Frame i take sinusoidal signal component where the (i-1)th phase has been calculated by recursion. Correct the complex coefficients of the FFT transform accordingly. FFTransform back, multiply each frame by a Hamming curve and add it all.