Time Frequency Processing Direct FFT/IFFT Approach Andrew Nelder 0422391 Colin MacIntosh 0423916 University of Victoria ELEC 484 – Audio Signal Processing Abstract Vocoding technology has been around since the 1930s, where it was used to send securely encoded messages between two points. It can also be used to apply effects to the transmitted sounds, such as robotization and whisperization. These are not the only uses for vocoders, but they will be investigated in this project as well. Uses for phase vocoding are constantly changing as more uses for it appear in today’s world. For these new uses, particularly digital telecommunications, efficiency is a necessity, and as it stands, the standard Linear Predictive Coding (LPC) vocoders are no longer efficient enough. This project will develop an LPC, experiment with the sound effects and will describe the procedures necessary to improve on efficiency and quality while maintaining intelligibility. project is to produce a phase vocoder using the Fast Fourier Transform and Inverse Fast Fourier Transform Method described in the DAFX: Digital Audio Effects Textbook. Upon obtaining a basic working model, focused research can be produced to yield a variety of audio effects on theoretical constraints. 1 INTRODUCTION Due to the large increase in the number of commerical communications technologies, research into phase vocoder technology is becoming more valuable. Understanding the fundamentals of producing a phase vocoder is necessary to produce a system constrained by limited processing and computing power. The objective of this Figure 1: Block Diagram of Phase Vocoder Using STFT Figure 1 displays the steps necessary to implementa a phase vocoder. These steps describe the theory behind the Direct Fast Fourier Transform and Inverse Fourier Transform method of phase vocoding. Effectively, small packets of data (bins) are broken down in windows. The Fast Fourier Transform is applied to the signal in each of these windows. Following the Fast Fourier Transform signal manipulation occurs in the frequency-domain to produce a variety of audio effects. After these manipaulations are applied, the altered signal is then remodulated to the time-domain using the Inverse Fast Fourier Transform. These altered packets of data are reconstructed to form a new modulated signal. 2 BACKGROUND Improved Phase Vocoder Time-Scale Modification of Audio, by Jean Laroche and Mark Dolson, addresses the problem of “phasiness” present in time-scaled outputs of phase vocoders. Despite having many commercial applications, the phasiness created by vocoders has posed a major barrier to vocoders being used more frequently. Another artifact that vocoding leaves behind is “transient smearing.” This undesired effect causes the modified sound to have less “bite”, and is usually prevalent with sounds such as a piano attack. The paper focuses on two things: understanding the problem; what it is that is causing such artifacts, and also presents two new solutions to this problem, as well as reviewing two previously proposed ones. In New Phase-Vocoder Techniques for PitchShifting, Harmonizing and Other Exotic Effects, Laroche and Dolson present and examine two new “phase-vocoder-based techniques” that, when applied to a signal, allow the user to apply more effects than are usually used with vocoders. Pitchshifting, chorusing, harmonizing and partial stretching are amongst the new effects examined in this paper. Laroche and Dolson present the standard techniques used and comment on some of the more common drawbacks to using these methods. New methods to achieve the effects mentioned previously are offered. Phase-Vocoder – About This Phasiness Business, by Jean Laroche and Mark Dolson describes methodologies that may be utilized to reduce the “phasiness” of signals passing through a vocoder. Unfortunately due to the non-linear phase relationships inherent to most windowing methods, it can be extremely difficult to reduce these effects. Their methodologies describe completely discarding any physical phase calculations and present a rather surprising alternative: using fine-tuned approximations to calculate a series of theoretical phases. These methods are unfortunately very computer intensive and can sometimes yield a decrease in performance; however, it may prove fruitful so the paper itself has not been discarded. Chris Duxbury, Mike Davies, and Mark Sandler present “method using temporal information of musical audio” to improve the time-scaling effects whilst preserving pitch in their paper Improved Time-Scaling of Musical Audio Using Phase Locking at Transients. Duxbury, Davies and Sandler Under ideal situations, this project will fully incorporate a completely self-contained vocoder test application. The finalized objective will be to produce a piece of MatLab software that can perform the vocoding algorithm based on the Fast Fourier Transform and Inverse Fourier Transform Method. 4 TIMELINE 21/06 01/07 Write a project proposal detailing the steps required to complete the project. Task 3: Initial MatLab Design 3 IDEAL RESULTS 11/06 Task 2: Proposal 11/07 21/07 Task 1 Task 2 Task 3 Task 4 Task 5 The project was split into several smaller tasks that allow for manageability. The figure above details the timeline for the expected (red), worst case (green), and best case (violet) scenarios. Task 1: Initial Research Conduct a thorough research of journal articles for information on the implementation and uses of phase vocoders. Begin drafting a MatLab implementation of the theoretical phase vocoders and experimenting with effects. Task 4: MatLab Debugging Debug the initial MatLab designs and remove any fragments and errors in output. Task 5: Finalize Implementation Finalize the project implementation by producing seamless effects and optimized code. 5 DATA COLLECTION We will be using MatLab to implement the phase vocoder. Sources include articles from audio engineering publications such as IEEE Transactions on Acoustics, Speech, and Signal Processing, and Audio Engineering Society monthly journals. 6 BIBLIOGRAPHY [1] Laroche and M. Dolson. Improved phase vocoder time-scale modification of audio. IEEE Trans. on Speech and Audio Processing 7(3): 323-332, 1999. [2] J. Laroch and M. Dolson New phasevocoder techniques for real-time pitch shifitng, choruing, harmonizing and other exotic audio modifications. Journal of the Audio Engineering Society, 47(11):928-936, 1999. [3] M. R. Portnoff. Implementation of the digital phase vocoder using the fast fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(3):243248, June 1976 [4] Z. Settel and C. Lippe. Real-time musical applications using the FFT-based resynthesis. In Proc. Int. Computer Music Conference (ICMC), 1994 [5] Laroche, J; Dolson, M: Phase Vocoder: About this Phasiness Business. IEEE Trans. on Speech and Audio Processing 19-22 Oct, 1997. [6] De Götzen, Amalia ; Bernardi, Nicola & Arfib, Daniel: Traditional Implementations of a Phase Vocoder: The Tricks of the Trade Proceedings of the COST G-6 Conference on Digital Audio Effects December 7-9,2000