Reduction of Additive Noise in the Digital Processing of Speech Avner Halevy AMSC 664 Final Presentation May 2009 Dr. Radu Balan Department of Mathematics Center for Scientific Computation and Math. Modeling 1 Background and Goal • Digital speech signals • Clean signals are degraded by noise – From environment – From processing (recording, transmitting, etc.) • Speech enhancement: reduce noise without distorting original (clean) signal Scope and Goal • • • • Focus on additive white Gaussian noise Improve quality and intelligibility Use short time analysis (frames) Explore two algorithms: – Spectral Subtraction – Iterative Wiener Filtering • Test objectively using TIMIT 3 Spectral Subtraction • Estimate the magnitude of the noise spectrum when speech is absent • Subtract the estimate from the magnitude of the spectrum of the noisy signal • Keep the noisy phase • Inverse Fourier to obtain enhanced signal in time domain 4 Spectral Subtraction y ( n) x ( n) d ( n) Y ( ) X ( ) D( ) Y ( ) Y ( ) e i y ( ) p 1/ p p ˆ X ( w) Y ( ) Dˆ ( ) Xˆ ( ) Xˆ ( ) e i y ( ) xˆ (n) Inverse Fourier { Xˆ ( )} 5 Wiener Filter - Theory y ( n) x ( n) d ( n) xˆ (n) h(k ) y(n k ), n k e(n) x(n) xˆ (n), minimize Pxx ( ) H ( ) Pxx ( ) Pdd ( ) E e 2 ( n) Pxx ( ) ( ) H ( ) , where ( ) 1 ( ) Pdd ( ) 6 Iterative Wiener - Theory V ( z) g 1 k 1 ak z p k all - pole model p x(n) ak x(n k ) gw(n) time domain k 1 Pxx ( ) g2 1 kp1 ak e jk 2 Estimate a, x iterativel y using two step linear procedure : 1. Find argmax p a | x (using linear prediction ) a 2. Use estimate for a to find argmax p ( x | a ) (using WF) x 7 Implementation, Validation, Testing • Both algorithms were implemented in MATLAB on a standard PC • Implementation validated using zero noise, stability ensured using small nonzero noise • Testing based on TIMIT • Main objective measure used: segmental SNR Noisy Clean Testing 9 Comparison of SS and IW 10 Iterative Wiener Spectral Subtraction Comparison of SS and IW 11 Comparison of Results 12 Comparison of Results Comparison of Results Convergence of IW: Female, 1st sent. Convergence of IW: Female, 2nd sent. Convergence of IW: Male, 1st sent. Summary of Iteration Results Conclusions • Objectively, IW produced better results • Subjectively, both methods suffer from severe artifacts (musical noise) • Possible improvement by more sophisticated noise estimation Questions? Bibliography • [1] Deller, J., Hansen, J., and Proakis, J. (2000) Discrete Time Processing of Speech Signals, New York, NY: Institute of Electrical and Electronics Engineers • [2] Quatieri, T. (2002) Discrete Time Speech Signal Processing, Upper Saddle River, NJ: Prentice Hall • [3] Loizou, P. (2007) Speech Enhancement: Theory and Practice, Boca Raton, FL: Taylor & Francis Group • [4] Rabiner, L., Schafer, R. (1978) Digital Processing of Speech Signals, Englewood Cliffs, NJ: Prentice Hall • [5] Vaseghi, S. (1996) Advanced Signal Processing and Digital Noise Reduction, Stuttgart, Germany: B.G. Teubner • [6] Lim, J. and Oppenheim, A.V. (1978) All-pole modeling of degraded speech, IEEE Trans. Acoust. Speech Signal Process., 26(3), 197-200 21