Signal Reconstruction from its Spectrogram Radu Balan IMAHA 2010, Northern Illinois University, April 24, 2010 Overview 1. Problem formulation 2. Reconstruction from absolute value of frame coefficients 3. Our approach – Embedding into the Hilbert-Schmidt space – Discrete Gabor multipliers – Quadratic reconstruction 4. Numerical example 2/23 1. Problem formulation • Typical signal processing “pipeline”: In Analysis Processing Synthesis Out Features: Relative low complexity O(Nlog(N)) On-line version if possible 3/23 The Analysis/Synthesis Components: x xH <·,gi> c c gˆ iI i y i y ci g i c l (I ) 2 Analysis ci h , g i yH iI Synthesis Example: Short-Time Fourier Transform ck , f x, g k , f t /kb F ggk ,kf, (f t()t )ee22ifift kb)) gg((ttkb I (k , f ); k Z ,0 f F 1 Z Z F H l 2 (Z ) 4/23 x(t+kb+M:t+kb+2M-1) x(t+kb:t+kb+M-1) * * = = g(t) x(t+kb)g(t) x(t+(k+1)b)g(t) fft f ck,F-1 ck,0 fft ck+1,F-1 ck+1,0 Data frame index (k) 5/23 ck,F-1 ck+1,F-1 ck,0 ck+1,0 ifft ifft * * = = ĝ(t) + 6/23 Problem: Given the Short-Time Fourier Amplitudes (STFA): d k , f x, g k , f kb M 1 x(t ) g (t kb)e 2ift / F t kb we want an efficient reconstruction algorithm: Reduced computational complexity On-line (“on-the-fly”) processing ck,f |.| dk,f Reconstruction x 7/23 • Where is this problem important: – Speech enhancement – Speech separation – Old recording processing 8/23 2. Reconstruction from absolute value of frame coefficients • Setup: – H=En , where E=R or E=C – F={f1,f2,...,fm} a spanning set of m>n vectors • Consider the map: N : E n / ~ R m , N ( x) x ~ y x zy, x, f k 1 k m for somescalar | z | 1 • Problem 1: When is N injective? • Problem 2: Assume N is injective, Given c=N(x) construct a vector y equivalent to x (that is, invert N 9/23 up to a constant phase factor) N : R / ~ R n m x ~ y x zy, , N ( x) x, f k 1 k m for somescalar z 1 Theorem [R.B.,Casazza, Edidin, ACHA(2006)] For E = R : • if m 2n-1, and a generic frame set F, then N is injective; • if m2n-2 then for any set F, N cannot be injective; • N is injective iff for any subset JF either J or F\J spans Rn. • if any n-element subset of F is linearly independent, then N is injective; for m=2n-1 this is a necessary and sufficient condition. 10/23 N : C n / ~ R m , N ( x) x ~ y x zy, x, f k 1 k m for somescalar | z | 1 Theorem [R.B.,Casazza, Edidin, ACHA(2006)] For E = C : • if m 4n-2, and a generic frame set F, then N is injective. • if m2n and a generic frame set F, then the set of points in Cn where N fails to be injective is thin (its complement has dense interior). 11/23 3. Our approach Recall: gk , f (t ) e2if t kb g(t kb) , (k, f ) I H l 2 (Z ) , I (k , f ); k Z ,0 f F 1 Z Z F • First observation: d 2 k, f x, g k , f 2 tr K x K g k , f * K x , K gk , f HS K x ( y ) y, x x , K g k , f ( y ) y, g k , f g k , f E=span{Kgk,f} x K Signal space: l2(Z) Hilbert-Schmidt nonlinear embedding Kx Kgk,f Hilbert-Schmidt: HS(l2(Z)) 12/23 • Assume {Kgk,f} form a frame for its span, E. Then the projection PE can be written as: PE , K g k , f k, f Qk , f HS where {Qk,f} is the canonical dual of {Kgk,f} . Frame operator X S ( X ) X , K gk , f k, f HS K gk , f Qk , f S 1 ( K g k , f ) 13/23 • Second observation: since: gk, f M f T k g where M : h Mh(t ) e 2it / F h(t ) , T : h Th(t ) h(t b) it follows: K gk , f f k K g where : X X MXM * , : X X TXT * 14/23 • However: S S and S S Qk , f Q0,0 k f • Explicitely: Q k , f t ,t 1 2 e 2if ( t1 t 2 ) / F Q 0 , 0 t kb,t kb 1 2 15/23 Short digression: Gabor Multipliers • Goes back to Weyl, Klauder, Daubechies • More recently: Feichtinger (2000), BenedettoPfander (2006), Dörfler-Toressani (2008) ST FT Multiplier: m m , g g d 2 Gabor Multiplier: m Lattice m( ) , g g Theorem [F’00] Assume {g , Lattice} is a frame for L2(R). Then the following are equivalent: 1. {<.,g>g,Lattice} is a frame for its span, in HS(L2(R)); 2. {<.,g>g,Lattice} is a Riesz basis for its span, in HS(L2(R)); 3. The function H does not vanish, H (e) Lattice e( ) g , g 2 , e DualGroup( Lattice) 16/23 • Return to our setting. Let H ( , m) e mf 2i k F kZ f Z F g, gk, f 2 Theorem Assume {gk,f}(k,f)ZxZF is a frame for l2(Z). Then Kg ; (k, f ) Z ZF 1. is a frame for its span in HS(l2(Z)) iff for each mZF, H(,m) either vanishes identically in , or it is never zero; 2. is a Riesz basis for its span in HS(l2(Z)) iff for each mZF and , H(,m) is never zero. k,f 17/23 • Third observation. Under the following settings: – For translation step b=1; – For window support supp(g)={0,1,2,...,L-1} – For F2L • The span of Kg ; (k, f ) Z ZF is the set of 2L-1 diagonal band matrices. Kg 0 0 0 0 g (0) 0 2 g (0)g (1) k,f 0 2 0 0 0 g (0)g ( L 1) 0 0 g (0) g ( L 1) 0 g (0) g (1) g (1) 0 0 0 g ( L 1) 0 0 0 2 0 0 g 18/23 • The reproducing condition (i.e. of the projection onto E) implies that Q must satisfy: Xgk , f , gk , f Qk , f t ,t X t1 ,t2 , for all X and t1 , t2 band 1 2 k, f By working out this condition we obtain: Q 0 , 0 t ,t 1 F 1 2it e d 0 g ( p)g ( p )e2ip p 19/23 • The fourth observation: We are able now to reconstruct up to L-1 diagonals of Kx. This means we can estimate 2 xt , xt xt 1 , , xt xt L 1 Assuming we already estimated xs for s<t, we estimate xt by a minimization problem: 2 2 2 min x x K x t ,t w1 xxˆt 1 K x t ,t 1 wJ xxˆt J K x t ,t J for some JL-1 and weights w1,...,wJ. Remark: This algorithm is similar to Nawab, Quatieri, Lim [’83] IEEE paper. 20/23 Reconstruction Scheme • Putting all blocks together we get: Stage 1 Stage 2 zˆt0 |ck,F-1|2 W0 I F F T |ck,0|2 zˆtL1 WL-1 Least Square Solver xˆt t W ( z 1 ) Q0,0 t ,t z t t 21/23 3. Numerical Example 22/23 Conclusions All is well but ... • For nice analysis windows (Hamming, Hanning, gaussian) the set {Kgk,f} DOES NOT form a frame for its span! The lower frame bound is 0. This is the (main) reason for the observed numerical instability! • Solution: Regularization. 23/23