Sparse Time-Frequency Transforms and Applications. Bruno Torrésani http://www.cmi.univ-mrs.fr/~torresan LATP, Université de Provence, Marseille DAFx, Montreal, September 2006 B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 1 / 41 1 Introduction 2 Signal waveform representations Bases Frames Multiple frames More realistic time-frequency atoms ? 3 Coefficient domain models Hybrid random waveform models Estimation algorithms based on observed coefficients Estimation algorithms based on synthesis coefficients 4 Conclusion 5 References B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 2 / 41 Introduction Introduction During the last twenty years (and much more than that in fact): harmonic analysis has provided many new techniques for expanding signals into “elementary” waveforms. Redundant Gabor wavelet systems (frames) Wavelet bases MDCT and wilson bases Matching pursuit and cognates ... Most often, sparsity of the representation was a key issue. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 3 / 41 Introduction Introduction During the last twenty years (and much more than that in fact): harmonic analysis has provided many new techniques for expanding signals into “elementary” waveforms. Redundant Gabor wavelet systems (frames) Wavelet bases MDCT and wilson bases Matching pursuit and cognates ... Most often, sparsity of the representation was a key issue. In this talk: we review a number of such approaches, in view of a few selected applications. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 3 / 41 Introduction Introduction: What is sparsity ? A signal representation is sparse when most information is concentrated in a small amount of data (coefficients). For example, a sine wave is sparsely represented in the Fourier domain, not in the time domain. Sparsity is an “vague” concept. Ideally, the volume of data (number of coefficients for example) would be a good sparsity measure. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 4 / 41 Introduction Introduction: What is sparsity ? A signal representation is sparse when most information is concentrated in a small amount of data (coefficients). For example, a sine wave is sparsely represented in the Fourier domain, not in the time domain. Sparsity is an “vague” concept. Ideally, the volume of data (number of coefficients for example) would be a good sparsity measure. In noisy situations, this measure is generally polluted by a large number of small coefficients, originating from noise. Other measures may be used (entropies)... but they often do not yield the same results [Jaillet & BT 2003]. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 4 / 41 Introduction Introduction: sparsity: what for ? A sparse time-frequency representation concentrates the relevant information in a small amount of coefficients: the pdf of the coefficients is peaked at 0, and heavy tailed. Most popular applications Signal coding... if the cost of encoding the representation itself is not too high Signal modeling: expand signals into components that make sense. Denoising: most often, noise is not sparse. Source separation (exploiting dimension reduction). ... B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 5 / 41 Introduction 1 Introduction 2 Signal waveform representations Bases Frames Multiple frames More realistic time-frequency atoms ? 3 Coefficient domain models Hybrid random waveform models Estimation algorithms based on observed coefficients Estimation algorithms based on synthesis coefficients 4 Conclusion 5 References B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 6 / 41 Signal waveform representations Signal representations Signal waveform expansion: decompose a signal as a linear combination of “elementary waveforms” ψλ , often generated using simple rules. X x(t) = αλ ψλ (t) λ with αλ the coefficients, and ψλ the waveforms. Examples: Time-frequency atoms (MDCT or Wilson bases, Gabor atoms,...) Time-scale atoms (wavelets, multiwavelets,...) Chirplets,... Higher dimensional versions See [Mallat 1998], [Carmona et al. 1998] or [Wickerhauser 1994]. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 7 / 41 Signal waveform representations Bases Signal representations: bases The mathematically simplest situation: orthonormal bases. The waveform system W = {ψλ , λ ∈ Λ} is an orthonormal basis of the signal space (inner product space, or Hilbert space) H is The atoms are mutually orthogonal and normalized: hψλ , ψµ i = δµν They form a complete set in H: if the signal x ∈ H is such that hx, ψλ i = 0 for all λ ∈ Λ, then x = 0. Then, any signal may be written in an unique way as X x(t) = αλ ψλ (t) , with αλ = hx, ψλ i λ∈Λ Thus, analysis and synthesis involve the same atoms. In addition, the “coefficient mapping” x → {αλ , λ ∈ Λ} preserves energy (Parseval’s formula) X |αλ |2 = kxk2 . λ∈Λ B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 8 / 41 Signal waveform representations Bases Signal representations: bases MDCT basis: smooth windows modulated by a sinusoidal function. In the continuous-time setting, the following (infinite) family of functions forms an orthonormal basis of L2 (R). r 2 π 1 wk (t) cos n+ (t − ak ) , k ∈ Z, n = 0, 1, 2, . . . ukn (t) = `k `k 2 In bounded intervals, as well as finite dimensional settings, similar bases may be constructed (Malvar, Suter, ...) B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 9 / 41 Signal waveform representations Bases Signal representations: bases More precisely, the only assumption is that the window functions wk must satisfy some symmetry conditions at boundaries. In general, windows are taken as regular translates of a single one. More freedom may be introduced, as long as the symmetry conditions are fullfilled. For example, some audio coders use systems with wide and narrow windows: B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 10 / 41 Signal waveform representations Bases Signal representations: bases More precisely, the only assumption is that the window functions wk must satisfy some symmetry conditions at boundaries. In general, windows are taken as regular translates of a single one. More freedom may be introduced, as long as the symmetry conditions are fullfilled. For example, some audio coders use systems with wide and narrow windows: Simple implementations are available on the Wavelab Stanford package: http://www-stat.stanford.edu/~wavelab B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 10 / 41 Signal waveform representations Bases Signal representations: bases MDCT basis is well adapted for audio signals: the expansion of most signals is sparse. See below: pdf (log scale) of MDCT coefficients of some organ recording. Besides signal coding/compression, sparsity also helps for several applications. Application: denoising: as noise is generally not sparse in the MDCT basis, simply threshold the MDCT coefficients of the noisy signal before reconstruction. Organ signal; Noisy organ signal; Denoised organ signal. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 11 / 41 Signal waveform representations Bases Signal representations: bases Application: source separation: Consider two mixtures (linear combinations): Mix 1; Mix 2. Below: scatter plots of the samples of mix 1 against mix 2 (left), and the mdct coefficients of mix 1 against mix 2 (right). B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 12 / 41 Signal waveform representations Bases Signal representations: bases Application: source separation: Consider two mixtures (linear combinations): Mix 1; Mix 2. Below: scatter plots of the samples of mix 1 against mix 2 (left), and the mdct coefficients of mix 1 against mix 2 (right). Method: identify the two directions, and project. Reconstructed organ; B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 12 / 41 Signal waveform representations Bases Signal representations: bases Advantages: “Optimal” in terms of redundancy. There exist bases for which fast algorithms have been developed (MDCT, Wilson, wavelets,...) Drawbacks: Being an orthonormal basis has a price: rigidity. Not any window function will generate a basis. Mathematically speaking, windows are not as smooth as one would like. Being a basis also imposes constraints on the sampling in time and frequency. No “free access” to the time-frequency domain. Question: can we make it sparser by introducing redundancy ? B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 13 / 41 Signal waveform representations Frames Signal representations: frames As an alternative to orthonormal bases, frames (wavelet, or Gabor) offer more flexibility. A frame is a (generally overcomplete) system of waveforms W = {ψλ , λ ∈ Λ} with respect to which signals may be expanded, with stable synthesis. In the case of frames, the energy conservation (Parseval’s formula) is generally replaced by an inequality of the form X Akxk2 ≤ |hx, ψλ i|2 ≤ Bkxk2 , λ∈Λ for some constants 0 < A ≤ B < ∞, for all signal x. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 14 / 41 Signal waveform representations Frames Signal representations: frames In such cases, one does not have exact reconstruction as before, but an approximation 2 X hx, ψλ iψλ x− A+B 2 ≤ λ∈Λ B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms B −A B +A September 2006 15 / 41 Signal waveform representations Frames Signal representations: frames In such cases, one does not have exact reconstruction as before, but an approximation 2 X hx, ψλ iψλ x− A+B 2 ≤ λ∈Λ B −A B +A Good news: there exists a (non unique) dual waveform system {ψ̃λ , λ ∈ Λ} such that for all signal: X X x= hx, ψλ iψ̃λ = hx, ψ̃λ iψλ . λ∈Λ λ∈Λ Hence: analysis and synthesis do not involve the same waveforms. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 15 / 41 Signal waveform representations Frames Signal representations: frames An example: as an alternative to MDCT bases, the Gabor frames ψmn (t) = e 2iπnν0 t ψ(t − mb0 ) provide a regular sampling of the time-frequency plane: a regular grid with mesh sizes b0 and ν0 . For b0 ν0 small enough, these indeed for a frame of the considered signal space, and the (canonical) dual system is a Gabor frame too. There exists a dual window ψ̃ such that the dual atoms are of the form ψ̃mn . The smaller b0 ν0 , the more redundant the system, the closer A and B, and the closer ψ and ψ̃. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 16 / 41 Signal waveform representations Frames Signal representations: frames Sampling grids in time-frequency domain have to be adapted to the time/frequency resolution of the atoms. Left: good frequency resolution (wide windows); Right: good time resolution (narrow window). Full circles correspond to “large” coefficients. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 17 / 41 Signal waveform representations Frames Signal representations: frames A Gaussian Gabor atom (blue), and its duals for low redundancy (left) and high redundancy (right) To play with Gabor atoms, dual atoms,...: the Linear Time-Frequency Analysis Toolbox (P. Söndergaard) LTFAT: http://www.univie.ac.at/nuhag-php/ltfat B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 18 / 41 Signal waveform representations Frames Signal representations: multiple frames Gabor frames offer more flexibility than MDCT bases in the choice of the window. However, in the standard construction, the Gabor atoms are of constant size, which is not always convenient for describing all features of audio signals. The time-frequency resolution of the atoms plays a significant role. Wide windows (from 20 to 40 ms) are well adapted to tonals (partials), while shorter ones (or wavelets instead of Gabor atoms) are beter suited for transients. Example: xilophone; B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 19 / 41 Signal waveform representations Frames Signal representations: multiple frames Question: can we get the best of the two frames ? B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 20 / 41 Signal waveform representations Frames Signal representations: multiple frames Question: can we get the best of the two frames ? Answer: yes, provided we can select the right signal expansion (among infinitely many), and control sparsity. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 20 / 41 Signal waveform representations Multiple frames Signal representations: multiple frames Idea: expand signals with respect to a larger system, involving both wide atoms Wg = {gmn , (m, n) ∈ Λg } and narrow atoms Wh = {hmn , (m, n) ∈ Λh }: D = Wg ∪ Wh D is still a frame, i.e. stable signal expansions on D exist. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 21 / 41 Signal waveform representations Multiple frames Signal representations: multiple frames Idea: expand signals with respect to a larger system, involving both wide atoms Wg = {gmn , (m, n) ∈ Λg } and narrow atoms Wh = {hmn , (m, n) ∈ Λh }: D = Wg ∪ Wh D is still a frame, i.e. stable signal expansions on D exist. For all finite-energy signal x, there exist infinitely many expansions X X x = xg + xh = αλ gλ + βµ hµ λ∈Λg µ∈Λh The coefficients α and β provide information on the time-frequency content of x; but some expansions are more meaningful than others. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 21 / 41 Signal waveform representations Multiple frames Signal representations: multiple frames How to pick the right time-frequency atoms ? Quilted frames [Dörfler 2002]: tile the time-frequency plane into domains corresponding to different time-frequency resolutions Time-frequency Jigsaw Puzzle [Jaillet & BT 2006]: let the computer choose the “right” atoms in time-frequency domain, using sparsity requirement. Matching Pursuit and Orthogonal Matching Pursuit [Mallat & Zhang 1993]: recursive search of atoms that correlate best with the signal. Basis Pursuit and Basis Pursuit Denoising [Chen et al 1998]. In all cases, partial synthesis from atoms of similar properties (i.e. time-frequency resolution) is possible B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 22 / 41 Signal waveform representations Multiple frames Signal representations: multiple frames Example with the TFJP algorithm [Jaillet & BT 2006] B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 23 / 41 Signal waveform representations More realistic time-frequency atoms ? More realistic time-frequency atoms ? In the previous approaches, sparsity was the only requirement. Can we do more ? In addition, whatever the choice of the waveform system, the time-frequency atoms are generally not realistic as “sound atoms”. Several possible approaches for improvement Learn more realistic atoms from sound databases: “dictionary learning” approach (e.g. [Bluemensath & Davies 2004]). B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 24 / 41 Signal waveform representations More realistic time-frequency atoms ? More realistic time-frequency atoms ? In the previous approaches, sparsity was the only requirement. Can we do more ? In addition, whatever the choice of the waveform system, the time-frequency atoms are generally not realistic as “sound atoms”. Several possible approaches for improvement Learn more realistic atoms from sound databases: “dictionary learning” approach (e.g. [Bluemensath & Davies 2004]). Build “time-frequency molecules” from atoms as compound objects (see for example [Daudet 2006]) B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 24 / 41 Signal waveform representations More realistic time-frequency atoms ? More realistic time-frequency atoms ? In the previous approaches, sparsity was the only requirement. Can we do more ? In addition, whatever the choice of the waveform system, the time-frequency atoms are generally not realistic as “sound atoms”. Several possible approaches for improvement Learn more realistic atoms from sound databases: “dictionary learning” approach (e.g. [Bluemensath & Davies 2004]). Build “time-frequency molecules” from atoms as compound objects (see for example [Daudet 2006]) Model dependencies between atoms in the coefficient domain. In the rest of the lecture, we focus on this last approach, using a pair of orthonormal bases (following [Daudet & Torrésani 2005]). B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 24 / 41 Signal waveform representations More realistic time-frequency atoms ? 1 Introduction 2 Signal waveform representations Bases Frames Multiple frames More realistic time-frequency atoms ? 3 Coefficient domain models Hybrid random waveform models Estimation algorithms based on observed coefficients Estimation algorithms based on synthesis coefficients 4 Conclusion 5 References B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 25 / 41 Coefficient domain models Coefficient domain models Experimental observation”: Interesting features in signals are “often” characterized by localized families of large coefficients, forming structured sets: Tonals: Horizontal lines in the short time Fourier domain. Transients: Vertical lines in the short time Fourier domain, or vertical trees in the wavelet domain. To encode separately such structures, explicit models may be introduced in the coefficient domain. Strategy: characterize the behavior of certain indicators (coefficients, or others) in the framework of the model, in view of estimation from real data. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 26 / 41 Coefficient domain models Hybrid random waveform models Hybrid random waveform models The generic form of such models (in N-dimensional space) is the following [Kowalski & BT 2006] X X x= αλ gλ + βδ hδ + r λ∈Λ δ∈∆ where Wg = {gλ , λ = 1, . . . N} and Wh = {hδ , δ = 1 . . . N} are two orthonormal bases of waveforms. The coefficients αλ and βδ are iid Gaussian random variables, with frequency dependent variances (λ and δ are time-frequency indices) and r is a small residual signal, modeled as white noise. The sets Λ and ∆ are sparse random subsets of the index set. The simplest model is the Bernoulli model: iid sets, with membership probabilities p and p̃. More complex models (for example Markov models) introduce dependencies between coefficients. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 27 / 41 Coefficient domain models Hybrid random waveform models Hybrid random waveform models Are such models able to reproduce “experimental observations” ? Study the behavior of observed coefficients (which differ from the synthesis coefficients αn and βn ) an = hx, gn i , bm = hx, hm i Introduce the membership variables XnΛ = 1 if n ∈ Λ and 0 otherwise, and similarly for Xn∆ . Then B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 28 / 41 Coefficient domain models Hybrid random waveform models Hybrid random waveform models Are such models able to reproduce “experimental observations” ? Study the behavior of observed coefficients (which differ from the synthesis coefficients αn and βn ) an = hx, gn i , bm = hx, hm i Introduce the membership variables XnΛ = 1 if n ∈ Λ and 0 otherwise, and similarly for Xn∆ . Then = hx, gn i = αn XnΛ + PN bn = hx, hn i = βn Xn∆ + PN an B. Torrésani (LATP Marseille) ∆ m=1 βm Xm hhm , gn i Λ m=1 αm Xm hgm , hn i Sparse Time-Frequency Transforms September 2006 28 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Estimation: observed coefficients In particular, assuming for simplicity that all coefficients α (resp. β) have the same variance σ 2 (resp. σ̃ 2 ), one has ! X var{ak } = σ 2 XkΛ + σ̃ 2 |hgk , hδ i|2 + σ02 . δ∈∆ If the significance maps are sparse, and if the two bases are sufficiently different, one recovers the “experimental” observations. This justifies the fact of approximating the signal by keeping the largest coefficients. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 29 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Estimation: observed coefficients Bernoulli model: It may be proved that the observed coefficients follow a mixture distribution: A “small variance” Gaussian mixture for coefficients whose time-frequency index does not belong to the significance map A “large variance” Gaussian mixture for coefficients whose time-frequency index does belong to the significance map. Exploiting numerically such a results yields an algorithm for estimating the significance maps, which yields significant dimension reduction, and allows one to estimate the coefficients. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 30 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Bernoulli-based estimation algorithm: 3 steps Goal: decompose the signal into two layers (+ residual) Parameter estimation: membership probabilities and synthesis coefficients variances (EM algorithm). Estimation of the significance maps Λ and ∆ (maximum likelihood... thresholding for the Bernoulli model) Estimation of the layers: orthogonal projection onto the subspace generated by the selected time-frequency atoms. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 31 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Bernoulli-based estimation algorithm: 3 steps Goal: decompose the signal into two layers (+ residual) Parameter estimation: membership probabilities and synthesis coefficients variances (EM algorithm). Estimation of the significance maps Λ and ∆ (maximum likelihood... thresholding for the Bernoulli model) Estimation of the layers: orthogonal projection onto the subspace generated by the selected time-frequency atoms. The algorithm is in fact more complex, and involves several iterations of steps 1 and 2. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 31 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Bernoulli-based estimation algorithm: example Decomposition of a Xilophone signal (top) into transient (bottom left) and tonal (bottom right) layers B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 32 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Structured model-based estimation algorithm Structured model: implements other a priori information, such as the fact that significant coefficients tend to form clusters, or lines (horizontal or vertical). Several models may be developed, among which Markov models for the significance maps [Molla & Torrésani 2005] Two-levels Bernoulli models (M. Kowalski) The estimation procedure has to be modified accordingly. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 33 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Structured model-based estimation algorithm Example: Markov model: (see [Molla & Torrésani 2005]) ∆ is for the tonal layer, the distribution of the indicator random variables Xkn characterized by transition matrices πn 1 − πn Pn = , 1 − πn0 πn0 with n o ∆ ∆ πn = P∆ Xk+1,n = 1|Xk,n =1 , n o ∆ ∆ πn0 = P∆ Xk+1,n = 0|Xk,n =0 In particular, P{Xk+1,n = 1, Xk,n = 1} > P{Xk+1,n = 1}P{Xk,n = 1} P{Xk+1,n = 0, Xk,n = 0} > P{Xk+1,n = 0}P{Xk,n = 0} . B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 34 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Structured model-based estimation algorithm The estimation algorithm keeps a similar structure as before... but becomes more complex: the estimation of the significance maps is not local (in the coefficient domain) anymore. Parameter estimation may be performed via EM algorithms The estimation of membership probabilities has to be replaced with the estimation of Markov matrices The estimation of the maps is done using Viterbi algorithm. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 35 / 41 Coefficient domain models Estimation algorithms based on observed coefficients Structured model-based algorithm: example “Blues Brothers” recording: original, tonal, transient and residual. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 36 / 41 Coefficient domain models Estimation algorithms based on synthesis coefficients Estimation: synthesis coefficients Alternative: in the framework of such random models, work directly on the distribution of the synthesis coefficients. A hierarchical Bayesian model model can be constructed [Févotte et al 2006], implementing Sparse signal decomposition into a dictionary of two MDCT bases, with Gaussian random coefficients Markov significance maps, implementing “structured” sets of coefficients Suitable priors for the model coefficients MAP and MMSE estimates are obtained by MCMC algorithms, significantly heavier than the previous ones. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 37 / 41 Coefficient domain models Estimation algorithms based on synthesis coefficients Estimation: synthesis coefficients Alternative: in the framework of such random models, work directly on the distribution of the synthesis coefficients. A hierarchical Bayesian model model can be constructed [Févotte et al 2006], implementing Sparse signal decomposition into a dictionary of two MDCT bases, with Gaussian random coefficients Markov significance maps, implementing “structured” sets of coefficients Suitable priors for the model coefficients MAP and MMSE estimates are obtained by MCMC algorithms, significantly heavier than the previous ones. Example: S. Raman; noisy version; denoised version (MMSE estimate); Tonal; Transient. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 37 / 41 Conclusion Conclusions Hybrid expansions generally provide sparser signal representations. The introduction of structured significance maps also improves sparsity, generally at the price of increased computational burden. In addition, tonal layers turn out to be more difficult to model accurately (MDCT bases do not offer the same flexibility as harmonic models). These techniques yield “transient + tonal + residual” signal decompositions: a sort of elementary (single captor) source separation. These may be exploited for various tasks (denoising, source separation, coding...), in situations where the residual signal is not too important. Otherwise, the residual will have to be modelled. To do next: relax the assumption of independence of the tonal and transient layers. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 38 / 41 References Bibliography T. Blumensath, M.E. Davies, Unsupervised learning of sparse and shift-invariant decompositions of polyphonic music, in: Proceedings of ICASSP ’04, vol. 5, 2004, pp. V:497-V:500. R. Carmona, W.L. Hwang, and B. Torrésani. Practical Time-Frequency Analysis: continuous wavelet and Gabor transforms, with an implementation in S, volume 9 of Wavelet Analysis and its Applications. Academic Press, San Diego, 1998. S.S. Chen, D.L. Donoho and M.A. Saunders, Atomic Decomposition by Basis Pursuit SIAM Journal on Scientific Computing 20:1 (1998), pp. 33 - 61 L. Daudet. Sparse and structured decompositions of signals with the molecular matching pursuit. IEEE Transactions on Acoustics, Speech, and Signal Processing, 2006, to appear. L. Daudet and B. Torrésani, Sparse adaptive representations for musical signals, Technical report, to appear in Signal processing for music transcription, M. Davy and A. Klapuri Eds. (2005). B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 39 / 41 References Bibliography M. Dörfler, Gabor Analysis for a Class of Signals called Music, PhD Dissertation, 2002, Mathematics Department, University of Vienna. http://www.mat.univie.ac.at/~moni/diss.pdf C. Fevotte, L. Daudet, S.J. Godsill and B. Torrésani, Sparse Regression with Structured Priors: Application to Audio Denoising. Proceedings of ICASSP 2006, Volume: 3, pp. III-57 - III-60. C. Févotte, B. Torrésani, L. Daudet and S. Godsill, Denoising of musical audio using sparse linear regression and structured priors, submitted. F. Jaillet and B. Torrésani, Remarques sur l’adaptativit des reprsentations temps-frquence Proceedings of the GRETSI’03 conference, Vol 1, pp. 145-148. F. Jaillet and B. Torrésani, Time-Frequency Jigsaw Puzzles, To appear in Int. J. on Wavelets and Multiresolution Information Processing (2006). B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 40 / 41 References Bibliography M. Kowalski, and B. Torrésani, A Family of Random Waveform Models for Audio Coding, Proceedings of ICASSP 2006, Volume: 3, pp III-472 - III-475. S. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41:3397–3415, 1993. S. Mallat. A wavelet tour of signal processing. Academic Press, 1998. S. Molla and B. Torrésani. Hybrid Audio Scheme using Hidden Markov Models of Waveforms Applied and Computational Harmonic Analysis 18 (2005), pp. 137-166. M. V. Wickerhauser. Adapted Wavelet Analysis from Theory to Software. AK Peters, Boston, MA, USA, 1994. B. Torrésani (LATP Marseille) Sparse Time-Frequency Transforms September 2006 41 / 41