Overview of the Mathematics of Compressive Sensing Simon Foucart Reading Seminar on “Compressive Sensing, Extensions, and Applications” Texas A&M University 1 October 2015 Part 1: Invitation to Compressive Sensing Keywords Keywords I Sparsity Essential Keywords I Sparsity Essential I Randomness Nothing better so far (measurement process) Keywords I Sparsity Essential I Randomness Nothing better so far (measurement process) I Optimization Preferred, but competitive alternatives are available (reconstruction process) The Standard Compressive Sensing Problem The Standard Compressive Sensing Problem x : unknown signal of interest in KN The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . Find concrete sensing/recovery protocols, The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . Find concrete sensing/recovery protocols, i.e., find I measurement matrices A : x ∈ KN 7→ y ∈ Km The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . Find concrete sensing/recovery protocols, i.e., find I I measurement matrices A : x ∈ KN → 7 y ∈ Km reconstruction maps ∆ : y ∈ Km → 7 x ∈ KN The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . Find concrete sensing/recovery protocols, i.e., find I I measurement matrices A : x ∈ KN → 7 y ∈ Km reconstruction maps ∆ : y ∈ Km → 7 x ∈ KN such that ∆(Ax) = x for any s-sparse vector x ∈ KN . The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . Find concrete sensing/recovery protocols, i.e., find I I measurement matrices A : x ∈ KN → 7 y ∈ Km reconstruction maps ∆ : y ∈ Km → 7 x ∈ KN such that ∆(Ax) = x for any s-sparse vector x ∈ KN . In realistic situations, two issues to consider: The Standard Compressive Sensing Problem x : unknown signal of interest in KN y : measurement vector in Km with m N, s : sparsity of x = card j ∈ {1, . . . , N} : xj 6= 0 . Find concrete sensing/recovery protocols, i.e., find I I measurement matrices A : x ∈ KN → 7 y ∈ Km reconstruction maps ∆ : y ∈ Km → 7 x ∈ KN such that ∆(Ax) = x for any s-sparse vector x ∈ KN . In realistic situations, two issues to consider: Stability: x not sparse but compressible, Robustness: measurement error in y = Ax + e. A Selection of Applications A Selection of Applications I Magnetic resonance imaging Figure: Left: traditional MRI reconstruction; Right: compressive sensing reconstruction (courtesy of M. Lustig and S. Vasanawala) A Selection of Applications I Magnetic resonance imaging I Sampling theory 8 6 4 2 0 ï2 ï4 ï6 ï0.5 ï0.4 ï0.3 ï0.2 ï0.1 0 0.1 0.2 Figure: Time-domain signal with 16 samples. 0.3 0.4 0.5 A Selection of Applications I Magnetic resonance imaging I Sampling theory I Error correction A Selection of Applications I Magnetic resonance imaging I Sampling theory I Error correction I and many more... Application in Metagenomics [Koslicki–F.–Rosen] Application in Metagenomics I [Koslicki–F.–Rosen] x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. Application in Metagenomics I [Koslicki–F.–Rosen] x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Application in Metagenomics I [Koslicki–F.–Rosen] x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. Application in Metagenomics [Koslicki–F.–Rosen] I x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. I y ∈ Rm (m = 46 = 4, 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) Application in Metagenomics [Koslicki–F.–Rosen] I x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. I y ∈ Rm (m = 46 = 4, 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) I A ∈ Rm×N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. Application in Metagenomics [Koslicki–F.–Rosen] I x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. I y ∈ Rm (m = 46 = 4, 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) I A ∈ Rm×N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, Application in Metagenomics [Koslicki–F.–Rosen] I x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. I y ∈ Rm (m = 46 = 4, 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) I A ∈ Rm×N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, Pm Ai,j ≥ 0 and i=1 Ai,j = 1. Application in Metagenomics [Koslicki–F.–Rosen] I x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. I y ∈ Rm (m = 46 = 4, 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) I A ∈ Rm×N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, Pm Ai,j ≥ 0 and i=1 Ai,j = 1. I Quikr improves on traditional read-by-read methods, especially in terms of speed. Application in Metagenomics [Koslicki–F.–Rosen] I x ∈ RN (N = 273, 727): concentrations of known bacteria in a given environmental sample. P Sparsity assumption is realistic. Note also that x ≥ 0 and j xj = 1. I y ∈ Rm (m = 46 = 4, 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) I A ∈ Rm×N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, Pm Ai,j ≥ 0 and i=1 Ai,j = 1. I Quikr improves on traditional read-by-read methods, especially in terms of speed. I Codes available at sourceforge.net/projects/quikr/ sourceforge.net/projects/wgsquikr/ `0 -Minimization Since kxkpp := N X j=1 |xj |p −→ p→0 N X j=1 1{xj 6=0} , `0 -Minimization Since kxkpp := N X j=1 |xj |p −→ p→0 N X 1{xj 6=0} , j=1 the notation kxk0 [sic] has become usual for kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}. `0 -Minimization Since kxkpp := N X j=1 |xj |p −→ p→0 N X 1{xj 6=0} , j=1 the notation kxk0 [sic] has become usual for kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}. For an s-sparse x ∈ KN , observe the equivalence of `0 -Minimization Since kxkpp := N X j=1 |xj |p −→ p→0 N X 1{xj 6=0} , j=1 the notation kxk0 [sic] has become usual for kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}. For an s-sparse x ∈ KN , observe the equivalence of I x is the unique s-sparse solution of Az = y with y = Ax, `0 -Minimization Since kxkpp := N X |xj |p −→ j=1 p→0 N X 1{xj 6=0} , j=1 the notation kxk0 [sic] has become usual for kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}. For an s-sparse x ∈ KN , observe the equivalence of I x is the unique s-sparse solution of Az = y with y = Ax, I x can be reconstructed as the unique solution of (P0 ) minimize kzk0 z∈KN subject to Az = y. `0 -Minimization Since kxkpp := N X |xj |p −→ j=1 p→0 N X 1{xj 6=0} , j=1 the notation kxk0 [sic] has become usual for kxk0 := card(supp(x)), where supp(x) := {j ∈ [N] : xj 6= 0}. For an s-sparse x ∈ KN , observe the equivalence of I x is the unique s-sparse solution of Az = y with y = Ax, I x can be reconstructed as the unique solution of (P0 ) minimize kzk0 z∈KN subject to Az = y. This is a combinatorial problem, NP-hard in general. Minimal Number of Measurements Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: 1. Every s-sparse x is the unique s-sparse solution of Az = Ax, Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: 1. Every s-sparse x is the unique s-sparse solution of Az = Ax, 2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0}, Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: 1. Every s-sparse x is the unique s-sparse solution of Az = Ax, 2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0}, 3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective, Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: 1. Every s-sparse x is the unique s-sparse solution of Az = Ax, 2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0}, 3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective, 4. Every set of 2s columns of A is linearly independent. Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: 1. Every s-sparse x is the unique s-sparse solution of Az = Ax, 2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0}, 3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective, 4. Every set of 2s columns of A is linearly independent. As a consequence, exact recovery of every s-sparse vector forces m ≥ 2s. Minimal Number of Measurements Given A ∈ Km×N , the following are equivalent: 1. Every s-sparse x is the unique s-sparse solution of Az = Ax, 2. ker A ∩ {z ∈ KN : kzk0 ≤ 2s} = {0}, 3. For any S ⊂ [N] with card(S) ≤ 2s, the matrix AS is injective, 4. Every set of 2s columns of A is linearly independent. As a consequence, exact recovery of every s-sparse vector forces m ≥ 2s. This can be achieved using partial Vandermonde matrices. Exact s-Sparse Recovery from 2s Fourier Measurements Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Consider the 2s Fourier coefficients x̂(j) = N−1 X k=0 x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1. Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Consider the 2s Fourier coefficients x̂(j) = N−1 X x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1. k=0 Consider a trigonometric polynomial vanishing exactly on S, i.e., Y p(t) := 1 − e −i2πk/N e i2πt/N . k∈S Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Consider the 2s Fourier coefficients x̂(j) = N−1 X x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1. k=0 Consider a trigonometric polynomial vanishing exactly on S, i.e., Y p(t) := 1 − e −i2πk/N e i2πt/N . k∈S Since p · x ≡ 0, discrete convolution gives 0 = (p̂ ∗ x̂)(j) = N−1 X k=0 p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1. Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Consider the 2s Fourier coefficients x̂(j) = N−1 X x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1. k=0 Consider a trigonometric polynomial vanishing exactly on S, i.e., Y p(t) := 1 − e −i2πk/N e i2πt/N . k∈S Since p · x ≡ 0, discrete convolution gives 0 = (p̂ ∗ x̂)(j) = N−1 X p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1. k=0 Note that p̂(0) = 1 and that p̂(k) = 0 for k > s. Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Consider the 2s Fourier coefficients x̂(j) = N−1 X x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1. k=0 Consider a trigonometric polynomial vanishing exactly on S, i.e., Y p(t) := 1 − e −i2πk/N e i2πt/N . k∈S Since p · x ≡ 0, discrete convolution gives 0 = (p̂ ∗ x̂)(j) = N−1 X p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1. k=0 Note that p̂(0) = 1 and that p̂(k) = 0 for k > s. The equations s, . . . , 2s − 1 translate into a Toeplitz system with unknowns p̂(1), . . . , p̂(s). Exact s-Sparse Recovery from 2s Fourier Measurements Identify an s-sparse x ∈ CN with a function x on {0, 1, . . . , N − 1} with support S, card(S) = s. Consider the 2s Fourier coefficients x̂(j) = N−1 X x(k)e −i2πjk/N , 0 ≤ j ≤ 2s − 1. k=0 Consider a trigonometric polynomial vanishing exactly on S, i.e., Y p(t) := 1 − e −i2πk/N e i2πt/N . k∈S Since p · x ≡ 0, discrete convolution gives 0 = (p̂ ∗ x̂)(j) = N−1 X p̂(k)x̂(j − k), 0 ≤ j ≤ N − 1. k=0 Note that p̂(0) = 1 and that p̂(k) = 0 for k > s. The equations s, . . . , 2s − 1 translate into a Toeplitz system with unknowns p̂(1), . . . , p̂(s). This determines p̂, hence p, then S, and finally x. `1 -Minimization (Basis Pursuit) `1 -Minimization (Basis Pursuit) Replace (P0 ) by (P1 ) minimize kzk1 z∈KN subject to Az = y. `1 -Minimization (Basis Pursuit) Replace (P0 ) by (P1 ) I minimize kzk1 z∈KN Geometric intuition subject to Az = y. `1 -Minimization (Basis Pursuit) Replace (P0 ) by (P1 ) minimize kzk1 z∈KN subject to Az = y. I Geometric intuition I Unique `1 -minimizers are at most m-sparse (when K = R) `1 -Minimization (Basis Pursuit) Replace (P0 ) by (P1 ) minimize kzk1 z∈KN subject to Az = y. I Geometric intuition I Unique `1 -minimizers are at most m-sparse (when K = R) I Convex optimization program, hence solvable in practice `1 -Minimization (Basis Pursuit) Replace (P0 ) by minimize kzk1 (P1 ) z∈KN subject to Az = y. I Geometric intuition I Unique `1 -minimizers are at most m-sparse (when K = R) I Convex optimization program, hence solvable in practice I In the real setting, recast as the linear optimization program minimize c,z∈RN N X j=1 cj subject to Az = y and − cj ≤ zj ≤ cj . `1 -Minimization (Basis Pursuit) Replace (P0 ) by minimize kzk1 (P1 ) z∈KN subject to Az = y. I Geometric intuition I Unique `1 -minimizers are at most m-sparse (when K = R) I Convex optimization program, hence solvable in practice I In the real setting, recast as the linear optimization program minimize c,z∈RN I N X cj subject to Az = y and − cj ≤ zj ≤ cj . j=1 In the complex setting, recast as a second order cone program Basis Pursuit — Null Space Property Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if (NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}. Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if (NSP) kuS k1 < kuS k1 , For real measurement matrices, all u ∈ ker A \ {0}. Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if (NSP) kuS k1 < kuS k1 , all u ∈ ker A \ {0}. For real measurement matrices, real and complex NSPs read Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if kuS k1 < kuS k1 , (NSP) all u ∈ ker A \ {0}. For real measurement matrices, real and complex NSPs read X X |uj | < |u` |, all u ∈ kerR A \ {0}, j∈S `∈S Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if kuS k1 < kuS k1 , (NSP) all u ∈ ker A \ {0}. For real measurement matrices, real and complex NSPs read X X |uj | < |u` |, all u ∈ kerR A \ {0}, j∈S `∈S Xq Xq 2 2 vj + wj < v`2 + w`2 , j∈S `∈S all (v, w) ∈ (kerR A)2 \ {0}. Basis Pursuit — Null Space Property ∆1 (Ax) = x for every vector x supported on S if and only if kuS k1 < kuS k1 , (NSP) all u ∈ ker A \ {0}. For real measurement matrices, real and complex NSPs read X X |uj | < |u` |, all u ∈ kerR A \ {0}, j∈S `∈S Xq Xq 2 2 vj + wj < v`2 + w`2 , j∈S all (v, w) ∈ (kerR A)2 \ {0}. `∈S Real and complex NSPs are in fact equivalent. Orthogonal Matching Pursuit Orthogonal Matching Pursuit Starting with S 0 = ∅ and x0 = 0, iterate (OMP1 ) S n+1 = S n ∪ j n+1 := argmax |(A∗ (y − Axn ))j | , j∈[N] (OMP2 ) n+1 x = argmin ky − Azk2 , supp(z) ⊆ S n+1 . z∈CN Orthogonal Matching Pursuit Starting with S 0 = ∅ and x0 = 0, iterate (OMP1 ) S n+1 = S n ∪ j n+1 := argmax |(A∗ (y − Axn ))j | , j∈[N] (OMP2 ) n+1 x = argmin ky − Azk2 , supp(z) ⊆ S n+1 . z∈CN I The norm of the residual decreases according to 2 ky − Axn+1 k22 ≤ ky − Axn k22 − (A∗ (y − Axn ))j n+1 . Orthogonal Matching Pursuit Starting with S 0 = ∅ and x0 = 0, iterate (OMP1 ) S n+1 = S n ∪ j n+1 := argmax |(A∗ (y − Axn ))j | , j∈[N] (OMP2 ) n+1 x = argmin ky − Azk2 , supp(z) ⊆ S n+1 . z∈CN I The norm of the residual decreases according to 2 ky − Axn+1 k22 ≤ ky − Axn k22 − (A∗ (y − Axn ))j n+1 . I Every vector x6=0 supported on S, card(S) = s, is recovered from y = Ax after at most s iterations of OMP if and only if AS is injective and max |(A∗ r)j | > max |(A∗ r)` | (ERC) j∈S `∈S for all r6=0 ∈ Az, supp(z) ⊆ S . Iterative Hard Thresholding and Hard Thresholding Pursuit Iterative Hard Thresholding and Hard Thresholding Pursuit I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, Iterative Hard Thresholding and Hard Thresholding Pursuit I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. IHT: Start with an s-sparse x0 ∈ CN and iterate: (IHT) xn+1 = Hs (xn + A∗ (y − Axn )) until a stopping criterion is met. Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. IHT: Start with an s-sparse x0 ∈ CN and iterate: (IHT) xn+1 = Hs (xn + A∗ (y − Axn )) until a stopping criterion is met. HTP: Start with an s-sparse x0 ∈ CN and iterate: Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. IHT: Start with an s-sparse x0 ∈ CN and iterate: (IHT) xn+1 = Hs (xn + A∗ (y − Axn )) until a stopping criterion is met. HTP: Start with an s-sparse x0 ∈ CN and iterate: (HTP1 ) (HTP2 ) Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. IHT: Start with an s-sparse x0 ∈ CN and iterate: (IHT) xn+1 = Hs (xn + A∗ (y − Axn )) until a stopping criterion is met. HTP: Start with an s-sparse x0 ∈ CN and iterate: (HTP1 ) S n+1 = s largest abs. entries of xn + A∗ (y − Axn ) , (HTP2 ) Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. IHT: Start with an s-sparse x0 ∈ CN and iterate: (IHT) xn+1 = Hs (xn + A∗ (y − Axn )) until a stopping criterion is met. HTP: Start with an s-sparse x0 ∈ CN and iterate: (HTP1 ) S n+1 = s largest abs. entries of xn + A∗ (y − Axn ) , (HTP2 ) xn+1 = argmin ky − Azk2 , supp(z) ⊆ S n+1 , Iterative Hard Thresholding and Hard Thresholding Pursuit I I I solving the rectangular system Ax = y amounts to solving the square system A∗ Ax = A∗ y, classical iterative methods suggest the iteration xn+1 = xn + A∗ (y − Axn ), at each iteration, keep s largest absolute entries and set the other ones to zero. IHT: Start with an s-sparse x0 ∈ CN and iterate: (IHT) xn+1 = Hs (xn + A∗ (y − Axn )) until a stopping criterion is met. HTP: Start with an s-sparse x0 ∈ CN and iterate: (HTP1 ) S n+1 = s largest abs. entries of xn + A∗ (y − Axn ) , (HTP2 ) xn+1 = argmin ky − Azk2 , supp(z) ⊆ S n+1 , until a stopping criterion is met (S n+1 = S n is natural here).