Error bounds for noisy compressive phase retrieval Bernhard G. Bodmann Nathaniel Hammen Mathematics Department University of Houston Houston, TX 77204-3008 Mathematics Department University of Houston Houston, TX 77204-3008 Abstract—This paper provides a random complex measurement matrix and an algorithm for complex phase retrieval of sparse or approximately sparse signals from the noisy magnitudes of the measurements obtained with this matrix. We compute explicit error bounds for the recovery which depend on the noiseto-signal ratio, the sparsity s, the number of measured quantitites m, and the dimension of the signal N . This requires m to be of the order of s ln(N/s). In comparison with sparse recovery from complex linear measurements, our phase retrieval algorithm requires six times the number of measured quantities. I. I NTRODUCTION Some types of measurement devices only record magnitudes of a linear transform, while the phase information is unavailable. This is the case in many optical, acoustic, electromagentic, and quantum systems [10, 17, 19], and in particular in X-ray crystallography [15]. In this paper, we assume an N -dimensional complex signal x ∈ CN and noisy measurements given by bi = |hx, fi i|2 + i for some set of measurement vectors {fi }M i=1 and measurement noise {i }M . We wish to recover an approximation i=1 to the vector x from the measurements {bi }M i=1 , up to a unimodular factor that remains undetermined. This is the abstract formulation of the phase retrieval problem [13] in finite dimensional Hilbert spaces. In many applications, the dimension of the signal is much larger than the number of measurements that can be taken feasibly. Thus, we would like to combine phase retrieval results with compressive sensing results, which allow a sufficiently sparse vector to be recovered from fewer measurements than the dimension of the vector space. This idea of combining phase retrieval with compressive sensing has been explored in recent years [4,5,8,11,12,14,18]. Some of these methods have provable performance guarantees in the presence of noise, but they do not include precise conditions on the number of measured quantities that are sufficient [12, 14, 18]. In addition, the known error bounds are either generic, or depend quadratically on the noise-to-signal ratio [4]. This paper provides an explicit error bound for sparse signals that is linear in the noise-to-signal ratio for a concrete number of measured quantities. It also includes an error bound for approximately sparse signals. To this end, we combine the phase retrieval algorithm of [6] with the generic two-stage sparse phase retrieval method of [12], both of which have an error bound that is linear c 978-1-4673-7353-1/15/$31.00 2015 IEEE in terms of the input noise. This results in a sparse phase retrieval algorithm with a small number of measurements and a uniform error bound that depends linearly on the noise-tosignal ratio. In the first stage of recovery, the relative phases of a number of linear measurements are recovered with a deterministic algorithm. In the second stage, we use that these linear measurements are chosen according to a randomization strategy from compressive sensing, and thus provide a method to recover sparse signals accurately. The main challenge of controlling the error with the two-stage method is that a naive combination of the random measurements with the phase retrieval algorithm only implies a random error bound. However, using the RIP constant of the measurement matrix allows us to deduce a deterministic error bound which holds with overwhelming probability. II. P RELIMINARIES A. Compressive Sensing We say a vector x is s-sparse if x has only s or fewer nonzero entries, and we say a vector x is nearly s-sparse if there exists an s sparse vector that is a small l1 distance away from x. For any vector x, we define kxk0 to be equal to the number of nonzero entries of x. This means that kxk0 is the smallest number s such that x is s-sparse. A central task in compressive sensing is the technique of creating an underdetermined system of measurements that can recover a sparse or nearly sparse vector to a high degree of accuracy. Typically, the accuracy is measured in terms of the Euclidean norm k · k2 . This is usually established using a restricted isometry property or a null space property of the measurement matrix [9]. Here, we use the restricted isometry property. Definition 1. For a real or complex m × N matrix A and a positive integer s ≤ N , we say that A satisfies the s-restricted isometry property with isometry constant δs if for each ssparse vector x ∈ RN or x ∈ CN , respectively, we have (1 − δs )kxk22 ≤ kAxk22 ≤ (1 + δs )kxk22 . Foucart and Rauhut show how a suitably bounded restricted isometry constant provides robust and stable sparse recovery by `1 -norm minimization. Theorem 2 (Foucart and Rauhut, Theorem 6.12 in [9]). Suppose that A ∈ Cm×n satisfies the 2s-restricted isometry property with isometry constant δ2s < √441 . Then, for any x ∈ CN and y ∈ Cm satisfying ky − Axk2 ≤ η, the solution x# to arg min kx̃k1 ky − Ax̃k2 ≤ η subject to x̃∈CN These measurements are interpolated using the Dirichlet Kernel to three approximating trigonometric polynomials, satisfies g0 (z) = C kx − x# k2 ≤ √ s 2m−1 X inf z∈CN ,kzk 0 ≤s kx − zk1 l=1 + Dη with C and D only depending on δ2s by g1 (z) = 2 C= and D= bl Dm−1 (zω −l ) , (1 + ρ) 1−ρ 2m−1 X b(2m−1+l) Dm−1 (zω −l ) , l=1 and (3 + ρ)τ 1−ρ g2 (z) = 2m−1 X b(4m−2+l) Dm−1 (zω −l ) . l=1 and, in turn, ρ= p as well as δ2s 2 − δ /4 1 − δ2s 2s √ τ=p 1 + δ2s 2 − δ /4 1 − δ2s 2s . There are many algorithms that have been created to solve the minimization problem in the above theorem efficiently, such as iteratively re-weighted least squares [7]. B. Phase Retrieval Our algorithm for phase retrieval recovers an approximation to the vector x ∈ Cm , up to a unimodular constant, from measurements of the form {bi = |hx, fi i|2 + i }ni=1 for some set of measurement vectors {fi }ni=1 and a noise vector = {i }ni=1 ∈ Rn . The magnitude of the noise is measured by the norm kk∞ = maxi |i |. We use the three-step algorithm from [6] to provide a robust solution to thee recovery problem using n = 6m − 3 measurements. (1) Let Dm−1 be the normalized Dirichlet kernel of degree m − 1, so that for any z ∈ C with |z| = 1, Pm−1 1 k Dm−1 (z) = 2m−1 k=−(m−1) z . Then the set of −l 2m−1 functions {z 7→ Dd−1 (zω )}l=1 is an orthogonal basis for the set of trigonometric polynomials of degree at most m−1 with respect to the L2 inner product. Thus, any trigonometric polynomial g can be P2m−1 interpolated as g(z) = g(ω l )Dm−1 (zω −l ). l=1 If we represent the coefficients of the vector x to be recovered as the coefficients of complex polynomial 2iπ p having degree at most m − 1, and we let ν = e m , then the functions z 7→ |p(z)|2 , z 7→ |p(z) − p(zν)|2 , and z 7→ |p(z) − ip(zν)|2 are equal to trigonometric polynomials of degree at most m − 1 when restricted to the unit circle T = {z ∈ C : |z| = 1}. The 6m − 3 measurements that are taken are perturbed samples of these functions at each of the 2m − 1st roots of unity. |p(ω l )|2 + l , l ≤ 2m − 1, l |p(ω ) − p(ω l ν)|2 + l , 2m ≤ l ≤ 4m − 2, bl = |p(ω l ) − ip(ω l ν)|2 + l , 4m − 1 ≤ l (2) (3) Note that this process is robust to noise in the measurements, with an error at any point that is at most (2m − 1)kk∞ so that on the unit circle, g0 (z) ≈ |p(z)|2 , g1 (z) ≈ |p(z) − p(zν)|2 , and g2 (z) ≈ |p(z) − ip(zν)|2 . Thus, the finite number of perturbed measurements has been expanded to an infinite family of measurements of each of these functions on the unit circle. Next, magnitude measurements of g0 are selected from points ξν k , for k from 1 to m, and multiple ξ ∈ C with |ξ| = 1 that have angles less than that of ν. Let z0 equal the value of ξ that maximizes mink g0 (ξν k ). Thus, {g0 (z0 ν k )}m k=1 is a set of m equally spaced magnitude measurements on the unit circle such that the smallest of the m measurements is suitably bounded away from zero. Magnitude measurements of g1 and g2 are also taken from the same m points z0 ν k . The evaluations of a polynomial p at m sample points {z0 ν k }m k=1 from step (2) are equal to inner products with measurement vectors, p(z0 ν k ) = hp, Kz0 ν k i given by Kz0 ν k (z) = m−1 X (z0 ν k )j z j , j=0 which form an orthogonal basis for the space of complex polynomials. Thus, if m sample points are ordered with increasing angle, the values of g0 (z0 ν k ), g1 (z0 ν k ), and g2 (z0 ν k ) correspond to perturbed values of the coefficients |xk |2 , |xk − xk+1 |2 , and |xk − ixk+1 |2 with respect to this orthogonal basis, respectively. An approximation y for x is created in this basis by letting y1 = p g0 (z0 ν) ≈ |x1 | and for each k from 1 to m − 1, the (noiseless) identity 1 xk xk+1 = (1 − i)(|xk |2 + |xk+1 |2 ) 2 1 i − |xk − xk+1 |2 + |xk − ixk+1 |2 2 2 is used to create an iterative process which assigns for any 1 ≤ k ≤ m − 1 1 tk = (1 − i) g0 (z0 ν k ) + g0 (z0 ν k+1 ) 2 1 i − g1 (z0 ν k ) + g2 (z0 ν k ) 2 2 xk xk+1 tk yk yk ≈ yk+1 = k g0 (z0 ν ) |xk |2 A change of basis converts from the coefficients of y back to the canonical polynomial basis, p̃ = m X k=1 1 yk √ Kz0 ν k . m The following theorem from [6] gives error bounds for the above algorithm that are linear in the noise-to-signal ratio. For the proof, it is most natural to measure this ratio with the quotient of kk∞ and kpk2 . Theorem 3 ([6]). Let Pm be the space of polynomials 2iπ 2iπ with maximal degree m − 1, ω = e 2m−1 , ν = e m , 2π r = sin( (m−1)m 2 ), and 0 < α < 1. For any nonzero p ∈ Pm , and any ∈ R6m−3 , if β = r (m−1)m 2 m ( m−1 2m ) 2 m−1 and ( k=1 αβ kk∞ ≤ 2m−1 , then the approximation p̃ ∈ Pm constructed with the above algorithm using the values of the measurement map Ae : Pm × R6m−3 → R6m−3 defined by |p(ω j )|2 + j , j ≤ 2m − 1, j e |p(ω ) − p(ω j ν)|2 + j , 2m ≤ j ≤ 4m − 2, A(p, )j = |p(ω j ) − ip(ω j ν)|2 + j , 4m − 1 ≤ j 2 Qm−1 (r k +1)) kpk22 satisfies the error bound kk∞ min kp̃ − cpk2 ≤ Ẽ kpk2 |c|=1 with the constant Ẽ = D̃ + 2β p√ 1 − C̃ m ! √ m(1 − α)(1 − C̃) m(2m − 1) , D̃ = In this section we apply the two theorems quoted above to the two-stage technique for sparse phase retrieval shown in [12]. The two stages are dimensional reduction by a randomized approximate projection and phase retrieval for the compressed components. The concrete error bounds for phase retrieval together with the performance guarantees for randomized measurements from compressed sensing allow us to control the accuracy of compressive phase retrieval. In the real case, it is standard knowledge that for a random m × N matrix A whose entries are drawn independently from a subgaussian distribution with variance one, there is a universal constant C that only depends on the distribution √ such that A/ m has the restricted isometry constant δs with probability exceeding 1 − 2 exp(−δs2 m/(2C)) provided m ≥ 2Cδs−2 s ln(eN/s) [9, Theorem 9.2]. In the complex case, we restrict ourselves to Gaussian measurement matrices and assemble several parts from [9]. The elementary starting point is that if A is a complex Gaussian m×s matrix with s < m, whose entries have standard normal distributed real and imaginary parts, then the maximal and √ 2m are minimal singular values σmax and σmin of A/ ps p s for t > 0 contained in the interval [1 − m − t, 1 + m + t] with a probability of [9, Exercise 9.5] r rs 2 s −t ≤ σmin , σmax ≤ 1+ +t ≥ 1−2e−mt . P 1− m m The proof of this is analogous to the real case [9, Theorem 9.26]. Using a union bound as in the proof of [9, Theorem 9.27], √ we then get that the restricted isometry constant δs of A/ 2m is bounded by r r s 2 eN s s 2 P δs > 2( + t) + ( + t) ≤ 2 e−mt . m m s If the sparsity s, the number of measurements m and the dimension of the Hilbert space N are such that for a given sparsity ratio s/N , m/s is sufficiently large, then the desired RIP constant is achieved with overwhelming probability for large dimensions. We summarize these results from [9]. Proposition 4. A complex random matrix with entries whose real and imaginary parts are drawn independently at random from a normal distribution with mean zero√ and variance 1/(2m) achieves an RIP constant δ2s < 4/ 41 with overwhelming probability if there exists r 2s eN t> ln( ) m 2s such that that depends on m and α through √ √ (1 + 2)αβ 2 + m C̃ = , β 2 (1 − α) √ III. M AIN R ESULT 2( m 2 + 2 m − mC̃ − 1 + C̃ . β 2 (1 − α) (1 − C̃)2 p 2s/m + t) + ( p 4 2s/m + t)2 < √ . 41 If we let A be the random Gaussian m×N matrix satisfying the conditions of Theorem 2, and we let B be the matrix associated with the linear portion of the measurement map Ae from Theorem 3, then the measurements of a vector x are of the form |BAx|2 + , where ∈ R6m−3 is a noise vector and | • |2 is the squared modulus taken component-wise. Theorem 5. Let x ∈ CN , let ∈ R6m−3 , and let A ∈ Cm×N satisfy the 2s-restricted isometry property with isometry con2iπ 2iπ stant δ2s < √441 . If ω = e 2m−1 and ν = e m , let B ∈ C(6m−3)×m be given by ω j(k−1) if 1 ≤ j ≤ 2m − 1, j(k−1) Bj,k = ω − (ω j ν)k−1 if 2m ≤ j ≤ 4m − 2, j(k−1) ω − i(ω j ν)k−1 if 4m − 1 ≤ j ≤ 6m − 3 . 2π Let r = sin( (m−1)m 2 ), 0 < α < 1, β = and kk∞ ≤ requirement αβ 2 kpk22 2m−1 . r (m−1)m 2 m ( m−1 2m ) Qm−1 2 m−1 , ( k=1 If x satisfies the approximate sparsity √ σs (x)1 < (r k +1)) 1 − δs kxk2 γs inf z∈CN ,kzk0 ≤s To eliminate the random term kAxk2 in the denominator, we split x into a sum of s-sparse vectors. Let z1 = arg min z∈CN ,kzk0 ≤s and for each k ∈ N with k < d Ns e, let k X x − z − z zk = arg min j z∈CN ,kzk0 ≤s j=1 so that for any j = 6 k, we have that zj is zero in each component that zk is non-zero. Thus, kx − zk1 dN s e x= and γs = p kx − zk1 1 with σs (x)1 = satisfies the random bound C kk∞ kc0 x − x# k2 ≤ √ inf kc0 x − zk1 + DẼ N kAxk2 s z∈C ,kzk0 ≤s C1 kk∞ = √ σs (x)1 + C2 . kAxk2 s X zj j=1 p 1 − δs + 1 + δs , then an approximation x# for x can be reconstructed from the vector |BAx|2 + (where | • |2 is taken component-wise), such that C1 kk∞ kc0 x − x# k2 ≤ √ σs (x)1 + C2 √ s 1 − δs kxk2 − γs σs (x)1 where C1 = C, and C2 = DẼ with C and D from Theorem 2, Ẽ from Theorem 3 and c0 ∈ C, |c0 | = 1. and N d s e dN s e X X zj = kzj k1 . kx − z1 k1 = j=2 j=2 1 Then, N d s e dN s e X X kAxk2 = Az = Az + Az j j 1 j=1 j=2 2 2 Proof. Consider Pm the polynomial px ∈ Pm defined such that px (z) = k=1 hx, φ∗k iz k−1 , where φk is the k-th row of A. This polynomial has monomial coefficients that are precisely equal to the coefficients of the vector Ax. Thus, the map Ae defined in Theorem 3 satisfies m 2 X ∗ e A(px , ) = hx, φk iBj,k + j = |BAx|2 + j j and by a few uses of the triangle inequality N dX e s Azj kAxk2 ≥ kAz1 k2 − j=2 Using Theorem 3, we obtain a polynomial p̃ ∈ Pm satisfying Next, with the s-restricted isometry property of A k=1 kk∞ kk∞ kp̃ − c0 px k2 ≤ Ẽ = Ẽ kpx k2 kAxk2 x̃∈CN subject to kk∞ . kAxk2 ky − c0 Ax̃k2 ≤ Ẽ kAzj k2 . j=2 N j=2 N dse X p p = 1 − δs kx − (x − z1 )k2 − 1 + δs kzj k2 j=2 by one more use of the reverse triangle inequality Using this, we may apply Theorem 2 to show that the solution x# to arg min kx̃k1 ≥kAz1 k2 − X dse Xp p kAxk2 ≥ 1 − δs kz1 k2 − 1 + δs kzj k2 for some c0 ∈ C with |c0 | = 1 and some Ẽ ∈ R+ that depends only on α and m. If y is the vector of monomial coefficients of p̃, then ky − c0 Axk2 = kp̃ − c0 px k2 ≤ Ẽ 2 dN s e kk∞ kAxk2 N dse X p p ≥ 1 − δs (kxk2 − kx − z1 k2 ) − 1 + δs kzj k2 j=2 and by the relation between `1 and `2 norms N ≥ p dse X p 1 − δs (kxk2 − kx − z1 k1 ) − 1 + δs kzj k1 j=2 and finally using the earlier `1 identity, we get p p p 1 − δs + 1 + δs kx − z1 k1 = 1 − δs kxk2 − p = 1 − δs kxk2 − γs σs (x)1 . Thus, C1 kk∞ kc0 x − x# k2 ≤ √ σs (x)1 + C2 kAxk2 s C1 kk∞ ≤ √ σs (x)1 + C2 √ . s 1 − δs kxk2 − γs σs (x)1 If x is s-sparse, then the error bound simplifies to a linear bound in the noise-to-signal ratio. Corollary 6. If the assumptions of the preceding theorem hold and x is s-sparse, then the recovery algorithm results in x# such that kk∞ min kcx − x# k2 ≤ C2 √ . |c|=1 1 − δs kxk2 Together with the random selection of normal, independently distributed entries as in Proposition 4, we achieve overwhelming probability of approximate recovery. Corollary 7. If A is a complex random matrix with entries whose real and imaginary parts are drawn independently at random from a normal distribution with mean zero and variance 1/(2m), with s, m, N and t > 0 chosen according to the assumption of Proposition 4, then the error bound in the preceding theorem holds for each x ∈ CN with a probability s −mt2 bounded below by 1 − 2( eN . s ) e Acknowledgment. This paper was supported in part by NSF grant DMS-1412524. R EFERENCES [1] Boris Alexeev, Afonso S. Bandeira, Matthew Fickus, and Dustin G. Mixon, Phase Retrieval with Polarization, SIAM J. Imaging Sci. 7 (2014), no. 1, 35–66. [2] Radu Balan, Bernhard G. Bodmann, Peter G. Casazza, and Dan Edidin, Painless reconstruction from magnitudes of frame coefficients, J. Fourier Anal. Appl. 15 (August 2009), no. 4, 488–501. [3] Radu Balan and Yang Wang, Invertibility and Robustness of Phaseless Reconstruction, Applied and Computational Harmonic Analysis 38 (May 2015), no. 3, 469–488. [4] Afonso S. Bandeira and Dustin G. Mixon, Near-Optimal Phase Retrieval of Sparse Vectors, Proceedings of SPIE, 2013. [5] S. Barel, O. Cohen, Y. C. Eldar, D. G. Mixon, and P. Sidorenko, Sparse Phase Retrieval from Short-Time Fourier Measurements, IEEE Signal Processing Letters 22 (2015), no. 5, 638–642. [6] Bernhard G. Bodmann and Nathaniel Hammen, Algorithms and error bounds for noisy phase retrieval with low-redundancy frames (December 2014), available at arXiv:1412.6678. pre-print. [7] Ingrid Daubechies, Ronald DeVore, Massimo Fornasier, and C. Sinan Gunturk, Iteratively Re-weighted Least Squares Minimization for Sparse Recovery, Comm Pure Appl. Math. 63 (2010), 1–38. [8] Roy Dong, Henrik Ohlsson, Shankar Sastry, and Allen Yang, Compressive Phase Retrieval From Squared Output Measurements Via Semidefinite Programming, 16th IFAC Symposium on System Identification, SYSID 2012, July 2012. [9] Simon Foucart and Holger Rauhut, A Mathematical Introduction to Compressive Sensing, Springer, 2013. [10] David Gross, Felix Krahmer, and Richard Kueng, A partial derandomization of PhaseLift using spherical designs, J. Fourier Anal. Appl. 21 (2015), 229–266. [11] Babak Hassibi, Kishore Jaganathan, and Samet Oymak, Recovery of Sparse 1-D Signals from the Magnitudes of their Fourier Transform, 2012 IEEE International Symposium on Information Theory Proceedings (ISIT), July 2012. [12] Mark Iwen, Aditya Viswanathan, and Yang Wang, Robust sparse phase retrieval made easy (October 2014), available at arXiv:1410.5295. preprint. [13] Monson H. Hayes, Jae S. Lim, and Alan V. Oppenheim, Signal reconstruction from phase or magnitude, IEEE Trans. Acoust., Speech, Signal Process. 28 (December 1980), no. 6, 672–680. [14] Xiaodong Li and Vladislav Voroninski, Sparse Signal Recovery from Quadratic Measurements via Convex Programming, SIAM J. Math. Anal. 45 (2013), no. 5, 3019–3033. [15] Arthur L. Patterson, A direct method for the determination of the components of interatomic distances in crystals, Zeitschrift für Kristallographie 90 (1935), 517–542. [16] Volker Pohl, Fanny Yang, and Holger Boche, Phaseless signal recovery in infinite dimensional spaces using structured modulations, J. Fourier Anal. Appl. 20 (December 2014), 1212–1233. [17] Lawrence Rabiner and Biing Hwang H. Juang, Fundamentals of speech recognition, Prentice Hall, 1993. [18] Vladislav Voroninski and Zhiqiang Xu, A strong restricted isometry property, with an application to phaseless compressed sensing (April 2014), available at arXiv:1404.3811. pre-print. [19] Adriaan Walther, The question of phase retrieval in optics, Journal of Modern Optics 10 (1963), no. 1, 41–49. [20] Fanny Yang, Volker Pohl, and Holger Boche, Phase retrieval via structured modulations in Paley-Wiener spaces, Proc. 10th Intern. Conf. on Sampling Theory and Applications (SampTA), July 2013.