Convex recovery from interferometric measurements Laurent Demanet and Vincent Jugnon∗ July 2013 Abstract This note formulates a deterministic recovery result for vectors x from quadratic measurements of the form (Ax)i (Ax)j for some left-invertible A. Recovery is exact, or stable in the noisy case, when the couples (i, j) are chosen as edges of a well-connected graph. One possible way of obtaining the solution is as a feasible point of a simple semidefinite program. Furthermore, we show how the proportionality constant in the error estimate depends on the spectral gap of a data-weighted graph Laplacian. Such quadratic measurements have found applications in phase retrieval, angular synchronization, and more recently interferometric waveform inversion. Acknowledgments. The authors would like to thank Amit Singer for interesting discussions. 1 Introduction The goal of this note is to formulate an analogue to certain quadratic systems of the well-known relative error bound kx − x0 k kek ≤ κ(A) (1) kx0 k kbk for the least-squares solution of the overdetermined linear system Ax = b with b = Ax0 + e, and where κ(A) is the condition number of A. We consider complex quadratic measurements of x ∈ Cn of the form Bij = (Ax)i (Ax)j , (i, j) ∈ E, (2) for certain well-chosen couples of indices (i, j), a scenario that we qualify as “interferometric”. This combination is special in that it is symmetric in x, and of rank 1 with respect to the indices i and j, hence the problem can be thought of as rank-1 symmetric matrix completion. The regime that interests us is when the number m of measurements, i.e., of couples (i, j) in E, is comparable to the number n of unknowns. While phaseless measurements bi = |(Ax)i |2 only admit recovery when A has very special structure – such as, being a tall random matrix with Gaussian i.i.d. entries [5, 9] – products (Ax)i (Ax)j for i 6= j correspond to the idea of phase differences hence encode much more information. As a consequence, stable recovery occurs under very general conditions: left-invertibility of A and “connectedness” of set E of couples (i, j). These conditions suffices to allow for m to be on the order of n. Various algorithms return x accurately up to a global phase; we mosty discuss variants of lifting with semidefinite relaxation in this paper. In contrast to other recovery results in matrix completion [18, 6], no randomness is needed in the ∗ Department of Mathematics, MIT. VJ is supported by the Earth Resources Laboratory at MIT. This work was supported by AFOSR, ONR, NSF, Total SA, and the Alfred P. Sloan Foundation. 1 data model, and our proof technique involves elementary spectral graph theory rather than dual certification or uncertainty principles. Our result is that an inequality of the form (1) still holds, but with the square root of the spectral gap of a data-weighted graph Laplacian in place of kbk in the right-hand side. This spectral gap quantifies the connectedness of E, and has the proper homogeneity with respect to b. 1.1 Interferometry In optical imaging, an interference fringe of two (possibly complex-valued) wavefields f (t) and g(t), where t is either a time or a space variable, is any combination of the form |f (t) + g(t + t0 )|2 . The sum is a result of the linearity of amplitudes in the fundamental equations of physics (such as Maxwell or Schrödinger), while the modulus squared is simply the result of a detector measuring intensities. The cross term 2<(f (t)g(t + t0 )) in the expansion of the square manifestly carries the information of destructive vs. constructive interference, hence is a continuous version of what we referred to earlier as an “interferometric measurement”. In particular, when the two signals are sinusoidal at the same frequency, the interferometric combination highlights a phase difference. In astronomical interferometry, the delay t0 is for instance chosen so that the two signals interfere constructively, yielding better resolution. Interferometric synthetic aperture radar (InSAR) is a remote sensing technique that uses the fringe from two datasets taken at different times to infer small displacements. In X-ray ptychograhy [19], imaging is done by undoing the interferometric combinations that the diffracted X-rays undergo from encoding masks. These are but three examples in a long list of applications. Interferometry is also playing an increasingly important role in geophysical imaging, i.e., inversion of the elastic parameters of the portions of the Earth’s upper crust from scattered seismic waves. In this context however, the signals are often impulsive rather than monochromatic. As a result, it is more common to perform quadratic combinations of the Fourier transform of seismograms at different receivers, such as |fb(ω) + gb(ω)|2 . The cross-term involves fb(ω)b g (ω), the Fourier transform of the cross-correlation of f and g. It highlights a time lag in the case when f and g are impulses. Cross-correlations have been shown to play an important role in imaging, mostly because of their stability to statistical fluctuations of a scattering medium [3] or an incoherent source [16, 11]. Though seismic interferometry is a vast research area [4, 22, 27, 21], explicit inversion of reflectivity parameters from interferometric data has to our knowledge only been considered in [10, 14]. Interferometric inversion offers great promise for model-robust imaging, i.e., recovery of reflectivity maps in a less-sensitive way on specific kinds of errors in the forward model. Finally, interferometric measurements also play an important role in quantum optical imaging. See [20] for a nice solution to the inverse problem of recovering a scattering dielectric susceptibility from measurements of two-point correlation functions relative to two-photon entangled states. 1.2 Broader context and related work The setting of this paper is discrete, hence we let i and j in place of either a time of frequency variable. We also specialize to f = g, and we let f = Ax to possibly allow an explanation of the signal f by a linear forward model1 A. The link between products of the form f g and squared measurements |f + g|2 goes both ways, 1 Such as scattering from a reflectivity profile x in the Born approximation, for which A is a wave equation Green’s function. 2 as shown by the polarization identity 4 fi fj = 1 X −iπk/2 e |fi + eiπk/2 fj |2 . 4 k=1 Hence any result of robust recovery of f , or x, from couples fi fj , implies the same result for recovery from phaseless measurements of the form |fi + eiπk/2 fj |2 . This latter setting was precisely considerered by Candès et al. in [5], where an application to x-ray diffraction imaging with a specific choice of masks is discussed. In [1], Alexeev et al. use the same polarization identity to design good measurements for phase retrieval, such that recovery is possible with m = O(n). Recovery of fi from fi fj for some (i, j) when |fi | = 1 (interferometric phase retrieval) can be seen a special case of the problem of angular synchronization considered by Singer [23]. There, rotation matrices Ri are to be recovered (up to a global rotation) from measurements of relative rotations Ri Rj−1 for some (i, j). This problem has an important application to cryo-electron microscopy, where the measurements of relative rotations are further corrupted in an a priori unknown fashion (i.e., the set E is to be recovered as well). An impressive recovery result under a Bernoulli model of gross corruption, with a characterization of the critical probability, were recently obtained by Wang and Singer [25]. The spectrum of an adapted graph Laplacian plays an important role in their analysis [2], much as it does in this paper. Singer and Cucuringu also considered the angular synchronization problem from the viewpoint of rigidity theory [24]. For the similar problem of recovery of positions from relative distance, with applications to sensor network localization, see for instance [13]. The algorithmic approach considered in this paper for solving interferometric inversion problems is to formulate them via lifting and semidefinite relaxation. This idea was considered by many groups in recent years [7, 5, 13, 23, 26], and finds its origin in theoretical computer science [12]. 1.3 Recovery of unknown phases Let us start by describing the simpler problem of interferometric phase recovery, when A = I and we furthermore assume |xi | = 1. Given a vector x0 ∈ Cn such that |(x0 )i | = 1, a set E of pairs (i, j), and noisy interferometric data Bij = (x0 )i (x0 )j + εij , find a vector x such that X (3) |xi | = 1, |xi xj − Bij | ≤ σ, (i,j)∈E for some σ > 0. Here and below, if no heuristic is provided for σ, we may cast the problem as a minimization problem for the misfit and obtain σ a posteriori. The choice of the elementwise `1 norm over E is arbitrary, but convenient for the analysis in the sequel2 . We aim to find situations in which this problem has a solution x close toP x0 , up to a global phase. Notice that x0 is feasible for (3), hence a solution exists, as soon as σ ≥ i,j∈E |εij |. The relaxation by lifting of this problem is to find X (a proxy for xx∗ ) such that X Xii = 1, |Xij − Bij | ≤ σ, X 0, (i,j)∈E then let x be the top eigenvector of X with kxk2 = n. 2 (4) The choice of `1 norm as “least unsquared deviation” is central in [25] for the type of outlier-robust recovery behavior documented there. 3 The notation X 0 means that X is symmetric and positive semi-definite. Again, the feasibility P problem (4) has at least one solution (X0 = x0 x∗0 ) as soon as σ ≥ i,j∈E |εij |. The set E generates edges of a graph G = (V, E), where the nodes in V are indexed by i. Without loss of generality, we consider E to be symmetric. By convention, G does not contain loops, i.e., the diagonal j = i is not part of E. (Measurements on the diagonal are not informative for the phase recovery problem, since |(x0 )i |2 = 1.) The graph Laplacian on G is di if i = j; −1 if (i, j) ∈ E; Lij = 0 otherwise, P where di is the node degree di = j:(i,j)∈E 1. Observe that L is symmetric and L 0 by Gershgorin. Denote by λ1 ≤ λ2 ≤ . . . ≤ λn the eigenvalues of L sorted in increasing order. Then λ1 = 0 with the constant eigenvector v1 = √1n . The second eigenvalue is zero if and only if G has two or more disconnected components. When λ2 > 0, its value is a measure of connectedness of the graph. Note that λn ≤ 2d by Gershgorin again, where d = maxi di is the maximum degree. Since λ1 = 0, the second eigenvalue λ2 is called the spectral gap. It is a central quantity in the study of expander graphs: it relates to • the edge expansion (Cheeger constant, large if λ2 is large); • the degree of separation between any two nodes (small if λ2 is large); and • the speed of mixing of a random walk on the graph (fast if λ2 is large). More information about spectral graph theory can be found, e.g., in the lecture notes by Lovasz [17]. It is easy to show with interlacing theorems that adding an edge to E, or removing a node from V , both increase λ2 . The spectral gap plays an important role in the following stability result. In the sequel, we denote the componentwise `1 norm on the set E by k · k1 . Theorem 1. Assume kεk1 + σ ≤ nλ2 , where λ2 is the second eigenvalue of the graph Laplacian L on G. Any solution x of (3) or (4) obeys s kεk1 + σ kx − eiα x0 k ≤ 4 , λ2 for some α ∈ [0, 2π). Manifestly, recovery is exact (up to the global phase ambiguity) as soon as ε = 0 and σ = 0, provided λ2 6= 0, i.e., the graph G is connected. The easiest way to construct expander graphs (graphs with large λ2 ) is to set up a probabilistic model with a Bernoulli distribution for each edge in an i.i.d. fashion, a model known as the Erdős-Rényi random graph. It can be shown that such graphs have a spectral gap bounded away from zero independently of n with m = O(n log n) edges. A stronger result is available when the noise ε originates at the level of x0 , i.e., B = x0 x∗0 + ε has the form (x0 + e)(x0 + e)∗ . Corollary 2. Assume ε = (x0 + e)(x0 + e)∗ − x0 x∗0 and σ ≤ nλ2 , where λ2 is the second eigenvalue of the graph Laplacian L on G. Any solution x of (3) or (4) obeys r σ iα + kek, kx − e x0 k ≤ 4 λ2 for some α ∈ [0, 2π). 4 Proof. Apply theorem 1 with ε = 0, x0 + e in place of x0 , then use the triangle inequality. In the setting of the corollary, problem (3) always has x = x0 + e as a solution, hence is feasible even when σ = 0. Let us briefly review the eigenvector method for interferometric recovery. In [23], Singer proposed to use the first eigenvector of the (noisy) data-weighted graph Laplacian as an estimator of the vector of phases. A similar idea appears in the work of Montanari et al. as the first step of their OptSpace algorithm [15], and in the work of Chatterjee on universal thresholding [8]. In our setting, this means defining if i = j; di −Bij if (i, j) ∈ E; (Le )ij = 0 otherwise, √ and letting x = ve1 n where v1 is the unit-norm eigenvector of Le with smallest eigenvalue. Denote e1 ≤ λ e2 ≤ . . . the eigenvalues of L. e The following result is known from [2], but we provide an by λ elementary proof for completeness. e2 /2. Then the result x of the eigenvector method obeys Theorem 3. Assume kεk ≤ λ kx − eiα x0 k ≤ √ 2n kεk , e2 λ for some α ∈ [0, 2π). Alternatively, we may express the inequality in terms of λ2 , the spectral gap of the noise-free e2 ≥ λ2 − kεk. Both λ2 and λ e2 are computationally Laplacian L defined earlier, by noticing3 that λ e1 ≥ 0, hence λ e2 is (slightly) greater than the accessible. In the case when |Bij | = 1, we have λ e e e e spectral gap λ2 − λ1 of L. Note that the 1/λ2 scaling appears to be sharp in view of the numerical experiments reported in section 3. The inverse square root scaling of theorem 1 is stronger in the presence of small spectral gaps, but the noise scaling is weaker in theorem 1 than in theorem 3. 1.4 Interferometric recovery The more general version of the interferometric recovery problem is to consider a left-invertible tall matrix A, linear measurements b = Ax0 for some vector x0 (without condition on the modulus of either bi or (x0 )i ), noisy interferometric measurements Bij = bi bj + εij for (i, j) in some set E, and find x subject to X |(Ax)i (Ax)j − Bij | ≤ σ. (5) (i,j)∈E∪D Notice that we now take the union of the diagonal D = {(i, i)} with E. Without loss of generality we assume that εij = εji , which can be achieved by symmetrizing the measurements. Since we no longer have a unit-modulus condition, the relevant notion of graph Laplacian is now data-dependent. It reads P 2 if i = j; k:(i,k)∈E |bk | L|b| ij = −|bi ||bj | if (i, j) ∈ E; 0 otherwise. e ≤ kεk, with L = ΛLΛ∗ the noise-free Laplacian with phases introduced at the beginning This owes to kL − Lk of section 2.1. 3 5 The connectedness properties of the underlying graph now depend on the size of |bi |: the edge (i, j) carries valuable information if and only if both |bi | and |bj | are large. A few different recovery formulations arise naturally in the context of lifting and semidefinite relaxation. • The basic lifted formulation is to find some X such that X |(AXA∗ )ij − Bij | ≤ σ, X 0, (i,j)∈E∪D √ then let x = x1 η1 , where (η1 , x1 ) is the top eigen-pair of X. (6) Our main result is as follows. Theorem 4. Assume kεk1 +σ ≤ λ2 /2, where λ2 is the second eigenvalue of the data-weighted graph Laplacian L|b| . Any solution x of (6) obeys s kx − eiα x0 k kεk1 + σ ≤ 15 κ(A)2 , kx0 k λ2 for some α ∈ [0, 2π), and where κ(A) is the condition number of A. The quadratic dependence on κ(A) is necessary4 . In section 3, we numerically verify the inverse square root scaling in terms of λ2 . The numerical experiments also indicate that the noise scaling is not in general tight – we do not know whether this is a consequence of the choice of regularization to pick a solution in the feasibility set or not. If the noise originates from b + e rather than bb∗ + ε, the error bound is again improved to r σ kx − eiα x0 k kek 2 ≤ 15 κ(A) + κ(A) , kx0 k λ2 kbk for the same reason as earlier. • An alternative, two-step lifting formulation is to find x through Y such that X |Yij − Bij | ≤ σ, Y 0, (i,j)∈E∪D √ then let x = A+ y1 η1 , where (η1 , y1 ) is the top eigen-pair of Y . (7) The dependence on the condition number of A is more favorable than for the basic lifting formulation. Theorem 5. Assume kεk1 +σ ≤ λ2 /2, where λ2 is the second eigenvalue of the data-weighted graph Laplacian L|b| . Any solution x of (5) or (7) obeys s kεk1 + σ kx − eiα x0 k ≤ 15 κ(A) , kx0 k λ2 for some α ∈ [0, 2π). 4 The following example shows why that is the case. For any X0 and invertible A, the solution to AXA∗ = AX0 A∗ + ε is X = X0 + A+ ε(A∗ )+ . Let X0 = e1 eT1 , ε = δe1 e∗1 for some small δ, and A+ = I + N e1 eT1 . Then √ X = (1 + δN 2 )e1 eT1 , and the square root of its leading eigenvalue is η1 ' 1 + 12 δN 2 . As a result, x is perturbation + 2 of x0 by a quantity of magnitude O(δkA k ). 6 The quantity λ2 is not computationally accessible in general, but it can be related to the second e2 of the noisy data-weighted Laplacian, eigenvalue λ P k:(i,k)∈E Bkk if i = j; e = LB −Bij if (i, j) ∈ E; ij 0 otherwise. e2 − [ (d + 1)kεk∞ + kεk ], where k · k∞ is the elementwise It is straightforward to show that λ2 ≥ λ maximum on E ∪ D, k · k is the spectral norm, and d is the maximum node degree. 2 Proofs 2.1 Proof of theorem 1. Observe that if x is feasible for (3), then xx∗ is feasible for (4), and has eiα x as leading eigenvector. Hence we focus without loss of generality on (4). As in [23], consider the Laplacian matrix weighted with the unknown phases, L = ΛLΛ∗ , with Λ = diag(x0 ). In other words Lij = (X0 )ij Lij with X0 = x0 x∗0 . We still have L 0 and λ1 = 0, but now v1 = √1n x0 . Here and below, λ and v refer to L, and v has unit `2 norm. The idea of the proof is to compare X with the rank-1 spectral projectors vj vj∗ of L. Let hA, Bi = tr(AB ∗ ) be the Frobenius inner product. Any X obeying (3) can be written as X = X0 + εe with ke εk1 ≤ kεk1 + σ. We have hX, Li = hX0 , Li + he ε, Li A short computation shows that hX0 , Li = X i =− X (X0 )ii Lii + (i,j)∈E X i X di + (x0 )i (x0 )j (x0 )i (x0 )j (i,j)∈E = (X0 )ij Lij X X −di + i 1 j:(i,j)∈E = 0. Since |Lij | = 1 on E, the error term is simply bounded as |he ε, Li| ≤ ke εk1 On the other hand the Laplacian expands as X L= vj λj vj∗ , j so we can introduce a convenient normalization factor 1/n and write h X X , Li = cj λj , n j 7 (8) with vj∗ Xvj X , vj vj∗ i = . n n Notice that cj ≥ 0 since we require X 0. Their sum is cj = h X cj = h j X X X tr(X) , vj vj∗ i = h , Ii = = 1. n n n j Hence (8) is a convex combination of the eigenvalues of L, bounded by ke εk1 /n. The smaller this bound, the more lopsided the convex combination toward λ1 , i.e., the larger c1 . The following lemma makes this observation precise. P P Lemma 1. Let µ = j cj λj with cj ≥ 0, j cj = 1, and λ1 = 0. If µ ≤ λ2 , then c1 ≥ 1 − µ . λ2 Proof of lemma 1. µ= X cj λj ≥ λ2 i≥2 X cj = λ2 (1 − c1 ), j≥2 then isolate c1 . Assuming ke εk1 ≤ nλ2 , we now have h X ke εk 1 , v1 v1∗ i ≥ 1 − . n nλ2 We can further bound X k − v1 v1∗ k2F = tr n = " X − v1 v1∗ n tr((v1 v1∗ )2 ) 2 # tr(X 2 ) + − 2 tr n2 X ∗ v1 v1 . n The first term is 1. The second term is less than 1, since tr(X 2 ) ≤ tr(X)2 for positive semidefinite matrices. Therefore, X X 2ke εk 1 ∗ 2 ∗ k − v1 v1 kF ≤ 2 − 2 tr v1 v1 ≤ . n n nλ2 We can now control the departure of the top eigenvector of X/n from v1 by the following lemma. It is analogous to the sin theta theorem of Davis-Kahan, except for the choice of normalization of the vectors. (It is also a generalization of a lemma used by one of us in [9] (section 4.2).) The proof is only given for completeness. 2 Lemma 2. Consider any Hermitian X ∈ Cn×n , and any v ∈ Cn , such that kX − vv ∗ k < kvk 2 . Let η1 be the leading eigenvalue of X, and x1 the corresponding unit-norm eigenvector. Let x be √ defined either as (a) x1 kvk, or as (b) x1 η1 . Then √ k xkxk − eiα vkvk k ≤ 2 2 kX − vv ∗ k, for some α ∈ [0, 2π). 8 P Proof of Lemma 2. Let δ = kX − vv ∗ k. Notice that kvv ∗ k = kvk2 . Decompose X = nj=1 xj ηj x∗j with eigenvalues ηj sorted in decreasing order. By perturbation theory for symmetric matrices (Weyl’s inequalities), max{|kvk2 − η1 |, |η2 |, . . . , |ηn |} ≤ δ, (9) so it is clear that η1 > 0, and that the eigenspace of η1 is one-dimensional, as soon as δ < Let us deal first with the case (a) when x = x1 kvk. Consider kvk2 2 . vv ∗ − xx∗ = vv ∗ − X + Y, where 2 Y = x1 (kvk − η1 )x∗1 + n X xj ηj x∗j . j=2 From (9), it is clear that kY k ≤ δ. Let v1 = v/kvk. We get kvv ∗ − xx∗ k ≤ kvv ∗ − Xk + kY k ≤ 2δ. Pick α so that |v ∗ x| = e−iα v ∗ x. Then k vkvk − e−iα xkxk k2 = kvk4 + kxk4 − 2 kvk kxk < e−iα v ∗ x = kvk4 + kxk4 − 2 kvk kxk |v ∗ x| ≤ kvk4 + kxk4 − 2 |v ∗ x|2 ∗ = kvv − xx∗ k2F ∗ 2 ≤ 2kvv ∗ − xx k by definition of α by Cauchy-Schwarz because vv ∗ − xx∗ has rank 2 ≤ 8δ 2 . √ The case (b) when x = x1 η1 is treated analogously. The only difference is now that Y = n X xj ηj x∗j . j=2 A fortiori, kY k ≤ δ as well. Part (a) of lemma 2 is applied with X/n in place of X, and v1 in place of v. In that case, kv1 k = 1. We conclude the proof by noticing that v1 = √x0n , and that the output x of the lifting method is normalized so that x1 = √xn . 2.2 Proof of theorem 3 The proof is a simple argument of perturbation of eigenvectors. We either assume εij = εji or e ≤ enforce it by symmetrizing the measurements. Define L as previously, and notice that kL − Lk kεk. Consider the eigen-decompositions Lvj = λj vj , ej vej , evj = λ Le with λ1 = 0. Form e j = λj vj + rj , Lv 9 with krj k ≤ kεk. Take the dot product of the equation above with vek to obtain ek − λj )he he vk , rj i = (λ vk , vj i. Let j = 1, and use λ1 = 0. We get X P 2 |he vk , v1 i| ≤ vk , r1 i|2 k≥2 |he ek |2 maxk≥2 |λ k≥2 ≤ kεk2 . e2 λ 2 As a result, |he v1 , v1 i|2 ≥ 1 − kεk2 . e2 λ 2 Choose α so that heiα ve1 , v1 i = |he v1 , v1 i|. Then kv1 − eiα ve1 k2 = 2 − 2<heiα ve1 , v1 i = 2 − 2|he v1 , v1 i| ≤ 2 − 2|he v1 , v1 i|2 ≤2 kεk2 . e2 λ 2 Conclude by multiplying through by n and taking a square root. 2.3 Proof of theorem 4. The proof follows the argument in section 2.1; we mostly highlight the modifications. Let bi = |bi |eiφi . The Laplacian with phases is Lb = Λφ L|b| Λ∗φ , with Λφ = diag(eiφi ). Explicitly, P 2 if i = j; k:(i,k)∈E |bk | (Lb )ij = −bi bj if (i, j) ∈ E; 0 otherwise, The matrix Y = AXA∗ is compared to the rank-1 spectral projectors of Lb . We can write it as Y = bb∗ + εe with ke εk1 ≤ kεk1 + σ. The computation of hbb∗ , Lb i is now X X hbb∗ , Lb i = bi bi Lii + bi bj Lij i =− (i,j)∈E X |bi | X 2 i X |bk |2 + k:(i,k)∈E bi bj bi bj (i,j)∈E = X i X |bi |2 − |bj |2 + j:(i,j)∈E X |bj |2 j:(i,j)∈E = 0. The error term is now bounded (in a rather crude fashion) as X |he ε, Lb i| ≤ kLb k∞ ke εk1 ≤ max |bj |2 ke εk1 ≤ kbk2 ke εk 1 . i j:(i,j)∈E 10 Upon normalizing Y to unit trace, we get |h Y kbk2 ke εk 1 , Lb i| ≤ ≤ 2ke εk 1 , tr(Y ) kbk2 + tr(e ε) where the last inequality follows from |tr(e ε)| ≤ ke εk1 ≤ kεk1 + σ ≤ λ2 /2 (assumption of the theorem) 2 ≤ kbk /2 (by Gershgorin). On the other hand, we expand h X Y , Lb i = cj λj , tr(Y ) j and use X 0 ⇒ Y 0 to get cj ≥ 0, that h P j cj = 1. Since 2ke εk1 ≤ λ2 , we conclude as in section 2.1 Y 2ke εk 1 , v1 v1∗ i ≥ 1 − , tr(Y ) λ2 hence k ke εk1 Y − v1 v1∗ k2F ≤ 4 . tr(Y ) λ2 (10) For X = A+ Y (A∗ )+ , we get k X ke εk 1 − (A+ v1 )(A+ v1 )∗ k2F ≤ 4kA+ k4 . tr(Y ) λ2 Call the right-hand side δ 2 . Recall that v1 = b/kbk hence A+ v1 = x0 /kbk. Using tr(Y ) = kbk2 + tr(e ε), we get kx0 k2 |tr(e ε)|. (11) kX − x0 x∗0 k ≤ δ tr(Y ) + kbk2 Elementary calculations based on the bound ke εk1 ≤ λ2 /2 ≤ kbk2 /2 allow to further bound the √ above quantity by (6+ 2) δkbk2 4 . We can now call upon lemma 2, part (b), to obtain √ √ (6 + 2) iα kxkxk − e x0 kx0 kk ≤ 2 2 δkbk2 , 4 p where x = x1 λ1 (X) is the leading eigenvector of X normalized so that kxk2 = λ1 (X) is the leading eigenvalue of X. We use (11) one more time to bound √ (6 + 2) 2 |λ1 (X) − kx0 k | ≤ δkbk2 , 4 hence kx0 k kx − eiα x0 k ≤ kxkxk − eiα x0 kx0 k + kxk |kxk − kx0 k| √ √ (6 + 2) kxk ≤2 2 δkbk2 + |kxk2 − kx0 k2 | 4 kxk + kx0 k √ √ (6 + 2) ≤ (2 2 + 1) δkbk2 . 4 11 Use kbk ≤ kAk kx0 k and the formula for δ to conclude that s iα 2 kx − e x0 k ≤ C kx0 k κ(A) ke εk 1 , λ2 √ √ with C = 2(2 2 + 1) (6+4 2) ≤ 15. 2.4 Proof of theorem 5. The proof proceeds as in the previous section, up to equation (10). The rest of the reasoning is a close mirror of the one p in the previous section, with Y in place of X, y in place of x, b in place of εk1 /λ2 . We obtain x0 , and δ re-set to 2 ke s ke εk1 ky − eiα bk ≤ 15 kbk . λ2 We conclude by letting x = A+ y, x0 = A+ b, and using kbk ≤ kAkkx0 k. 3 Numerical illustrations We investigate the scalings of the bounds for phase recovery given by theorems 1 and 3 on toy examples (n = 27 ). 3.1 Influence of the spectral gap We first focus on the scaling with respect to the spectral gap. This is achieved by considering three types of graphs : • the path Pn which is proven to be the connected graph with the smallest spectral gap5 ; • graphs obtained by adding randomly K edges to Pn with K ranging from 1 to 50; • Erdős-Rényi random graphs with probability ranging from 0.03 to 0.05, conditioned on connectedness (positive specrtal gap). A realization of the two latter types of graphs is given in figure 1. 5 As justified by the decreasing property of λ2 under edge removal, mentioned earlier. 12 Figure 1: Pn + random edges (left), Erdős-Rényi random graph (right) To study the eigenvector method, we draw one realization of a symmetric error matrix ε with εij ∼ CN (0, η 2 ), with η = 10−8 . The spectral norm of the noise (used in theorem 3) is ||ε|| ∼ 2 × 10−7 . For different realizations of the aforementioned graphs, we estimate the solution with the eigenvector e2 . See figure 2. method and plot the `2 recovery error as a function of λ −4 10 Erdos−Renyi graphs graphs obtained by adding edges randomly to P(n) path P(n) bound of theorem 3 −5 l2 recovery error 10 −6 10 −7 10 −8 10 −9 10 −10 10 −4 10 −3 10 −2 −1 10 10 0 10 1 10 Noisy spectral gap of the graph Laplacian λ̃2 e2 Figure 2: Recovery error for the eigenvector method as a function of λ To study the feasibility method, we consider the case of an approximate fit (σ = 10−4 ) in the noiseless case (ε = 0). The feasibility problem (4) is solved using the Matlab toolbox cvx which calls the toolbox SeDuMi. An interior point algorithm (centering predictor-corrector) is used. The 13 recovery error as a function of the spectral gap λ2 is illustrated in figure 3. 1 10 Erdos−Renyi graphs P(n) + random edges path P(n) bound of theorem 1 0 10 −1 l2 recovery error 10 −2 10 −3 10 −4 10 −5 10 −6 10 −7 10 −4 −3 10 10 −2 −1 spectral gap of the graph Laplacian λ 10 10 0 1 10 10 2 Figure 3: Recovery error for the feasibility method as a function of λ2 3.2 Influence of the noise level We now fix the set E as one realization of adding K = 15 edges randomly to Pn . We then draw realizations of the noise level, ε ∼ CN (0, η 2 ) with η logarithmically equally spaced between 10−6 λ2 and 10−1 λ2 . The recovery for the eigenvector method is illustrated in figure 4. 1 10 recovery from the eigenvector method bound of theorem 3 0 10 −1 l2 recovery error 10 −2 10 −3 10 −4 10 −5 10 −6 10 −7 10 −6 10 −5 10 −4 −3 10 10 −2 10 −1 10 spectral norm of the noise ||ε|| Figure 4: Recovery error for the eigenvector method as a function of the spectral norm of the noise |||| For the feasibility problem, we chose σ to be two times the `1 norm of on E. The recovery for the feasibility method is illustrated in figure 5. As mentioned earlier, it is unclear to us whether 14 the bound could be strenghthened or if the discrepancy owes to the particular method by which a point is chosen in the feasibility set. 2 10 recovery from the feasibility method bound of theorem 1 1 10 0 l2 recovery error 10 −1 10 −2 10 −3 10 −4 10 −5 10 −6 10 −5 10 −4 10 −3 −2 10 10 −1 10 0 10 l1 norm of the noise ||ǫ||l1 (E) Figure 5: Recovery error for the feasiibility method as a function of the l1 norm of the noise ||||1 . 3.3 Interferometric inverse scattering An important application of interferometric ideas is to the inversion of a medium’s index of refraction from recordings of waves scattered by that medium, as in seismic imaging. In a first numerical experiment, we let b = Ax where x is a reflectivity profile in a rectangle (perturbation of the index of refraction), and A is an acoustic wave equation that maps this reflectivity profile in a linearized way to the solution wavefield b sampled at receivers placed at the top edge of the rectangle (surface). The wave equation is discretized via finite differences, with different schemes for the data modeling step and for the inversion step. The data index i runs over receiver locations xr , frequencies ω, and source positions xs (which define different wave equations with different righ-hand sides.) Figure 6 shows robust recovery of x from noisy b, both by least-squares and by interferometric inversion. Here the noise model is taken to be Gaussian, ebi = bi + ηi ηi ∼ CN (0, σ 2 ) ||b||2 ||η||2 where CN is the complex normal distribution and σ = 0.1 √ so that = 0.1 (10% additive ||b||2 2n noise). log(N ) In this case the graph E is taken to be an Erdős-Rényi random graph with p = 1.5 N to ensure connectedness. The computational method used for handling this example is a rank-2 relaxation scheme explained in the companion note [14]. In a second numerical experiment, we show that interferometric inversion is still accurate and stable, even when the forward model b = A(x) is the full wave equation that maps the index of refraction x to the wavefield b nonlinearly (no Born approximation.) Again, 10% Gaussian noise is added. Figure 7 shows the result of nonlinear least-squares inversion, and the corresponding interferometric inversion result. 15 These numerical experiments merely show that interferometric inversion can be accurate and stable under minimal assumptions on the graph E of data pair products. One rationale for switching to the interferometric formulation is that its results display robustness vis-a-vis uncertainties in the forward model A, an aspect that we briefly document in [14] and intend to further investigate. 0 1500 500 1000 1000 500 1500 0 2000 −500 2500 −1000 3000 −1500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 800 0 600 500 400 1000 200 1500 0 2000 −200 2500 −400 3000 −600 −800 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 800 0 600 500 400 1000 200 1500 0 2000 −200 2500 −400 3000 −600 −800 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Figure 6: First: true, unknown reflectivity profile x. Second: least-squares solution. Third: result of interferometric inversion. 16 0 4500 500 4000 1000 3500 1500 3000 2000 2500 2500 2000 3000 1500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 4500 500 4000 1000 3500 1500 3000 2000 2500 2500 2000 3000 1500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 4500 500 4000 1000 3500 1500 3000 2000 2500 2500 2000 3000 1500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 4500 500 4000 1000 3500 1500 3000 2000 2500 2500 2000 3000 1500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Figure 7: First: true, unknown map of the index of refraction x. Second: initial guess for either inversion scheme. Third: nonlinear least-squares solution. Fourth: result of interferometric inversion. 17 References [1] B. Alexeev, A. S. Bandeira, M. Fickus and D. G. Mixon. Phase retrieval with polarization, arXiv preprint arXiv:1210.7752, 2012 [2] A. S. Bandeira, A. Singer, and D. A. Spielman. A Cheeger inequality for the graph connection Laplacian, arXiv preprint arXiv:1204.3873, 2012 [3] P. Blomgren, G. Papanicolaou, and H. Zhao, Super-resolution in time-reversal acoustics, J. Acoust. Soc. Am. 111(1), 230-248, 2002 [4] L. Borcea, G. Papanicolaou, and C. Tsogka, Interferometric array imaging in clutter, Inv. Prob. 21(4), 1419-1460, 2005 [5] E. J. Candes, Y. C. Eldar, T. Strohmer and V. Voroninski. Phase retrieval via matrix completion, SIAM Journal on Imaging Sciences, 6(1), 199-225, 2013 [6] E. J. Candes, B. Recht, Exact Matrix Completion via Convex Optimization, Found. Comput. Math., 9-6, 717-772, 2009 [7] A. Chai, M. Moscoso, and G. Papanicolaou. Array imaging using intensity-only measurements, Inverse Problems 27.1 015005, 2011 [8] S. Chatterjee, Matrix estimation by Universal Singular Value Thresholding, arXiv preprint arXiv:1212.1247, 2012 [9] L. Demanet and P. Hand. Stable optimizationless recovery from phaseless linear measurements, preprint, 2013 [10] E. Dussaud, Velocity analysis in the presence of uncertainty, Ph.D. thesis, Computational and Applied Mathematics, Rice University, 2005 [11] J. Garnier, Imaging in randomly layered media by cross-correlating noisy signals, SIAM Multiscale Model. Simul. 4, 610-640, 2005 [12] M. Goemans and D. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, Journal of the ACM, 42(6),1115-1145, 1995 [13] A. Javanmard, A. Montanari, Localization from incomplete noisy distance measurements, Found. Comput. Math. 13, 297-345, 2013 [14] V. Jugnon and L. Demanet, Interferometric inversion: a robust approach to linear inverse problems, to appear in Proc. SEG Annual Meeting, 2013 [15] R. Keshavan, A. Montanari, S. Oh, Matrix Completion from Noisy Entries, Journal of Machine Learning Research 11, 2057-2078, 2010 [16] O. I. Lobkis and R. L. Weaver, On the emergence of the Greens function in the correlations of a diffuse field, J. Acoustic. Soc. Am., 110, 3011-3017, 2001 [17] L. Lovasz, Eigenvalues of graphs, Lecture notes, 2007 [18] B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Review 52-3, 471-501, 2010 18 [19] J. M. Rodenburg, A. C. Hurst, A. G. Cullis, B. R. Dobson, F. Pfeiffer, O. Bunk, C. David, K. Jefimovs, and I. Johnson. Hard-x-ray lensless imaging of extended objects, Physical review letters 98, no. 3 034801, 2007 [20] J. Schotland, Quantum imaging and inverse scattering, Optics letters, 35(20), 3309-3311, 2010 [21] G. T. Schuster, Seismic Interferometry Cambridge University Press, 2009 [22] G. T. Schuster, J. Yu, J. Sheng, and J. Rickett, Interferometric/daylight seismic imaging Geophysics 157(2), 838-852, 2004 [23] A. Singer. Angular synchronization by eigenvectors and semidefinite programming, Applied and computational harmonic analysis 30.1 20-36, 2011 [24] A. Singer, M. Cucuringu, Uniqueness of low-rank matrix completion by rigidity theory, SIAM. J. Matrix Anal. Appl. 31(4), 16211641, 2010 [25] L. Wang and A. Singer. Exact and Stable Recovery of Rotations for Robust Synchronization, arXiv preprint arXiv:1211.2441, 2012 [26] I. Waldspurger, A. d’Aspremont, S. Mallat, Phase recovery, maxcut and complex semidefinite programming, arXiv preprint arXiv:1206.0102 2012 [27] K. Wapenaar and J. Fokkema, Greens function representations for seismic interferometry, Geophysics, 71, SI33-SI46, 2006 19