Error bounds for noisy compressive phase retrieval Bernhard G. Bodmann Nathaniel Hammen

advertisement
Error bounds for noisy compressive phase retrieval
Bernhard G. Bodmann
Nathaniel Hammen
Mathematics Department
University of Houston
Houston, TX 77204-3008
Mathematics Department
University of Houston
Houston, TX 77204-3008
Abstract—This paper provides a random complex measurement matrix and an algorithm for complex phase retrieval of
sparse or approximately sparse signals from the noisy magnitudes
of the measurements obtained with this matrix. We compute
explicit error bounds for the recovery which depend on the noiseto-signal ratio, the sparsity s, the number of measured quantitites
m, and the dimension of the signal N . This requires m to be
of the order of s ln(N/s). In comparison with sparse recovery
from complex linear measurements, our phase retrieval algorithm
requires six times the number of measured quantities.
I. I NTRODUCTION
Some types of measurement devices only record magnitudes of a linear transform, while the phase information
is unavailable. This is the case in many optical, acoustic,
electromagentic, and quantum systems [10, 17, 19], and in
particular in X-ray crystallography [15].
In this paper, we assume an N -dimensional complex signal
x ∈ CN and noisy measurements given by bi = |hx, fi i|2 + i
for some set of measurement vectors {fi }M
i=1 and measurement noise {i }M
.
We
wish
to
recover
an approximation
i=1
to the vector x from the measurements {bi }M
i=1 , up to a
unimodular factor that remains undetermined. This is the
abstract formulation of the phase retrieval problem [13] in
finite dimensional Hilbert spaces. In many applications, the
dimension of the signal is much larger than the number of
measurements that can be taken feasibly. Thus, we would like
to combine phase retrieval results with compressive sensing
results, which allow a sufficiently sparse vector to be recovered
from fewer measurements than the dimension of the vector
space.
This idea of combining phase retrieval with compressive
sensing has been explored in recent years [4,5,8,11,12,14,18].
Some of these methods have provable performance guarantees
in the presence of noise, but they do not include precise
conditions on the number of measured quantities that are
sufficient [12, 14, 18]. In addition, the known error bounds are
either generic, or depend quadratically on the noise-to-signal
ratio [4]. This paper provides an explicit error bound for sparse
signals that is linear in the noise-to-signal ratio for a concrete
number of measured quantities. It also includes an error bound
for approximately sparse signals.
To this end, we combine the phase retrieval algorithm of
[6] with the generic two-stage sparse phase retrieval method
of [12], both of which have an error bound that is linear
c
978-1-4673-7353-1/15/$31.00 2015
IEEE
in terms of the input noise. This results in a sparse phase
retrieval algorithm with a small number of measurements and
a uniform error bound that depends linearly on the noise-tosignal ratio. In the first stage of recovery, the relative phases
of a number of linear measurements are recovered with a
deterministic algorithm. In the second stage, we use that these
linear measurements are chosen according to a randomization
strategy from compressive sensing, and thus provide a method
to recover sparse signals accurately. The main challenge of
controlling the error with the two-stage method is that a
naive combination of the random measurements with the
phase retrieval algorithm only implies a random error bound.
However, using the RIP constant of the measurement matrix
allows us to deduce a deterministic error bound which holds
with overwhelming probability.
II. P RELIMINARIES
A. Compressive Sensing
We say a vector x is s-sparse if x has only s or fewer
nonzero entries, and we say a vector x is nearly s-sparse
if there exists an s sparse vector that is a small l1 distance
away from x. For any vector x, we define kxk0 to be equal
to the number of nonzero entries of x. This means that kxk0
is the smallest number s such that x is s-sparse. A central
task in compressive sensing is the technique of creating an
underdetermined system of measurements that can recover a
sparse or nearly sparse vector to a high degree of accuracy.
Typically, the accuracy is measured in terms of the Euclidean
norm k · k2 . This is usually established using a restricted
isometry property or a null space property of the measurement
matrix [9]. Here, we use the restricted isometry property.
Definition 1. For a real or complex m × N matrix A and a
positive integer s ≤ N , we say that A satisfies the s-restricted
isometry property with isometry constant δs if for each ssparse vector x ∈ RN or x ∈ CN , respectively, we have
(1 − δs )kxk22 ≤ kAxk22 ≤ (1 + δs )kxk22 .
Foucart and Rauhut show how a suitably bounded restricted
isometry constant provides robust and stable sparse recovery
by `1 -norm minimization.
Theorem 2 (Foucart and Rauhut, Theorem 6.12 in [9]).
Suppose that A ∈ Cm×n satisfies the 2s-restricted isometry
property with isometry constant δ2s < √441 . Then, for any
x ∈ CN and y ∈ Cm satisfying ky − Axk2 ≤ η, the solution
x# to
arg min kx̃k1
ky − Ax̃k2 ≤ η
subject to
x̃∈CN
These measurements are interpolated using the
Dirichlet Kernel to three approximating trigonometric polynomials,
satisfies
g0 (z) =
C
kx − x# k2 ≤ √
s
2m−1
X
inf
z∈CN ,kzk
0 ≤s
kx − zk1
l=1
+ Dη
with C and D only depending on δ2s by
g1 (z) =
2
C=
and
D=
bl Dm−1 (zω −l ) ,
(1 + ρ)
1−ρ
2m−1
X
b(2m−1+l) Dm−1 (zω −l ) ,
l=1
and
(3 + ρ)τ
1−ρ
g2 (z) =
2m−1
X
b(4m−2+l) Dm−1 (zω −l ) .
l=1
and, in turn,
ρ= p
as well as
δ2s
2 − δ /4
1 − δ2s
2s
√
τ=p
1 + δ2s
2 − δ /4
1 − δ2s
2s
.
There are many algorithms that have been created to solve
the minimization problem in the above theorem efficiently,
such as iteratively re-weighted least squares [7].
B. Phase Retrieval
Our algorithm for phase retrieval recovers an approximation
to the vector x ∈ Cm , up to a unimodular constant, from
measurements of the form {bi = |hx, fi i|2 + i }ni=1 for some
set of measurement vectors {fi }ni=1 and a noise vector =
{i }ni=1 ∈ Rn . The magnitude of the noise is measured by
the norm kk∞ = maxi |i |. We use the three-step algorithm
from [6] to provide a robust solution to thee recovery problem
using n = 6m − 3 measurements.
(1)
Let Dm−1 be the normalized Dirichlet kernel of
degree m − 1, so that
for any z ∈ C with |z| = 1,
Pm−1
1
k
Dm−1 (z) = 2m−1
k=−(m−1) z . Then the set of
−l 2m−1
functions {z 7→ Dd−1 (zω )}l=1 is an orthogonal
basis for the set of trigonometric polynomials of degree at most m−1 with respect to the L2 inner product. Thus, any trigonometric
polynomial g can be
P2m−1
interpolated as g(z) =
g(ω l )Dm−1 (zω −l ).
l=1
If we represent the coefficients of the vector x to be
recovered as the coefficients of complex polynomial
2iπ
p having degree at most m − 1, and we let ν = e m ,
then the functions z 7→ |p(z)|2 , z 7→ |p(z) − p(zν)|2 ,
and z 7→ |p(z) − ip(zν)|2 are equal to trigonometric
polynomials of degree at most m − 1 when restricted
to the unit circle T = {z ∈ C : |z| = 1}. The 6m − 3
measurements that are taken are perturbed samples
of these functions at each of the 2m − 1st roots of
unity.

|p(ω l )|2 + l ,
l ≤ 2m − 1,

l
|p(ω ) − p(ω l ν)|2 + l , 2m ≤ l ≤ 4m − 2,
bl =

|p(ω l ) − ip(ω l ν)|2 + l , 4m − 1 ≤ l
(2)
(3)
Note that this process is robust to noise in the
measurements, with an error at any point that is
at most (2m − 1)kk∞ so that on the unit circle,
g0 (z) ≈ |p(z)|2 , g1 (z) ≈ |p(z) − p(zν)|2 , and
g2 (z) ≈ |p(z) − ip(zν)|2 . Thus, the finite number
of perturbed measurements has been expanded to
an infinite family of measurements of each of these
functions on the unit circle.
Next, magnitude measurements of g0 are selected
from points ξν k , for k from 1 to m, and multiple
ξ ∈ C with |ξ| = 1 that have angles less than that
of ν. Let z0 equal the value of ξ that maximizes
mink g0 (ξν k ). Thus, {g0 (z0 ν k )}m
k=1 is a set of m
equally spaced magnitude measurements on the unit
circle such that the smallest of the m measurements
is suitably bounded away from zero. Magnitude
measurements of g1 and g2 are also taken from the
same m points z0 ν k .
The evaluations of a polynomial p at m sample points
{z0 ν k }m
k=1 from step (2) are equal to inner products
with measurement vectors,
p(z0 ν k ) = hp, Kz0 ν k i
given by
Kz0 ν k (z) =
m−1
X
(z0 ν k )j z j ,
j=0
which form an orthogonal basis for the space of
complex polynomials. Thus, if m sample points
are ordered with increasing angle, the values of
g0 (z0 ν k ), g1 (z0 ν k ), and g2 (z0 ν k ) correspond to perturbed values of the coefficients |xk |2 , |xk − xk+1 |2 ,
and |xk − ixk+1 |2 with respect to this orthogonal
basis, respectively. An approximation y for x is
created in this basis by letting
y1 =
p
g0 (z0 ν) ≈ |x1 |
and for each k from 1 to m − 1, the (noiseless)
identity
1
xk xk+1 = (1 − i)(|xk |2 + |xk+1 |2 )
2
1
i
− |xk − xk+1 |2 + |xk − ixk+1 |2
2
2
is used to create an iterative process which assigns
for any 1 ≤ k ≤ m − 1
1
tk = (1 − i) g0 (z0 ν k ) + g0 (z0 ν k+1 )
2
1
i
− g1 (z0 ν k ) + g2 (z0 ν k )
2
2
xk xk+1
tk
yk
yk ≈
yk+1 =
k
g0 (z0 ν )
|xk |2
A change of basis converts from the coefficients of
y back to the canonical polynomial basis,
p̃ =
m
X
k=1
1
yk √ Kz0 ν k .
m
The following theorem from [6] gives error bounds for the
above algorithm that are linear in the noise-to-signal ratio.
For the proof, it is most natural to measure this ratio with the
quotient of kk∞ and kpk2 .
Theorem 3 ([6]). Let Pm be the space of polynomials
2iπ
2iπ
with maximal degree m − 1, ω = e 2m−1 , ν = e m ,
2π
r = sin( (m−1)m
2 ), and 0 < α < 1. For any nonzero
p ∈ Pm , and any ∈ R6m−3 , if β =
r
(m−1)m
2
m
( m−1
2m )
2
m−1
and
( k=1
αβ
kk∞ ≤ 2m−1 , then the approximation p̃ ∈ Pm constructed
with the above algorithm using the values of the measurement
map Ae : Pm × R6m−3 → R6m−3 defined by

|p(ω j )|2 + j ,
j ≤ 2m − 1,

j
e
|p(ω ) − p(ω j ν)|2 + j , 2m ≤ j ≤ 4m − 2,
A(p, )j =

|p(ω j ) − ip(ω j ν)|2 + j , 4m − 1 ≤ j
2
Qm−1
(r k +1))
kpk22
satisfies the error bound
kk∞
min kp̃ − cpk2 ≤ Ẽ
kpk2
|c|=1
with the constant
Ẽ =
D̃ +
2β
p√
1 − C̃ m
!
√
m(1 − α)(1 − C̃)
m(2m − 1) ,
D̃ =
In this section we apply the two theorems quoted above
to the two-stage technique for sparse phase retrieval shown
in [12]. The two stages are dimensional reduction by a
randomized approximate projection and phase retrieval for
the compressed components. The concrete error bounds for
phase retrieval together with the performance guarantees for
randomized measurements from compressed sensing allow us
to control the accuracy of compressive phase retrieval.
In the real case, it is standard knowledge that for a random
m × N matrix A whose entries are drawn independently
from a subgaussian distribution with variance one, there is
a universal constant
C that only depends on the distribution
√
such that A/ m has the restricted isometry constant δs
with probability exceeding 1 − 2 exp(−δs2 m/(2C)) provided
m ≥ 2Cδs−2 s ln(eN/s) [9, Theorem 9.2].
In the complex case, we restrict ourselves to Gaussian measurement matrices and assemble several parts from [9]. The
elementary starting point is that if A is a complex Gaussian
m×s matrix with s < m, whose entries have standard normal
distributed real and imaginary parts, then the maximal
and
√
2m
are
minimal singular values σmax and σmin
of
A/
ps
p s for
t > 0 contained in the interval [1 − m
− t, 1 + m
+ t]
with a probability of [9, Exercise 9.5]
r
rs
2
s
−t ≤ σmin , σmax ≤ 1+
+t ≥ 1−2e−mt .
P 1−
m
m
The proof of this is analogous to the real case [9, Theorem
9.26]. Using a union bound as in the proof of [9, Theorem
9.27],
√ we then get that the restricted isometry constant δs of
A/ 2m is bounded by
r
r
s
2
eN
s
s
2
P δs > 2(
+ t) + (
+ t) ≤ 2
e−mt .
m
m
s
If the sparsity s, the number of measurements m and the
dimension of the Hilbert space N are such that for a given
sparsity ratio s/N , m/s is sufficiently large, then the desired
RIP constant is achieved with overwhelming probability for
large dimensions. We summarize these results from [9].
Proposition 4. A complex random matrix with entries whose
real and imaginary parts are drawn independently at random
from a normal distribution with mean zero√ and variance
1/(2m) achieves an RIP constant δ2s < 4/ 41 with overwhelming probability if there exists
r
2s
eN
t>
ln(
)
m
2s
such that
that depends on m and α through
√
√
(1 + 2)αβ 2 + m
C̃ =
,
β 2 (1 − α)
√
III. M AIN R ESULT
2(
m
2 + 2 m − mC̃ − 1 + C̃
.
β 2 (1 − α)
(1 − C̃)2
p
2s/m + t) + (
p
4
2s/m + t)2 < √ .
41
If we let A be the random Gaussian m×N matrix satisfying
the conditions of Theorem 2, and we let B be the matrix
associated with the linear portion of the measurement map Ae
from Theorem 3, then the measurements of a vector x are of
the form |BAx|2 + , where ∈ R6m−3 is a noise vector and
| • |2 is the squared modulus taken component-wise.
Theorem 5. Let x ∈ CN , let ∈ R6m−3 , and let A ∈ Cm×N
satisfy the 2s-restricted isometry property with isometry con2iπ
2iπ
stant δ2s < √441 . If ω = e 2m−1 and ν = e m , let
B ∈ C(6m−3)×m be given by

ω j(k−1)
if 1 ≤ j ≤ 2m − 1,

j(k−1)
Bj,k =
ω
− (ω j ν)k−1 if 2m ≤ j ≤ 4m − 2,
 j(k−1)
ω
− i(ω j ν)k−1 if 4m − 1 ≤ j ≤ 6m − 3 .
2π
Let r = sin( (m−1)m
2 ), 0 < α < 1, β =
and kk∞ ≤
requirement
αβ 2 kpk22
2m−1 .
r
(m−1)m
2
m
( m−1
2m )
Qm−1
2
m−1
,
( k=1
If x satisfies the approximate sparsity
√
σs (x)1 <
(r k +1))
1 − δs
kxk2
γs
inf
z∈CN ,kzk0 ≤s
To eliminate the random term kAxk2 in the denominator, we
split x into a sum of s-sparse vectors. Let
z1 = arg
min
z∈CN ,kzk0 ≤s
and for each k ∈ N with k < d Ns e, let
k
X
x −
z
−
z
zk = arg
min
j
z∈CN ,kzk0 ≤s j=1
so that for any j =
6 k, we have that zj is zero in each
component that zk is non-zero. Thus,
kx − zk1
dN
s e
x=
and
γs =
p
kx − zk1
1
with
σs (x)1 =
satisfies the random bound
C
kk∞
kc0 x − x# k2 ≤ √
inf
kc0 x − zk1 + DẼ
N
kAxk2
s z∈C ,kzk0 ≤s
C1
kk∞
= √ σs (x)1 + C2
.
kAxk2
s
X
zj
j=1
p
1 − δs + 1 + δs ,
then an approximation x# for x can be reconstructed from
the vector |BAx|2 + (where | • |2 is taken component-wise),
such that
C1
kk∞
kc0 x − x# k2 ≤ √ σs (x)1 + C2 √
s
1 − δs kxk2 − γs σs (x)1
where
C1 = C, and C2 = DẼ with C and D from Theorem 2, Ẽ
from Theorem 3 and c0 ∈ C, |c0 | = 1.
and
N
d s e dN
s e
X
X zj =
kzj k1 .
kx − z1 k1 = j=2 j=2
1
Then,
N
d s e
dN
s e
X
X
kAxk2 = Az
=
Az
+
Az
j
j
1
j=1
j=2
2
2
Proof. Consider
Pm the polynomial px ∈ Pm defined such that
px (z) = k=1 hx, φ∗k iz k−1 , where φk is the k-th row of A.
This polynomial has monomial coefficients that are precisely
equal to the coefficients of the vector Ax. Thus, the map Ae
defined in Theorem 3 satisfies
m
2
X
∗
e
A(px , ) = hx, φk iBj,k + j = |BAx|2 + j
j
and by a few uses of the triangle inequality
N
dX
e
s
Azj kAxk2 ≥ kAz1 k2 − j=2
Using Theorem 3, we obtain a polynomial p̃ ∈ Pm satisfying
Next, with the s-restricted isometry property of A
k=1
kk∞
kk∞
kp̃ − c0 px k2 ≤ Ẽ
= Ẽ
kpx k2
kAxk2
x̃∈CN
subject to
kk∞
.
kAxk2
ky − c0 Ax̃k2 ≤ Ẽ
kAzj k2 .
j=2
N
j=2
N
dse
X
p
p
= 1 − δs kx − (x − z1 )k2 − 1 + δs
kzj k2
j=2
by one more use of the reverse triangle inequality
Using this, we may apply Theorem 2 to show that the solution
x# to
arg min kx̃k1
≥kAz1 k2 −
X
dse
Xp
p
kAxk2 ≥ 1 − δs kz1 k2 −
1 + δs kzj k2
for some c0 ∈ C with |c0 | = 1 and some Ẽ ∈ R+ that depends
only on α and m. If y is the vector of monomial coefficients
of p̃, then
ky − c0 Axk2 = kp̃ − c0 px k2 ≤ Ẽ
2
dN
s e
kk∞
kAxk2
N
dse
X
p
p
≥ 1 − δs (kxk2 − kx − z1 k2 ) − 1 + δs
kzj k2
j=2
and by the relation between `1 and `2 norms
N
≥
p
dse
X
p
1 − δs (kxk2 − kx − z1 k1 ) − 1 + δs
kzj k1
j=2
and finally using the earlier `1 identity, we get
p
p
p
1 − δs + 1 + δs kx − z1 k1
= 1 − δs kxk2 −
p
= 1 − δs kxk2 − γs σs (x)1 .
Thus,
C1
kk∞
kc0 x − x# k2 ≤ √ σs (x)1 + C2
kAxk2
s
C1
kk∞
≤ √ σs (x)1 + C2 √
.
s
1 − δs kxk2 − γs σs (x)1
If x is s-sparse, then the error bound simplifies to a linear
bound in the noise-to-signal ratio.
Corollary 6. If the assumptions of the preceding theorem hold
and x is s-sparse, then the recovery algorithm results in x#
such that
kk∞
min kcx − x# k2 ≤ C2 √
.
|c|=1
1 − δs kxk2
Together with the random selection of normal, independently distributed entries as in Proposition 4, we achieve
overwhelming probability of approximate recovery.
Corollary 7. If A is a complex random matrix with entries
whose real and imaginary parts are drawn independently
at random from a normal distribution with mean zero and
variance 1/(2m), with s, m, N and t > 0 chosen according
to the assumption of Proposition 4, then the error bound in the
preceding theorem holds for each x ∈ CN with a probability
s −mt2
bounded below by 1 − 2( eN
.
s ) e
Acknowledgment. This paper was supported in part by NSF
grant DMS-1412524.
R EFERENCES
[1] Boris Alexeev, Afonso S. Bandeira, Matthew Fickus, and Dustin G.
Mixon, Phase Retrieval with Polarization, SIAM J. Imaging Sci. 7
(2014), no. 1, 35–66.
[2] Radu Balan, Bernhard G. Bodmann, Peter G. Casazza, and Dan Edidin,
Painless reconstruction from magnitudes of frame coefficients, J. Fourier
Anal. Appl. 15 (August 2009), no. 4, 488–501.
[3] Radu Balan and Yang Wang, Invertibility and Robustness of Phaseless
Reconstruction, Applied and Computational Harmonic Analysis 38 (May
2015), no. 3, 469–488.
[4] Afonso S. Bandeira and Dustin G. Mixon, Near-Optimal Phase Retrieval
of Sparse Vectors, Proceedings of SPIE, 2013.
[5] S. Barel, O. Cohen, Y. C. Eldar, D. G. Mixon, and P. Sidorenko, Sparse
Phase Retrieval from Short-Time Fourier Measurements, IEEE Signal
Processing Letters 22 (2015), no. 5, 638–642.
[6] Bernhard G. Bodmann and Nathaniel Hammen, Algorithms and error
bounds for noisy phase retrieval with low-redundancy frames (December
2014), available at arXiv:1412.6678. pre-print.
[7] Ingrid Daubechies, Ronald DeVore, Massimo Fornasier, and C. Sinan
Gunturk, Iteratively Re-weighted Least Squares Minimization for Sparse
Recovery, Comm Pure Appl. Math. 63 (2010), 1–38.
[8] Roy Dong, Henrik Ohlsson, Shankar Sastry, and Allen Yang, Compressive Phase Retrieval From Squared Output Measurements Via Semidefinite Programming, 16th IFAC Symposium on System Identification,
SYSID 2012, July 2012.
[9] Simon Foucart and Holger Rauhut, A Mathematical Introduction to
Compressive Sensing, Springer, 2013.
[10] David Gross, Felix Krahmer, and Richard Kueng, A partial derandomization of PhaseLift using spherical designs, J. Fourier Anal. Appl. 21
(2015), 229–266.
[11] Babak Hassibi, Kishore Jaganathan, and Samet Oymak, Recovery of
Sparse 1-D Signals from the Magnitudes of their Fourier Transform,
2012 IEEE International Symposium on Information Theory Proceedings
(ISIT), July 2012.
[12] Mark Iwen, Aditya Viswanathan, and Yang Wang, Robust sparse phase
retrieval made easy (October 2014), available at arXiv:1410.5295. preprint.
[13] Monson H. Hayes, Jae S. Lim, and Alan V. Oppenheim, Signal reconstruction from phase or magnitude, IEEE Trans. Acoust., Speech, Signal
Process. 28 (December 1980), no. 6, 672–680.
[14] Xiaodong Li and Vladislav Voroninski, Sparse Signal Recovery
from
Quadratic
Measurements
via
Convex
Programming,
SIAM J. Math. Anal. 45 (2013), no. 5, 3019–3033.
[15] Arthur L. Patterson, A direct method for the determination of the components of interatomic distances in crystals, Zeitschrift für Kristallographie
90 (1935), 517–542.
[16] Volker Pohl, Fanny Yang, and Holger Boche, Phaseless signal recovery
in infinite dimensional spaces using structured modulations, J. Fourier
Anal. Appl. 20 (December 2014), 1212–1233.
[17] Lawrence Rabiner and Biing Hwang H. Juang, Fundamentals of speech
recognition, Prentice Hall, 1993.
[18] Vladislav Voroninski and Zhiqiang Xu, A strong restricted isometry
property, with an application to phaseless compressed sensing (April
2014), available at arXiv:1404.3811. pre-print.
[19] Adriaan Walther, The question of phase retrieval in optics, Journal of
Modern Optics 10 (1963), no. 1, 41–49.
[20] Fanny Yang, Volker Pohl, and Holger Boche, Phase retrieval via
structured modulations in Paley-Wiener spaces, Proc. 10th Intern. Conf.
on Sampling Theory and Applications (SampTA), July 2013.
Download