Nonuniform Sparse Recovery with Random Convolutions

advertisement
Nonuniform Sparse Recovery with Random Convolutions
David James
Institute for Numerical and
Applied Mathematics
University of Goettingen
Germany
Email: d.james@math.uni-goettingen.de
Abstract
We discuss the use of random convolutions for Compressed Sensing applications. In particular, we will
show that after convolving an N -dimensional, s-sparse
signal with a Rademacher or Steinhaus sequence, it
can be recovered via l1 -minimization using only
m & s log(N/ε) arbitrary chosen samples with probability at least 1 − ε.
1. Introduction
The fast evolving area of Compressed Sensing had
tremendous impact on data acquisition in recent years.
It states the paradigm, that instead of first measuring
information and then compressing it, data aquisition
and compression should be done simultaneosly. One
underlying concept of Compressed Sensing is sparsity.
A signal x ∈ CN is said to be s-sparse, if it has at most
s non-vanishing entries. A key task now is to design
a measurement matrix A ∈ Cm×N , such that a signal
x ∈ CN can be recovered from linear measurements
y = Ax via Basis Pursuit, i.e. for
x] = arg min kzk1
subject to Ax = Az (BP )
z∈CN
it holds that x] = x. In this paper, we will consider
measurement matrices which arise from circular convolution of a signal with a random vector followed by
subsampling with respect to an arbitrary deterministic
subset. This is a physically relevant sensing architecture and can be realized as a FIR filter. Convolution
matrices also allow for fast matrix-vector multiplication, since they can be diagonalized using the Fast
Fourier Transform.
Holger Rauhut
Chair for Mathematics C (Analysis)
RWTH Aachen University
Germany
Email: rauhut@mathc.rwth-aachen.de
2. Notation
Throughout this paper, we will denote by [N ] the
set of integers from one to N . For any k ∈ [N ], the
translation operator T k ∈ CN ×N is the linear map,
that applied to an arbitrary vector z ∈ CN gives for
all j ∈ [N ],
(T k (z))j = zjk ,
where denotes the subtraction modulo N . The
circulant matrix Φ(z) ∈ CN ×N generated by a vector
z ∈ CN is the matrix consisting of all possible
translations of the vector z, i.e.


z1
zN
. . . z2
 z2
z1
. . . z3 


Φ(z) = T k (z) 0≤k≤N −1 = .
.
.. .
..
 ..
.
zN
zN −1
...
z1
The convolution is followed by deterministic subsampling and we denote by Ω Φ(z) the partial circulant
matrix subsampled by the subset Ω ⊂ [N ],
Ω Φ(z)
= RΩ Φ(z),
(1)
N
where RΩ is the linear restriction operator C → CΩ
which restricts any vector x ∈ CN to its entries
∗
indexed by Ω. The adjoint operator RΩ
extends any
Ω
N
vector x ∈ C to C by filling it with zeros. It
∗
follows that RΩ RΩ
= IΩ is the identity on CΩ and
∗
RΩ RΩ = PΩ is the projection to the subspace spanned
by the canonical basis vectors indexed by Ω. We will
apply the partial circulant matrix to an individual signal
which is supported on a subset Λ ⊂ [N ] and denote
the column restricted partial circulant matrix by
Ω Φ(z)Λ
∗
= RΩ Φ(z)RΛ
.
The random convolutions in this paper are partial
random circulant matrices, which are generated by
either a Rademacher or a Steinhaus sequence, which
we will define next.
Definition 2.1. A Rademacher random variables is a
random variable, taking only the values +1 or −1 with
equal probability, while a Steinhaus random variable
σ is distributed uniformly on the complex unit sphere
S 1 ⊂ C. We call an N -dimensional random vector
ξ a Rademacher or Steinhaus sequence, if its entries
ξj , j ∈ [N ] are independent Rademacher or Steinhaus
random variables, respectively.
For the rest of this paper, a Rademacher sequence
will usually be denoted by , while σ refers to a
Steinhaus sequence.
Isometry Property at
factors,
the cost of additional log i.e. m & Cs max (log s)2 (log N )2 , log(ε−1 ) measurements.
Note that there are also other measuring methods
using convolutions for Sparse Recovery, like random
convolution followed by random subsampling, which
was studied in [8], and random subsampling after
deterministic convolution, studied e.g. in [5].
4. Proof of the main result
Let A = Ω Φ(ξ)Λ be the partial random circulant
matrix generated by ξ, subsampled by Ω ⊂ [N ] and
where additionally the columns were restricted to the
set Λ ⊂ [N ]. We will now investigate the two matrices
H(ξ) = m−1 (Ω Φ(ξ)Λ )∗ Ω Φ(ξ)Λ − IΛ
3. Main result
= m−1 Λ Φ∗ (ξ)PΩ Φ(ξ)Λ − IΛ ,
Theorem 3.1. Let x ∈ CN be an arbitrary s-sparse
signal and A = m−1/2 Ω Φ(ξ) ∈ Cm×N be a normalized partial random circulant matrix generated by
either a Rademacher sequence ξ = or a Steinhaus
sequence ξ = σ and subsampled by an arbitrary
deterministic subset Ω ⊂ [N ] of cardinality ]Ω = m.
Then with probability at least 1−ε, Basis Pursuit (BP )
recovers x from y = Ax provided
m ≥ C1 s log(N/ε) and
m ≥ Cξ (s log(s4 /ε) + C2 ),
(2)
with absolute constants depending only on the choice
of the random vector ξ, i.e. C1 ≈ 305.7, Cσ ≈ 87.87,
C ≈ 135.74 and C2 ≈ 16.70.
Since s < N , there exists an absolute constant
C > 0 such that both inequalities in (2) are implied
by m ≥ Cs log(N/ε).
Nonuniform recovery with partial random circulant
matrices has already been studied before. In [2] a
similar statement was proven with the additional assumption that the signs of the signal x also form
a Rademacher sequence and m ≥ 57s log(17N 2 /ε).
To the best of our knowledge, our result is the first
nonuniform recovery result for partial random circulant matrices, where m scales linearly in the logfactors. It only slightly falls short in terms of the
constant and the factors in the logarithm compared
to a similar result for Gaussian matrices, where, for
large N , only m ≥ 2s log(N/s) measurements are
needed [1]. However, our result does not apply for
noisy measurements or compressible instead of sparse
signals. A stronger recovery guarantee for partial random circulant matrices proved in [4], achieves uniform,
stable and robust recovery implied by the Restricted
and
∗
K(ξ) = m−1 (Ω Φ(ξ))∗ Ω Φ(ξ)Λ − RΛ
∗
= m−1 Φ∗ (ξ)PΩ Φ(ξ)Λ − RΛ
.
∗
= IΛ , we have the identity
Note that since RΛ RΛ
RΛ K(ξ) = H(ξ). The following recovery lemma
holds for any matrix A ∈ Cm×N , but for convenience,
it will be stated only for partial random circulant
matrices. A proof can be found in [6, Lemma 5.1].
Proposition 4.1. Let x ∈ CN be an s-sparse signal
with support set Λ, ξ ∈ CN be a random vector and
A = m−1/2 Ω Φ(ξ) and H(ξ), K(ξ) as above. If β >
0, κ > 0, k ∈ N and Lt ∈ N, t ∈ [k] are chosen such
that they satisfy
X
1 − a −3/2
κ
≤
s
,
a :=
β m/Lt < 1 and
1−κ
1+a
t∈[k]
(3)
then x] 6= x in (BP ) with probability at most
p =κ−2 E[Tr H(ξ)2k ]+
(4)
X X
−2k
t−1
2Lt
β
E[|(K(ξ)H(ξ) RΛ sgn(x)ρ | ].
ρ∈Λc t∈[k]
In order to get a reasonable small bound on p in
(4), we have to bound both appearing expectations
separately. To this end, we introduce some notation
first.
Definition 4.2. Let D2k be the set of all derangements
(fixpoint free permutations) of the set [k], where k is
an even integer and let D2 (k, l) ⊂ D2k be the set of
all derangements of [k], which can be decomposed into
l ≤ k/2 cycles. Their cardinalities
]D2 (k, l) = d2 (k, l)
are called associate Stirling numbers of the first kind.
They can be computed inductively, see e.g. [7, p. 75],
as
d2 (0, 0) = 1, d2 (k, 0) = 0, d2 (k, l) = 0 if k ≥ 1, l > k2 ,
By independence of the random variables ξi , i ∈ [N ],
,...,ωk
E[Pλω11,...,λ
(ξ)] factorizes into
k
Y
bi
,...,ωk
E[Pλω11,...,λ
(ξ)]
=
E[ξiai ξ i ],
k
i∈[N ]
d2 (k + 1, l) = k(d2 (k, l)+d2 (k − 1, l − 1)),1 ≤ l ≤ k2 .
We further define the functions
X
Gk (z) := z −k
d2 (k, l)z l .
l∈[k/2]
In terms of the definition above, the upper bounds
on the expectations in Proposition 4.1 read as follows.
Lemma 4.3. Let Λ ⊂ [N ] be a subset of cardinality
s and k be an even integer. Let be a Rademacher
sequence and σ be a Steinhaus sequence, then
E[Tr H k (σ)] ≤ sGk (m/s),
E[Tr H k ()] ≤ sGk (m/(2s)).
Lemma 4.4. Let Λ ⊂ [N ] be a subset of cardinality s
and L, t ∈ N some integers. Then for any ρ ∈
/Λ
E[|(K(σ)H(σ)t−1 RΛ sgn(x))ρ |2L ] ≤ G2tL (m/s),
E[|(K()H()t−1 RΛ sgn(x))ρ |2L ] ≤ G2tL (m/(2s)).
Due to page limitations, we will sketch the proof
of Lemma 4.3. The proofs of lemmas 4.3 and 4.4
are shown in full detail in [3]. They are inspired by
[6], where a similar technique was used to bound the
expectations in Proposition 4.1 for H(ξ) and K(ξ)
corresponding to time-frequency structured random
matrices generated by a Steinhaus sequence σ.
Sketch of the proof of Lemma 4.3: Let ξ be either
equal to or σ, let λ1 , λ2 ∈ Λ and denote by h·, ·iΩ the
usual inner product where we restrict the summation to
the subset Ω ⊂ [N ]. Then by elementary calculations,
it follows that
E[Tr H k (ξ)] =
X
m−k
Y
E[hT λp ξ, T λp⊕1 ξiΩ ] (5)
λ1 ,...,λk ∈Λ
p∈[k]
λ1 6=λ2 6=···6=λk 6=λ1
=: m−k
X
X
,...,ωk
E[Pλω11,...,λ
(ξ)]
k
ω1 ,...,ωk ∈Ω
λ1 ,...,λk ∈Λ
λ1 6=λ2 6=···6=λk 6=λ1
where ⊕ denotes the summation modulo k, denotes
the subtraction modulo N , in λ1 6= λ2 6= · · · 6=
λk 6= λ1 inequality only has to hold for neighbouring
quantities and we defined
Y
,...,ωk
Pλω11,...,λ
(ξ)
:=
ξωp λp ξ ωp λp⊕1 .
k
p∈[k]
X
(6)
ai + bi = 2k,
i∈[N ]
where
ai = ] {j ∈ [k]| ωj λj = i} ,
bi = ] {j ∈ [k]| ωj λj⊕1 = i} .
To investigate the expectations occurring in (6), we
have to distinguish between ξ = being a Rademacher
sequence and ξ = σ being a Steinhaus sequence. It
follows that


1 if ξ = σ, ai = bi ∀i
ω1 ,...,ωk
E[Pλ1 ,...,λk (ξ)] =
or ξ = , ai + bi even ∀i,


0 else.
(7)
We will now estimate the number of all possible values of (λ1 , . . . , λk , ω1 , . . . , ωk ), for which
,...,ωk
E[Pλω11,...,λ
(ξ)] = 1. To this end, we define for
k
ξ = σ being a Steinhaus sequence or ξ =
being a Rademacher sequence bipartite graphs
ω1 ,...,ωk
1 ,...,ωk
Gω
λ1 ,...,λk (ξ) whose edge sets Eλ1 ,...,λk (ξ) depend on
the choice of the random sequence and the values of
(λ1 , . . . , λk , ω1 , . . . , ωk ) ∈ Λk ×Ωk . We can show, that
,...,ωk
1 ,...,ωk
E[Pλω11,...,λ
(ξ)] = 1 holds if and only if Gω
λ1 ,...,λk (ξ)
k
has a perfect matching. To upper bound the sum in
(5), we conclude the proof by counting the number of
1 ,...,ωk
graphs Gω
λ1 ,...,λk (ξ) which possess a perfect matching.
Due to a similar structure, we can apply analogue ideas
to prove Lemma 4.4.
We will see how our main result, Theorem 3.1
follows from lemmas 4.3 and 4.4.
Proof Theorem 3.1: We aim to apply Proposition 4.1 and set z := m/s if ξ = σ and z := m/(2s)
if ξ = . We can apply Lemma 4.3 to bound the
first expectation and Lemma 4.4 to bound the second
expectation in (4) to get
X X
p ≤ κ−2 sG2k (z) + β −2k
G2tLt (z)
ρ∈Λc t∈[k]
≤ κ−2 sG2k (z) + β −2k N
X
(8)
G2tLt (z),
t∈[k]
where the second inequality follows since ]Λc = N −
s ≤ N . We now proceed with a bound on the associate
Stirling numbers of the first kind d2 (k, l). To this end,
we will prove inductively that
d2 (k+1, l) ≤ (2k)k−l
∀k ∈ N, 0 ≤ l ≤ k/2. (9)
Inequality (9) clearly holds true for all pairs (k, l)
which are given in the base cases in Definition 4.2
and also for (k, l) = (2, 1), since d2 (2, 1) = 1. Now
suppose that k ≥ 2 and that the inequality in (9) holds
true for all k̃ ≤ k, l ≥ 0. Then,
d2 (k + 1, l) = k d2 (k, l) + d2 (k − 1, l − 1)
≤ k (2(k − 1))k−1−l + (2(k − 2))k−2−(l−1)
k−1−l
≤ 2k(2k)
k−l
= (2k)
l∈[k]
2k
4k
1 X z l
z
4k
4k
l∈[k]
k
1 − (4k/z)k
1 4k
,
=
4k z
1 − (4k/z)
≤
where we assumed that
4kz /z ≤ α < 1,
(10)
for k = kz ≥ 1 to be chosen below. Consequently,
k
1
1
4kz z
G2kz (z) ≤
4kz
z
1 − (4kz /z)
1 (4kz /z)kz
(11)
=
4kz 1 − (4kz /z)
1 αkz
αkz
≤
≤
.
4kz 1 − α
4(1 − α)
We now can upper bound the sum in (8) and choose
Lt to be kt rounded to the nearest
integer.
Then tLt is
4k
an integer contained in the set 2k
,
3
3 ∩ N. Indeed, if
k
t > 2k
,
then
<
1.5,
i.e.
L
=
1
and
therefore
t
3
t
k
|tLt − k| = |t − k| < | 2k
3 − k| = 3 .
Assume now that t ≤
2k
3 ,
then
|tLt − k| = |tLt − t kt | = t|Lt − kt | ≤ t 21 ≤ k3 ,
which proves the claim. Applying now (11) and setting
3αz
k := kz :=
(12)
16
for some α > 0 to be chosen below, which satisfies
4k 0 /z ≤ α < 1 for all k 0 ∈ [ 2k3z , 4k3z ] ∩ N, yields
X
G2tLt (z) ≤ kz max G2k0 (z)
k0 ∈
t∈[kz ]
≤ kz
k0 ∈
2kz 4kz
3 , 3 ∩N
k0
α
α2kz /3
≤ kz
.
4(1 − α)
4(1 − α)
∩N
2kmax
4k
z
z
3 , 3
t∈[k]
Choosing α := β 3 e−3/2 , ε ∈ (0, 1) the latter is upper
bounded by ε/2 if
kz e−kz
ε
≥
.
N
2(1 − α)
,
which proves the claim. Using inequality (9) and the
formula for partial sums of the geometric series, we
can conclude that
X
G2k (z) ≤ z −2k
(2(2k − 1))2k−1−l z l
This gives for the second term in (8),
X
(β −3 α)2kz /3
β −2k N
G2tLt (z) ≤ N kz
.
4(1 − α)
The inequality above holds true by monotonicity of the
logarithm if and only if
kz e−kz
log(N/ε) ≤ − log
2(1 − α)
kz
= kz 1 − kz−1 log
.
2(1 − α)
We now choose β = 0.47 which is valid since α =
β 3 e−3/2 ≤ 0.0232 < 1 and the corresponding a in
Proposition 4.1 is always less than 0.957 < 1. Noting
t
that the function t 7→ t−1 log( 2(1−α)
) is monotonically
decreasing for t ≥ 6 and our choice of α, we now
assume that kz ≥ K for some integer K ≥ 6 and
derive
kz
kz 1 − kz−1 log
2(1 − α)
(13)
log(K(1 − α)−1 /2)
≥ kz 1 −
.
K
Since for any x ≥ K,
bxc ≥ x − 1 = 1 −
1
x
x=
x−1
x
x≥
we can lower bound kz by
3αz(K − 1)
3αz
.
≥
kz =
16
16K
K−1
K
x,
(14)
Applying this bound in (13) yields
kz
kz − log
2(1 − α)
log(K(1 − α)−1 /2)
3αz(K − 1)
1−
≥
16K
K
≥ log(N/ε),
provided
−1
log(K(1 − α)−1 /2)
16K
1−
(15)
z≥
3α(K − 1)
K
∗ log(N/ε) =: C(α, K) log(N/ε).
Now consider the Steinhaus case where z = m
s and we
choose K = 10. We then get with the previous choice
of α := β 3 e−3/2 , β = 0.47,
C1 := C(α, 10) ≈ 305.7.
and we still have to ensure that (15) implies kz ≥ K =
10, which was assumed above. We may assume that
s ≥ 1 since s = 0 would imply that x = 0 is the zero
vector which will clearly be restored via Basis Pursuit.
Then (15) demands that
m/ log(m) ≥ m/ log(N/ε) ≥ C1 s ≥ C1 ,
Choosing α as above and assuming again that kz ≥ K,
we can apply (14) once again and we see that the above
holds if
16K
2(1 + a)2
z≥
log
3α(K − 1) log(α−1 )
(1 − a)2 (1 − α)
!
+ log(s4 /ε) .
and a numerical test shows that this is the case if
m ≥ 2377, which gives
z ≥ C1 log(N/ε) ≥ C1 log(m) ≥ C1 log(2377).
Hence the minimal choice of z yields in (12)
3αz
3αC1 log(2377)
kz =
=
= b10.325c = 10,
16
16
and our assumption that kz ≥ K = 10 is satisfied.
m
We now turn to the Rademacher case where z = 2s
and we again set K = 10 and assume s ≥ 1, then (15)
requires similarly as above
m/ log(m) ≥ 2sC1 ≥ 2C1 ,
and kz ≥ K is satisfied. Consequently,
X
β −2k N
G2tLt (z) ≤ ε/2 and kz ≥ 10
with constants C̃ ≈ 67.87, C2 ≈ 8.353. The claim
now follows by replacing z by z = m/s if ξ = σ
m
if ξ = is a
is a Steinhaus sequence or z = 2s
Rademacher sequence.
[1] V. Chandrasekaran, B. Recht, P. Parrilo, and A. Willsky.
The convex geometry of linear inverse problems. Found.
Comput. Math., 12(6):805–849, 2012.
[2] M. Fornasier. Theoretical Foundations and Numerical
Methods for Sparse Recovery. Radon Series on Computational and Applied Mathematics. De Gruyter, 2010.
[3] David James.
Sparse recovery with random
convolutions. Master’s thesis, University of Bonn,
2013.
http://num.math.uni-goettingen.de/∼djames/
publications/MasterThesis James.pdf.
t∈[k]
are implied by
(16)
For the first term in (8) we choose the smallest possible
κ such that (3) is satisfied, i.e.
κ=
z ≥ C̃(log(s4 /ε) + C2 )
References
which, by a numerical test, is the case if m ≥ 5236.
Plugging this again into (12) yields for the minimal
choice of z ≥ C1 log(m) ≥ C1 log(5236),
3αz
3αC1 log(5236)
kz =
=
= b11.372c = 11,
16
16
m ≥ C1 s log(N/ε).
Since (16) already ensures that kz ≥ 10 we can now
plug β = 0.47, a ≤ 0.957 and K = 10 in the above
inequality to get
1 − a −3/2
(1 − a)/(1 + a)s−3/2
≥
s
.
2(1 + a)
1 + (1 − a)/(1 + a)s−3/2
We use (11) and the inequality above to obtain
αkz
κ−2 sG2k (z) ≤ κ−2 s
4(1 − α)
2
2(1 + a)
αkz
≤
s4
.
1−a
4(1 − α)
Hence the first term in (8) is upper bounded by ε/2 if
2
2(1 + a)
1
−kz
α
≥
s4
2/ε,
1−a
4(1 − α)
which holds by monotonicity of the logarithm if and
only if
2(1 + a)2
log(α−1 )kz ≥ log
+ log(s4 /ε).
(1 − a)2 (1 − α)
[4] F. Krahmer, S. Mendelson, and H. Rauhut. Suprema
of chaos processes and the restricted isometry property.
Comm. Pure Appl. Math., to appear.
[5] K. Li, L. Gan, and C. Ling. Convolutional compressed
sensing using deterministic sequences. Signal Processing, IEEE Transactions on, 61(3):740–752, 2013.
[6] G. E. Pfander and H. Rauhut. Sparsity in time-frequency
representations. J. Fourier Anal. Appl., 16(2):233–260,
2010.
[7] J. Riordan. An Introduction to Combinatorial Analysis.
Dover Books on Mathematics Series. Dover Publications, Incorporated, 2002.
[8] J. Romberg. Compressive sensing by random convolution. SIAM J. Img. Sci., 2(4):1098–1128, 2009.
Download