Recovery of Periodic Clustered Sparse Signals From Compressive Measurements Chia Wei Lim and Michael B. Wakin Department of Electrical Engineering and Computer Science Colorado School of Mines Abstract—The theory of Compressive Sensing (CS) enables the efficient acquisition of signals which are sparse or compressible in an appropriate domain. In the sub-field of CS known as model-based CS, prior knowledge of the signal sparsity profile is used to improve compression and sparse signal recovery rates. In this paper, we show that by exploiting the periodic support of Periodic Clustered Sparse (PCS) signals, model-based CS improves upon classical CS. We quantify this improvement in terms of simulations performed with a proposed greedy algorithm for PCS signal recovery and provide sampling bounds for the recovery of PCS signals from compressive measurements. Index Terms—Compressive Sensing, model-based Compressive Sensing, Periodic Clustered Sparse signals I. I NTRODUCTION Compressive Sensing (CS) has recently enabled efficient signal acquisition protocols with acquisition requirements much lower than those required by the Nyquist sampling theorem [1, 2]. CS exploits the compressibility of signals that can be well represented by just a few coefficients in an appropriate basis. In the classical setting, CS acquisition protocols involve taking a small number of “compressive” linear measurements of a signal and subsequently recovering the signal from such measurements by solving a convex optimization problem or using a greedy algorithm. More recently, it has been shown [3] that by imposing further structural dependencies in the locations of the signal coefficients, the number of compressive measurements required for signal recovery can be further reduced. These structural dependencies require a priori information about the structure of the signal support, and therefore signal recovery that relies on such information is termed model-based CS. In this paper, the theory of model-based CS [3] is used to reduce the number of compressive measurements required for recovery of Periodic Clustered Sparse (PCS) signals. PCS signals, which we define formally in Section III, are sparse signals whose supports are concentrated in a periodically repeating cluster. The model-based CoSaMP [3] algorithm, incorporated with a new model-based approximation for the PCS signal model, is shown to outperform CoSaMP [4]. The proposed PCS model-based approximation algorithm estimates key parameters associated with the PCS signal on the fly. PCS signals can often be found in Time Division Multiple Access (TDMA) networks having low traffic density. Typically Email: clim@mines.edu, mwakin@mines.edu. This work was partially supported by DSO National Laboratories Singapore and by NSF CAREER grant CCF-1149225. TDMA networks allow multiple users to transmit on the same frequency channel by allocating specific time slots to its users. During periods of low traffic intensity, where only one of many users are transmitting, the PCS signal aptly models such TDMA networks. The rest of this paper is organized as follows. Section II briefly reviews the theory of CS and model-based CS. Section III introduces the PCS signal model. In Section IV, modelbased CoSaMP is first introduced and is then followed by a discussion of a proposed PCS model-based approximation algorithm which is directly incorporated into the model-based CoSaMP algorithm for signal recovery. Section IV continues with the derivation of sampling bounds using subspace counting arguments. Finally, simulation results based on the PCS adapted model-based CoSaMP algorithm and concluding remarks are found in Sections V and VI, respectively. II. P RELIMINARIES A. CS Preliminaries Sparsity and compressibility are key notions fundamental to CS signal acquisition. These concepts can be explained as follows. Given a signal vector x ∈ RN and an orthonormal basis Ψ ∈ RN ×N , the signal’s coefficients (representation) η ∈ RN in Ψ are related via x = Ψη. A signal x is said to be s sparse in Ψ if η contains only s N nonzero entries and compressible in Ψ if η is dominated by a few large entries. Most CS acquisition protocols obtain compressive linear measurements y ∈ RM of x via y = Φx = ΦΨη = Aη where the sensing matrix Φ ∈ RM ×N is typically constructed randomly, A = ΦΨ and M N . A matrix A is said to satisfy the Restricted Isometry Property (RIP) of order s with isometry constant δs ∈ (0, 1) if (1 − δs )||η||22 ≤ ||Aη||22 ≤ (1 + δs )||η||22 holds for all s sparse vectors η. If A satisfies the RIP, recovery of an s sparse signal x ∈ RN from the compressive measurements y is possible [5]. For any fixed orthonormal basis Ψ, if M = O s log Ns and the sensing matrix Φ is generated randomly with independent and identically distributed (IID) Gaussian or Bernoulli (± √1M ) entries, then A will satisfy the RIP with high probability. B. Model-based CS Preliminaries N An s sparse coefficient vector η ∈ R lives in a union of ms = Ns subspaces of dimension s. When additional a priori information on the support of η is available, the number of possible subspaces ms can be reduced. This reduction allows for an improvement upon conventional CS in terms of the number of measurements required for signal recovery. Formally, such a priori information is referred to as a signal model Ms and is defined [3] as Ms = ms [ Xm , Xm := {η : η|Ωm ∈ Rs , η|ΩCm = 0}, (1) m=1 where η|Ω are the entries of η with indices Ω ⊆ {1, . . . , N }, ΩC is the complement of Ω, and each subspace Xm in (1) contains all η supported on Ωm . In other words, the index sets Ωm refer to all supports allowed in Ms . All coefficient vectors η belonging to such a model Ms (including the PCS signals to be discussed in Section III) are referred to as smodel sparse [3]. For a collection of s-model sparse signals, a matrix A is said to obey the model-based RIP [6, 7] if (1 − δMs )||η||22 ≤ ||Aη||22 ≤ (1 + δMs )||η||22 (2) holds for all η ∈ Ms with model-based isometry constant δMs . Blumensath et al. [6] proved an M × N IID subgaussian random matrix A satisfies the model-based RIP with probability at least 1 − e−t if 12 2 ln(2ms ) + sln +t . (3) M≥ 2 CδM δ Ms s In conventional CS where ms = Ns , substituting this value of ms into (3) gives M = O s log Ns . Eqn. (3) shall be used to derive the sampling bound required for successful PCS signal recovery from compressive measurements in Section IV. III. PCS S IGNAL M ODEL In this paper, a PCS signal η ∈ RN is one that is s sparse, parameterized by a fixed period L ∈ {1, 2, . . . , N }, start-phase α ∈ {1, 2, . . . , L}, and cluster size D ∈ {1, 2, . . . , L}. All indices β belonging to the support of such a PCS signal obey β = α + (i − 1) + jL, where all values of i ∈ {1, . . . , D} and j ∈ Z are taken such that 1 ≤ β ≤ N . In other words, a PCS signal is simply one whose support has a periodic clustered structure with start-phase α, period L, and dutys cycle D L ≈ N . A typical PCS signal sparsity profile is shown in Fig. 2(a). Evidently, due to the periodic structure of the sparsity profile of the PCS signal, such a priori information can be further used to improve the recovery of PCS signals from compressive measurements. This will be further discussed in the next section. The PCS signal can be seen as a special case of the (s, C)-clustered sparse signal introduced in [8]. However, as we discuss in Section IV-C, and in contrast to what one would obtain by applying the results in [8], the sampling bound we derive for PCS signals is independent of the number of clusters C. In this paper we will assume that s and N are known for a given signal, but that L, α, and D are unknown. In some cases, however, we may have prior information that the period L ∈ {Lmin , . . . , Lmax }, for some minimum and maximum possible periods Lmin and Lmax , respectively. In general, if Lmin and Lmax are unknown, appropriate values would be Lmin = 1 and Lmax = N2 (since a minimum of two consecutive periods is required to determine the period). We note that although L, α, and D may all be unknown a priori, since s and N are s known and D L ≈ N , only α and L remain as free parameters defining a PCS signal. As such the PCS signal model MPCS s can be formulated as ! mPCS L L max s [ [ [ MPCS = X (αm , Lm ) = X (α, L) , (4) s m=1 L=Lmin α=1 mPCS s where is the number of possible subspaces of the PCS signal and X (α, L) is a subspace corresponding to the support having start-phase α and period L. IV. PCS S IGNAL R ECOVERY FROM C OMPRESSIVE M EASUREMENTS A. Model-Based CoSaMP In the CS signal recovery problem, one wishes to recover the entries of η ∈ RN given the measurements y and the matrix A. Given further the restricted union of subspaces MPCS s , one could solve the following optimization problem: η = arg min ||η̃||0 , s.t. y = Aη̃ and η̃ ∈ MPCS s . (5) Solving (5) can be achieved by solving a least squares problem for every candidate support and returning the η̃ corresponding to the minimum least squares error. For the PCS model, we note that solving (5) in this manner has a worst case complexity of O(N 2 ) iterations (with each iteration involving a least squares problem), which may be prohibitive when N is large. Therefore faster algorithms are sought in the sequel. The model-based CoSaMP [3] algorithm, which is based on the CoSaMP [4] algorithm, belongs to the class of greedy algorithms frequently used for CS signal recovery. Due to its established robust signal recovery guarantees and simple iterative greedy structure, model-based CoSaMP shall be used as the primary CS recovery algorithm in this paper. Its pseudocode is listed in Algorithm 1. Algorithm 1 Model-based CoSaMP Inputs: CS matrix A, measurements y, model Ms Output: s-sparse approximation η̂ to true coefficients η 1: η̂0 = 0, r = y, T = ∅, i = 0 {initialize} 2: while halting criterion false do 3: i←i+1 4: e ← AT r {form coefficients residual estimate} 5: Ω ← supp(M2 (e, s)) {estimate support} 6: T ← Ω ∪ supp(η̂i−1 ) {merge supports} 7: b|T ← A†T y, b|T C ← 0 {form coefficients estimate} 8: η̂i ← M1 (b, s) {prune coefficients estimate} 9: r ← y − Aη̂i {update residual} 10: end while 11: return η̂ ← η̂i Model-based CoSaMP, being a modified CoSaMP algorithm, distinguishes itself through the signal support estimation steps 5 and 8 (in Algorithm 1) which are based on Ms . For a positive integer B, MB (η, s) is defined [3] as MB (η, s) = arg min ||η − η̄||2 , η̄∈MB s (6) where MB s is the B-Minkowski sum [3] for the set Ms : ( ) B X MB η= η (r) , with η (r) ∈ Ms . (7) s = r=1 Note that when B = 1, MB s = Ms . In words, MB (η, s) is the model-based approximation algorithm which returns the best approximation of η in the enlarged union of subspaces PCS MB s . In the next section, the design of an MB (η, s) (based PCS on Ms ) is discussed. However due to its high complexity, an alternative heuristic approximation c MPCS B (η, s) is proposed with lower complexity. B. PCS Model-based Approximation Algorithm can be used to gain further The PCS signal model MPCS s intuition as to how a PCS model-based approximation algorithm MPCS B (η, s) should be designed. Intuitively, with B = 1, MPCS (η, s) should search over a range of α’s and L’s and 1 return the approximate support set defined by an αopt and an Lopt which captures most of the energy of η. Thanks to the periodic structure of the PCS signal support, such a search can be implemented with a worst case complexity of O((Lmax − Lmin + 1)2 ). If Lmin and Lmax are unknown, the complexity increases to O(N 2 ). When B = 2, MPCS 2 (η, s) should return the best approximation to η based on candidate supports generated by union of two PCS signals. Hence, a natural extension of MPCS 1 (η, s) to MPCS (η, s) would be to jointly search over a range of α’s 2 and L’s (for PCS signal 1) and over a second range of α’s and L’s (for PCS signal 2), and return the support (generated by the union of these individual supports) which captures most of the energy of η. Such a search has a worst case complexity of O((Lmax − Lmin + 1)4 ), or O(N 4 ) if Lmin and Lmax are unknown. In general, these search-based MPCS B (η, s) algorithms have a worst case complexity of O(N 2B ). Algorithm 2 lists the pseudo-code of a heuristic approximation algorithm c MPCS B (η, s) that has lower computational complexity compared to the optimal MPCS B (η, s) previously discussed. Complexity reduction is achieved by observing that due to periodic support of the PCS signal, the support of its signal proxy AT y is also periodic. Subsequent Fourier analysis on the energy of AT y results in few significant peaks occurring at frequencies f = Lj , j ∈ Z. Therefore, FFT-based methods can be used to decouple the PCS period estimation and PCS (η, s) . Using a variant of start-phase estimation cf. MPCS B b B can be the harmonic comb [9], the PCS signal period L estimated using Algorithm 2 line 6 where hnum is the number of harmonics used for maximization, and hmin and hmax is the minimum and maximum harmonics considered, respectively. If a priori information of the range of L’s is available, the b B can be improved by restricting hmin and hmax . accuracy of L cPCS (η, s) Algorithm 2 PCS Model-based Approximation M B Inputs: nominal duty-cycle Ns , input signal η, parameter B, FFT length Nfft , range of harmonics hmin , hmax , number of harmonics hnum Output: approximation to supp(η) β 1: β = ∅ {initialize} 2: for b = 1 to B do 3: ηt |β ← 0, ηt |β C ← η|β C {compute proxy on reduced support} 4: ηe ← abs(ηt)2 {compute signal energy} PNfft −j2πk Nn fft {compute FFT} 5: H[k] = abs n=0 ηe [n]e Phnum 6: k̂0,b ← arg maxk̂0 ∈[hmin ,hmax ] a=0 H[ak̂0 ] {estimate PCS frequency} b b ← Nfft {estimate PCS period} 7: L k̂0,b b b 8: Db ← round LNb s {compute block length} PJ b b + k) {compute energy average} 9: ηae [k] ← 1 ηe (aL J 10: 11: 12: 13: 14: a=1 Db α bb ← arg maxk ηae ∗ gMA {estimate PCS start-phase} b βb ← α bb + (` − 1) + j Lb {generate PCS support based on bb } start-phase α bb and period L β ← β ∪ βb {union PCS support} end for return β {estimated support} b b B , PCS start-phase estiThen, using the estimated period L mation can be achieved via a series of operations (Algorithm 2 lines 8–10) which obtain the start-phase corresponding to the maximum average signal energy ηae captured assuming a PCS b B and duty-cycle D b B . ηae is signal template with period L easily computed by averaging ηe across segments of length b B (line 9). Next, ηae is filtered using a moving-average (MA) L bb D b b . The PCS start-phase α filter gMA of length D bb is estimated as the time index corresponding to the maximum value of the output of the MA filter. Overall, the algorithm loops over B total estimates of PCS signals, and in line 1 the supports of the previous estimates are zeroed out. The complexity of Algorithm 2 can be derived as follows: the complexity of the FFT is O(N log N ), the worst case complexity of the harmonic comb is O(N ), the worst case complexity of the averaging operation is O(N ) and the worst case complexity of moving average filtering is O(N log N ). Hence the overall worst case complexity of Algorithm 2 is O(BN log N ). Figure 1 shows a comparison of the timing Fig. 1. Timing profile comparison across different algorithms. D α L (a) Typical realization of a PCS coefficient vector. (b) Recovery success rate. (c) Normalized mean square error. Fig. 2. Simulation results for PCS signal recovery from compressive measurements. profile of model-based CoSaMP with MPCS B (η, s), modelbased CoSaMP with c MPCS B (η, s), and `0 minimization. C. Sampling Bound Given a family of s-model sparse signals Ms , model-based CoSaMP can be shown to perfectly recover any η ∈ Ms from the measurements y = Aη if the model-based RIP (2) holds for all η ∈ M4s with constant δM4s ≤ 0.1 [3]. The recovery is also robust with respect to noise. In the context of PCS signals, when Ms = MPCS s , we can compute the number of measurements M required for an M × N IID subgaussian matrix A to satisfy the model-based RIP for all η ∈ M4s . Theorem 4.1: With high probability, the number of measurements sufficient for successful recovery of s-sparse PCS signals is given by MPCS = O (s + log N ). Proof: For the PCS model, ms , the number of subspaces of Ms , can be easily derived using the following argument. For a given hypothesized period L, the number of possible start-phases is exactly L. Therefore ms = 1 + 2 + · · · + N N N = + 1 , where the following assumptions are made: 2 4 2 Lmin = 1, Lmax = N2 . To compute a bound on MPCS , we employ (3). However, because we require the model-based RIP for all η ∈ M4s rather than merely all η ∈ Ms , we must substitute (ms )4 in place of ms and 4s in place of s. The final result changes only by a constant. We note that Theorem 4.1 applies to model-based CoSaMP (Algorithm 1) using MPCS B (η, s) and is not technically guaranteed to apply to model-based CoSaMP using c MPCS B (η, s). We also note that the (s, C)-clustered sparse signal model M(s,C) [8] allows supports which occur in clusters (C is the maximum number of clusters). In contrast to the (s, C)-clustered sparse signal sampling bound M(s,C) = O s + C log N [8], MPCS C is not dependent on the number of clusters, thanks to the assumed periodicity of the PCS signal clusters. V. S IMULATION R ESULTS In this section,1 the performance of model-based CoSaMP (Algorithm 1) incorporated with c MPCS B (Algorithm 2 executed with Nfft = 256, hnum = 3, hmin = 2 and hmax = 50) is compared with that of a model-based CoSaMP incorporated with MPCS (executed with2 Lmin = 30 and Lmax = 34), the B 1 Here, 2 These PCS b PCS b PCS we denote MPCS B (η, s) by MB and MB (η, s) by MB . values were chosen to reduce overall simulation run-time (cf. Fig. 1). standard CoSaMP [4], and `0 minimization. PCS coefficient vectors η with s = 40, L = 32, N = 256 and α = 10 were constructed with entries generated from an IID normal distribution; Figure 2(a) shows a typical realization. 500 independent trials were performed using independent realizations of η and of the matrix A, which contained entries generated from IID Bernoulli (±1) distribution. In these simulations, we take Ψ = IN ×N , and hence η = x. Figure 2(b) shows the probability of successful PCS signal recovery from compressive measurements as a function of M s . A recovered signal η̂ is considered successfully recovered when ||η̂ −η||∞ ≤ 10−4 . We note that as compared to conventional CoSaMP, model-based CoSaMP performs significantly better in terms of PCS signal success recovery rate especially when M s < 3. Model-based CoSaMP achieves better than 80% success rate even when M s is 2 as compared to CoSaMP which is unable to successfully recover the signal at all. We note that as compared to model-based CoSaMP (with MPCS s ), modelPCS c based CoSaMP (with MB ) offers comparable performance M down to M s = 2.2, with significant degradation when s = 2. We note that because line 7 of Algorithm 1 involves solving a least squares problem with up to 2s unknowns, we did not pursue simulations with M < 2s. Figure 2(c) shows the Normalized Mean Squared Error (NMSE) of the recovered signal η̂ as a function of M P500 ||η̂i −ηi ||22s 1 where NMSE is computed as N M SE = 500 i=1 ||ηi ||22 . Model-based CoSaMP again performs significantly better than CoSaMP in terms of NMSE especially when M s < 3.5. Model) performs significantly better than based CoSaMP (with MPCS B M PCS c model-based CoSaMP (with MB ) when s < 3. VI. C ONCLUSION Relying on a priori information of the support of the PCS signal, a PCS signal model MPCS (a union of subspaces) was s formulated. Then, using insights gained from the MPCS formus lation, a PCS model-based approximation algorithm c MPCS B was designed and a sampling bound for ideal PCS signal recovery was derived. Finally, through Monte Carlo simulations, it was demonstrated that model-based CoSaMP using c MPCS offered B comparable performance with model-based CoSaMP using M MPCS B down to s = 2 and still performed better than CoSaMP especially in the regime when M s < 3. R EFERENCES [1] E. Candès, J. Romberg, and T. Tao. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory, 52(2):489–509, February 2006. [2] D. Donoho. Compressed sensing. IEEE Trans. Inf. Theory, 52(4), April 2006. [3] R.G. Baraniuk, V. Cevher, M.F. Duarte, and C. Hegde. Model-based compressive sensing. Information Theory, IEEE Transactions on, 56(4):1982– 2001, April 2010. [4] D. Needell and J. A. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Appl. Comput. Harmonic Anal., 26(3):301–321, 2009. [5] E. J. Candès. Compressive sampling. In Proceedings of the International Congress of Mathematicians: Madrid, August 22-30, 2006: invited lectures, pages 1433–1452, 2006. [6] T. Blumensath and M.E. Davies. Sampling theorems for signals from the union of finite-dimensional linear subspaces. Information Theory, IEEE Transactions on, 55(4):1872–1882, April 2009. [7] Y.M. Lu and M.N. Do. Sampling signals from a union of subspaces. Signal Processing Magazine, IEEE, 25(2):41–47, March 2008. [8] V. Cevher, P. Indyk, C. Hegde, and R. G. Baraniuk. Recovery of clustered sparse signals from compressive measurements. In Sampling Theory and Applications (SAMPTA), Marseille, May 2009. [9] J. O. Smith. Physical Audio Signal Processing. http://ccrma.stanford.edu/ ∼jos/pasp/, 2014. online book, 2010 edition.