INCOHERENT DICTIONARIES AND THE STATISTICAL RESTRICTED ISOMETRY PROPERTY SHAMGAR GUREVICH AND RONNY HADANI Abstract. In this paper we formulate and prove a statistical version of the restricted isometry property (SRIP for short) which holds in general for any incoherent dictionary D which is a disjoint union of orthonormal bases. In addition, we prove that, under appropriate normalization, the spectral distribution of the associated Gram operator converges in probability to the Sato-Tate (also called semi-circle) distribution. The result is then applied to various dictionaries that arise naturally in the setting of …nite harmonic analysis, giving, in particular, a better understanding on a conjecture of Calderbank concerning RIP for the Heisenberg dictionary of chirp like functions. 0. Introduction Digital signals, or simply signals, can be thought of as complex valued functions on the …nite …eld Fp where p is a prime number. The space of signals H = C (Fp ) is a Hilbert space of dimension p, with the inner product given by the standard formula P hf; gi = f (t) g (t): t2Fp A dictionary D is simply a set of vectors (also called atoms) in H. The number of vectors in D can exceed the dimension of the Hilbert space H, in fact, the most interesting situation is when jDj p = dim H. In this set-up one can de…ne a resolution of the Hilbert space H via D, which is the morhpism of vector spaces : C (D) ! H, P de…ned by (f ) = '2D f (') ', for every f 2 C (D). A more concrete way to think of the morphism is as a p jDj matrix with the columns being the atoms in D. 0.1. The restricted isometry property. Fix a natural number n 2 N and a pair of positive real numbers 1 ; 2 2 R>0 . De…nition 0.1. A dictionary D satis…es the restricted isometry property (RIP for short) with coe¢ cients ( 1 ; 2 ; n) if for every subset S D such that jSj n (1 2 ) kf k k (f )k (1 + 1 ) kf k ; for every function f 2 C (D) which is supported on the set S. Date : July 2008. c Copyright by S. Gurevich and R. Hadani, July 1, 2008. All rights reserved. 1 2 SHAM GAR GUREVICH AND RONNY HADANI Equivalently, RIP can be formulated in terms of the spectral radius of the corresponding Gram operator. Let G (S) denote the composition S S with S denoting the restriction of to the subspace CS (D) C (D) of functions supported on the set S. The dictionary D satis…es ( 1 ; 2 ; n)-RIP if for every subset S D such that jSj n 2 kG (S) IdS k 1; where IdS is the identity operator on CS (D). Problem 0.2. Find deterministic construction of a dictionary D with jDj which satis…es RIP with coe¢ cients in the critical regime (0.1) 1; 2 for some constant 0 < of such a construction. 1 and n = p p; < 1. As far as we know, to this day, there is no example 0.2. Incoherent dictionaries. Fix a positive real number 2 R>0 . De…nition 0.3. A dictionary D is called incoherent with coherence coe¢ cient (also called -coherent) if for every pair of distinct atoms '; 2 D jh'; ij p . p There are various explicit constructions of incoherent dictionaries. We will be interested, particularly, with three examples which arise naturally in the setting of …nite harmonic analysis (for the sake of completeness we review the construction of these examples in Section 3): The …rst example [H, HCM], referred to as the Heisenberg dictionary DH , is constructed using the Heisenberg representation of the …nite Heisenberg group H (Fp ). The Heisenberg dictionary is of size jDH j p2 and its coherence coe¢ cient is = 1. The second example [GHS], which is referred to as the oscillator dictionary DO , is constructed using the Weil representation of the …nite symplectic group SL2 (Fp ). The oscillator dictionary is of size jDO j p3 and its coherence coe¢ cient is = 4. The third example [GHS], referred to as the extended oscillator dictionary DEO , is constructed using the Heisenberg-Weil representation of the …nite Jacobi group J (Fp ) = SL2 (Fp ) n H (Fp ). The extended oscillator dictionary is of size jDEO j p5 and its coherence coe¢ cient is = 4. The three examples of dictionaries we just described constitute reasonable candidates for solving Problem 0.2: They are large in the sense that jDj p and, moreover, empirical evidence suggests that they satisfy RIP with coe¢ cients in the critical regime (0.1). We summarize this as follows: Conjecture 0.4. The dictionaries DH ; DO and DEO satisfy RIP with coe¢ cients 1 and n = p, for some 0 < < 1. 1; 2 0.3. Main results. In this paper we formulate a relaxed statistical version of RIP, called statistical isometry property (SRIP for short) and we prove that it holds for any incoherent dictionary D which is, in addition, a disjoint union of orthonormal STATISTICAL RESTRICTED ISOM ETRY PROPERTY 3 bases: D= (0.2) a Bx , x2X where Bx = b1x ; ::; bpx is an orthonormal basis of H, for every x 2 X. 0.3.1. The statistical isometry property. Let D be an incoherent dictionary of the from (0.2). Roughly, the statement is that for S D, jSj = n with n = p1 " , for 0 < " < 1, choosen uniformly at random, the operator norm kG (S) IdS k is small with high probability. Theorem 0.5 (SRIP property). For every k 2 N, there exists a constant C (k) such that (0.3) P kG (S) IdS k p "=2 C (k) p1 "k=2 The above theorem, in particular, implies that P kG (S) as p ! 1 faster then p l for any l 2 N. : IdS k p "=2 !0 0.3.2. The statistics of the eigenvalues. A natural thing to know is how the eigenvalues of the Gram operator G (S) ‡actuate around 1. In this regard, we study the spectral statistics of the normalized error term 1=2 E (S) = (p=n) (G (S) IdS ) : Pn Let E(S) = n 1 i=1 i denote the spectral distribution of E (S) with i , i = 1; ::; n, are the real eigenvalues of the Hermitian operator E (S). We prove that Sato-Tate distribution (also called the E converges in probability as p ! 1 to the 1p 4 x2 1[2; 2] (x) where 1[2; 2] is the semi-circle distribution) ST (x) = (2 ) characteristic function of the interval [ 2; 2]. Theorem 0.6 (Sato-Tate ). (0.4) lim p!1 P E = ST . Remark 0.7. A limit of the form (0.4) is familiar in random matrix theory as the asymptotic of the spectral distribution of Wigner matrices. Interestingly, the same asymptotic distribution appears in our situation, albeit, the probability spaces are of di¤ erent nature (our probability spaces are, in particular, much smaller). In particular, Theorems 0.5, 0.6 can be applied to the three examples DH , DO and DEO , which are all of the appropriate form (0.2). Finally, we remark that our result gives new information on a conjecture of Calderbank concerning RIP of the Heisenberg dictionary. Remark 0.8. For practical applications, it might be important to compute explicitly the constants C (k) which appears in (0.3). This constant depends on the incoherence coe¢ cient , therefore, for a …xed p, having as small as possible is preferable. 4 SHAM GAR GUREVICH AND RONNY HADANI 0.3.3. Structure of the paper. The paper consists of four sections except of the introduction . In Section 1, we develop the statistical theory of systems of incoherent orthonormal bases. We begin by specifying the basic set-up. Then we proceed to formulate and prove the main Theorems of this paper - Theorem 1.2, Theorem 1.3 and Theorem 1.4. The main technical statement underlying the proofs is formulated in Theorem 1.5. In Section 2, we prove Theorem 1.5. In Section 3, we review the constructions of the dictionaries DH ; DO and DEO . Finally, in Appendix A, we prove all technical statements which appear in the body of the paper. Acknowledgement 0.9. It is a pleasure to thank J. Bernstein for his interests and guidance in the mathematical aspects of this work. We are grateful to R.Calderbank for explaining to us his work. We would like to thank A. Sahai for many interesting discussions. Finally, we thank B. Sturmfels for encouraging us to proceed in this line of research. 1. The statistical theory of incoherent bases 1.1. Standard Terminology. 1.1.1. Terminology from asymptotic analysis. Let fap g ; fbp g be a pair of sequences of positive real numbers. We write ap = O (bp ) if there exists C > 0 and Po 2 N such that ap C bp for every p P0 . We write ap = o (bp ) if limp!1 ap =bp = 0. Finally, we write ap bp if limp!1 ap =bp = 1. 1.1.2. Terminology from set theory. Let n 2 N 1 . We denote by [1; n] the set f1; 2; ::; ng. Given a …nite set A, we denote by jAj the number of elements in A. 1.2. Basic set-up. 1.2.1. Incoherent orthonormal bases. Let f(Hp ; h ; ip )g be a sequence of Hilbert spaces such that dim Hp = p. De…nition 1.1. Two (sequences of ) orthonormal bases Bp ; Bp0 of Hp are called -coherent if jhb; b0 ij p ; p for every b 2 Bp and b0 2 Bp0 and is some …xed (does not depend on p) positive real number. Fix 2 R>0 . Let fXp g be a sequence of sets such that limp!1 jXp j = 1 (usually we will have that p = o (jXp j)) such that each Xp parametrizes orthonormal bases of Hp which are -coherent pairwisely., that is, for every x 2 Xp , there is an orthonormal basis Bx = b1x ; ::; bpx of Hp so that (1.1) for every x 6= y 2 Xp . Denote hbix ; bjy i Dp = p ; p F Bx . x2Xp The set Dp will be referred to as incoherent dictionary or sometime more precisely as -coherent dictionary. STATISTICAL RESTRICTED ISOM ETRY PROPERTY 1.2.2. Resolutions of Hilbert spaces. Let vector spaces given by (f ) = p 5 : C (Dp ) ! Hp be the morphism of p P f (b) b. b2Dp The map p will be referred to as resolution of Hp via Dp . Convention: For the sake of clarity we will usually omit the subscript p from the notations. 1.3. Statistical restricted isometry property (SRIP). The main statement of this paper concerns a formulation of a statistical restricted isometry property (SRIP for short) of the resolution maps . Let n = n (p) = p1 " , for some 0 < " < 1. Let n = ([1; n]) denote the set of injective maps n = fS : [1; n] ,! Dg : We consider the set n as a probability space equipped with the uniform probability measure. Given a map S 2 n , it induces a morphism of vector spaces S : C ([1; n]) ! C (D) given by S ( i ) = S(i) . Let us denote by S : C ([1; n]) ! H the composition S and by G (S) 2 Matn n (C) the Hermitian matrix G (S) = S. S Concretely, G (S) is the matrix (gij ) where gij = hS (i) ; S (j)i. In plain language, G (S) is the Gram matrix associated with the ordered set of vectors (S (1) ; :::; S (n)) in H. We consider G : n ! Matn n (C) as a matrix valued random variable on the probability space n . The following theorem asserts that with high probability the matrix G is close to the unit matrix In 2 Matn n (C). Theorem 1.2. Let 0 P kG e 1 and let k 2 N be an even number such that ek 1=(2+e) In k (n=p) For a proof, see Subsection 1.6. In the above theorem, substituting n = p1 P kG In k =(2+e) p ek=(2+e) = O (n=p) " 1 n : yields =O p e(k+1)=(2+e)+1 : Equivalently, Theorem 1.2 can be formulated as a statistical restricted isometry property of the resolution morphism . A given S 2 n de…nes a morphism of vector spaces S = S : C ([1; n]) ! H - in this respect, can be considered as a random variable : n ! Mor (C ([1; n]) ; H) . Theorem 1.3 (SRIP property). Let 0 such that ek 1 P Sup fjk (f )k kf kjg e 1 and let k 2 N be an even number 1=(2+e) (n=p) ek=(2+e) = O (n=p) n : 6 SHAM GAR GUREVICH AND RONNY HADANI 1.4. Statistics of the error term. Let E denote the normalized error term 1=2 E = (p=n) (G In ) . Our goal is describe the statistics of the random variable E. Let spectral distribution of E, namely E n 1 P n i=1 = i (E) E denote the ; where 1 (E) ::: 2 (E) n (E) are the eigen values of E indexed in decreasing order (We note that the eigenvalues of E are real since it is an Hermitian matrix). The following theorem asserts that the spectral distribution E converges in probability to the Sato-Tate distribution 1 p 4 x2 1[2; 2] (x) . (1.2) ST (x) = 2 Theorem 1.4. lim p!1 P E = ST . For a proof, see Subsection 1.7. 1.5. The method of moments. The proofs of Theorems 1.2, 1.4 will be based on the method of moments. Let mk denote the kth moment of the distribution E , that is Z n 1X 1 k mk = xk E (x) = T r Ek . i (E) = n i=1 n R Similarly, let mST;k denote the kth moment of the semicircle distribution. Theorem 1.5. For every k 2 N, (1.3) lim E (mk ) = mST;k . p!1 In addition, (1.4) V ar (mk ) = O n 1 ; For a proof, see Section 2. 1.6. Proof of Theorem 1.2. Theorem 1.2 follows from Theorem 1.5 using the Markov inequality. Let > 0 and k 2 N an even number. First, observe that the condition 1=2 kG In k is equivalent to the condition kEk (p=n) which, in turns, 1=2 is equivalent to the spectral condition max (E) (p=n) . k k k k Since, max (E) 1 (E) + 2 (E) + :: + n (E) we can write P max (E) 1=2 (p=n) = P P k max Pn i=1 = P mk k=2 k (E) (p=n) k k=2 k i (E) (p=n) n 1 k=2 k (p=n) : STATISTICAL RESTRICTED ISOM ETRY PROPERTY By the triangle inequality mk mk 0) therefore we can write P mk n 1 jmk k=2 k (p=n) 7 Emk j+Emk (recall that k is even, hence P jmk Emk j n 1 k=2 k (p=n) Emk . 1=(e+2) 1=(e+2) By (1.3), Emk = O (1), in addition, substituting = (n=p) = (p=n) k ek=2(2+e) with 0 < e < 1, we get n 1 (p=n) k = n 1 (p=n) . Altogether, we can summarize the previous development with the following inequality P kG In k 1=(e+2) (n=p) P jmk By Markov inequality P (jmk ek=2(2+e) n 1 (p=n) + O (1) we get P jmk Emk j n 1 Emk j Emk j 1 ek=2(2+e) (p=n) + O (1) : V ar (mk ) = 2 . Substituting ) ek=2(2+e) (p=n) n ek=(2+e) + O (1) = O n (n=p) where in the last equality we used the estimate V ar (mk ) = O n 1.5). This concludes the proof of the theorem. 1 = , (see Theorem 1.7. Proof of Theorem 1.4. Theorem 1.4 follows from Theorem 1.5 using the Markov inequality. P In order to show that limp!1 E = ST , it is enough to show that for every k 2 N and > 0 we have lim P (jmk mST;k j p!1 ) = 0: The proof of the last assertion proceeds as follows: By the triangle inequality we have that jmk mST;k j jmk Emk j + jEmk mST;k j, therefore P (jmk mST;k j ) P (jmk Emk j + jEmk By (1.3) there exists P0 2 N such that jEmk hence P (jmk for every p Emk j + jEmk mST;k j mST;k j mST;k j ) P (jmk =2, for every p Emk j P0 . Now, using the Markov inequality P (jmk Emk j =2) V ar (mk ) : =2 This implies that P (jmk msc k j ) V ar (mk ) p!1 ! 0, =2 where we use the estimate V ar (mk ) = O (1=n) (Equation (1.4)). This concludes the proof of the theorem. 2. Proof of Theorem 1.5 2.1. Preliminaries on matrix multiplication. ). =2) ; P0 , 8 SHAM GAR GUREVICH AND RONNY HADANI 2.1.1. Pathes. De…nition 2.1. A path of length k on a set A is a function : [0; k] ! A. The path is called closed if (0) = (k). The path is called strict if (j) 6= (j + 1) for every j = 0; ::; k 1. Given a path : [0; k] ! A, an element (j) 2 A is called a vertex of the path . A pair of consecutive vertices ( (j) ; (j + 1)) is called an edge of the path . Let Pk (A) denote the set of strict closed paths of length k on the set A and by Pk (A; a; b) where a; b 2 A, the set of strict paths of length k on A which begin at the vertex a and end at the vertex b. Conventions: We will consider only strict paths and refer to these simply as paths. When considering a closed path 2 Pk (A), it will be sometime convenient to think of it as a function : Z=kZ ! A. 2.1.2. Graphs associated with paths. Given a path , we can associate to it an undirected graph G = (V ; E ) where the set of vertices V = Im and the set of edges E consists of all sets fa; bg A so that either (a; b) or (b; a) is an edge of . Remark 2.2. Since the graph G is obtained from a path it is connected and, moreover, jV j ; jE j k where k is the length of : De…nition 2.3. A closed path 2 Pk (A) is called a tree if the associated graph G is a tree and every edge fa; bg 2 E is crossed exactly twice by , once as (a; b) and once as (b; a). Let Tk (A) Pk (A) denote the set of trees of length k. Remark 2.4. If 2 (jV j 1) : is a tree of length k then k must be even, moreover, k = 2.1.3. Isomorphism classes of paths. Let us denote by (A) the permutation group Aut (A). The group (A) acts on all sets which can be derived functorially from the set A, in particular it acts on the set of closed paths Pk (A) as follows: Given 2 (A) it sends a path : [0; k] ! A to . An isomorphism class = [ ] 2 Pk (A) = (A) can be uniquely speci…ed by a k + 1 ordered tuple of positive integers ( 0 ; ::; k ) where for each j the vertex (j) is the j th distinct vertex crossed by . For example, the isomorphism class of the path = (a; b; c; a; b; a) is speci…ed by [ ] = (1; 2; 3; 1; 2; 1). As a consequence we get that (2.1) j[ ]j = jAj(jV j) = jAj (jAj 1) ::: (jAj jV j + 1) : 2.1.4. The combinatorics of matrix multiplication. First let us …x some general notations: If the set A is [1; n] then we will denote Pk = Pk ([1; n]), Pk (i; j) = Pk ([1; n] ; i; j). Tk = Tk ([1; n]). ([1; n]) : n = Let M 2 Matn n (C) be a matrix such that mii = 0, for every i 2 [1; n]. The (i; j) entry mki;j of the kth power matrix M k can be described as a sum of contributions indexed by strict paths, that is X mki;j = w , 2Pk (i;j) STATISTICAL RESTRICTED ISOM ETRY PROPERTY where w = m (0); the trace of M k as (1) m X Tr Mk = (2.2) :: m (1); (2) (k 1); (k) . X 9 Consequently, we can describe w = X w , 2Pk i2[1;n] 2Pk (i;i) 2.2. Fundamental estimates. Our goal here is to formulate the fundamental estimates that we will require for the proof of theorem 1.5: Recall k=2 k mk = n 1 T r Ek = n 1 (p=n) T r (G In ) . Since (G In )ii = 0 for every i 2 [1; n] we can write, using Equation (2.2), the moment mk in the form X k=2 (2.3) mk = n 1 (p=n) w , 2Pk where w : n ! C is the random variable given by w (S) = hS (0) ; S (1)i ::: hS Consequently, we get that (2.4) Emk = n 1 k=2 (p=n) (k X 1) ; S (k)i . Ew : 2Pk Lemma 2.5. Let 2 n then Ew = Ew ( ). For a proof, see Appendix A. Lemma 2.5 implies that the expectation Ew depends only on the isomorphism class [ ] therefore we can write the sum (2.4) in the form X k=2 Emk = n 1 (p=n) j j Ew ; 2Pk = n where Ew denotes the expectation Ew for any n( ) = n 1 k=2 (p=n) j j=n 1 k=2 (p=n) njV 2 . Let us denote j = pk=2 njV j 1 k=2 , where in the second equality we used (2.1). We conclude the previous development with the following formula X (2.5) Emk = n ( ) Ew . 2Pk = n Theorem 2.6 (Fundamental estimates). Let (1) If k > 2 (jV j 1) then (2.6) (2) If k lim n ( ) Ew = 0: p!1 2 (jV j 1) and (2.7) (3) If k (2.8) 2 Pk = is not a tree then lim n ( ) Ew = 0: p!1 2 (jV j 1) and is a tree then lim n ( ) Ew = 1: p!1 For a proof, see Subsection 2.4. n. 10 SHAM GAR GUREVICH AND RONNY HADANI 2.3. Proof of Theorem 1.5. The proof is a direct consequence of the fundamental estimates (Theorem 2.6). 2.3.1. Proof of Equation (1.3). Our goal is to show that limp!1 Emk = mST;k . Using Equation (2.5) we can write X (2.9) lim Emk = lim n ( ) Ew : p!1 2Pk = p!1 n When k is odd, no class 2 Pk = n is a tree (see Remark 2.4), therefore by Theorem 2.6 all the terms in the right side of (2.9) are equal to zero, which implies that in this case limp!1 Emk = 0. When k is even then, again by Theorem 2.6, only terms associated to trees yields a non-zero contribution to the right side of (2.9), therefore in this case X X 1 = jTk j . lim n ( ) Ew = lim Emk = p!1 p!1 2Tk = For every m 2 N, let m 2Tk = n n denote the mth Catalan number, that is m 1 : m+1 2m m = On the one hand, the number of isomorphism classes of trees in Tk = described in terms of the Catalan numbers: Lemma 2.7. If k = 2m, m 2 N then jT2m j = n can be m. For a proof, see Appendix A. On the other hand, the moments mST;k of the Sato-Tate distribution are wellknown and can be described in terms of the Catalan numbers as well: Lemma 2.8. If k = 2m then mST;k = m otherwise, if k is odd then mST;k = 0. Consequently we obtain that for every k 2 N lim Emk = mST;k . p!1 This concludes the proof of the …rst part of the theorem. 2.3.2. Proof of Equation (1.4). By de…nition, V ar (mk ) = Em2k Equation (2.3) implies that X k Em2k = n 2 (p=n) E w 1w 2 , 2 (Emk ) . 1 ; 2 2Pk Equation (2.4) implies that 2 (Emk ) = n 2 k (p=n) X 1; Ew 1 Ew 2 2 2Pk When V \ V 0 = ?, E(w 1 w 2 ) = Ew 1 Ew 2 . If we denote by Ik Pk the set of pairs ( 1 ; 2 ) such that V 1 \ V 2 6= ? then we can write X k V ar (mk ) = n 2 (p=n) E(w 1 w 2 ) Ew 1 Ew 2 : ( 1 ; 2 )2Ik The estimate of the variance now follows from Pk STATISTICAL RESTRICTED ISOM ETRY PROPERTY Lemma 2.9. n (p=n) ( n 2 X k 2 k (p=n) ( E(w 1 w 2 ) 11 = O n 1 ; = O n 1 : 1 ; 2 )2Ik X Ew Ew 1 2 1 ; 2 )2Ik For a proof, see Appendix A. This concludes the proof of the second part of the theorem. 2.4. Proof of Theorem 2.6. We begin by introducing notation: Given a set A we denote by (A) the set of injective maps (A) = fS : A ,! Dg , and consider measure. (A) as a probability space equipped with the uniform probability 2.4.1. Proof of Equation 2.6. Let = [ ] 2 Pk = n be an isomorphism class and assume that k > 2 (jV j 1). Our goal is to show that lim n ( ) jEw j = 0. p!1 njV j , therefore On the one hand, by Equation (2.1), we have that j[ ]j (2.10) k=2 n ( ) = (p=n) On the other hand Ew = j nj 1 X S2 n pk=2 njV j[ ]j j 1 k=2 1 w (S) = j (V )j By the triangle inequality, jEw j j (V )j the incoherence condition (Equation (1.1)) jw (S)j k=2 p 1 S2 (V ) j 1 k=2 w (S) . jw (S)j, moreover, by , for every S 2 (V ). In conclusion, we get that jEw j with (2.10) yields n ( ) jEw j = O njV X S2 (V ) P k=2 . k=2 p k=2 which combined p!1 ! 0, since, by assumption, jV j 1 k=2 < 0. This concludes the proof of Equation 2.6. 2.4.2. Proof of Equations (2.7) and (2.8). Let = [ ] 2 Pk = n be an isomorphism class and assume that k 2 (jV j 1). We prove Equations (2.7), (2.8) by induction on jV j. Since k 2 (jV j 1), there exists a vertex v = (i0 ) where 0 i0 k 1, which is crossed once by the path . Let vl = (i0 1) and vr = (i0 + 1) be the adjacent vertices to v. We will deal with the following two cases separately: Case 1. vl 6= vr . Case 2. vl = vr . 12 SHAM GAR GUREVICH AND RONNY HADANI Introduce the following auxiliary constructions: If vl 6= vr , let vb : [0; k 1] ! [1; n] denote the closed path of length k 1 de…ned by (j) j i0 1 : v b (j) = (j + 1) i0 j k 1 In words, the path vb is obtained from by deleting the vertex v and inserting an edge connecting vl to vr . If vl = vr , let vb : [0; k 2] ! [1; n] denote the closed path of length k 2 de…ned by (j) j i0 1 : v b (j) = (j + 2) i0 j k 2 In words, the path vb is obtained from by deleting the vertex v and identifying the vertices vl and vr . In addition, for every u 2 V fv; vl ; vr g, let u : [0; k] ! [1; n] denote the closed path of length k de…ned by 8 < (j) j i0 1 u j = i0 : u (j) = : (j) i0 + 1 j k In words, the path u is obtained from by deleting the vertex v and inserting an edge connecting vl to u followed by an edge connecting u to vr . Important fact: The number of vertices in the paths vb, u is jV j 1. The main technical statement is the following relation between the expectation Ew and the expectations Ew vb , Ew u . Proposition 2.10. (2.11) Ew 1 p Ew (p jXj) v b 1 X Ew u . u For a proof, see Appendix A. Analysis of case 1. In this case the path is not a tree hence our goal is to show that lim n ( ) Ew = 0. p!1 The length of The length of v b u is k 1 and V n( ) pk=2 njV is k and V n( ) p u v b = jV j 1, therefore j 1 k=2 = jV j p1=2 n1=2 n ([ 1, therefore k=2 jV j 1 k=2 n n n ([ v b]) : u ]) : Applying the above to (2.11) we obtain n ( ) Ew 1=2 (n=p) n ([ v b]) Ew v b . By estimate (2.6) and the induction hypothesis n ([ vb]) Ew vb = O (1), therefore limp!1 n ( ) Ew = 0, since (n=p) = o (1) (recall that we take n = p1 ). This concludes the proof of Equation 2.7: Analysis of case 2. The length of vb is k 2 and V vb = jV j 1, therefore n( ) pk=2 njV j 1 k=2 pn ([ v b]) : STATISTICAL RESTRICTED ISOM ETRY PROPERTY The length of u is k and V n( ) u = jV j pk=2 njV 13 1, therefore j 1 k=2 n n ([ u ]) : Applying the above to (2.11) yields n ( ) Ew n ([ v b]) Ew v b . If is a tree then vb is also a tree with a smaller number of vertices, therefore, by the induction hypothesis limp!1 n ([ vb]) Ew vb = 1 which implies by (2.11) that limp!1 n ( ) Ew = 1 as well. This concludes the proof of Equation (2.8): 3. Examples of incoherent dictionaries 3.1. Representation theory. 3.1.1. The Heisenberg group. Let (V; !) be a two-dimensional symplectic vector space over the …nite …eld Fp . The reader should think of V as Fp Fp with the standard symplectic form ! (( ; w) ; ( 0 ; w0 )) = w0 w 0: Considering V as an Abelian group, it admits a non-trivial central extension called the Heisenberg group. Concretely, the group H can be presented as the set H = V Fp with the multiplication given by (v; z) (v 0 ; z 0 ) = (v + v 0 ; z + z 0 + 21 !(v; v 0 )): The center of H is Z = Z(H) = f(0; z) : z 2 Fp g : The symplectic group Sp = Sp(V; !), which in this case is just isomorphic to SL2 (Fp ), acts by automorphism of H through its tautological action on the V -coordinate, that is, a matrix g= a b ; c d sends an element (v; z), where v = ( ; w) to the element (gv; z) where gv = (a + bw; c + dw). 3.1.2. The Heisenberg representation . One of the most important attributes of the group H is that it admits, principally, a unique irreducible representation. The precise statement goes as follows: Let : Z ! S 1 be a non-degenerate unitary 2 i character of the center, for example, in this paper we take (z) = e p z . It is not di¢ cult to show [T] that Theorem 3.1 (Stone-von Neuman). There exists a unique (up to isomorphism) irreducible unitary representation : H ! U (H) with central character , that is, (z) = (z) IdH , for every z 2 Z. The representation which appears in the above theorem will be called the Heisenberg representation. The representation : H ! U (H) can be realized as follows: H is the Hilbert space C(Fp ) of complex valued functions on the …nite line, with the standard inner product X hf; gi = f (t) g (t), t2Fp for every f; g 2 C(Fp ), and the action is given by 14 SHAM GAR GUREVICH AND RONNY HADANI ( ; 0)[f ] (t) = f (t + ) ; (0; w)[f ] (x) = (wt) f (t) ; (z)[f ] (t) = (z) f (t) ; z 2 Z: Here we are using to indicate the …rst coordinate and w to indicate the second coordinate of V ' Fp Fp . We will call this explicit realization the standard realization. 3.1.3. The Weil representation . A direct consequence of Theorem 3.1 is the existence of a projective unitary representation e : Sp ! P U (H). The construction of e out of the Heisenberg representation is due to Weil [W] and it goes as follows: Considering the Heisenberg representation : H ! U (H) and an element g 2 Sp, one can de…ne a new representation g : H ! U (H) by g (h) = (g (h)). Clearly both and g have the same central character hence by Theorem 3.1 they are isomorphic. Since the space of intertwining morphisms HomH ( ; g ) is one dimensional, choosing for every g 2 Sp a non-zero representative e(g) 2 HomH ( ; g ) gives the required projective representation. In more concrete terms, the projective representation e is characterized by the Egorov’s condition: e (g) (h) e g (3.1) 1 = (g (h)) ; for every g 2 Sp and h 2 H. The important and non-trivial statement is that the projective representation e can be linearized in a unique manner into an honest unitary representation: Theorem 3.2. There exists a unique1 unitary representation : Sp ! U (H); such that every operator (g) satis…es Equation (3.1). For the sake of concreteness, let us give an explicit description (which can be directly veri…ed using Equation (3.1)) of the operators (g), for di¤erent elements g 2 Sp, as they appear in the standard realization. The operators will be speci…ed up to a unitary scalar. The standard diagonal subgroup A Sp acts by (normalized) scaling: An element a 0 a= ; 0 a 1 acts by (3.2) Sa [f ] (t) = (a) f a 1 t ; where : Fp ! f 1g is the unique non-trivial quadratic character of the multiplicative group Fp (also called the Legendre character), given by p 1 (a) = a 2 (mod p). The subgroup of strictly upper diagonal elements U exponents (chirps): An element u= 1 u ; 0 1 1 Unique, except in the case the …nite …eld is F . 3 Sp acts by quadratic STATISTICAL RESTRICTED ISOM ETRY PROPERTY 15 acts by Mu [f ] (t) = ( u2 t2 )f (t) : The Weyl element w= 0 1 acts by discrete Fourier transform 1 X F [f ] (w) = p p 1 0 (wt) f (t) : t2Fp 3.2. The Heisenberg dictionary. The Heisenberg dictionary is a collection of p + 1 orthonormal bases, each characterized, roughly, as eigenvectors of a speci…c linear operator. An elegant way to de…ne this dictionary is using the Heisenberg representation [H, HCM]. 3.2.1. Bases associated with lines. The Heisenberg group is non-commutative, yet it consists of various commutative subgroups which can be easily described as follows: Let L V be a line in V . One can associate to L a commutative subgroup AL H, given by AL = f(l; 0) : l 2 Lg. It will be convenient to identify the group AL with the line L. Restricting the Heisenberg representation to the commutative subgroup L, namely, considering the restricted representation : L ! U (H), one obtains a collection of operators f (l) : l 2 Lg which commute pairwisely. This, in turns, yields an orthogonal decomposition into character spaces L H= H ; b of unitary characters of L. where runs in the set L A more concrete way to specify the above decomposition is by choosing a non-zero vector l0 2 L. After such a choice, the character space H naturally corresponds to the eigenspace of the linear operator (l0 ) associated with the eigenvalue = (l0 ). It is not di¢ cult to verify in this case that Lemma 3.3. For every b we have dim H = 1. 2L b which = 1, for every 2 L Choosing a vector ' 2 H of unit norm ' appears in the decomposition, we obtain an orthonormal basis which we denote by BL . Theorem 3.4 ([H, HCM]). For every pair of di¤ erent lines L; M every ' 2 BL , 2 BM 1 jh'; ij = p . p V and for Since there exist p + 1 di¤erent lines in V , we obtain in this manner a collection of p + 1 orthonormal bases a DH = BL . L V which are = 1-coherent. We will call this dictionary, for obvious reasons, the Heisenberg dictionary. 16 SHAM GAR GUREVICH AND RONNY HADANI 3.3. The oscillator dictionary. Re‡ecting back on the Heisenberg dictionary we see that it consists of a collection of orthonormal bases characterized in terms of commutative families of unitary operators where each such family is associated with a commutative subgroup in the Heisenberg group H, via the Heisenberg representation : H ! U (H). In comparison, the oscillator dictionary [GHS] is characterized in terms of commutative families of unitary operators which are associated with commutative subgroups in the symplectic group Sp via the Weil representation : Sp ! U (H). 3.3.1. Maximal tori. The commutative subgroups in Sp that we consider are called maximal algebraic tori [B] (not to be confused with the notion of a topological torus). A maximal (algebraic) torus in Sp is a maximal commutative subgroup which becomes diagonalizable over some …eld extension. The most standard example of a maximal algebraic torus is the standard diagonal torus A= a 0 0 a 1 : a 2 Fp : Standard linear algebra shows that up to conjugation2 there exist two classes of maximal (algebraic) tori in Sp. The …rst class consists of those tori which are diagonalizable already over Fp , namely, those are tori T which are conjugated to the standard diagonal torus A or more precisely such that there exists an element g 2 Sp so that g T g 1 = A. A torus in this class is called a split torus. The second class consists of those tori which become diagonalizable over the quadratic extension Fp2 , namely, those are tori which are not conjugated to the standard diagonal torus A. A torus in this class is called a non-split torus (sometimes it is called inert torus). All split (non-split) tori are conjugated to one another, therefore the number of split tori is the number of elements in the coset space Sp=N (see [A] for basics of group theory), where N is the normalizer group of A; we have # (Sp=N ) = p (p + 1) ; 2 and the number of non-split tori is the number of elements in the coset space Sp=M , where M is the normalizer group of some non-split torus; we have # (Sp=M ) = p (p 1) : Example of a non-split maximal torus. It might be suggestive to explain further the notion of non-split torus by exploring, …rst, the analogue notion in the more familiar setting of the …eld R. Here, the standard example of a maximal non-split torus is the circle group SO(2) SL2 (R). Indeed, it is a maximal commutative subgroup which becomes diagonalizable when considered over the extension …eld C of complex numbers. The above analogy suggests a way to construct examples of maximal non-split tori in the …nite …eld setting as well. Let us assume for simplicity that 1 does not admit a square root in Fp or equivalently that p 1 mod 4. The group Sp acts naturally on the plane V = 2 Two elements h ; h in a group G are called conjugated elements if there exists an element 1 2 g 2 G such that g h1 g 1 = h2 . More generally, Two subgroups H1 ; H2 G are called conjugated subgroups if there exists an element g 2 G such that g H1 g 1 = H2 . STATISTICAL RESTRICTED ISOM ETRY PROPERTY Fp 17 Fp . Consider the standard symmetric form B on V given by B((x; y); (x0 ; y 0 )) = xx0 + yy 0 : An example of maximal non-split torus is the subgroup SO = SO (V; B) Sp consisting of all elements g 2 Sp preserving the form B, namely g 2 SO if and only if B(gu; gv) = B(u; v) for every u; v 2 V . In coordinates, SO consists of all matrices A 2 SL2 (Fp ) which satisfy AAt = I. The reader might think of SO as the "…nite circle". 3.3.2. Bases associated with maximal tori. Restricting the Weil representation to a maximal torus T Sp yields an orthogonal decomposition into character spaces L (3.3) H= H ; where runs in the set Tb of unitary characters of the torus T . A more concrete way to specify the above decomposition is by choosing a generator3 t0 2 T , that is, an element such that every t 2 T can be written in the form t = tn0 , for some n 2 N. After such a choice, the character spaces H which appears in (3.3) naturally corresponds to the eigenspace of the linear operator (t0 ) associated to the eigenvalue = (t0 ). The decomposition (3.3) depends on the type of T in the following manner: In the case where T is a split torus we have dim H = 1 unless = , where : T ! f 1g is the unique non-trivial quadratic character of T (also called the Legendre character of T ), in the latter case dim H = 2. In the case where T is a non-split torus then dim H = 1 for every character which appears in the decomposition, in this case the quadratic character does not appear in the decomposition (for details see [GH]). Choosing for every character 2 Tb; 6= , a vector ' 2 H of unit norm, we obtain an orthonormal system of vectors BT = ' : 6= . Important fact: In the case when T is a non-split torus, the set BT orthonormal basis. an Example 3.5. It would be bene…cial to describe explicitly the system BA when A ' Gm is the standard diagonal torus. The torus A acts on the Hilbert space H by scaling (see Equation (3.2)). For every 6= , de…ne a function ' 2 C (Fp ) as follows: ' (t) = p1 p 1 0 (t) t 6= 0 : t=0 It is easy to verify that ' is a character vector with respect to the action : A ! U (H) associated to the character . Concluding, the orthonormal system BA b m ; 6= g. is the set f' : 2 G Theorem 3.6 ([GHS]). Let 2 BT1 and ' 2 BT2 jh ; 'ij 4 p : p 3 A maximal torus T in SL (F ) is a cyclic group, thus there exists a generator. p 2 18 SHAM GAR GUREVICH AND RONNY HADANI Since there exist p (p 1) distinct non-split tori in Sp, we obtain in this manner a collection of p (p 1) orthonormal bases a DO = BT . T Sp non-split which are = 4-coherent. We will call this dictionary the Oscillator dictionary. 3.4. The extended oscillator dictionary. 3.4.1. The Jacobi group. Let us denote by J the semi-direct product of groups J = Sp n H. The group J will be referred to as the Jacobi group. 3.4.2. The Heisenberg-Weil representation. The Heisenberg representation : H ! U (H) and the Weil representation : Sp ! U (H) combine to a representation of the Jacobi group = n : J ! U (H) , de…ned by (g; h) = (g) (h). The fact that is indeed a representation is a direct consequence of the Egorov’s condition - Equation 3.1. We will refer to the representation as the Heisenberg-Weil representation. 3.4.3. maximal tori in the Jacobi group. Given a non-split torus T Sp, the conjugate subgroup Tv = vT v 1 J, for every v 2 V , will be called a maximal non-split torus in J. It is easy to verify that the subgroups Tv ; Tu are distinct for v 6= u; moreover, for di¤erent tori T 6= T 0 Sp the subgroups Tv ; Tu0 are distinct for every v; u 2 V . This implies that there are p (p 1) p2 non-split maximal tori in J. 3.4.4. Bases associated with maximal tori. Restricting the Heisenberg-Weil representation to a maximal non-split torus T J yields a basis BT consisting of character vectors. A way to think of the basis BT is as follows: If T = Tv where T is a maximal torus in Sp then the basis BTv can be derived from the already known basis BT by BTv = (v) BT , namely, the basis BTv consists of the vectors (v) ' where ' 2 BT . Interestingly, given any two tori T1 ; T2 J, the bases BT1 ; BT2 remain = 4 coherent - this is a direct consequence of the following generalization of Theorem 3.6: Theorem 3.7 ([GHS]). Given (not necessarily distinct) tori T1 ; T2 pair of vectors ' 2 BT1 , 2 BT2 4 jh'; (v) ij p , p for every v 2 V . Sp and a Since there exist p (p 1) p2 distinct non-split tori in J, we obtain in this manner a collection of p (p 1) p2 p4 orthonormal bases a DEO = BT . T J non-split STATISTICAL RESTRICTED ISOM ETRY PROPERTY which are dictionary. 19 = 4-coherent. We will call this dictionary the extended oscillator Remark 3.8. A way to interpret Theorem 3.7 is to say that any two di¤ erent vectors ' 6= 2 DO are incoherent in a stable sense, that is, their coherency p is 4= p no matter if any one of them undergoes an arbitrary time/phase shift. This property seems to be important in communication where a transmitted signal may acquire time shift due to asynchronous communication and phase shift due to Doppler e¤ ect. Appendix A. Proof of statements A.1. Proof of Lemma 2.5. Let [0; k] ! [1; n]. Write Ew ( ) 2 Pk and =j nj 1 2 X S2 w n. ( ) By de…nition, ( )= (S) . n Direct veri…cation reveals that w ( ) (S) = w ( (S)) where (S) = S [1; n] ! D, hence X X X w ( ) (S) = w ( (S)) = w (S) , S2 n S2 : S2 n : n which implies that Ew ( ) = Ew . This concludes the proof of the lemma. A.2. Proof of Lemma 2.7. We need to introduce the notion of a Dick word. De…nition A.1. A Dick work of length 2m is a sequence D = d1 d2 :::d2m where di = 1, which satis…es l X di 0; i=1 for every l = 1; ::; 2m. Let us the denote by D2m the set of Dick words of length 2m. It is well know that jD2m j = m . In addition, let us denote by T2m P2m the subset of trees of length 2m. Our goal is to establish a bijection Given a tree n ' ! D2m : 2 T2m de…ne the word D ( ) = d1 d2 :::d2m as follows: if (i 1) is crossed for the …rst time on the i 1 step : 1 otherwise Pl The word D ( ) is a Dick word since i=1 di counts the number of vertices visited exactly once by in the …rst l steps, therefore, it is greater or equal to zero. On the one direction, if two trees 1 ; 2 are isomorphic then D ( 1 ) = D ( 2 ). In addition, it is easy to verify that the tree can be reconstructed from the pair ! ! (D ( ) ; V ) where V is the set of vertices of equipped with the following linear order: v < u , crosses v for the …rst time before it crosses u for the …rst time. di = 1 D : T2m = 20 SHAM GAR GUREVICH AND RONNY HADANI This implies that D de…nes an injection from T2m = n into D2m . Conversely, it is easy to verify that for every Dick word D 2 D2m there is a tree 2 P2k such that D = D ( ), which implies that the map D is surjective. This concludes the proof of the lemma. A.3. Proof of Lemma 2.9. We begin with an auxiliary construction. De…ne a map t : Ik ! P2k as follows: Given ( 1 ; 2 ) 2 Ik , let 0 i1 k be the …rst index so that 1 (i1 ) 2 V 2 and let 0 i2 k be the …rst index such that 1 (i1 ) = 2 (i2 ). De…ne 8 0 j i1 < 1 (j) (i i + j) i1 j i1 + k : t (j) = 2 1 1 2 : 2 (j k) i1 + k j 2k 1 In words, the path 1 t 2 is obtained, roughly, by substituting the path 2 instead of the vertex 1 (i1 ). Clearly, the map t is injective and commutes with the action of n , therefore, we get, in particular, that the number of elements in the isomorphism class [ 1 ; 2 ] 2 Ik = n is smaller or equal than the number of elements in the isomorphism class [ 1 t 2 ] 2 P2k = n (A.1) j[ 1; 2 ]j j[ First estimate. We need to show X k n 2 (p=n) ( 1; 1 t 2 ]j . E(w 1 w 2 ) = O n 1 : 2 )2Ik Write X ( 1; E(w 1 w 2 ) = 2 )2Ik [ 1; 2 ]2Ik = It is enough to show that for every [ (A.2) 2 n k (p=n) j[ 1; X 2 ]j 1; 2] j[ 1; 2 ]j E(w 1 w 2 ) : n 2 Ik = n jE(w 1 w 2 )j = O n 1 : Fix an isomorphism class [ 1 ; 2 ] 2 Ik = n . By Equation (A.1) we have that j[ 1 ; 2 ]j j[ 1 t 2 ]j. In addition, a simple observation reveals that w 1 w 2 = w 1 t 2 which implies that E w 1 w 2 = E w 1 t 2 . In conclusion, since the length of 1 t 2 is 2k, we get that n 2 k (p=n) j[ 1; 2 ]j jE(w 1 w 2 )j n 1 n ([ t 1 2 ]) E w 1t 2 : Finally, by Theorem 2.6, we have that n ([ 1 t 2 ]) E w 1 t 2 = O (1), hence, Equation (A.2) follows. This concludes the proof of the …rst estimate. Second estimate. We need to show X k n 2 (p=n) Ew 1 Ew 2 = O n 1 : ( 1 ; 2 )2Ik Write ( X Ew 1 Ew 2 1 ; 2 )2Ik = [ 1 ; 2 ]2Ik = It is enough to show that for every [ (A.3) n 2 k (p=n) j[ 1; 2 ]j X 1; Ew 2] 1 j[ 1; 2 ]j Ew Ew 1 n 2 Ik = Ew 2 n =O n 1 . 2 : STATISTICAL RESTRICTED ISOM ETRY PROPERTY 21 Fix an isomorphism class [ 1 ; 2 ] 2 Ik = n . By Equation (A.1) we have that j[ 1 ; 2 ]j j[ 1 t 2 ]j. For every path , we have that j[ ]j = n(jV j) njV j (since always jV j k and we assume that k is …xed, that is, it does not depend on p), in particular j[ ]j njV 1 j ; 1 j[ By construction, V 1 t 2 therefore, j[ 1 t 2 ]j = O n n 2 k (p=n) j[ 1; 2 ]j j[ Ew 1 t V 1 j[ 1 njV 2 j ; njV 1 t 2 j . 2 ]j 2 ]j 1 + V 1 ]j j[ 2 ]j Ew 2 1 (we assume that V 1 \ V 2 . In conclusion, we get that =O n 1 n( 1) n ( 2) Ew 1 Ew 2 2 6= ?), ; k where we used the identity n 2 (p=n) j[ 1 ]j j[ 2 ]j = n ( 1 ) n ( 2 ). Finally, by Theorem 2.6, we have that n ( i ) Ew i = O (1), i = 1; 2, hence, Equation (A.3) follows. This concludes the proof of the second estimate and concludes the proof of the lemma. A.4. Proof of Proposition 2.10. Write X 1 w (S) Ew = j (V )j S2 (V ) (A.4) = j (V )j X 1 S2 (V nfvg) b2DnS(V nfvg) where S t b : V ! D is given by S t b (u) = Write X b2DnS(V nfvg) X w (S t b) = X b2D w (S t b) ; S (u) u 6= v : b u=v X w (S t b) b2S(V nfvg) w (S t b) : Let us analyze separately the two terms in right side of the above equation. First term. Write X X X w (S t b) = w (S t bx ) : x2X bx 2Bx b2D Furthermore w (S t bx ) = h:; :i :: hS (vl ) ; bx i hbx ; S (vr )i :: h:; :i , Since Bx is an orthonormal basis X hS (vl ) ; bx i hbx ; S (vr )i = hS (vl ) ; S (vr )i ; bx 2Bx which implies that P bx 2Bx (A.5) w (S t bx ) = w X b2D v b (S). Concluding, we obtain w (S t b) = jXj w v b (S) . 22 SHAM GAR GUREVICH AND RONNY HADANI Second term. Let b 2 S (V n fvg). Since S is injective, there exists a unique u 2 V n fvg such that b = S (u), therefore w (S t b) = h:; :i :: hS (vl ) ; bi hb; S (vr )i :: h:; :i = w u (S) . Furthermore, observe that when u = vl or u = vr we have that w (S). In conclusion, we obtain v b X X (A.6) w (S t b) = 2w vb (S) + w u (S) . u (S) = w b2S(V nfvg) u2V nfvl ;vr ;vg Combining (A.5) and (A.6) yields X w (S t b) = (jXj 2) w b2DnS(V nfvg) v b Substituting the above in (A.4) yields (A.7) Ew = (jXj 2) j (V )j X u2V nfvl ;vr ;vg 1 X (S) w u (S) : u2V nfvl ;vr ;vg X w S2 (V nfvg) j (V )j X 1 v b (S) w u (S) : S2 (V nfvg) Finally, direct counting argument reveals that j (V )j j (V )j Hence (A.7) yields Ew p 1 Ew v b p jXj p jXj X V V u2V nfvl ;vr ;vg v b u ; : (p jXj) 1 Ew u . This concludes the proof of the proposition. References [A] [B] Artin M., Algebra. Prentice Hall, Inc., Englewood Cli¤ s, NJ (1991). Borel A. Linear algebraic groups. Graduate Texts in Mathematics, 126. Springer-Verlag, New York (1991). [BDE] Bruckstein A.M., Donoho D.L. and Elad M., "From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images", to appear in SIAM Review (2007). [DE] Donoho D.L. and Elad M., Optimally sparse representation in general (non-orthogonal) dictionaries via l_1 minimization. Proc. Natl. Acad. Sci. USA 100, no. 5, 2197–2202 (2003). [DGM] Daubechies I., Grossmann A. and Meyer Y., Painless non-orthogonal expansions. J. Math. Phys., 27 (5), pp. 1271-1283 (1986). [EB] Elad M. and Bruckstein A.M., A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases. IEEE Trans. On Information Theory, Vol. 48, pp. 2558-2567 (2002). [GG] Golomb, S.W. and Gong G. Signal design for good correlation. For wireless communication, cryptography, and radar. Cambridge University Press, Cambridge (2005). [GH] Gurevich S. and Hadani R., Self-reducibility of the Weil representation and applications. arXiv:math/0612765 (2005). [GHS] Gurevich S., Hadani R. and Sochen N., The …nite harmonic oscillator and its associated sequences. Proceedings of the National Academy of Sciences of the United States of America (Accepted: Feb. 2008). STATISTICAL RESTRICTED ISOM ETRY PROPERTY 23 [GMS] Gilbert A.C., Muthukrishnan S. and Strauss M.J. Approximation of functions over redundant dictionaries using coherence. Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (Baltimore, MD, 2003), 243–252, ACM, New York, (2003). [GN] Gribonval R. and Nielsen M. Sparse representations in unions of bases. IEEE Trans. Inform. Theory 49, no. 12, 3320–3325 (2003). [H] Howe R., Nice error bases, mutually unbiased bases, induced representations, the Heisenberg group and …nite geometries. Indag. Math. (N.S.) 16 , no. 3-4, 553–583 (2005).. [HCM] Howard S. D., Calderbank A. R. and Moran W., The …nite Heisenberg-Weyl groups in radar and communications. EURASIP J. Appl. Signal Process. (2006). [S] Sarwate D.V., Meeting the Welch bound with equality. Sequences and their applications (Singapore, 1998), 79–102, Springer Ser. Discrete Math. Theor. Comput. Sci., Springer, London, (1999). [Se] Serre J.P., Linear representations of …nite groups. Graduate Texts in Mathematics, Vol. 42. Springer-Verlag, New York-Heidelberg (1977). [SH] Strohmer T. and Heath, R.W. Jr. Grassmannian frames with applications to coding and communication. Appl. Comput. Harmon. Anal. 14, no. 3 (2003). [T] Terras A., Fourier analysis on …nite groups and applications. London Mathematical Society Student Texts, 43. Cambridge University Press, Cambridge (1999). [Tr] Tropp, J.A., Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inform. Theory 50, no. 10 (2004) 2231–2242. [W] Weil A., Sur certains groupes d’operateurs unitaires. Acta Math. 111, 143-211 (1964). Department of Mathematics, University of California, Berkeley, CA 94720, USA. E-mail address : shamgar@math.berkeley.edu Department of Mathematics, University of Chicago, IL 60637, USA. E-mail address : hadani@math.uchicago.edu