(1+eps)-approximate Sparse Recovery

(1 + ǫ ) -approxim ate Sparse Recov ery Eric Price MIT Da vid P . W oodruff IBM Almaden 2011-08-12 Abstract The problem central to sparse recovery and compressi ve sensing is th at of stable s parse r ecovery : we want a distribution A of matrices A ∈ R m × n such that, for any x ∈ R n and with p robability 1 − δ > 2 / 3 over A ∈ A , there is an algorith m to recover ˆ x from Ax with k ˆ x − x k p ≤ C min k -sparse x ′ k x − x ′ k p (1) for some constant C > 1 and norm p . The me asurement complexity of th is pr oblem is well und erstood for constant C > 1 . Howev er , in a variety of ap plications it is impor tant to obtain C = 1 + ǫ for a small ǫ > 0 , and this co mplexity is not well un derstood . W e resolve the dep endence on ǫ in the n umber o f m easuremen ts r equired o f a k -sparse recovery algorithm , up to polylog arithmic factors for the central cases of p = 1 and p = 2 . Namely , we give new alg orithms and lower bound s that sh ow th e number of mea surements requ ired is k /ǫ p/ 2 polylog ( n ) . For p = 2 , our boun d of 1 ǫ k log( n/ k ) is tig ht up to constan t factor s. W e also give matching bo unds when the outpu t is required to be k - sparse, in which case we achieve k /ǫ p polylog ( n ) . This shows the distinction between the comp lexity of sparse and non- sparse outp uts is fundamen tal. 1 Introd uction Over the last sev eral years , substant ial interest has been gener ated in the problem of solvi ng underdete r- mined linear syste ms subject to a sparsity constr aint. The ﬁeld, known as compr e ssed sens ing or spar se r eco very , has appli cations to a w ide vari ety of ﬁelds that includes data stream algorithms [Mut05], medical or geologica l imaging [CR T06, Don06], and genetics tes ting [SAZ10]. The ap proach uses the po wer of a spar sity constrain t: a vect or x ′ is k -spar se if at most k coef ﬁcients are non -zero. A sta ndard formula tion for the p roblem is that of st able spar se r e cove ry : we w ant a distrib utio n A of matri ces A ∈ R m × n such th at, for any x ∈ R n and with proba bility 1 − δ > 2 / 3 over A ∈ A , there is an alg orithm to recov er ˆ x from Ax with k ˆ x − x k p ≤ C min k -sparse x ′   x − x ′   p (2) for some cons tant C > 1 and norm p 1 . W e ca ll this a C -ap pr o ximate ℓ p /ℓ p r eco very sc heme w ith failur e pr obabi lity δ . W e refer to the elements of Ax as measur ements . It is known [CR T06, GLPS 10] th at such reco very schemes ex ist for p ∈ { 1 , 2 } with C = O (1) and m = O ( k log n k ) . Furthermor e, it is kno wn [DIPW10, FPR U10] that any such reco very scheme requi res Ω( k log 1+ C n k ) measu rements. This means th e measuremen t complexit y is w ell un derstoo d for C = 1 + Ω(1) , b ut not for C = 1 + o (1) . 1 Some formulations allow the two n orms to be different, in which case C is not constant. W e only consider equal norms in this paper . 1 Lower b ound Upper bound k -sparse output ℓ 1 Ω( 1 ǫ ( k log 1 ǫ + log 1 δ )) O ( 1 ǫ k log n ) [CM04] ℓ 2 Ω( 1 ǫ 2 ( k + log 1 δ )) O ( 1 ǫ 2 k log n ) [CCF02, CM 06, W ai09] Non- k -sparse outp ut ℓ 1 Ω( 1 √ ǫ l og 2 ( k/ǫ ) k ) O ( log 3 (1 /ǫ ) √ ǫ k log n ) ℓ 2 Ω( 1 ǫ k log( n/k )) O ( 1 ǫ k log ( n/k )) [GLPS10] Figure 1: Our resul ts, along w ith exist ing upper bounds. Fairly minor restric tions on the relati ve magnitude of paramet ers appl y; see the theorem statements for details. A number of applicatio ns would like to hav e C = 1 + ǫ for small ǫ . For example, a radio wa ve signal can be mod eled as x = x ∗ + w where x ∗ is k -sparse (co rrespon ding to a sign al ove r a narro w band) and the noise w is i.i.d. Gaussian with k w k p ≈ D k x ∗ k p [TDB09]. T hen s parse reco very with C = 1 + α/D allo ws the reco very of a (1 − α ) fraction of the true signa l x ∗ . Since x ∗ is concent rated in a small band while w is locate d over a lar ge reg ion, it is often the case that α/D ≪ 1 . The dif ﬁculty of (1 + ǫ ) -approximate r ecov ery ha s seemed to depend on whe ther the outpu t x ′ is required to be k -sparse o r can hav e mo re than k elements in its suppo rt. Hav ing k -sparse output is important f or some applic ations (e.g. the af oremention ed radio wa ves) b ut not for oth ers (e.g. imagin g). Algorithms that o utput a k -sparse x ′ ha ve us ed Θ( 1 ǫ p k log n ) measurement s [CCF02, CM04 , CM06, W ai09]. In con trast, [GLPS10] uses only Θ( 1 ǫ k log ( n/k )) m easure ments for p = 2 and outpu ts a non- k -sparse x ′ . Our results W e show that t he apparent dist inction b etween complexity o f spars e an d n on-spa rse outpu ts is fundame ntal, for both p = 1 and p = 2 . W e sho w that for sparse output, Ω( k /ǫ p ) measurement s are neces- sary , ma tching the upper bounds up to a log n factor . For general output and p = 2 , we show Ω( 1 ǫ k log ( n/k )) measuremen ts are nece ssary , matching the upper bound up to a cons tant fa ctor . In the remainin g case of genera l output and p = 1 , we sho w e Ω( k/ √ ǫ ) measurement s are necessary . W e then gi ve a nove l algorithm that uses O ( log 3 (1 /ǫ ) √ ǫ k log n ) meas urements, beating the 1 /ǫ dependenc e giv en by all previ ous algor ithms. As a result, all our bound s are tight up to factors logarithmic in n . The fu ll results are shown in Figur e 1. In addition, for p = 2 an d gene ral outpu t, we sho w that thresholdin g the top 2 k elements of a Count- Sketch [CC F02] estimate giv es (1 + ǫ ) -approximate recov ery with Θ( 1 ǫ k log n ) measurements. This is in- teresti ng because it h ighligh ts the distinc tion between sparse o utput and non-s parse output: [CM06] sho wed that thresholdin g th e top k elemen ts of a Count-Sk etch estimate requir es m = Θ ( 1 ǫ 2 k log n ) . While [GLPS10] achie ves m = Θ( 1 ǫ k log ( n/k )) for the same re gime, it only succeed s with cons tant pr obabili ty while ours succee ds w ith p robabili ty 1 − n − Ω(1) ; hence o urs is the most ef ﬁcient kno wn algorithm when δ = o (1) , ǫ = o (1) , and k < n 0 . 9 . Related work Much of the work on sparse recov ery has relied on the R estricte d Isometry Property [CR T06]. None o f th is wor k has been able to g et bett er than 2 -appr oximate recov ery , so there a re rela ti vely fe w pap ers achie ving (1 + ǫ ) -approxi mate recov ery . The exis ting ones with O ( k log n ) measurements are surve yed abo ve (ex cept for [IR08], which has worse depend ence on ǫ than [CM04] for the same regime). A co uple of pre vious works ha ve stud ied the ℓ ∞ /ℓ p proble m, where e very coordi nate must be estimat ed with small error . T his proble m is harder than ℓ p /ℓ p sparse reco very with sparse outp ut. For p = 2 , [W ai09] sho wed that schemes using Gaussian matrices A require m = Ω( 1 ǫ 2 k log( n/k )) . For p = 1 , [CM05] sho wed that any sk etch requi res Ω( k /ǫ ) bits (rather than measurements). 2 Indepe ndently of this work and of each othe r , multiple authors [CD11, IT10, ASZ10] ha ve matched our Ω( 1 ǫ k log ( n/k )) bound for ℓ 2 /ℓ 2 in related setting s. The details va ry , but all proofs are broadly similar in structure to ours: they consider observing a lar ge set of “wel l-separa ted” vect ors un der Gauss ian no ise. Fano’ s inequali ty giv es a lo wer bound on the mutual informatio n betwee n the obser v ation and the signal; then, an upper bound on the mutual information is giv en by either the Shannon -Hartley the orem or a K L- di ver gence arg ument. This techn ique does not seem usef ul for the othe r problems we co nsider in this pap er , such as lo wer bound s for ℓ 1 /ℓ 1 or the spars e output setting. Our techniques For the upper bounds for non-sp arse output , we observ e that the hard case for sparse outpu t is when the noise is f airly concen trated, in which the estimation of the top k elements can hav e √ ǫ error . O ur goal is to reco ver enough mass from outsid e the top k ele ments to cance l this error . The upper bound for p = 2 is a fairl y straightfo rward analy sis of the top 2 k elements of a Coun t-Sketc h data stru cture. The upper bou nd for p = 1 proceed s by sub sampling the vecto r at rate 2 − i and performin g a Count- Sketch with si ze propo rtional to 1 √ ǫ , for i ∈ { 0 , 1 , . . . , O (log(1 /ǫ )) } . The intui tion is that if the noise i s well spread ove r m any (more th an k /ǫ 3 / 2 ) coordinates , then the ℓ 2 bound from the ﬁrst Count- Sketch gi ves a v ery good ℓ 1 bound , so the app roximation is (1 + ǫ ) -appro ximate. Howe ver , if the noise is concentra ted ov er a small n umber k /ǫ c of co ordinat es, then the err or from th e ﬁrst Cou nt-Sket ch is pr oportio nal to 1 + ǫ c/ 2+1 / 4 . But in this case , one of the subsampl es will only ha ve O ( k /ǫ c/ 2 − 1 / 4 ) < k / √ ǫ of the coordina tes with lar ge noise. W e can then reco ver those coordina tes with the C ount-Sk etch for that subs ample. T hose coordina tes contai n an ǫ c/ 2+1 / 4 fractio n of the total noise, so recov ering them decreases the approximatio n error by exa ctly the error induced from the ﬁrst Count-Sketc h. The lo wer b ounds u se sub stantial ly differe nt techni ques for sp arse o utput an d for non-sparse outpu t. For sparse output, w e use reductions from co mmunicatio n complexit y to sho w a lo wer boun d in terms of bits. Then, as in [ DIPW10], we embe d Θ(log n ) copies of th is communicat ion problem into a singl e vecto r . This multiplie s the bit complexit y by log n ; we also s ho w we can round A x to lo g n bits per measurement with out af fecting recov ery , giv ing a lower bound in terms of measu rements. W e illustrate the lo wer bound on bit comple xity for spars e outp ut using k = 1 . Consider a v ector x contai ning 1 /ǫ p ones and zeros else where, such that x 2 i + x 2 i +1 = 1 for all i . For any i , set z 2 i = z 2 i +1 = 1 and z j = 0 elsewhere . Then succ essful (1 + ǫ/ 3) -approxi mate sparse reco very from A ( x + z ) returns ˆ z with supp( ˆ z ) = s upp( x ) ∩ { 2 i, 2 i + 1 } . H ence we can recov er each bit of x with probability 1 − δ , requiri ng Ω(1 /ǫ p ) bits 2 . W e can generali ze this to k -sparse output for Ω( k /ǫ p ) bits, and to δ failure probabi lity with Ω( 1 ǫ p log 1 δ ) . H o wev er , the two general izations do not seem to combine. For non-sparse output, we split betwee n ℓ 2 and ℓ 1 . In ℓ 2 , we consider A ( x + w ) where x is sparse a nd w has u niform Gaussi an noise with k w k 2 2 ≈ k x k 2 2 /ǫ . T hen e ach coord inate of y = A ( x + w ) = Ax + Aw is a Gaussian channel with signal to noise ratio ǫ . This channel has channel capacity ǫ , sho wing I ( y ; x ) ≤ ǫm . Correct sparse recov ery must either get most of x or an ǫ fraction of w ; the latter requires m = Ω( ǫn ) and the former requires I ( y ; x ) = Ω( k log ( n /k )) . This giv es a tight Θ( 1 ǫ k log ( n/k )) result. Unfortunately , this does not easily extend to ℓ 1 , because it relies on the Gaussia n distrib ution being both stable and maximum entrop y under ℓ 2 ; the corr espondi ng distrib utions in ℓ 1 are not the same. Therefore for ℓ 1 non-sp arse outp ut, w e ha ve yet another ar gument. The hard instances for k = 1 must ha ve one large value (or else 0 is a valid output) b ut small other v alues (or else the 2 -spa rse approximatio n is signiﬁcan tly better than the 1 -sp arse approximation ). S uppos e x has one value of size ǫ and d val ues of size 1 /d spre ad through a vector of size d 2 . Then a (1 + ǫ/ 2) -approxi mate recov ery scheme must either locate the lar ge element or guess the locations of the d va lues with Ω( ǫd ) more correct than incorre ct. The former requires 1 / ( dǫ 2 ) bits by the difﬁculty of a nov el version of the Gap- ℓ ∞ proble m. The latter requires ǫd bits because it allo ws recov ering an error correcting code. S etting d = ǫ − 3 / 2 balanc es the terms at ǫ − 1 / 2 2 For p = 1 , we can actually set | supp( z ) | = 1 /ǫ and search among a set of 1 /ǫ candidates. This gi ves Ω( 1 ǫ log(1 /ǫ )) bits. 3 bits. Because some of these reduct ions are very intricate, this extended abstract does not manage to embed log n copies of the proble m into a single vect or . As a result, we lose a log n factor in a uni verse of size n = poly ( k /ǫ ) when con verting to measurement complexi ty from bit comple xity . 2 Pr eliminaries Notation W e use [ n ] to denote the set { 1 . . . n } . For an y set S ⊂ [ n ] , we use S to denote the complement of S , i.e., the set [ n ] \ S . For any x ∈ R n , x i denote s the i th coo rdinate of x , and x S denote s the ve ctor x ′ ∈ R n gi ven by x ′ i = x i if i ∈ S , and x ′ i = 0 otherwise. W e use supp( x ) to denote the suppor t of x . 3 Upper bounds The algori thms in this section are indif ferent to permutation of the coordinates . Therefore, for simplicity of notati on in the analy sis, w e assume the coef ﬁcients of x are sorted such that | x 1 | ≥ | x 2 | ≥ . . . ≥ | x n | ≥ 0 . Count-Sketch Both our up per bounds use the Count-Sketch [CCF 02] data structure. The structure con - sists of c log n hash tables of size O ( q ) , for O ( cq log n ) total space; it can b e repre sented as Ax for a matrix A with O ( cq log n ) ro ws. Gi ven Ax , one can constru ct x ∗ with k x ∗ − x k 2 ∞ ≤ 1 q    x [ q ]    2 2 (3) with fai lure probabilit y n 1 − c . 3.1 Non-sp arse ℓ 2 It was shown in [C M06] that, if x ∗ is the result of a Count-Sk etch w ith hash table size O ( k /ǫ 2 ) , then outpu tting the top k element s of x ∗ gi ves a (1 + ǫ ) -approximate ℓ 2 /ℓ 2 reco very scheme. Here w e sho w that a seemingly minor chang e—selecti ng 2 k elements rat her th an k elements—turns this into a (1 + ǫ 2 ) - approx imate ℓ 2 /ℓ 2 reco very scheme. Theor em 3.1. Let ˆ x be the top 2 k estimat es fr om a Count-Ske tch structur e with hash table size O ( k /ǫ ) . Then with failur e pr obability n − Ω(1) , k ˆ x − x k 2 ≤ (1 + ǫ )    x [ k ]    2 . Ther efor e, ther e is a 1 + ǫ -appr oximate ℓ 2 /ℓ 2 r ecovery sch eme with O ( 1 ǫ k log n ) r ows. Pr oof. Let the hash table size be O ( ck /ǫ ) for consta nt c , and let x ∗ be the vec tor of estimates for each coordi nate. Deﬁne S to be the indi ces of the larges t 2 k v alues in x ∗ , and E =    x [ k ]    2 . By (3), the standa rd analy sis of Count-Sketch : k x ∗ − x k 2 ∞ ≤ ǫ ck E 2 . so k x ∗ S − x k 2 2 − E 2 = k x ∗ S − x k 2 2 −    x [ k ]    2 2 ≤ k ( x ∗ − x ) S k 2 2 +   x [ n ] \ S   2 2 −    x [ k ]    2 2 ≤ | S | k x ∗ − x k 2 ∞ +   x [ k ] \ S   2 2 −   x S \ [ k ]   2 2 ≤ 2 ǫ c E 2 +   x [ k ] \ S   2 2 −   x S \ [ k ]   2 2 (4) 4 Let a = max i ∈ [ k ] \ S x i and b = min i ∈ S \ [ k ] x i , and let d = | [ k ] \ S | . The algorithm passes ove r an element of v alue a to choose one of v alue b , so a ≤ b + 2 k x ∗ − x k ∞ ≤ b + 2 r ǫ ck E . Then   x [ k ] \ S   2 2 −   x S \ [ k ]   2 2 ≤ da 2 − ( k + d ) b 2 ≤ d ( b + 2 r ǫ ck E ) 2 − ( k + d ) b 2 ≤ − k b 2 + 4 r ǫ ck dbE + 4 ǫ ck dE 2 ≤ − k ( b − 2 r ǫ ck 3 dE ) 2 + 4 ǫ ck 2 dE 2 ( k − d ) ≤ 4 d ( k − d ) ǫ ck 2 E 2 ≤ ǫ c E 2 and combinin g this with (4) gi ves k x ∗ S − x k 2 2 − E 2 ≤ 3 ǫ c E 2 or k x ∗ S − x k 2 ≤ (1 + 3 ǫ 2 c ) E which pro ves the theore m for c ≥ 3 / 2 . 3.2 Non-sp arse ℓ 1 Theor em 3.2. Ther e ex ists a (1 + ǫ ) -appr oximate ℓ 1 /ℓ 1 r ecovery scheme with O ( log 3 1 /ǫ √ ǫ k log n ) m easur e- ments and failur e pr obability e − Ω( k/ √ ǫ ) + n − Ω(1) . Set f = √ ǫ , so our goa l is to get (1 + f 2 ) -appr oximate ℓ 1 /ℓ 1 reco very wit h O ( log 3 1 /f f k log n ) measure- ments. For intu ition, consider 1-sparse recov ery of the follo wing vector x : let c ∈ [0 , 2] and set x 1 = 1 /f 9 and x 2 , . . . , x 1+1 /f 1+ c ∈ {± 1 } . T hen we ha ve    x [1]    1 = 1 /f 1+ c and by (3), a Count-Sk etch w ith O (1 /f ) -sized hash tables returns x ∗ with k x ∗ − x k ∞ ≤ p f    x [1 /f ]    2 ≈ 1 /f c/ 2 = f 1+ c/ 2    x [1]    1 . The reconstructi on alg orithm therefo re cannot reliably ﬁnd any of the x i for i > 1 , and its error on x 1 is at least f 1+ c/ 2    x [1]    1 . Henc e the algorithm will not do better than a f 1+ c/ 2 -appro ximation. Ho wev er , consid er w hat happens if we subsample an f c fractio n of the vector . T he result probab ly has about 1 /f non-zero valu es, so a O (1 /f ) -width Count-Sket ch can reconstru ct it exactly . Putting this in our outpu t improv es the ov erall ℓ 1 error by about 1 /f = f c    x [1]    1 . Since c < 2 , this m ore than cance ls the f 1+ c/ 2    x [1]    1 error the initia l C ount-Sk etch makes on x 1 , gi ving an approximatio n fact or better than 1 . 5 This tells us that subsampling can help. W e don’ t need to subsamp le at a scale belo w k /f (where we can re constru ct well alre ady) or abo ve k/f 3 (where the ℓ 2 bound is small enough alrea dy), b ut in the intermed iate range we need to subsa mple. Our algorith m subsamples at all log 1 /f 2 rates in between these two end points, and combines the heavy hitter s from each. First we analy ze ho w subsampled C ount-Sk etch work s. Lemma 3.3. Suppo se we subsampl e with pr obabili ty p and then apply Count-Sk etch with Θ(log n ) r ows and Θ( q ) -sized hash tables. Let y be the subsample of x . Then with failur e pr obabi lity e − Ω( q ) + n − Ω(1) we r ecover a y ∗ with k y ∗ − y k ∞ ≤ p p/q    x [ q /p ]    2 . Pr oof. Recall the fo llo w ing form of the Chernof f bound : if X 1 , . . . , X m are indepe ndent with 0 ≤ X i ≤ M , and µ ≥ E[ P X i ] , then Pr[ X X i ≥ 4 3 µ ] ≤ e − Ω( µ/ M ) . Let T be the set of coordin ates in the sample. Then E[    T ∩ [ 3 q 2 p ]    ] = 3 q / 2 , so Pr      T ∩ [ 3 q 2 p ]     ≥ 2 q  ≤ e − Ω( q ) . Suppose this e vent doe s not happen, so    T ∩ [ 3 q 2 p ]    < 2 q . W e als o hav e    x [ q /p ]    2 ≥ r q 2 p    x 3 q 2 p    . Let Y i = 0 if i / ∈ T and Y i = x 2 i if i ∈ T . Then E[ X i> 3 q 2 p Y i ] = p     x [ 3 q 2 p ]     2 2 ≤ p    x [ q /p ]    2 2 For i > 3 q 2 p we ha ve Y i ≤    x 3 q 2 p    2 ≤ 2 p q    x [ q /p ]    2 2 gi ving by Chernof f that Pr[ X Y i ≥ 4 3 p    x [ q /p ]    2 2 ] ≤ e − Ω( q / 2) But if this e vent does not happen, then    y [2 q ]    2 2 ≤ X i ∈ T ,i> 3 q 2 p x 2 i = X i> 3 q 2 p Y i ≤ 4 3 p    x [ q /p ]    2 2 By (3), using O (2 q ) -size hash tables giv es a y ∗ with k y ∗ − y k ∞ ≤ 1 √ 2 q    y [2 q ]    2 ≤ p p/q    x [ q /p ]    2 with fai lure probabilit y n − Ω(1) , as desired . 6 Let r = 2 log 1 /f . Our algorithm is as follows: for j ∈ { 0 , . . . , r } , we ﬁnd and estimate the 2 j / 2 k lar gest ele ments not found in previo us j in a subsample d Cou nt-Sket ch w ith probabilit y p = 2 − j and hash size q = ck /f for some paramete r c = Θ( r 2 ) . W e outp ut ˆ x , the union of all these estimates. Our goal is to sho w k ˆ x − x k 1 −    x [ k ]    1 ≤ O ( f 2 )    x [ k ]    1 . For each lev el j , let S j be th e 2 j / 2 k lar gest coordi nates in our estimate not foun d in S 1 ∪ · · · ∪ S j − 1 . Let S = ∪ S j . By Lemma 3 .3, for each j we ha ve (with failure probab ility e − Ω( k/f ) + n − Ω(1) ) that   ( ˆ x − x ) S j   1 ≤ | S j | r 2 − j f ck    x [2 j ck /f ]    2 ≤ 2 − j / 2 r f k c    x [2 k /f ]    2 and so k ( ˆ x − x ) S k 1 = r X j =0   ( ˆ x − x ) S j   1 ≤ 1 (1 − 1 / √ 2) √ c p f k    x [2 k /f ]    2 (5) By standa rd ar guments, the ℓ ∞ bound for S 0 gi ves   x [ k ]   1 ≤ k x S 0 k 1 + k k ˆ x S 0 − x S 0 k ∞ ≤ p f k /c    x [2 k /f ]    2 (6) Combining Equation s (5 ) and (6) gi ves k ˆ x − x k 1 −    x [ k ]    1 = k ( ˆ x − x ) S k 1 +   x S   1 −    x [ k ]    1 = k ( ˆ x − x ) S k 1 +   x [ k ]   1 − k x S k 1 = k ( ˆ x − x ) S k 1 + (   x [ k ]   1 − k x S 0 k 1 ) − r X j =1   x S j   1 ≤  1 (1 − 1 / √ 2) √ c + 1 √ c  p f k    x [2 k /f ]    2 − r X j =1   x S j   1 = O ( 1 √ c ) p f k    x [2 k /f ]    2 − r X j =1   x S j   1 (7) W e would like to con vert the ﬁrst term to depend on the ℓ 1 norm. For any u and s we hav e, by splitti ng into chunk s of size s , that    u [2 s ]    2 ≤ r 1 s    u [ s ]    1    u [ s ] ∩ [2 s ]    2 ≤ √ s | u s | . Along with the triang le inequ ality , this giv es us that p k f    x [2 k /f ]    2 ≤ p k f    x [2 k /f 3 ]    2 + p k f r X j =1    x [2 j k /f ] ∩ [ 2 j +1 k /f ]    2 ≤ f 2    x [ k /f 3 ]    1 + r X j =1 k 2 j / 2   x 2 j k /f   7 so k ˆ x − x k 1 −    x [ k ]    1 ≤ O ( 1 √ c ) f 2    x [ k /f 3 ]    1 + r X j =1 O ( 1 √ c ) k 2 j / 2   x 2 j k /f   − r X j =1   x S j   1 (8) Deﬁne a j = k 2 j / 2   x 2 j k /f   . The ﬁrst term grows as f 2 so it is ﬁ ne, bu t a j can gro w as f 2 j / 2 > f 2 . W e need to sho w that the y are canceled by the correspondi ng   x S j   1 . In particular , we will show that   x S j   1 ≥ Ω ( a j ) − O (2 − j / 2 f 2    x [ k /f 3 ]    1 ) with high proba bility—at least wherev er a j ≥ k a k 1 / (2 r ) . Let U ∈ [ r ] be the set of j w ith a j ≥ k a k 1 / (2 r ) , so that k a U k 1 ≥ k a k 1 / 2 . W e hav e    x [2 j k /f ]    2 2 =    x [2 k /f 3 ]    2 2 + r X i = j    x [2 j k /f ] ∩ [2 j +1 k /f ]    2 2 ≤    x [2 k /f 3 ]    2 2 + 1 k f r X i = j a 2 j (9) For j ∈ U , we hav e r X i = j a 2 i ≤ a j k a k 1 ≤ 2 r a 2 j so, along with ( y 2 + z 2 ) 1 / 2 ≤ y + z , we turn Equation (9) into    x [2 j k /f ]    2 ≤    x [2 k /f 3 ]    2 + v u u t 1 k f r X i = j a 2 j ≤ r f 3 k    x [ k /f 3 ]    1 + r 2 r k f a j When choo sing S j , let T ∈ [ n ] be the set of in dices chose n in the sample. Applying Lemm a 3.3 the estimate x ∗ of x T has k x ∗ − x T k ∞ ≤ r f 2 j ck    x [2 j k /f ]    2 ≤ r 1 2 j c f 2 k    x [ k /f 3 ]    1 + r 2 r 2 j c a j k = r 1 2 j c f 2 k    x [ k /f 3 ]    1 + r 2 r c   x 2 j k /f   for j ∈ U . Let Q = [2 j k /f ] \ ( S 0 ∪ · · · ∪ S j − 1 ) . W e hav e | Q | ≥ 2 j − 1 k /f so E[ | Q ∩ T | ] ≥ k / 2 f and | Q ∩ T | ≥ k / 4 f with failur e probability e − Ω( k/f ) . Conditio ned on | Q ∩ T | ≥ k / 4 f , since x T has at least | Q ∩ T | ≥ k / (4 f ) = 2 r / 2 k / 4 ≥ 2 j / 2 k / 4 pos sible choice s of v alue at least   x 2 j k /f   , x S j must ha ve at least k 2 j / 2 / 4 elements at least   x 2 j k /f   − k x ∗ − x T k ∞ . There fore, for j ∈ U ,   x S j   1 ≥ − 1 4 √ c f 2    x [ k /f 3 ]    1 + k 2 j / 2 4 (1 − r 2 r c )   x 2 j k /f   8 and therefo re r X j =1   x S j   1 ≥ X j ∈ U   x S j   1 ≥ X j ∈ U − 1 4 √ c f 2    x [ k /f 3 ]    1 + k 2 j / 2 4 (1 − r 2 r c )   x 2 j k /f   ≥ − r 4 √ c f 2    x [ k /f 3 ]    1 + 1 4 (1 − r 2 r c ) k a U k 1 ≥ − r 4 √ c f 2    x [ k /f 3 ]    1 + 1 8 (1 − r 2 r c ) r X j =1 k 2 j / 2   x 2 j k /f   (10) Using (8) and (10) we get k ˆ x − x k 1 −    x [ k ]    1 ≤  r 4 √ c + O ( 1 √ c )  f 2    x [ k /f 3 ]    1 + r X j =1 O ( 1 √ c ) + 1 8 r 2 r c − 1 8 ! k 2 j / 2   x 2 j k /f   ≤ f 2    x [ k /f 3 ]    1 ≤ f 2    x [ k ]    1 for some c = O ( r 2 ) . Hence we use a total of r c f k log n = log 3 1 /f f k log n measurements for 1 + f 2 - approx imate ℓ 1 /ℓ 1 reco very . For e ach j ∈ { 0 , . . . , r } we had f ailure probabili ty e − Ω( k/f ) + n − Ω(1) (from Lemma 3.3 and | Q ∩ T | ≥ k / 2 f ). By the uni on bound, our overa ll failure probab ility is at most (log 1 f )( e − Ω( k/f ) + n − Ω(1) ) ≤ e − Ω( k/f ) + n − Ω(1) , pro ving Theorem 3.2. 4 Lower bound s for non-sparse outp ut and p = 2 In this case, the lower bound follo ws fairly straigh tforward ly from the Shannon- Hartley information capacity of a Gaussian channel . W e w ill set up a communicati on game. Let F ⊂ { S ⊂ [ n ] | | S | = k } be a famil y of k -sparse support s such that: • | S ∆ S ′ | ≥ k for S 6 = S ′ ∈ F , • Pr S ∈F [ i ∈ S ] = k /n for all i ∈ [ n ] , and • log |F | = Ω( k log ( n/k )) . This is possible ; for example, a ran dom linear code on [ n /k ] k with relati ve distance 1 / 2 ha s these proper - ties [Gur10]. 3 Let X = { x ∈ { 0 , ± 1 } n | su pp( x ) ∈ F } . Let w ∼ N (0 , α k n I n ) be i.i.d. normal with v ariance αk /n in each coord inate. Consider the follo wing process: 3 This assumes n/k is a prime power larger than 2. I f n/k is not prime, we can choose n ′ ∈ [ n/ 2 , n ] to be a prime multiple of k , and restrict to the ﬁrst n ′ coordinates. This works unless n/k < 3 , in which case a bound of Θ(min( n, 1 ǫ k log( n/k ) ) ) = Θ( k ) is trivial. 9 Pro cedure First, Alice chooses S ∈ F unifor mly at random, then x ∈ X uniformly at random subject to supp( x ) = S , then w ∼ N (0 , α k n I n ) . She sets y = A ( x + w ) and sends y to Bob . Bob performs sp arse reco very on y to reco ver x ′ ≈ x , rounds to X by ˆ x = arg min ˆ x ∈ X k ˆ x − x ′ k 2 , and sets S ′ = su p p( ˆ x ) . This gi ves a Mark ov chain S → x → y → x ′ → S ′ . If sparse r ecov ery work s for any x + w with pr obabili ty 1 − δ as a distrib ution ov er A , then there is some speciﬁc A and random seed such that sparse r ecov ery works with probab ility 1 − δ ov er x + w ; let us c hoose this A and the rando m seed, so that Alice and B ob run deter ministic algorithms on their inputs. Lemma 4.1. I ( S ; S ′ ) = O ( m log (1 + 1 α )) . Pr oof. Let the columns of A T be v 1 , . . . , v m . W e may a ssume that the v i are orthon ormal, because this can be accomplished via a unita ry transformation on Ax . T hen we ha ve that y i = h v i , x + w i = h v i , x i + w ′ i , where w ′ i ∼ N (0 , αk   v i   2 2 /n ) = N (0 , αk /n ) and E x [ h v i , x i 2 ] = E S [ X j ∈ S ( v i j ) 2 ] = k n Hence y i = z i + w ′ i is a G aussian chann el with power constrain t E[ z 2 i ] ≤ k n   v i   2 2 and noise varian ce E[( w ′ i ) 2 ] = α k n   v i   2 2 . Henc e by the S hannon -Hartle y theore m this chann el has information capacity max v i I ( z i ; y i ) = C ≤ 1 2 log(1 + 1 α ) . By the data proces sing inequality for Marko v chains and the chain rule for entropy , this means I ( S ; S ′ ) ≤ I ( z ; y ) = H ( y ) − H ( y | z ) = H ( y ) − H ( y − z | z ) = H ( y ) − X H ( w ′ i | z , w ′ 1 , . . . , w ′ i − 1 ) = H ( y ) − X H ( w ′ i ) ≤ X H ( y i ) − H ( w ′ i ) = X H ( y i ) − H ( y i | z i ) = X I ( y i ; z i ) ≤ m 2 log(1 + 1 α ) . (11) W e w ill show that success ful recov ery either recov ers most of x , in which case I ( S ; S ′ ) = Ω ( k log ( n/k )) , or reco vers an ǫ fraction of w . First we sho w that reco vering w require s m = Ω( ǫn ) . Lemma 4.2. Suppose w ∈ R n with w i ∼ N (0 , σ 2 ) for all i and n = Ω ( 1 ǫ 2 log(1 /δ )) , and A ∈ R m × n for m < δ ǫn . Then any alg orithm that ﬁ nds w ′ fr om Aw must have k w ′ − w k 2 2 > (1 − ǫ ) k w k 2 2 with pr obabi lity at least 1 − O ( δ ) . Pr oof. Note that Aw m erely gi ves the projectio n of w onto m dimensions, gi ving no information about the other n − m dimens ions. Since w and the ℓ 2 norm are rotation in v ariant, w e may assume WLOG that A gi ves the projecti on of w onto the ﬁrst m dimensions, namely T = [ m ] . By the norm concen tration of Gaussian s, with probab ility 1 − δ we ha ve k w k 2 2 < (1 + ǫ ) nσ 2 , and by Marko v w ith probabilit y 1 − δ we ha ve k w T k 2 2 < ǫ nσ 2 . For an y ﬁ xed v alue d , since w is uniform Gaussian and w ′ T is indepen dent of w T , Pr[   w ′ − w   2 2 < d ] ≤ Pr[   ( w ′ − w ) T   2 2 < d ] ≤ Pr[   w T   2 2 < d ] . 10 Therefore Pr[   w ′ − w   2 2 < (1 − 3 ǫ ) k w k 2 2 ] ≤ Pr[   w ′ − w   2 2 < (1 − 2 ǫ ) nσ 2 ] ≤ Pr[   w T   2 2 < (1 − 2 ǫ ) nσ 2 ] ≤ Pr[   w T   2 2 < (1 − ǫ )( n − m ) σ 2 ] ≤ δ as desir ed. Rescaling ǫ giv es the result. Lemma 4 .3. Suppose n = Ω(1 /ǫ 2 + ( k/ǫ ) log ( k /ǫ )) and m = O ( ǫn ) . Then I ( S ; S ′ ) = Ω( k log ( n/k )) for some α = Ω(1 /ǫ ) . Pr oof. Consider the x ′ reco vere d from A ( x + w ) , and let T = S ∪ S ′ . Suppos e that k w k 2 ∞ ≤ O ( αk n log n ) and k w k 2 2 / ( αk ) ∈ [1 ± ǫ ] , as happens w ith probab ility at least (say) 3 / 4 . T hen we claim that if reco very is succes sful, one of the follo wing must be true:   x ′ T − x   2 2 ≤ 9 ǫ k w k 2 2 (12)   x ′ T − w   2 2 ≤ (1 − 2 ǫ ) k w k 2 2 (13) T o sho w this, suppos e k x ′ T − x k 2 2 > 9 ǫ k w k 2 2 ≥ 9 k w T k 2 2 (the last by | T | = 2 k = O ( ǫn/ log n ) ). Then   ( x ′ − ( x + w )) T   2 2 > (   x ′ − x   2 − k w T k 2 ) 2 ≥ (2   x ′ − x   2 / 3) 2 ≥ 4 ǫ k w k 2 2 . Because reco very is succes sful,   x ′ − ( x + w )   2 2 ≤ (1 + ǫ ) k w k 2 2 . Therefore   x ′ T − w T   2 2 +   x ′ T − ( x + w ) T   2 2 =   x ′ − ( x + w )   2 2   x ′ T − w T   2 2 + 4 ǫ k w k 2 2 < (1 + ǫ ) k w k 2 2   x ′ T − w   2 2 − k w T k 2 2 < (1 − 3 ǫ ) k w k 2 2 ≤ (1 − 2 ǫ ) k w k 2 2 as desir ed. Thus with 3 / 4 probabil ity , at least one of (12) and (13) is true. Suppose Equation (13) holds with at least 1 / 4 probability . T here must be some x and S such that the same equation holds with 1 / 4 probabili ty . For this S , giv en x ′ we can ﬁ nd T and thus x ′ T . Hence for a uniform Gaussian w T , giv en Aw T we can compute A ( x + w T ) and recov er x ′ T with    x ′ T − w T    2 2 ≤ (1 − ǫ )   w T   2 2 . By Lemma 4 .2 this is impossible, since n − | T | = Ω( 1 ǫ 2 ) and m = Ω( ǫn ) by assumptio n. Therefore E quatio n (12) holds with at leas t 1 / 2 pr obabili ty , namely k x ′ T − x k 2 2 ≤ 9 ǫ k w k 2 2 ≤ 9 ǫ (1 − ǫ ) αk < k / 2 for appropriat e α . But if the neare st ˆ x ∈ X to x is not equal to x ,   x ′ − ˆ x   2 2 =   x ′ T   2 2 +   x ′ T − ˆ x   2 2 ≥   x ′ T   2 2 + ( k x − ˆ x k 2 −   x ′ T − x   2 ) 2 >   x ′ T   2 2 + ( k − k / 2) 2 >   x ′ T   2 2 +   x ′ T − x   2 2 =   x ′ − x   2 2 , a contra diction . Hence S ′ = S . But Fano’ s inequality states H ( S | S ′ ) ≤ 1 + Pr[ S ′ 6 = S ] log |F | and hence I ( S ; S ′ ) = H ( S ) − H ( S | S ′ ) ≥ − 1 + 1 4 log |F | = Ω( k log ( n/k )) as desir ed. 11 Theor em 4.4 . A ny (1 + ǫ ) -appr oximate ℓ 2 /ℓ 2 r ecovery scheme with ǫ > q k log n n and failur e pr obabili ty δ < 1 / 2 r equir es m = Ω( 1 ǫ k log( n/k )) . Pr oof. Combine Lemmas 4.3 and 4.1 with α = 1 /ǫ to get m = Ω( k log( n/k ) log(1+ ǫ ) ) = Ω( 1 ǫ k log ( n/k )) , m = Ω( ǫn ) , or n = O ( 1 ǫ k log ( k /ǫ )) . For ǫ as in the theorem statement, the ﬁrst bound is controllin g. 5 Bit complexity to measurement com plexity The remainin g lower bound s procee d by reduc tions from co mmunicatio n complexity . The follo wing lemma (implicit in [DIPW10]) sho ws that lo wer boundin g the number of b its for app roximate recov ery is suf ﬁ cient to lo wer bound the number of measurements. Let B n p ( R ) ⊂ R n denote the ℓ p ball of radius R . Deﬁnition 5.1. Let X ⊂ R n be a d istrib ution with x i ∈ {− n d , . . . , n d } for all i ∈ [ n ] and x ∈ X . W e deﬁne a 1 + ǫ -appr oximate ℓ p /ℓ p spar se r eco very bit scheme on X w ith b bits, pr ecision n − c , an d failur e pr obabi lity δ to be a deterministi c pair of f unction s f : X → { 0 , 1 } b and g : { 0 , 1 } b → R n wher e f is linea r so that f ( a + b ) can be computed fr om f ( a ) and f ( b ) . W e req uir e that, for u ∈ B n p ( n − c ) uniformly and x dra wn fr om X , g ( f ( x )) is a valid r esult of 1 + ǫ -appr oximate r ecov ery on x + u with pr obabilit y 1 − δ . Lemma 5.2. A lower bound of Ω( b ) bits for such a spars e rec overy bit scheme with p ≤ 2 implies a lower bound of Ω( b/ ((1 + c + d ) log n )) bits for re gular (1 + ǫ ) -appr oximate spars e r ecov ery with failur e pr obabi lity δ − 1 /n . Pr oof. Suppose we hav e a standard (1 + ǫ ) -appr oximate spa rse reco very algorithm A with failure probabilit y δ using m m easure ments Ax . W e w ill us e this to constru ct a (ran domized) spar se recov ery bit sche me using O ( m (1 + c + d ) log n ) bits and failure probability δ + 1 /n . Then by av eraging some deterministic sparse reco very bit scheme perfor ms better than av erage ov er the input distrib ution. W e may assume that A ∈ R m × n has orthonormal rows (otherwise, if A = U Σ V T is its singular v alue decompo sition, Σ + U T A has this pro perty and can be in verted b efore applying the algorithm). When appli ed to the distr ib ution X + u for u uniform ov er B n p ( n − c ) , we m ay assume that A and A are determin istic and fail with pro babilit y δ over th eir input. Let A ′ be A rounded to t log n bits per entry for some parameter t . Let x be chose n from X . By Lemma 5.1 of [DIPW10], for any x we ha ve A ′ x = A ( x − s ) for some s w ith k s k 1 ≤ n 2 2 − t log n k x k 1 , so k s k p ≤ n 2 . 5 − t k x k p ≤ n 3 . 5+ d − t . Let u ∈ B n p ( n 5 . 5+ d − t ) unifo rmly at random. W ith probabil ity at least 1 − 1 /n , u ∈ B n p ((1 − 1 /n 2 ) n 5 . 5+ d − t ) because the balls a re similar so the ra tio of vol umes is (1 − 1 /n 2 ) n > 1 − 1 /n . In this cas e u + s ∈ B n p ( n 5 . 5+ d − t ) ; hen ce the random variab le u and u + s ov erlap in at lea st a 1 − 1 /n fraction of their volumes , so x + s + u and x + u hav e statisti cal distanc e at most 1 /n . Therefore A ( A ( x + u )) = A ( A ′ x + Au ) with probability at least 1 − 1 /n . No w , A ′ x uses only ( t + d + 1) log n bits per entry , so we can set f ( x ) = A ′ x for b = m ( t + d + 1) log n . Then we set g ( y ) = A ( y + Au ) for uniformly rando m u ∈ B n p ( n 5 . 5+ d − t ) . Setting t = 5 . 5 + d + c , this gi ves a spa rse recov ery bit scheme using b = m (6 . 5 + 2 d + c ) log n . 6 Non-sparse output Lower Bound f or p = 1 First, we sho w that recov ering the locat ions of an ǫ fraction of d ones in a vector of si ze n > d/ǫ requires e Ω( ǫd ) bits. Then, we sho w high bit comple xity of a distrib utional product versio n of the Gap- ℓ ∞ prob- lem. Finally , we cre ate a distrib ution for which suc cessful sparse reco very m ust solve one of th e previo us proble ms, givin g a lo wer bound in bit complex ity . Lemma 5.2 con verts the bit comple xity to measuremen t comple xity . 12 6.1 ℓ 1 Lower bound f or recov ering noise bits Deﬁnition 6.1 . W e say a set C ⊂ [ q ] d is a ( d, q , ǫ ) code if any two distinct c, c ′ ∈ C agr ee in at most ǫd positi ons. W e say a set X ⊂ { 0 , 1 } dq r epr esents C if X is C conca tenated with the trivial code [ q ] → { 0 , 1 } q given by i → e i . Claim 6.2. F or ǫ ≥ 2 /q , ther e exist ( d , q , ǫ ) code s C of size q Ω( ǫd ) by th e Gilbert -V arsha mov bound (detail s in [DIPW10]). Lemma 6.3. Let X ⊂ { 0 , 1 } dq r epr esent a ( d, q , ǫ ) code. Supp ose y ∈ R dq satisﬁ es k y − x k 1 ≤ (1 − ǫ ) k x k 1 . Then we ca n r ecov er x uniquely fr om y . Pr oof. W e assu me y i ∈ [0 , 1] for all i ; thresholding other wise decrea ses k y − x k 1 . W e will show that there exi sts no other x ′ ∈ X w ith k y − x k 1 ≤ (1 − ǫ ) k x k 1 ; thus choosing the neare st element of X is a uniqu e decod er . Suppose otherwise, and let S = sup p( x ) , T = supp( x ′ ) . T hen (1 − ǫ ) k x k 1 ≥ k x − y k 1 = k x k 1 − k y S k 1 +   y S   1 k y S k 1 ≥   y S   1 + ǫd Since the same is true relati ve to x ′ and T , we ha ve k y S k 1 + k y T k 1 ≥   y S   1 +   y T   1 + 2 ǫd 2 k y S ∩ T k 1 ≥ 2   y S ∪ T   1 + 2 ǫd k y S ∩ T k 1 ≥ ǫ d | S ∩ T | ≥ ǫd This viola tes the distan ce of the code represented by X . Lemma 6.4 . L et R = [ s, cs ] for some cons tant c and parameter s . L et X be a permutation indepen dent distrib ution ove r { 0 , 1 } n with k x k 1 ∈ R with pr obability p . If y satisﬁes k x − y k 1 ≤ (1 − ǫ ) k x k 1 with pr obabi lity p ′ with p ′ − (1 − p ) = Ω(1) , then I ( x ; y ) = Ω( ǫs log( n/s )) . Pr oof. For each integer i ∈ R , let X i ⊂ { 0 , 1 } n repres ent an ( i, n/i, ǫ ) co de. Let p i = Pr x ∈ X [ k x k 1 = i ] . Let S n be the set of permutations of [ n ] . Then the distrib ution X ′ gi ven by (a) choosin g i ∈ R proportio nal to p i , (b) choos ing σ ∈ S n unifor mly , (c) choosing x i ∈ X i unifor mly , and (d) outputtin g x ′ = σ ( x i ) is equal to the distrib ution ( x ∈ X | k x k 1 ∈ R ) . No w , because p ′ ≥ Pr[ k x k 1 / ∈ R ] + Ω(1) , x ′ chosen from X ′ satisﬁes k x ′ − y k 1 ≤ (1 − ǫ ) k x ′ k 1 with δ ≥ p ′ − (1 − p ) probabilit y . Therefore, w ith at lea st δ / 2 proba bility , i and σ are such that k σ ( x i ) − y k 1 ≤ (1 − ǫ ) k σ ( x i ) k 1 with δ / 2 probabil ity ov er uniform x i ∈ X i . But gi ven y with k y − σ ( x i ) k 1 small, we can compute y ′ = σ − 1 ( y ) w ith k y ′ − x i k 1 equall y small. Then by L emma 6.3 we can recov er x i from y with probab ility δ / 2 ov er x i ∈ X i . Thus for this i and σ , I ( x ; y | i, σ ) ≥ Ω(log | X i | ) = Ω( δ ǫs log ( n/s )) by Fano’ s inequality . But then I ( x ; y ) = E i,σ [ I ( x ; y | i, σ )] = Ω ( δ 2 ǫs log( n/s )) = Ω( ǫs log( n/s )) . 6.2 Distrib utional Indexed Gap ℓ ∞ Consider the follo wing communica tion game, which we refer to as Gap ℓ B ∞ , studied in [BYJKS 04]. The leg al instances are pairs ( x, y ) of m -dimensiona l vect ors, w ith x i , y i ∈ { 0 , 1 , 2 , . . . , B } for all i such that • N O insta nce: for all i , y i − x i ∈ { 0 , 1 } , or 13 • Y ES instan ce: there is a unique i for which y i − x i = B , and for all j 6 = i , y i − x i ∈ { 0 , 1 } . The distrib utiona l communication co mplex ity D σ ,δ ( f ) of a functi on f is the minimum ove r all determinist ic protoc ols computing f w ith erro r probability at m ost δ , w here the proba bility is o ver input s drawn from σ . Consider the distrib ution σ which chooses a random i ∈ [ m ] . Then for each j 6 = i , it chooses a random d ∈ { 0 , . . . , B } and ( x i , y i ) is uniform in { ( d, d ) , ( d, d + 1) } . For coordin ate i , ( x i , y i ) is uniform in { (0 , 0) , (0 , B ) } . U sing similar argumen ts to those in [BY JKS04], Jayram [Jay02] showed D σ ,δ ( Gap ℓ B ∞ ) = Ω( m/B 2 ) (this is refere nce [70] on p.182 of [BY02]) for δ less than a small constant. W e deﬁne the on e-way distrib utional communication co mplexi ty D 1 − w ay σ ,δ ( f ) of a fu nction f to be the smallest distrib utional complexity of a protoc ol for f in w hich only a single message is sent from Alice to Bob . Deﬁnition 6.5 (Inde xed Ind ℓ r,B ∞ Problem) . Ther e ar e r pairs of inputs ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x r , y r ) such that eve ry pair ( x i , y i ) is a le gal instance of the Gap ℓ B ∞ pr oblem. Alice is given x 1 , . . . , x r . Bob is g iven an inde x I ∈ [ r ] and y 1 , . . . , y r . The go al is to decide whether ( x I , y I ) is a NO or a YES instan ce of Gap ℓ B ∞ . Let η be the distrib ution σ r × U r , where U r is the uniform distrib ution on [ r ] . W e bound D 1 − w ay η,δ ( Ind ℓ ∞ ) r,B as fo llo ws. For a fun ction f , let f r denote the problem of computing r instances of f . For a distrib ution ζ on instances of f , let D 1 − w ay, ∗ ζ r ,δ ( f r ) denote the m inimum communication cost of a deterministic protocol computin g a funct ion f with error proba bility at most δ in each of the r copies of f , where the input s come from ζ r . Theor em 6.6. (spe cial case of Cor ollary 2.5 of [BR11]) Assume D σ ,δ ( f ) is lar ger than a lar ge enough consta nt. Then D 1 − w ay, ∗ σ r ,δ / 2 ( f r ) = Ω ( r D σ ,δ ( f )) . Theor em 6.7. F or δ le ss than a suf ﬁcient ly small constant, D 1 − w ay η,δ ( Ind ℓ r,B ∞ ) = Ω ( δ 2 r m/ ( B 2 log r )) . Pr oof. Consider a d eterminist ic 1 -way protoc ol Π for Ind ℓ r,B ∞ with err or probab ility δ on inp uts dra wn from η . Then for at least r/ 2 values i ∈ [ r ] , Pr[ Π( x 1 , . . . , x r , y 1 , . . . , y r , I ) = Gap ℓ B ∞ ( x I , y I ) | I = i ] ≥ 1 − 2 δ. Fix a set S = { i 1 , . . . , i r / 2 } of indices with this prope rty . W e bu ild a deterministic 1 - way protocol Π ′ for f r / 2 with input distrib ution σ r / 2 and error probabi lity at m ost 6 δ in each of the r / 2 copies of f . For each ℓ ∈ [ r ] \ S , independen tly choos e ( x ℓ , y ℓ ) ∼ σ . For each j ∈ [ r / 2] , let Z 1 j be the probab ility th at Π( x 1 , . . . , x r , y 1 , . . . , y r , I ) = Gap ℓ B ∞ ( x i j , y i j ) gi ven I = i j and the choice of ( x ℓ , y ℓ ) for all ℓ ∈ [ r ] \ S . If we repeat this experiment independen tly s = O ( δ − 2 log r ) times, obtaining indepe ndent Z 1 j , . . . , Z s j and let Z j = P t Z t j , then Pr[ Z j ≥ s − s · 3 δ ] ≥ 1 − 1 r . So there e xists a set o f s = O ( δ − 1 log r ) repetitions for whi ch for ea ch j ∈ [ r / 2] , Z j ≥ s − s · 3 δ . W e hard wire these int o Π ′ to mak e the pro tocol determinist ic. Giv en inputs (( X 1 , . . . , X r / 2 ) , ( Y 1 , . . . , Y r / 2 )) ∼ σ r / 2 to Π ′ , Alice and Bob run s exe cutions of Π , each with x i j = X j and y i j = Y j for all j ∈ [ r / 2] , ﬁlling in the remain ing valu es usin g the hardwired inputs . Bob runs the algorithm speciﬁed by Π for each i j ∈ S and each ex ecution. His outpu t for ( X j , Y j ) is the majority of the outpu ts of the s exec utions with index i j . Fix an inde x i j . Let W be th e number of repetition s for which Gap ℓ B ∞ ( X j , Y j ) does not equal the out put of Π on input i j , for a rando m ( X j , Y j ) ∼ σ . T hen, E [ W ] ≤ 3 δ . By a Mark ov bound , Pr[ W ≥ s / 2] ≤ 6 δ , and so the coord inate is corr ect w ith probab ility at least 1 − 6 δ . The communic ation of Π ′ is a factor s = Θ( δ − 2 log r ) more than that of Π . The theor em now follo ws by Theorem 6.6, using that D σ , 12 δ ( Gap ℓ B ∞ ) = Ω( m/B 2 ) . 6.3 L ower bound for sp arse rec overy Fix the parameters B = Θ(1 /ǫ 1 / 2 ) , r = k , m = 1 /ǫ 3 / 2 , and n = k /ǫ 3 . Giv en an instanc e ( x 1 , y 1 ) , . . . , ( x r , y r ) , I of Ind ℓ r,B ∞ , we deﬁne the input signa l z to a spa rse recov ery problem. W e alloca te a set S i of m disj oint 14 coordi nates in a univ erse of size n for each pair ( x i , y i ) , and on these coordi nates place the vector y i − x i . The locati ons are important for argui ng the sparse reco very algori thm cannot learn much information about the noise, and will be placed unifor mly at random. Let ρ denote the in duced distrib ution on z . Fix a (1 + ǫ ) -approximate k -sparse recov ery bit scheme Alg that takes b bits as input and succeeds with probability at least 1 − δ / 2 over z ∼ ρ for some small consta nt δ . Let S be the set of top k coo rdinate s in z . Al g has the guarantee that if it succeed s for z ∼ ρ , then there exi sts a small u with k u k 1 < n − 2 so that v = Al g ( z ) satisﬁes k v − z − u k 1 ≤ (1 + ǫ )   ( z + u ) [ n ] \ S   1 k v − z k 1 ≤ (1 + ǫ )   z [ n ] \ S   1 + (2 + ǫ ) /n 2 ≤ (1 + 2 ǫ )   z [ n ] \ S   1 and thus k ( v − z ) S k 1 +   ( v − z ) [ n ] \ S   1 ≤ (1 + 2 ǫ ) k z [ n ] \ S k 1 . (14) Lemma 6.8. F or B = Θ(1 /ǫ 1 / 2 ) sufﬁ ciently lar ge, suppos e that Pr z ∼ ρ [ k ( v − z ) S k 1 ≤ 10 ǫ · k z [ n ] \ S k 1 ] ≥ 1 − δ . Then Alg r equir es b = Ω( k / ( ǫ 1 / 2 log k )) . Pr oof. W e sho w how to use Al g to solv e instances of Ind ℓ r,B ∞ with probabil ity at least 1 − C for some small C , where the probabili ty is ove r input instances to Ind ℓ r,B ∞ distrib uted accordi ng to η , inducing the distrib ution ρ . The lo wer bound w ill follo w by T heorem 6.7. Since Al g is a determinis tic sparse recov ery bit s cheme, it recei ves a ske tch f ( z ) of the in put signa l z and runs an arbitrary reco very a lgorith m g on f ( z ) to deter mine its out put v = Al g ( z ) . Giv en x 1 , . . . , x r , for each i = 1 , 2 , . . . , r , Alice places − x i on the appropria te coordin ates in the block S i used in deﬁning z , obtaini ng a vecto r z Alice , and transmits f ( z Alice ) to Bob . Bob uses his inputs y 1 , . . . , y r to place y i on the appropriat e co ordinat e in S i . He thus creates a vecto r z B ob for which z Alice + z B ob = z . Giv en f ( z Alice ) , Bob computes f ( z ) from f ( z Alice ) and f ( z B ob ) , then v = Alg ( z ) . W e assume all coordin ates of v are rounded to the real interva l [0 , B ] , as this can only decreas e the erro r . W e say that S i is bad if eith er • there is no coordinate j in S i for which | v j | ≥ B 2 yet ( x i , y i ) is a YES insta nce of Ga p ℓ r,B ∞ , or • there is a coordinate j in S i for which | v j | ≥ B 2 yet either ( x i , y i ) is a NO ins tance of G ap ℓ r,B ∞ or j is not the unique j ∗ for which y i j ∗ − x i j ∗ = B The ℓ 1 -error in curred by a bad block is at lea st B / 2 − 1 . Henc e, if ther e are t bad blocks, the to tal error is at least t ( B / 2 − 1) , which must be smaller than 10 ǫ · k z [ n ] \ S k 1 with proba bility 1 − δ . Suppose this happe ns. W e bound t . All coordinates in z [ n ] \ S ha ve val ue in the set { 0 , 1 } . Hence, k z [ n ] \ S k 1 < r m . S o t ≤ 20 ǫr m/ ( B − 2) . For B ≥ 6 , t ≤ 30 ǫr m/B . Plugging in r , m and B , t ≤ C k , where C > 0 is a consta nt that can be made arbit rarily small by increasing B = Θ(1 /ǫ 1 / 2 ) . If a block S i is no t bad, the n it can be used to solv e Ga p ℓ r,B ∞ on ( x i , y i ) with pro babilit y 1 . Bob de clares that ( x i , y i ) is a YES insta nce if and only if there is a coordinat e j in S i for which | v j | ≥ B / 2 . Since Bob’ s index I is uniform on the m coordina tes in Ind ℓ r,B ∞ , with probabil ity at least 1 − C the players solv e I nd ℓ r,B ∞ gi ven that the ℓ 1 error is small. There fore th ey solve Ind ℓ r,B ∞ with proba bility 1 − δ − C overal l. By Theorem 6.7, for C and δ sufﬁci ently small Alg requires Ω( mr / ( B 2 log r )) = Ω( k / ( ǫ 1 / 2 log k )) bi ts. Lemma 6.9. Suppose Pr z ∼ ρ [ k ( v − z ) [ n ] \ S k 1 ] ≤ (1 − 8 ǫ ) · k z [ n ] \ S k 1 ] ≥ δ / 2 . Then Alg r equir es b = Ω( 1 √ ǫ k log(1 /ǫ )) . 15 Pr oof. The distrib ution ρ consists of B ( mr , 1 / 2) ones placed uniformly throughout the n coordinate s, where B ( mr , 1 / 2) deno tes the binomial distri bu tion with mr ev ents of 1 / 2 probabil ity each. Therefore with probabi lity at least 1 − δ / 4 , th e nu mber of ones lie s in [ δmr / 8 , (1 − δ / 8) mr ] . Thus by Lemma 6.4, I ( v ; z ) ≥ Ω( ǫmr log ( n / ( mr ))) . Since the mutual informatio n only passes throug h a b -bit string, b = Ω( ǫmr log ( n/ ( mr ))) as well. Theor em 6.10. Any (1 + ǫ ) -appr oximate ℓ 1 /ℓ 1 r ecovery scheme with suf ﬁciently small constan t failur e pr obabi lity δ must make Ω( 1 √ ǫ k / log 2 ( k /ǫ )) measur ements. Pr oof. W e will lo wer bound any ℓ 1 /ℓ 1 sparse reco very bit scheme Alg . If Al g succeeds , then in order to satisfy inequality (14), we must eithe r ha ve k ( v − z ) S k 1 ≤ 10 ǫ · k z [ n ] \ S k 1 or we must hav e k ( v − z ) [ n ] \ S k 1 ≤ (1 − 8 ǫ ) · k z [ n ] \ S k 1 . Since Alg succeeds w ith probability at le ast 1 − δ , it must either satisfy the hypoth esis of Lemma 6.8 or the hypothes is of Lemma 6.9. But by thes e two lemmas , it follo w s that b = Ω( 1 √ ǫ k / log k ) . T herefo re by Lemma 5.2, any (1 + ǫ ) -app roximate ℓ 1 /ℓ 1 sparse reco very algorit hm requir es Ω( 1 √ ǫ k / log 2 ( k /ǫ )) measurements. 7 Lower bound s for k -sparse output Theor em 7.1. A ny 1 + ǫ -appr oximate ℓ 1 /ℓ 1 r ecovery sc heme with k -spars e output and f ailur e pr obability δ r equir es m = Ω( 1 ǫ ( k log 1 ǫ + log 1 δ )) , for 32 ≤ 1 δ ≤ n ǫ 2 /k . Theor em 7.2. A ny 1 + ǫ -appr oximate ℓ 2 /ℓ 2 r ecovery sc heme with k -spars e output and f ailur e pr obability δ r equir es m = Ω( 1 ǫ 2 ( k + log ǫ 2 δ )) , for 32 ≤ 1 δ ≤ n ǫ 2 /k . These two theorems correspond to four statemen ts: one for lar ge k and one for small δ for both ℓ 1 and ℓ 2 . All the lo wer boun ds proceed by reduc tions from communica tion complexi ty . The follo wing lemma (implicit in [DIPW10]) sho ws that lo wer boundin g the number of b its for app roximate recov ery is suf ﬁ cient to lo wer bound the number of measurements. Lemma 7.3. Let p ∈ { 1 , 2 } and α = Ω(1) < 1 . Suppos e X ⊂ R n has k x k p ≤ D and k x k ∞ ≤ D ′ for all x ∈ X , and all coef ﬁcients of elements of X ar e exp r essible in O (log n ) bits. Further sup pose that we have a r ecover y algo rithm tha t, for any ν with k ν k p < αD and k ν k ∞ < αD ′ , r ecover s x ∈ X fr om A ( x + ν ) with const ant pr obability . Then A must have Ω(log | X | ) measur ements. Pr oof. [Use lemma 5.2] First, we may assume tha t A ∈ R m × n has orthonor mal rows (otherwise, if A = xxx U Σ V T is its singular v alue decomp osition, Σ + U T A has this proper ty and can be in verted before applying the algorithm). Let A ′ be A rounde d to c log n bits per entry . By L emma 5 .1 of [DIPW10], for any v we ha ve A ′ v = A ( v − s ) for some s with k s k 1 ≤ n 2 2 − c log n k v k 1 , so k s k p ≤ n 2 . 5 − c k v k p . Suppose Alice has a bit string of length r log | X | for r = Θ(log n ) . By splitting into r blocks, this corres ponds to x 1 , . . . , x r ∈ X . Let β be a power of 2 between α/ 2 and α/ 4 , and deﬁne z j = r X i = j β i x i . Alice sends A ′ z 1 to Bob; t his is O ( m log n ) bits. B ob will solv e the augmented inde xing pr oblem [citatio n?] — xxx gi ven A ′ z 1 , arbitrary j ∈ [ r ] , and x 1 , . . . , x j − 1 , he must ﬁnd x j with constant proba bility . This requir es A ′ z 1 to ha ve Ω ( r log | X | ) bits, givin g the result. 16 Bob recei ves A ′ z 1 = A ( z 1 + s ) for k s k 1 ≤ n 2 . 5 − c k z 1 k p ≤ n 2 . 5 − c D . Bob then chooses u ∈ B n p ( n 4 . 5 − c D ) uniformly at random. W ith probability at least 1 − 1 /n , u ∈ B n p ((1 − 1 /n 2 ) n 4 . 5 − c D ) by a volume arg ument. In this case u + s ∈ B n p ( n 4 . 5 − c D ) ; hence the rando m v ariable s u an d u + s ov erlap in at leas t a 1 − 1 /n fraction of their v olumes, so z j + s + u and z j + u ha ve statistical distance at most 1 /n . The distrib ution of z j + u is independe nt of A (u nlike z j + s ) so running th e re cov ery algorit hm on A ( z j + s + u ) succeed s with const ant prob ability as well. W e also hav e k z j k p ≤ β j − β r +1 1 − β D < 2( β j − β r +1 ) D . Since r = O (log n ) and β is a co nstant, there exi sts a c = O (1) with k z j + s + u k p < (2 β j + n 4 . 5 − c + n 2 . 5 − c − 2 β r ) D ≤ β j − 1 αD for all j . Therefore , gi ven x 1 , . . . , x j − 1 , Bob can compute 1 β j ( A ′ z 1 + Au − A ′ X i 1 Suppose p, s, 3 r satisfy L emma 7.4 for some parameter c , and let q = s/k . T he Gilbert-V arshamo v bound implies that there exis ts a code V ⊂ [ q ] r with log | V | = Ω ( r log q ) and m inimum Hamming distance r / 4 . Let X ⊂ { 0 , 1 } q r be in one-to-on e correspon dence with V : x ∈ X correspo nds to v ∈ V when x ( a − 1) q + b = 1 if and only if v a = b . Let x and v corresp ond. Let S ⊂ [ r ] with | S | = k , so S corresponds to a set T ⊂ [ n ] with | T | = k q = s . Consider arbitrary ν that satisﬁes k ν k p < α k x k p and k ν k ∞ ≤ α fo r some small constant α ≤ 1 / 4 . W e would like to apply Lemma 7.3, so we jus t need to show we can reco ver x from A ( x + ν ) with constant probab ility . Let ν ′ = x T + ν , so   ν ′   p p ≤ p (   x T   p p + k ν k p p ) ≤ p ( r − k + α p r ) ≤ 3 r   ν ′ T   ∞ ≤ 1 + α   ν ′ T   ∞ ≤ α Therefore Lemma 7.4 implies that with prob ability 1 − δ , if B ob is giv en A ( x T + ν ′ ) = A ( x + ν ) he can reco ver ˆ x that agrees with x T in all but k /c location s. Hence in all but k /c of the i ∈ S , x { ( i − 1) q +1 ,.. .,iq } = ˆ x { ( i − 1) q +1 ,.. .,iq } , so he c an identify v i . Hence Bob can recov er an estimate of v S that is accurate in ( 1 − 1 /c ) k charac ters w ith prob ability 1 − δ , so it agrees with v S in (1 − 1 /c )(1 − δ ) k characters in expecta tion. If we apply this in parallel to the sets S i = { k ( i − 1) + 1 , . . . , k i } for i ∈ [ r /k ] , w e reco ver (1 − 1 /c )(1 − δ ) r charac ters in expecta tion. Hence with probability at least 1 / 2 , w e reco ver more than (1 − 2(1 /c + δ )) r charac ters of v . If we se t δ and 1 / c to less than 1 / 32 , this gi ves that we recove r all b ut r / 8 characters of v . Since V has mini mum dis tance r / 4 , this allo ws us to reco ver v (and hence x ) exactly . B y Lemm a 7.3 this giv es a lower bound of m = Ω(log | V | ) = Ω( r log q ) . Hence m = Ω( 1 ǫ k log 1 ǫ ) for ℓ 1 /ℓ 1 reco very and m = Ω( 1 ǫ 2 k ) for ℓ 2 /ℓ 2 reco very . 7.2 k = 1 , δ = o (1) T o achie ve the other half of our lo wer bound s fo r sparse outputs, we restrict to the k = 1 case. A k -sp arse algori thm implies a 1 -sparse algorithm by inserting k − 1 dummy coordi nates of v alue ∞ , so this is v alid. Let p, s, 51 r satisf y Lemma 7.4 fo r so me α and D to be determine d, and le t our reco ver y algo rithm ha ve fail ure probabilit y δ . L et C = 1 / (2 rδ ) and n = C r . Let V = [( s − 1) C ] r and let X ′ ∈ { 0 , 1 } ( s − 1) C r be the corresp onding binary vec tor . Let X = { 0 } × X ′ be deﬁned by addi ng x 0 = 0 to each vector . No w , consider arbitrary x ∈ X and noise ν ∈ R 1+( s − 1) C r with k ν k p < α k x k p and k ν k ∞ ≤ α for some small consta nt α ≤ 1 / 20 . Let e 0 / 5 be the vector that is 1 / 5 at 0 and 0 else where. Consider the sets S i = { 0 , ( s − 1)( i − 1) + 1 , ( s − 1)( i − 1) + 2 , . . . , ( s − 1) i } . W e woul d lik e to apply Lemma 7.4 to recov er ( x + ν + e 0 / 5) S i for each i . T o see what it implies, there are two cases: k x sS i k 1 = 1 and k x S i k 1 = 0 (since S i lies entire ly in one charac ter , k x S i k 1 ∈ { 0 , 1 } ). In the former case, we ha ve ν ′ = x S i + ν + e 0 / 5 with   ν ′   p p ≤ (2 p − 1)(    x S i    p p + k ν k p p +   e 0 / 5   p p ) ≤ 3( r + α p r + 1 / 5 p ) < 4 r    ν ′ S i    ∞ ≤ 1 + α   ν ′ S i   ∞ ≤ 1 / 5 + α ≤ 1 / 4 Hence Lemma 7.4 will, with failure probab ility δ , recove r ˆ x S i that diff ers from x S i in at most 1 /c < 1 positi ons, so x S i is correctly recov ered. 19 No w , su ppose k x S i k 1 = 0 . Then we o bserv e that Lemma 7.4 w ould apply to recov ery from 5 A ( x + ν + e 0 / 5) , with ν ′ = 5 x + 5 ν and x ′ = e 0 , so   ν ′   p p ≤ 5 p p ( k x k p p + k ν k p p ) ≤ 5 p p ( r + α p r ) < 51 r    ν ′ S i    ∞ ≤ 5 + 5 α   ν ′ S i   ∞ ≤ 5 α. Hence Lemma 7.4 would reco ver , w ith failu re probability δ , an ˆ x S i with supp ort equal to { 0 } . No w , we observ e that t he alg orithm in L emma 7.4 is robust t o s caling th e in put A ( x ′ + ν ′ ) by 5 ; t he o nly dif ference is that the ef fectiv e µ changes by the same factor , which increases the number of errors k /c by a fact or of at most 5 . Hence if c > 5 , we ca n apply the alg orithm once and ha ve it wo rk reg ardless of wheth er k x S i k 1 is 0 or 1 : if k x S i k 1 = 1 the res ult ha s supp ort supp( x i ) , and if k x S i k 1 = 0 the result has support { 0 } . T hus we can reco ver x S i exa ctly with failure probab ility δ . If we try this to the C r = 1 / (2 δ ) sets S i , we recove r all of x correc tly with failur e pr obabili ty at most 1 / 2 . Hence Lemma 7.3 implies that m = Ω (log | X | ) = Ω( r log s r δ ) . For ℓ 1 /ℓ 1 , this means m = Ω( 1 ǫ log 1 δ ) ; for ℓ 2 /ℓ 2 , this means m = Ω( 1 ǫ 2 log ǫ 2 δ ) . Ack nowledgmen t: W e than k T .S. Jayram for helpful discus sions. Refer ences [ASZ10] S. Aeron, V . Saligrama, and M. Zhao. Informatio n theoreti c bounds for compressed sensin g. Informat ion T heory , IE EE T ra nsactio ns on , 56(10) :5111–5 130, 20 10. [BR11] Mark Bra ver man and Anup Rao. Information equals amortized communicatio n. In STOC , 2011. [BY02] Ziv Bar-Y ossef. T he Comple xity of Massive Data Set Computations . PhD the sis, UC Berk eley , 2002. [BYJKS04] Z i v Bar -Y ossef, T . S. Ja yram, Ra vi Kumar , and D. Siv akumar . An in formation statisti cs ap- proach to data stream and commun ication comple xity . J. Comput. Syst . Sci. , 68(4) :702–73 2, 2004. [CCF02] M. Charika r , K. Chen, and M. Farach-Colto n. Finding frequent items in data streams. ICALP , 2002. [CD11] E.J. Cand ` es and M.A. Dav enport. H o w well can we estimate a sparse vect or? Arxiv pre print arXiv:11 04.5246 , 2011. [CM04] G. Cormode and S. Muthukris hnan. Improv ed data stream summaries : The count-min sketch and its appl ications . LATIN , 2004 . [CM05] Graham Cormode and S. Muthuk rishnan . Summarizing and mining sk ewed data stre ams. In SDM , 2005 . [CM06] G. C ormode and S. M uthuk rishnan . Combinato rial algo rithms for compresse d sensin g. Sir occo , 2006. [CR T06] E. J. Cand ` es, J. Romber g, and T . T ao. S table signal recov ery from incomplete and inacc urate measuremen ts. Comm. Pur e Appl. Math. , 59(8):120 8–1223, 2006. 20 [DIPW10] K. Do Ba, P . Indyk, E. Price, and D. W oodr uf f. Lo wer bounds for sparse reco ver y . SODA , 2010. [Don06] D. L. Donoho. Compresse d S ensing . IEEE T ran s. Info. T heory , 52(4) :1289– 1306, Apr . 2006. [FPR U 10] S. Foucart, A . Pajor , H . Rauhut, and T . Ullrich. The gelfand widths of lp-balls for 0 < p ≤ 1 . 2010. [GLPS10] Anna C. Gilbert, Y i Li, Ely Porat, and Marti n J. Strauss. Approximate sparse recov ery: opti- mizing time and measur ements. In ST OC , pages 475–484 , 2010. [Gur10] V . Guruswa mi. Introd uction to coding theory . Graduate cour se notes , avai lable at http: //www .cs.cmu.edu / ˜ venka tg/te aching/co dingtheory/ , 2010. [IR08] Piotr Indyk and Milan R uzic. Near -optimal sparse reco very in the l1 nor m. In FOCS , pages 199–2 07, 2008. [IT10] MA Iwen and AH T ewﬁk. A dapti ve group testing strateg ies for targe t de tection and localizat ion in noisy en viron ments. IMA Prep rint Series , (2311), 2010. [Jay02 ] T .S. Jayram. Unpublishe d manuscript , 2002. [Mut05] S. Muthukr ishnan. Data streams: Algo rithms and applicatio ns). FT TCS , 200 5. [SAZ10] N. Shent al, A. Amir , and Or Zuk. Identiﬁcation of rare al leles and their carriers using com- presse d se(que)nsin g. Nucleic Acids Resear ch , 38(19 ):1–22, 2010. [TDB09] J. T reichler , M. Da venp ort, and R. Baraniuk . Applicati on of compressi ve sensing to the desi gn of wideband signal acquisi tion recei vers. In Pr oc. U.S./Aus tral ia Joint W ork. D efense Apps. of Signa l P r ocessing (D A SP) , 2009 . [W ai09] Martin J. W ainwright. Informat ion-theo retic limits on sparsity reco very in the high- dimensio nal and noisy setting. IEEE T ransac tions on Information Theory , 55(12):572 8–5741 , 2009. 21

(1+eps)-approximate Sparse Recovery

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment