A Deterministic Sub-linear Time Sparse Fourier Algorithm via Non-adaptive Compressed Sensing Methods

A Deterministic Sub-linear Time Sparse F ourier Algorithm via Non-adaptiv e Compressed Sen sing Methods M. A. Iwen ∗ Univ ersity of Michigan markiwen@umich.edu Abstract W e s tudy the pr oblem of estimating the b est B term F ourier r epres enta tion for a give n fr equenc y-sparse signal (i.e., vector) A of length N ≫ B . Mor e explicitly , we in vestigate how to deterministically identify B of the lar gest magnitude fr equencies of ˆ A , and estimate their coefﬁcients, in polyn omial ( B , log N ) time. Rando mized sub- linear time algorithms which ha ve a small (co ntr o llable) pr obab ility of fa ilur e for each pr ocessed sign al exist for solving this pr o blem. However , for failure in tolerant applications such as those involving mission -critical har dware d esigned to pr o cess many signals over a long lifetime, deterministic algorithms with no pr ob ability of failure are highly desirable. In th is pape r we build on the de terministic Compress ed Sensing r esults of Cormod e an d Muthukrishn an (CM) [26, 6 , 7] in order to develop the ﬁrst known d eterministic sub- linear time spa rse F ou rier T ransform alg orithm suitable fo r failur e intolerant applica tions. Furthermore , in th e pr o cess of developin g our new F ou rier a lgorithm, we present a simpliﬁed deterministic Compres sed Sensing algorithm which imp r oves on CM’ s algebraic co mpr essibility res ults while simultaneou sly maintaining their r esults con cerning e xpo nential decay . 1 Intr oduction In many application s only the top few most energetic terms of a signa l’ s F our ier Transform (FT) are of in terest. In such app lications the Fast Fourier Transform (FFT), which com putes all FT terms, is com putationally wasteful. T o make our point, we n ext consider a simple application-b ased example in which the FFT can be r eplaced b y faster approx imate F ourier method s: Motivating E xa mple: sub-Nyquist frequency acquisition Imagine a signal/function f : [0 , 2 π ] → C o f the form f ( x ) = C · e i ω x consisting of a sin gle u nknown frequen cy ω ∈ ( − N , N ] (e. g., con sider a windowed sinusoid al po rtion of a wideb and frequen cy-hopp ing signal [ 21]). Samplin g at the Nyquist-rate would dictate the need for at least 2 N equally spaced samples f rom f in or der to d iscover ω via the FFT with out aliasing [ 3]. Thus, we would have to com pute the FFT of the 2 N -length vector A ( j ) = f π j N ! , 0 ≤ j < 2 N . Howe ver , if we use aliasing to ou r ad vantage we can correctly d etermine ω with sign iﬁcantly fewer f -sam ples taken in parallel. ∗ Supported in part by NSF DMS-0510203. Consider, for example, the tw o-samp le Discrete F ou rier T r ansform of f . It has ˆ f (0) = C · 1 + ( − 1) ω 2 and ˆ f (1) = C · 1 + ( − 1 ) ω − 1 2 . Clearly ˆ f (0) = 0 implies that ω ≡ 1 modulo 2 while ˆ f (1) = 0 implies that ω ≡ 0 modulo 2. In this fashion we may use se veral p otentially aliased F ast F ou rier T ransfo rms in parallel to discover ω modu lo 3 , 5 , . . . , the O (log N ) th prime. Once we hav e collected these modu li we can reco nstruct ω via the famous Chinese Remainder Theorem (CR T) . Theorem 1 C H I N E S E R E M A I N D E R T H E O R E M ( C RT ) : Any in te ger x is uniq uely speciﬁed mo d N b y its remainders modulo m r elatively prime inte gers p 1 , . . . , p m as long as Q m l = 1 p l ≥ N . T o ﬁnish ou r example, sup pose th at N = 500 , 00 0 and that we have used three FFT’ s with 100, 101 , an d 103 samples to determine that ω ≡ 34 m od 100 , ω ≡ 3 m od 101 , and ω ≡ 1 m od 103, respecti vely . Using that ω ≡ 1 mod 103 and we can see that ω = 103 · a + 1 for some integer a . Thus, (103 · a + 1 ) ≡ 3 mod 101 ⇒ 2 a ≡ 2 mod 101 ⇒ a ≡ 1 mod 101 . Therefo re, a = 1 01 · b + 1 for some integer b . Substituting for a we get that ω = 104 03 · b + 1 04 . By similar work we can see that b ≡ 10 m od 10 0 after co nsidering ω m odulo 100 . Hence, ω = 104 , 134 by the CR T . As an adde d bon us we note that our three FFTs will have also provided us with three dif feren t estimates of ω ’ s co efﬁcient C . The end result is that we have used signiﬁcan tly less than 2 N samp les to determine ω . Using the CR T we req uired only 100 + 1 01 + 103 = 304 samples from f to determine ω since 100 · 101 · 103 > 1 , 000 , 000 . In contr ast, a million f -samples would be gath ered during Nyquist-rate sampling. Besides need ing signiﬁcantly less samples than th e FFT , this CR T -b ased single freq uency metho d dram atically r educes req uired co mputation al effort. And, it’ s determin istic. There is no c hance of failure. Of co urse, a single f requency signal is incred ibly simple. Sig nals inv olving mo re th an 1 n on-zero frequ ency are much mo re difﬁcult to h andle since f requen cy mod uli may begin to collide modulo various number s. Dealing with the potential difﬁculties caused by such frequency collisions in a de terministic way comprises the majority of this paper . 1.1 Compressed Sensing and Related W ork Compressed Sen sing (CS) me thods [4, 28, 2 6, 6, 7] provide a ro bust framework for reducing the numb er of mea- surements requ ired to estimate a sp arse signal. For this reason CS metho ds are useful in areas such as MR im aging [23, 2 4] and analo g-to-d igital c onv ersion [21, 2 0] where mea surement co sts are high. The general CS setup is as follows: Let A be an N -length signal/vector with complex v alued entries and Ψ be a f ull rank N × N change of basis matrix. Furth ermore, su ppose that Ψ · A is sparse (i.e., only k ≪ N en tries of Ψ · A are sig niﬁcant/large in magnitude) . CS methods deal with generating a K × N measur ement matrix, M , with the s mallest numb er of rows p ossible (i.e., K minimized) so that the k signiﬁcant entries of Ψ · A can be well app roximate d by the K -elemen t vector result of M · Ψ · A . (1) Note that CS is inheren tly algor ithmic since a procedur e for recovering Ψ · A ’ s largest k -entries from the result of Equation 1 must be speciﬁed. For th e rem ainder of th is paper we will consid er the special CS case wher e Ψ is the N × N Discrete Fourier T ransfo rm matrix. Hence, we have Ψ i , j = e 2 π i · i · j N (2) Our prob lem of interest is to ﬁnd, and estimate the coefﬁcients of, the k signiﬁcant entries (i.e., frequencies) o f ˆ A given a freq uency-sparse (i. e., smoo th) signal A . In this case the deterministic Fourier CS measurement matrixes, M · Ψ , produ ced by [28, 26, 6, 7] requ ire sup er-linear O ( K N ) -time to multiply b y A in Equation 1. Similarly , the energetic frequen cy recovery procedure of [ 4, 9] requir es super-linear time in N . Hence, non e of [4, 28, 9, 26, 6, 7] h av e bo th sub-linear measureme nt and reco nstruction time. Existing r andomize d sub -linear time Fourier algor ithms [15, 19, 1 6] n ot only show great p romise fo r decr easing measuremen t costs, but also fo r speeding u p the nu merical solution o f co mputation ally challenging multi-scale pro b- lems [8, 18]. Howe ver , these algorithms ar e not deterministic and so can produce incor rect re sults with some small probab ility on each inp ut signal. Thus, they aren’t appr opriate for long-li ved f ailure into lerant applications. In this paper we build on the deterministic Com pressed Sensin g m ethods of Corm ode and Muthuk rishnan (CM) [26, 6 , 7 ] in ord er to construct th e ﬁrst known deterministic sub-lin ear time sparse Fourier algorith m. In ord er to produ ce our new Fourier algorithm we mu st m odify CM’ s work in two ways. First, w e alter CM’ s m easurement construction in order to allow sub-linear time computatio n of Fourier measuremen ts via aliasing. Thus, our algorithm can determin istically appro ximate the result of Equ ation 1 in time K 2 · polylog ( N ). Second, CM use a k -strong ly selecti ve collectio n of sets [17] to construct th eir measurem ents for algeb raically compr essible signals. W e in troduce the gen eralized no tion of a K -majority k -strongly selective co llection of sets which lead s us to a new r econstructio n algorithm with better alg ebraic com pressibility resu lts than CM’ s algo rithm. As a result, our determ inistic sub-linea r time Fourier algorithm has better then pre vio usly possible algeb raic compressibility behavior . The main contributions of this pap er are: 1. W e present a new determ inistic com pressed sensing algo rithm that both ( i ) improves on CM’ s algeb raically compressible signal results, and ( ii ) has compa rable measurement/ru n time requ irements to CM’ s algorithm for exponentially decaying signals. 2. W e p resent the ﬁrst kn own deterministic sub-lin ear time spar se DFT . In th e p rocess, we explicitly dem onstrate the connection between compressed sensing and sub-linear time F ou rier transform methods. 3. W e in troduce K - majority k -strongly selective collections of sets which have potential ap plications to streaming algorithm s along the lines of [25, 13]. The remaind er o f this p aper is organized as follo ws: In section 2 we intro duce relev ant deﬁnition s a nd termino l- ogy . Then , in section 3 we deﬁne K -majo rity k -strongly selecti ve collectio ns o f sets an d use th em to construct our compressed sensing measuremen ts. Section 4 con tains our new determin istic compressed sensing algo rithm along with an alysis of it’ s accura cy and run time. Finally , we pr esent our deter ministic sub-linear time Fourier algo rithm in sections 5 and 5.1. Section 6 co ntains a short conclusion. 2 Pr eliminaries Throu ghout th e r emainder o f th is pap er we will be inter ested in complex-valued f unctions f : [0 , 2 π ] → C a nd signals (o r arr ays) of len gth N contain ing f values at various x ∈ [0 , 2 π ] . W e shall deno te such signals by A , where A ( j ) ∈ C is th e signa l’ s j th complex value f or a ll j ∈ [0 , N − 1 ] ⊂ N . Hereafter we will refer to the process of eith er calculating, measur ing, or retrieving the f value associa ted any A ( j ) ∈ C fro m m achine me mory as sa mpling fr om f and/or A . Given a signal A we d eﬁne its discrete L 2 -norm , or Euclide an norm, to be k A k 2 = v u t N − 1 X j = 0 | A ( j ) | 2 . W e will also re fer to k A k 2 2 as A ’ s energy . For any signal, A , its Discrete Fourier Transform (DFT), denoted ˆ A , is ano ther signal o f leng th N deﬁned as fo llows: ˆ A ( ω ) = 1 √ N N − 1 X j = 0 e − 2 π i ω j N A ( j ) , ∀ ω ∈ [0 , N − 1] . Furthermo re, we may recover A from its DFT via the In verse Discrete Fourier T ran sform (IDFT) as follows: A ( j ) = d ˆ A − 1 ( j ) = 1 √ N N − 1 X ω = 0 e 2 π i ω j N ˆ A ( ω ) , ∀ j ∈ [0 , N − 1] . W e will refer to any index, ω , of ˆ A as a frequen cy . Furth ermore, we will refer to ˆ A ( ω ) as freq uency ω ’ s coefﬁcient for each ω ∈ [0 , N − 1 ] . Parse val’ s equ ality tells us that k ˆ A k 2 = k A k 2 for any signa l. I n o ther words, the DFT p reserves Euclidean n orm and en ergy . Note th at any no n-zero coefﬁcient fr equency will con tribute to ˆ A ’ s energy . Hence , we will also refer to | ˆ A ( ω ) | 2 as frequen cy ω ’ s energy . I f | ˆ A ( ω ) | is relatively large we’ll say that ω is energetic. Our a lgorithm p rodu ces ou tput o f the fo rm ( ω 1 , C 1 ) , . . . , ( ω B , C B ) where each ( ω j , C j ) ∈ [0 , N − 1] × C . W e will refer to any such set of B < N tuples n ( ω j , C j ) ∈ [0 , N − 1] × C s.t. 1 ≤ j ≤ B o as a sparse Fourier representation and d enote it with a superscript ‘s’. Note that if we are gi ven a sparse Fourier representatio n, ˆ R s , we may consider ˆ R s to be a length- N signal. W e simply v iew ˆ R s as the N leng th signal ˆ R ( j ) = ( C j if ( j , C j ) ∈ b R s 0 otherwise for all j ∈ [0 , N − 1] . Using this idea we may , for example, compu te R fro m ˆ R s via the IDFT . A B term/tu ple sparse Fourier represen tation is B -op timal fo r a sign al A if it contain s B of the most energetic frequen cies of ˆ A along with their coefﬁcients. More precisely , we’ll say that a sparse F our ier representation ˆ R s = n ( ω j , C j ) ∈ [0 , N − 1] × C s.t. 1 ≤ j ≤ B o is B -optimal for A if there exists a v alid orderin g of ˆ A ’ s coefﬁcients by magnitude | ˆ A ( ω 1 ) | ≥ | ˆ A ( ω 2 ) | ≥ . . . ≥ | ˆ A ( ω j ) | ≥ . . . ≥ | ˆ A ( ω N ) | (3) so that n ( ω l , ˆ A ( ω l ))    l ∈ [ 1 , B ] o = ˆ R s . Note that a signal ma y h ave several B -op timal Fourier representations if its frequen cy coefﬁcient magnitud es ar e non-uniqu e. For example, there are tw o 1-optimal sparse Fourier representatio ns for the signal A ( j ) = 2 e 2 π i j N + 2 e 4 π i j N , N > 2 . Howe ver , all B -optim al F ourier representation s, ˆ R s opt , for any signal A will always ha ve both t he same uniq ue k R opt k 2 and k A − R opt k 2 values. W e co ntinue with two ﬁnal deﬁnitions: Let ω b be a b th most energetic f requen cy as per E quation 3. W e will say that a signal ˆ A is (algebraica lly) p -compressible f or so me p > 1 if | A ( ω b ) | = O ( b − p ) fo r all b ∈ [1 , N ) . If R s opt is a B -optimal Fourier representation we can see that k A − R opt k 2 2 = N − 1 X b = B | A ( ω b ) | 2 2 = O Z ∞ B b − 2 p db ! = O ( B 1 − 2 p ) . (4) Hence, any p -comp ressible sign al A (i.e., any signal with a ﬁxed c ∈ R so that | A ( ω b ) | 2 ≤ c · b − p for all b ∈ [1 , N ) ) will ha ve k A − R opt B k 2 2 ≤ ˜ c p · B 1 − 2 p for som e ˜ c p ∈ R . For any p -comp ressible sign al class (i.e., for any choice o f p and c ) we will refer to the rela ted optimal O ( B 1 − 2 p ) -size worst case error v alue (i.e., Equatio n 4 above) as k C opt B k 2 2 . Similarly , we d eﬁne an exponentially compressible (o r exponent ially decaying ) sign al for a ﬁxed α to be one for which | ˆ A ( ω b ) | = O (2 − α b ) . The op timal worst case error is then k C opt B k 2 2 = O Z ∞ B 4 − α b db ! = O (4 − α B ) . (5) Fix δ small (e. g., δ = 0 . 1 ). Giv en a com pressible input sign al, A , our d eterministic Fourier algor ithm will iden tify B of the most en ergetic frequenc ies from ˆ A and ap proxim ate their coefﬁcients to produc e a Fourier representation ˆ R s with k A − R k 2 2 ≤ k A − R opt k 2 2 + δ k C opt B k 2 2 . These are the same types of compr essible sig nal results proven by C M [6, 7]. 3 Construction of Measur ements W e will use the fo llowing types of subset collections to form our measurem ents: Deﬁnition 1 A collection, S , of s subsets of [0 , N ) is called K -majo rity k -strongly selective if for any X ⊂ [0 , N ) with | X | ≤ k , and for all x ∈ X , the following ar e true: ( i ) x belong s to K su bsets in S and, ( ii ) more tha n two-thir ds of S j ∈ S containin g x a r e such tha t S j ∩ X = { x } ( i.e., every member of X occurs sepa rated fr om a ll oth er members of X in more than two-th ir ds of the K S -subsets it belong s to). A 1-majority k -stro ngly selectiv e c ollection of sets is an examp le of a k -strongly selective collect ion of sets [1 7, 26]. Note that a K -majority k -strongly selecti ve collection of subsets contains many k -stron gly s elective collection s of subsets (i.e., it has repeated strong selecti v ity). Thu s, o ur newly deﬁned K -majo rity k -strongly selecti ve collections are help us cou nt how many times each small subset element is isolated. Th is added structure allo ws a ne w reconstruction algorithm (Algorith m 1) with better algeb raic compressibility properties than pre viou s methods. Next, we will build O (log N ) K - majority k -strong ly selective collec tions of su bsets. Each of these O (log N ) co l- lections will ultimately be u sed to determine energetic freq uencies m odulo a sma ll prime < N . These mod uli will then be used alon g with the Chinese Remainde r Theorem to recon struct each energetic frequency in a manner akin to the intr oduction ’ s mo ti vating example. Our techniqu e is motiv ated by the meth od of prime gro upings ﬁr st em ployed in [ 25]. T o b egin, we will denote e ach of the O (log N ) collections of subsets by S l where 0 ≤ l ≤ O (log N ) . W e construct each of these K -majority k - strongly selecti ve collections as follows: Deﬁne p 0 = 1 and let p 1 , p 2 , . . . , p l , . . . , p m be the ﬁrst m primes where m is such that m − 1 Y l = 1 p l ≤ N k ≤ m Y l = 1 p l . Hence, p l is the l th prime natural numb er and we h av e p 0 = 1 , p 1 = 2 , p 2 = 3 , p 3 = 5 , . . . , p m = O ( m log m ) . Note that we k now p m = O ( m log m ) via th e Prime Num ber Theor em, and so p m = O (log N log log N ) . Each p l will correspo nd to a different K -majority k -stron gly s elective collection of s ub sets of [0 , N ) = { 0 , . . . , N − 1 } . Along the same lines we let q 1 throug h q K be the ﬁrst K (to b e speciﬁed later) consequitive p rimes such that max( p m , k ) ≤ q 1 ≤ q 2 ≤ . . . ≤ q K . W e are now r eady to b uild S 0 , our ﬁ rst K -majo rity k-strongly s elective collection of sets . W e begin by letting S 0 , j , h for all 1 ≤ j ≤ K and 0 ≤ h ≤ q j − 1 b e S 0 , j , h = { n ∈ [0 , N ) | n ≡ h mod q j } . Next, we progressiv ely deﬁne S 0 , j to be all integer residues mod q j , i.e., S 0 , j = { S 0 , j , h | h ∈ [0 , q j ) } , and conclud e by setting S 0 equal to all K such q j residue groups: S 0 = ∪ K j = 1 S 0 , j . More genera lly , we deﬁn e S l for 0 ≤ l ≤ m as follows: S l = ∪ K j = 1 n { n ∈ [0 , N ) | n ≡ h mod p l q j }    h ∈ [0 , p l q j ) o . Lemma 1 F ix k . If we set K ≥ 3 ( k − 1) ⌊ log k N ⌋ + 1 then S 0 will be a K - majority k -str ong ly selective collection o f sets. Fu rthermor e, if K = O ( k log k N ) then |S 0 | = O  k 2 log 2 k N · max(log k , log log k N )  . Pr o of: Let X ⊂ [0 , N ) be such th at | X | ≤ k . Furthe rmore, let x , y ∈ X be such that x , y . By the Chinese Remainder Theorem we know th at x and y may o nly c ollide modu lo at most ⌊ log k N ⌋ of the K q -primes q K ≥ . . . ≥ q 1 ≥ k . Hen ce, x may collide with all the other elements of X (i.e., with X − { x } ) m odulo at most ( k − 1 ) ⌊ log k N ⌋ q -primes. W e can n ow see that x will be isolated fro m all other elements of X modu lo at least K − ( k − 1) ⌊ log k N ⌋ ≥ 2( k − 1 ) ⌊ log k N ⌋ + 1 > 2 K 3 q -primes. T his leads us to the conclusion that S 0 is indeed K -m ajority k -strongly selecti ve. Finally , we have that |S 0 | ≤ K X j = 1 q j ≤ K · q K . Furthermo re, gi ven that K > max( k , m ) , the Prime Num ber Theorem tells us th at q K = O ( K log K ) . Thus, we can see that S 0 will indeed contain O  k 2 log 2 k N · max(log k , log log k N )  sets. ✷ Note that at least O ( k log k N ) prime s are required in order to create a ( K -majo rity) k -strongly separating collection of subsets using p rimes in this fashion . Given any x ∈ [ 0 , N ) a k − 1 elemen t subset X can be cr eated via the Chin ese Remainder Theo rem and x moduli so that every element of X collides with x in any desired O (log k N ) q -p rimes. W e next consider the properties of the other m collections we ha ve deﬁned: S 1 , . . . , S m . Lemma 2 Let S l , j , h = { n ∈ [0 , N ) | n ≡ h mo d p l q j } , X ⊂ [ 0 , N ) have ≤ k elements, and x ∈ X . Furthermore , suppose that S 0 , j , h ∩ X = { x } . Then, for all l ∈ [1 , m ] , ther e exists a unique b ∈ [ 0 , p l ) so that S l , j , h + b · q j ∩ X = { x } . Pr o of: Fix any l ∈ [1 , m ] . S 0 , j , h ∩ X = { x } im plies that x = h + a · q j for some unique in teger a . Using a ’ s unique r ep- resentation modu lo p l (i.e., a = b + c · p l ) we get that x = h + b · q j + c · q j p l . Hence, we c an see that x ∈ S l , j , h + bq j . Furthermo re, no other element of X is in S l , j , h + t · q j for any t ∈ [0 , p l ) since it’ s inc lusion therein would imply that i t was also an element of S 0 , j , h . ✷ Note that Lemma 2 and Lemma 1 tog ether imply th at each S 1 , . . . , S m is also a K -m ajority k -strongly separatin g collection of subsets. Also , we can see that if x ∈ S l , j , h + b · q j we can ﬁnd x mod p l by simply compu ting h + bq j mod p l . Finally , we form our measure ment matrix. Set S = ∪ m l = 0 S l . T o form our mea surement matrix , M , we simp ly create one row for each S l , j , h ∈ S by co mputing the N -length chara cteristic fun ction vector of S l , j , h , deno ted χ S l , j , h . This lead s to M being a ˜ O ( k 2 ) x N measurement matrix. Here we boun d the nu mber o f rows in M b y n oting that: ( i ) |S| < m · K · p m q K , ( ii ) m = O (log N ) , ( iii ) p m = O ( log N · log log N ) , ( iv ) K = O ( k log N ) , and ( v ) q K = O ( K log K ) . 4 Signal Reconstruction from Measur ements Let ˆ A b e an N - length signal of complex number s with it’ s N entrie s numb ered 0 th rough N − 1 . Our goal is to identify B of th e largest mag nitude entries of ˆ A (i.e., the ﬁrst B en tries in a valid o rdering o f ˆ A as in Equation 3) and then estimate their signal values. T oward this end , set ǫ = | ˆ A ( ω B ) | √ 2 C (6) where C > 1 is a constant to be speciﬁed later , and let B ′ be the smallest integer such that N − 1 X b = B ′ | ˆ A ( ω b ) | < ǫ 2 . (7) Algorithm 1 S PA R S E A P P RO X I M A T E 1: Input: Signal ˆ A, integers B , B ′ 2: Output: ˆ R s , a sparse repr esentatio n for ˆ A 3: Initialize ˆ R s ← ∅ 4: Set K = 3 B ′ ⌊ log B ′ N ⌋ + 1 5: Form measurement matrix, M , via K - majority B ′ -strongly selecti ve collections (Section 3) 6: Compute M · ˆ A I D E N T I FI C A T I O N 7: f or j fro m 1 to K do 8: So rt h χ S 0 , j , 0 , ˆ A i , h χ S 0 , j , 1 , ˆ A i , . . . , h χ S 0 , j , q j − 1 , ˆ A i by magn itude 9: for b from 1 to B ′ + 1 do 10: k j , b ← b th largest mag nitude h χ S 0 , j , · , ˆ A i -measuremen t 11: r 0 , b ← k j , b ’ s associated residu e mod q j (i.e., the · in h χ S 0 , j , · , ˆ A i ) 12: for l from 1 t o m do 13: t min ← min t ∈ [0 , p l ) | k j , b − h χ S l , j , t · q j + r 0 , b , ˆ A i| 14: r l , b ← r 0 , b + t min · q j mod p l 15: end for 16: Construct ω j , b from r 0 , b , . . . , r m , b via the Chinese Remainder Theore m 17: end for 18: end f or 19: Sort ω j , b ’ s maintain ing duplicates and set C ( ω j , b ) = the num ber of times ω j , b was constructed via line 16 E S T I M A T I O N 20: f or j f rom 1 to K do 21: for b from 1 to B ′ + 1 do 22: if C ( ω j , b ) > 2 K 3 then 23: C ( ω j , b ) ← 0 24: x = median { real( k j ′ , b ′ ) | ω j ′ , b ′ = ω j , b } 25: y = median { imag( k j ′ , b ′ ) | ω j ′ , b ′ = ω j , b } 26: ˆ R s ← ˆ R s ∪ { ( ω j , b , x + i y ) } 27: end if 28: end for 29: end f or 30: Output B largest magnitude entries in ˆ R s Note that B ′ is deﬁned to be the last p ossible signiﬁcant freque ncy (i.e., with energy > a fraction of | ˆ A ( ω B ) | ). W e expect to work with sparse/co mpressible signals so tha t B ≤ B ′ ≪ N . Later we will give speciﬁc values for C and B ′ depend ing on B , the desired a pprox imation er ror, and ˆ A ’ s com pressibility characteristics. For now we show that we can identify/ap proxim ate B o f ˆ A ’ s largest magnitude entries each to within ǫ -precision via Algorithm 1. Algorithm 1 works by using S 0 measuremen ts to separate ˆ A ’ s sign iﬁcantly energetic frequenc ies Ω = { ω 1 , . . . , ω B ′ } ⊂ [0 , N ) . Every measurem ent which successfully separates an energetic frequency ω j from all other memb ers of Ω will both ( i ) provide a go od (i.e., within ǫ 2 ≤ | ˆ A ( ω B ) | 2 √ 2 ) coefﬁcient estimate for ω j , and ( ii ) yield inform ation about ω j ’ s iden- tity . Frequ ency separation occurs because ou r S 0 measuremen ts can’t co llide any ﬁxed ω j ∈ Ω with any o ther memb er of Ω modu lo more then ( B ′ − 1 ) log B ′ N q -primes (see Le mma 1). Th erefore, more than 2 3 rds of S 0 ’ s 3 B ′ log B ′ N + 1 q -primes will isolate any ﬁxed ω j ∈ Ω . This mean s th at ou r rec onstruction algo rithm will iden tify all fre quencies at least as energetic as ω B at least 2 B ′ log B ′ N + 1 times. W e can igno re any frequ encies that aren’t recov ered this often. On the other hand, for any fre quency that is identiﬁed more then 2 B ′ log B ′ N times, at most B ′ log B ′ N of the measure - ments which lead to this identiﬁcation c an be signiﬁcantly c ontaminated via collisions with Ω members. Th erefore, we can take a median of the more than 2 B ′ log B ′ N measurements leading to the r ecovery of each freque ncy as th at Algorithm 2 F O U R I E R M E A S U R E 1: Input: f - samples, integers m , K 2: Output: < χ S l , j , h , ˆ f > -measurements 3: Zero a O ( q K p m ) -element array , A 4: f or j fro m 1 to K do 5: for l from 1 to m do 6: A ← f (0) , f  2 π q j p l  , . . . , f  2 π ( q j p l − 1) q j p l  7: Calculate ˆ A via FFT 8: < χ S l , j , h , ˆ f > ← ˆ A ( h ) for each h ∈ [0 , q j p l ) 9: end for 10: end f or 11: Output < χ S l , j , h , ˆ f > -measurem ents frequen cy’ s co efﬁcient estimate. Since mo re than half of these measu rements must be accura te, th e m edian will be accurate. Th e following Theorem is proved in the appen dix. Theorem 2 Let ˆ R opt be a B -optimal F ourier r epr esentatio n for our input signal A . Then, the B term r epr esentatio n ˆ R s r eturned fr om Algorithm 1 i s such that k A − R k 2 2 ≤ k A − R opt k 2 2 + 6 B ·| ˆ A ( ω B ) | 2 C . F urthermor e, Algorithm 1’ s Identiﬁcation and Estimation (lines 7 - 30) run time is O ( B ′ 2 log 4 N ) . The nu mber of measur ements used is O ( B ′ 2 log 6 N ) . Theorem 2 immedia tely indicates that Algorith m 1 giv es us a d eterministic O ( B 2 log 6 N ) -measur ement, O ( B 2 log 4 N ) - reconstruc tion time method for exactly recovering B -support vectors. I f ˆ A is a B -suppo rt vector th en setting B ′ = B + O (1) and C = 1 will b e sufﬁcient to gu arantee that bo th | ˆ A ( ω B ) | 2 = 0 an d P N − 1 b = B ′ | ˆ A ( ω b ) | = 0 are tru e. Hen ce, we may apply Theorem 2 with B ′ = B + O (1) and C = 1 to obtain a perfect reconstruction via Algorithm 1. H owe ver , we are main ly interested in the m ore realistic cases where ˆ A is either algebraically o r expon entially com pressible. The following theorem (proved in the appendix) presents itself. Theorem 3 Let ˆ A be p -comp r essible. Then, Algo rithm 1 can r eturn a B term sparse repr esentatio n ˆ R s with k A − R k 2 2 ≤ k A − R opt k 2 2 + δ k C opt B k 2 2 using O  B 2 p p − 1 δ 2 1 − p log 4 N  total identiﬁ cation/estimation time and O  B 2 p p − 1 δ 2 1 − p log 6 N  measur ements. If ˆ A decays e x ponentia lly , Algorithm 1 can r etu rn a B term sparse r epresentation, ˆ R s , with k A − R k 2 2 ≤ k A − R opt k 2 2 + δ k C opt B k 2 2 using both  B 2 + log 2 δ − 1 α  · polylog( N ) measurements and identiﬁcatio n/estimation time. For p -compressible signals, p > 2 , CM’ s alg orithm [6, 7] takes O  B 6 p p − 2 δ 6 2 − p log 6 N  - identiﬁcation /estimation time and O  B 4 p p − 2 δ 4 2 − p log 4 N  -measurem ents to a chieve the same error bo und. As a con crete c omparison , CM’ s algorithm requires O ( B 18 δ − 6 log 6 N ) - id entiﬁcation/estimation time and O ( B 12 δ − 4 log 4 N ) -measur ements for 3-com pressible signals. Alg orithm 1, on the othe r hand, r equires only O ( B 3 δ − 1 log 4 N ) - identiﬁcatio n/estimation time and O ( B 3 δ − 1 log 6 N ) - measuremen ts. Hen ce, we h av e improved on CM’ s alg ebraic compressibility r esults. All that’ s left to do in order to develop a deter ministic sub-linear time Fourier algorithm is to com pute o ur CS Fourier measuremen ts (Algo rithm 1 lines 1 - 6) in sub-linea r time. 5 Sub-linear Time F ourier Measur ement Acquisition Our goa l in this section is to d emonstrate how to use Algo rithm 1 as means to app roximate the Fourier tr ansform of a sign al/function f : [0 , 2 π ] → C , wh ere ( i ) f h as an integrable p th deriv ative, and ( ii ) f (0 ) = f (2 π ) , f ′ (0) = f ′ (2 π ) , . . . , f ( p − 2) (0) = f ( p − 2) (2 π ) . In this case we k now the Fourier co efﬁcients f or f to be p -comp ressible [3 , 12]. Hence, for N = q 1 · p 1 · · · p m sufﬁciently large, if we can collect the necessary Algorithm 1 (line 5 and 6) measur ements in sub-linear time we will indeed be able to use Algorithm 1 as a sub-linea r time Fourier algorithm for f . Note th at in ord er to validate the use of Algor ithm 1 (or any other sparse ap proxim ate Fourier Transform m ethod [15, 16]) we m ust a ssume that f exhib its som e m ulti-scale beh avior . If ˆ f contains n o un predictab ly en ergetic large (relative to the number of desire d Fourier coefﬁcients) frequencies then it is m ore compu tationally efﬁcient to simply use standard FFT/USFFT meth ods [5, 22, 1, 10, 11]. The responsib le user , therefore , is no t entirely released from the obligation to consider ˆ f ’ s likely charac teristics before proceeding with computations. Choose a ny Sec tion 3 q -p rime q j , j ∈ [1 , K ] , an d any p - prime p l with l ∈ [0 , m ] . Fu rthermo re, p ick h ∈ [0 , q j p l ) . Throu ghout th e rest o f this discussion we will co nsider f to be accessible to samp ling at any desired pred etermined positions t ∈ [ 0 , 2 π ] . Given th is assumptio n we may sample f at t = 0 , 2 π q j p l , . . . , 2 π ( q j p l − 1) q j p l in orde r to perfor m the following DFT com putation: < χ S l , j , h , ˆ f > = 1 q j p l q j p l − 1 X k = 0 f 2 π k q j p l ! e − 2 π i hk q j p l . V ia aliasing [3] this redu ces to 1 q j p l q j p l − 1 X k = 0 f 2 π k q j p l ! e − 2 π i hk q j p l = 1 q j p l q j p l − 1 X k = 0       ∞ X ω = −∞ ˆ f ( ω ) e 2 π i ω k q j p l       e − 2 π i hk q j p l = 1 q j p l ∞ X ω = −∞ ˆ f ( ω ) q j p l − 1 X k = 0 e 2 π i ( ω − h ) k q j p l = X ω ≡ h mod q j p l ˆ f ( ω ) . Using Sectio ns 3 and 4 we c an see that these m easurements ar e exactly wh at we need in order to deter mine B of th e most energetic frequencies of ˆ f modulo N = q 1 · p 1 · · · p m (i.e., B of the m ost energetic freq uencies of f ’ s band -limited interpolan t’ s DFT). W e are now in the p osition to modify Alg orithm 1 in order to ﬁnd a spar se Fourier r epresentation fo r ˆ f . T o do so we proceed as follows: First, remove lines 5 and 6 and replace them with Algorithm 2 for compu ting all the necessary < χ S l , j , h , ˆ f > -measurem ents. Seco nd, repla ce each “ < χ S l , j , h , ˆ A > ” by “ < χ S l , j , h , ˆ f > ” in Alg orithm 1’ s I D E N T I FI C A T I O N section. It remains to sho w th at these Algo rithm 1 mo diﬁcations inde ed y ield a sub -linear time app roximate Fourier transform . Th e following theor em pre sents itself (see append ix for proo f): Theorem 4 Let f : [0 , 2 π ] → C ha ve ( i ) an integr able p th derivative, an d ( ii ) f (0 ) = f (2 π ) , f ′ (0) = f ′ (2 π ) , . . . , f ( p − 2) (0) = f ( p − 2) (2 π ) for some p > 1 . F urthermor e, assume tha t ˆ f ’s B ′ = O  B 2 p p − 1 ǫ 2 1 − p  lar gest magnitude fr equen cies all belon g to  − l N 2 m , j N 2 ki . Th en, we may use Algorithm 1 to return a B term sparse F ourier repr esen tation, ˆ R s , for ˆ f such th at k ˆ f − ˆ R k 2 2 ≤ k ˆ f − ˆ R opt k 2 2 + δ k C opt B k 2 2 using O  B 2 p p − 1 δ 2 1 − p log 7 N  -time and O  B 2 p p − 1 δ 2 1 − p log 6 N  -measurements fr om f . If f : [0 , 2 π ] → C is smooth (i.e., has inﬁnitely many continu ous deriv atives on the unit circle where 0 is identiﬁed with 2 π ) it f ollows from Theore m 4 that Algorith m 1 can be used to ﬁnd an δ -accurate, with δ = O  1 N  , sparse B -term F our ier repr esentation for ˆ f u sing ˜ O ( B 2 ) -time/measur ements. This result differs from previous sub-line ar time Fourier algor ithms [15, 16] in that both the alg orithm and the me asurements/samp les it requ ires are deterministic. Recall th at the deterministic nature o f the alg orithm’ s requ ired sample s is poten tially b eneﬁcial fo r failure intolera nt hardware. In signal proce ssing app lications the sub-Nyq uist samplin g required to comp ute Alg orithm 1’ s < χ S l , j , h , ˆ f > - measuremen ts could be accom plished via ˜ O ( B ) parallel low-rate analog -to-dig ital co n verters. 5.1 DFT from Inaccess ible Signa l Samples Throu ghout th e remainder of this section we will consider our N -length compressible vector ˆ A to be the produ ct o f the N x N DFT m atrix, Ψ , and a non-sparse N -length vector A . Thus, ˆ A = Ψ A . Furthermo re, we will assume that A con tains equally spaced sample s from some un known smoo th fun ction f : [0 , 2 π ] → C (e.g. , A ’ s b and-limited interpolent) . Henc e, A ( j ) = f 2 π j N ! , j ∈ [0 , N ) . W e would like to u se our mo diﬁed Algorithm 1 along with A lgorithm 2 to ﬁnd a sparse Fourier rep resentation for ˆ A . Howe ver , unless N q j p l ∈ N for all q j p l -pairs (wh ich would imply f had been gro ssly oversampled), A won’t contain all the f -samp les require d b y Algorithm 2. Not having ac cess to f directly , an d restricting ourselves to sub-linear time approa ches only , we ha ve little recourse b ut to loca lly interpolate f a round our required samples. For each required Alg orithm 2 f -sample at t = 2 π h q j p l , h ∈ [0 , q j p l ) , we may app roximate f ( t ) to within O ( N − 2 κ ) -error by constructin g 2 local interpolents (one real, one imagina ry) around t using A ’ s near est 2 κ entries [14]. These errors in f -samples ca n lead to erro rs of size O ( N − 2 κ · p m q K log p m q K ) in ou r < χ S l , j , h , ˆ f > calculations. Howe ver , as long as the < χ S l , j , h , ˆ f > -measurem ent errors are small eno ugh (i.e., of size O ( δ · B − p ) in the p -compr essible case) Theorem 4 and all related Section 4 results and will still hold . Using the proof of Theo rems 2 and 3 along with some scratch w ork we can see that using 2 κ = O (log δ − 1 + p ) interpolation poin ts per f -sample ensures all our < χ S l , j , h , ˆ f > -measurem ent errors are O ( δ · B − p ) . W e have the follo wing result: Theorem 5 Let ˆ A = Ψ A be p -co mpr essible. Then, we may use Algo rithms 1 and 2 to r eturn a B term spa rse r epr esentation , ˆ R s , for ˆ A such that k A − R k 2 2 ≤ k A − R opt k 2 2 + δ k C opt B k 2 2 using ˜ O  B 2 p p − 1 δ 2 1 − p (log δ − 1 + p ) 2  -time and ˜ O  B 2 p p − 1 δ 2 1 − p (log δ − 1 + p )  -samples fr om A . Notice tha t Theor em 5 no longer guaran tees an δ = O ( 1 N ) -accura te ˜ O ( B 2 ) -time D FT algorith m for smooth data (i.e., A ’ s containing samp les from a smoo th function f ). This is because as p → ∞ we require an increa singly large nu mber of interpolatio n points per f -sam ple in order to gu arantee our < χ S l , j , h , ˆ f > -measurem ents remain O ( δ · B − p ) -accura te. Howe ver , for δ = O (log − 1 N ) , we can still consider smooth data A to be O (log N ) -compr essible and so achiev e a ˜ O ( B 2 ) -time DFT algorithm. 6 Conclusion Compressed Sensing (CS ) metho ds provide algorithms for app roximatin g the result of any large matrix multiplica- tion as lo ng as it is known in advance th at the result w ill be spar se/compressible. Hence, CS is potentially valuable for many n umerical ap plications such as those inv olving m ulti-scale aspe cts [ 8, 1 8]. In this p aper we used CS meth - ods to develop the ﬁrst known deter ministic sub- linear time sparse Fourier tran sform algo rithm. In th e p rocess, we introdu ced a new deterministic Compr essed Sensing alg orithm along th e lines o f Cormod e and Muthu krishnan (CM) [6, 7]. Our new deterministic CS algorithm improves on CM’ s algebr aic compressibility results while simultaneously maintaining their results concern ing e xpo nential compr essibility . Compressed Sensing is clo sely related to hashin g methods, combinato rial group testing, and many o ther algor ithmic problem s [25, 13]. T hus, K -m ajority k -strong ly selecti ve collectio ns of sets and Algo rithm 1 sho uld help imp rove results concern ing algebraica lly co mpressible (at ea ch momen t in time) stream h ashing/heavy-hitter identiﬁcation . Further development of th ese/other algo rithmic app lications is left as future work. It is also worthwhile to note that Monte Carlo Fourier results similar to those of [16] may be obtain ed by altering our measuremen t co nstruction in Section 3. If we con struct our S l collections by u sing only a sma ll subset o f rando mly chosen q j ’ s we will still loca te all sufﬁciently energetic entries of ˆ A with h igh pr obability . The entries’ coefﬁcients can then be appro ximated by standard USFFT techniqu es [16 , 10, 11, 22]. 7 Acknowledgmen ts W e would like to thank Graham Cormod e and S. Muthukrishn an for answering questions a bout their work. W e would also like to thank Martin Strau ss, Anna Gilbert, Joel Lep ak, and Hualong Feng f or helpful discussions, adv ice, and commen ts. Refer ences [1] C. Ander son and M. D. Dahleh. Rapid computation o f the discrete Fourier transfor m. SIAM J. S ci. Com put. , 17:913 –919 , 199 6. [2] L. I. Bluestein. A Lin ear Filtering App roach to the Comp utation of Discrete Fourier Transform. IEEE T ransac- tions on Audio an d Electr oa coustics , 18:451–4 55, 1970. [3] J. P . B oyd. Chebyshev and F ourier Spectral Method s . Dover Publica tions, Inc., 2001. [4] E. Can des, J. Romb erg, and T . T ao. Robust un certainty p rinciples: Exact sign al reconstruc tion fro m h ighly incomplete frequ ency informatio n. IEEE T rans. Inform. Theory , 52:489– 509, 2006 . [5] J. Cooley and J. T ukey . An algorithm for the machin e calculation of complex Four ier series. Math. Comput. , 19:297 –301 , 196 5. [6] G. Corm ode and S. Muthu krishnan . Combinato rial Algorithm s for Compr essed Sensing. T echnical Rep ort DIMACS TR 2005 -40 , 2005. [7] G. Cor mode an d S. Muthu krishnan. Combinatorial Alg orithms fo r Com pressed Sensing . Confer ence on Infor- mation Sciences and Systems , March 2006. [8] I. Daubechies, O. Run borg, and J. Zou . A spar se spectral meth od for homog enization multiscale p roblems. Multiscale Model. Sim. , 2007 . [9] R. A. DeV ore. Deterministic co nstruction s of compr essed sensing matrices. http://www .ima .umn.ed u/2006 - 2007/ND6. 4-15. 07/activities/DeV or e-Ronald/Henrykﬁnal.pdf , 2007. [10] A. Dutt an d V . Rokh lin. Fast Four ier transfo rms for no nequispace d data. SIAM J. Sci. Compu t. , 14:1 368– 1383, 1993. [11] J. A. Fessler and B. P . Sutton. No nunif orm Fast fo urier transform s u sing min-max interp olation. IEEE T rans. Signal Pr o c. , 51:560 –574, 200 3. [12] G. B. Folland. F ourier Analysis and Its Applicatio ns . Brooks/Cole Publishing Company , 1992. [13] S. Gan guly and A. Majumder. CR-p recis: A determin istic summary structure for update data streams. ArXiv Computer Science e-prints , Sept. 2006. [14] C. F . Gerald and P . O. Wheatley. Applied Numerical Ana lysis . Add ison-W esley Publishing Company , 199 4. [15] A. Gilbe rt, S. Guha, P . Indyk, S. Mu thukrish nan, a nd M. Strauss. Near-optimal sparse Fou rier estimation v ia sampling. ACM STOC , pages 152 –161 , 2002. [16] A. Gilber t, S. Mu thukrishn an, and M. Strauss. Improved time bo unds for near-optimal sparse Fou rier rep resen- tations. S PIE , 2005. [17] P . Indyk . Explicit constru ctions of selectors and related c ombinato rial structures, with ap plications. In SOD A ’02: Pr o ceedings of the thirteenth annu al ACM-SIAM symposium on Discr ete algorithms , pages 697–7 04, Philad el- phia, P A, USA, 2 002. Society for Industrial and Applied Mathematics. [18] M. A. Iwen. Unp ublished Results. http://www-personal.umich.edu/ mar kiwen/ . [19] M. A. Iwen, A. C. Gilbert, and M . J. Str auss. Empirical evaluation of a sub -linear time spar se DFT algor ithm. Submitted for Publication , 2007. [20] S. Kirolos, J. Laska, M. W ak in, M. Dua rte, D. Baron, T . Ragheb, Y . Massoud, and R. Baraniuk. Analo g-to- informa tion con version via random demodulation . Pr oc. IEEE Dallas Cir cuits and Systems Conference , 200 6. [21] J. Laska, S. Kirolos, Y . Massou d, R. Baraniuk , A. Gilb ert, M. Iwen, a nd M. Strau ss. Rand om samplin g for analog-to -infor mation conv ersion of wideband signa ls. Pr oc. IEEE Dallas Cir cuits an d Systems Conference , 2006. [22] J.-Y . Lee and L. Greengard. The typ e 3 nonunifo rm FFT and its applications. J Comput. Phys. , 206:1– 5, 2005 . [23] M. Lustig, D. Donoh o, and J. Pauly . Sparse MRI: The application of compressed sensing for rapid MR imaging. Submitted for publication , 200 7. [24] R. Maleh , A. C. Gilbert, and M. J. Strauss. Signal re covery fro m partial inf ormation via orthogona l match ing pursuit. I EEE Int. Conf. on Image Pr ocessing , 2007. [25] S. Muthuk rishnan. Data Streams: Algorithms and Ap plications. F ounda tions a nd T r end s in Theo r etical Com- puter Science , 1, 2005. [26] S. Muthuk rishnan. Some Algorithmic Problems and R esults in Compressed Sensing . Allerton Confer ence , 200 6. [27] L. Rabin er , R. Sch afer, and C. Rader . The Chirp z -Transfor m Algorithm. IEEE T ransactions on Audio and Electr o acoustics , A U-17(2 ):86–9 2, Ju ne 1969. [28] J. T rop p an d A. Gilbert. Sig nal recovery f rom partial inf ormation via orth ogon al match ing pursuit. Submitted for Publication , 2005. A Proof of T heorem 2 W e begin by proving two lemma s. Lemma 3 IDENTIFICA TION: Lines 7 thr ough 19 of Algo rithm 1 ar e gua ranteed to r ecover all valid ω 1 , . . . , ω B (i.e., all ω with | ˆ A ( ω ) | 2 ≥ | ˆ A ( ω B ) | 2 - there may be > B such entries) mor e then 2 K 3 times. Hen ce, d espite line 2 2, a n en try for all such ω b , 1 ≤ b ≤ B , will be added to ˆ R s in line 26. Pr o of: Because o f the construction of S 0 (i.e., pro of of Lem ma 1) we know that f or each b ∈ [1 , B ] there exist m ore then 2 K 3 subsets S ∈ S 0 such that S ∩ { ω b ′ | b ′ ∈ [1 , B ′ ] } = { ω b } . Choo se any b ∈ [1 , B ] . Deno te the q -primes that isolate ω b from all of ω 1 , . . . , ω b − 1 , ω b + 1 , . . . , ω B ′ by q j 1 , q j 2 , . . . , q j K ′ , 2 K 3 < K ′ ≤ K . W e n ext sh ow that, for eac h k ′ ∈ [1 , K ′ ] , we get < χ S 0 , j k ′ ,ω b mod q j k ′ , A > a s one of the B ′ + 1 largest mag nitude < χ S 0 , j k ′ , · , ˆ A > -m easuremen ts identiﬁed in line 10. Choose any k ′ ∈ [1 , K ′ ] . W e know that ǫ 2 < ǫ √ 2 < | ˆ A ( ω B ) | − √ 2 N − 1 X b ′ = B ′ | ˆ A ( ω b ′ ) | ≤ | ˆ A ( ω b ) | −        X b ′ ∈ [ B ′ , N ) , ω b ′ ≡ ω b ˆ A ( ω b ′ )        ≤       < χ S 0 , j k ′ ,ω b mod q j k ′ , ˆ A >       . W e also know that the ( B ′ + 1) st largest me asurement L 2 -magnitu de mu st be < ǫ 2 . Hence, we are guaranteed to execute lines 12-1 5 with an r 0 , · = ω b mod q j k ′ . Choose any l ∈ [ 1 , m ] and s et Ω = n ω b ′    b ′ ∈ [ B ′ , N ) , ω b ′ ≡ ω b mod q j k ′ , ω b ′ . ω b mod q j k ′ p l o . Using Le mma 2 we can see that line 13 inspects all the nec essary re sidues of ω b mod p l q j k ′ . T o see that t min will be chosen correctly we note ﬁrst that       < χ S 0 , j k ′ ,ω b mod q j k ′ , ˆ A > − < χ S 0 , j k ′ ,ω b mod p l q j k ′ , ˆ A >       =        X ω b ′ ∈ Ω ˆ A ( ω b ′ )        ≤ √ 2 X ω b ′ ∈ Ω | ˆ A ( ω b ′ ) | . Furthermo re, setti ng r 0 , · = ω b mod q j k ′ and Ω ′ = n ω b ′    b ′ ∈ [ B ′ , N ) , ω b ′ ≡ ω b mod q j k ′ , ω b ′ . ( r 0 , · + tq j k ′ ) mod q j k ′ p l for some t with ( r 0 , · + tq j k ′ ) . ω b mod q j k ′ p l o , we have √ 2 X ω b ′ ∈ Ω | ˆ A ( ω b ′ ) | < ǫ √ 2 < | ˆ A ( ω B ) | − √ 2 N − 1 X b ′ = B ′ | ˆ A ( ω b ′ ) | ≤ | ˆ A ( ω b ) | −        X ω b ′ ∈ Ω ′ ˆ A ( ω b ′ )        . Finally we can see that | ˆ A ( ω b ) | −        X ω b ′ ∈ Ω ′ ˆ A ( ω b ′ )        ≤       < χ S 0 , j k ′ ,ω b mod q j k ′ , ˆ A > − < χ S 0 , j k ′ , ( r 0 , · + tq j k ′ ) . ω b mod p l q j k ′ , ˆ A >       . Hence, lines 13 and 14 wil l indeed select the correct residue for ω b modulo p l . Ther efore, line 16 will correctly recon- struct ω b at least K ′ > 2 K 3 times. ✷ Lemma 4 ESTIMA TION: Every ( ω, ˜ ˆ A ω ) stor ed in ˆ R s in line 27 is such that | ˆ A ( ω ) − ˜ ˆ A ω | 2 < ǫ . Pr o of: Suppose that ( ω, ˜ ˆ A ω ) is stored in ˆ R s . T his only happen s if ˆ A ( ω ) has been estimated by < χ S 0 , j ,ω mod q j , ˆ A > = X ˜ ω ≡ ω mo d q j ˆ A ( ˜ ω ) for m ore the n 2 K 3 q j -primes. Th e only way that any such estimate can have | ˆ A ( ω ) − < χ S 0 , j ,ω mod q j , ˆ A > | 1 ≥ ǫ √ 2 is if ω collides with one of ω 1 , . . . , ω B ′ modulo q j (this is d ue to th e deﬁn ition of B ′ in Equation 7) . By th e pro of of Lemma 1 we know this can happen at mo st B ′ ⌊ log B ′ N ⌋ < K 3 times. Hence, more then half of the 2 K 3 estimates, ˜ ˆ A ′ ω , must b e such that | ˆ A ( ω ) − ˜ ˆ A ′ ω | 1 < ǫ √ 2 . It follows that taking medians as per lines 24 and 25 will result in th e d esired ǫ -accurate estimate for ˆ A ( ω ) . ✷ W e are now ready to prove Theore m 2. Theorem 2 Let ˆ R opt be a B -op timal F o urier r epresentation fo r our input signal ˆ A . Then, the B term r epresentation ˆ R s r eturned fr om Algorithm 1 i s such that k A − R k 2 2 ≤ k A − R opt k 2 2 + 6 B ·| ˆ A ( ω B ) | 2 C . F urthermor e, Algorithm 1’ s Identiﬁcation and Estimation (lines 7 - 30) run time is O ( B ′ 2 log 4 N ) . The nu mber of measur ements used is O ( B ′ 2 log 6 N ) . Pr o of: Choose any b ∈ (0 , B ] . Using Lem mas 3 and 4 we can see that only w ay some ω b < ˆ R s B is if there exists some associated b ′ ∈ ( B , N ) so that ω b ′ ∈ ˆ R s and | ˆ A ( ω B ) | + ǫ ≥ | ˆ A ( ω b ′ ) | + ǫ > | ˜ ˆ A ω b ′ | ≥ | ˜ ˆ A ω b | > | ˆ A ( ω b ) | − ǫ ≥ | ˆ A ( ω B ) | − ǫ. In this case we’ll hav e 2 ǫ > | ˆ A ( ω b ) | − | ˆ A ( ω b ′ ) | ≥ 0 so that | ˆ A ( ω b ′ ) | 2 + 4 ǫ  ǫ + | ˆ A ( ω B ) |  ≥ | ˆ A ( ω b ′ ) | 2 + 4 ǫ  ǫ + | ˆ A ( ω b ′ ) |  > | ˆ A ( ω b ) | 2 . (8) Now using Lemma 4 we can see that k ˆ A − ˆ R k 2 = X ( ω, · ) < ˆ R s | ˆ A ( ω ) | 2 + X ( ω, ˜ ˆ A ω ) ∈ ˆ R s | ˆ A ( ω ) − ˜ ˆ A ω | 2 < X ( ω, · ) < ˆ R s | ˆ A ( ω ) | 2 + B · ǫ 2 . Furthermo re, we have B · ǫ 2 + X ( ω, · ) < ˆ R s | ˆ A ( ω ) | 2 = B · ǫ 2 + X b ∈ (0 , B ] , ω b < ˆ R s | ˆ A ( ω b ) | 2 + X b ′ ∈ ( B , N ) , ω b ′ < ˆ R s | ˆ A ( ω b ′ ) | 2 . Using observation 8 above we can see that this last e xpr ession is bounded above by B · (5 ǫ 2 + 4 ǫ | ˆ A ( ω B ) | ) + X b ′ ∈ [ B , N ) , ω b ′ ∈ ˆ R s | ˆ A ( ω b ′ ) | 2 + X b ′ ∈ ( B , N ) , ω b ′ < ˆ R s | ˆ A ( ω b ′ ) | 2 ≤ k ˆ A − ˆ R opt k 2 2 + B · (5 ǫ 2 + 4 ǫ | ˆ A ( ω B ) | ) . Substituting for ǫ (see Eq uation 6) gi ves us our result. Mainly , B · (5 ǫ 2 + 4 ǫ | ˆ A ( ω B ) | ) = B | ˆ A ( ω B ) | 2 C  5 2 C + 2 √ 2  < 6 B | ˆ A ( ω B ) | 2 C . W e next focus on run time . Alg orithm 1’ s Id entiﬁcation (i.e., lin es 7 throug h 19) run time is domin ated by the O ( KB ′ m ) executions o f lin e 13 . And, each execution of line 13 takes time O ( p m ) . Henc e, given that m = O (log N ) , p m = O (log N · log log N ) , an d K = O ( B ′ log B ′ N ) , w e can see that Identiﬁcation req uires O ( B ′ 2 · log B ′ N · log 2 N · log log N ) - time. Continuing , Alg orithm 1’ s Estimation (i.e., lin es 20 thro ugh 30) ru n time is u ltimately deter mined by line 22’ s I F -statem ent. Altho ugh lin e 22 is executed O ( KB ′ ) = O ( B ′ 2 log B ′ N ) times, it can only evaluate to tr ue O ( B ′ ) times. Hence, each line 24/25 O ( B ′ log B ′ N log B ′ ) -time median operation will be e valuated at most O ( B ′ ) times. The result- ing Estimation runtime is therefo re O ( B ′ 2 log B ′ N log B ′ ) . T o bou nd the num ber of measuremen ts we rec all that: ( i ) the numb er of m easurements is < m · K · p m q K , ( ii ) m = O (log N ) , ( iii ) p m = O (log N · log log N ) , ( iv ) K = O ( B ′ log N ) , and ( v ) q K = O ( K log K ) . Hence, th e nu mber of measuremen ts is O  K 2 log K log 2 N log log N  . Su bstituting for K gives us the desired bou nd. ✷ B Proof of Th eorem 3 Theorem 3 Let ˆ A be p -compr essible. Then, Algorithm 1 can r eturn a B term sparse r epresentation ˆ R s with k A − R k 2 2 ≤ k A − R opt k 2 2 + δ k C opt B k 2 2 using O  B 2 p p − 1 δ 2 1 − p log 4 N  total iden tiﬁcation/estima tion time and O  B 2 p p − 1 δ 2 1 − p log 6 N  mea- sur ements. If ˆ A decays exponentially , Algorithm 1 can return a B term sparse r epres enta tion, ˆ R s , with k A − R k 2 2 ≤ k A − R opt k 2 2 + δ k C opt B k 2 2 using both  B 2 + log 2 δ − 1 α  · polylog( N ) measurements and identiﬁcatio n/estimation time. Pr o of: W e ﬁrst d eal with the a lgebraically compressible case. W e have to determine our Algorith m 1’ s B ′ and Theor em 2 ’ s C variables. Moving toward that goal we no te that 6 B · | ˆ A ( ω B ) | 2 C = 1 C O  B − 2 p + 1  = O  1 C  k C opt B k 2 2 . After looking at Theorem 2 we can see that we must use C = O  1 δ  and a B ′ (see Equatio ns 6 an d 7) so that N − 1 X b = B ′ | ˆ A ( ω b ) | = O ( δ · | ˆ A ( ω B ) | ) = O ( δ · B − p ) . Continuing , N − 1 X b = B ′ | ˆ A ( ω b ) | 2 = O Z ∞ B ′ b − p db ! = O ( B ′ 1 − p ) . Hence, we m ust use B ′ = O  δ 1 1 − p B p p − 1  . App lying Theorem 2 gives us Algorithm 1’ s run time and nu mber of requ ired measuremen ts. W e next deal with the exponentially comp ressible case. Now , as befor e, we ha ve to determ ine our Algorithm 1 ’ s B ′ and Theor em 2’ s C variables. T o do so we note that 6 B · | ˆ A ( ω B ) | 2 C = B C O  4 − α B  = O  B C  k C opt B k 2 2 . After looking at Theorem 2 we can see that we must use C = O  B δ  and a B ′ so that N − 1 X b = B ′ | ˆ A ( ω b ) | 2 = O δ · | ˆ A ( ω B ) | B ! = O  δ · 2 − α B − log B  . Continuing , N − 1 X b = B ′ | ˆ A ( ω b ) | 2 = O        ∞ X b = B ′ 2 − α b        = O  2 − α B ′  . Hence, we must use B ′ = O  B + log  B δ  1 α  . Ap plying Theorem 2 gives u s Algorithm 1’ s r untime and num ber of required measurements. ✷ C Proof of T heorem 4 Theorem 4 Let f : [ 0 , 2 π ] → C have ( i ) a n inte grable p th derivative, an d ( ii ) f (0 ) = f (2 π ) , f ′ (0) = f ′ (2 π ) , . . . , f ( p − 2) (0) = f ( p − 2) (2 π ) for some p > 1 . F urthermor e, assume tha t ˆ f ’s B ′ = O  B 2 p p − 1 ǫ 2 1 − p  lar gest magnitude fr equen cies all belon g to  − l N 2 m , j N 2 ki . Th en, we may use Algorithm 1 to return a B term sparse F o urier r epresentation, ˆ R s , for ˆ f such th at k ˆ f − ˆ R k 2 2 ≤ k ˆ f − ˆ R opt k 2 2 + δ k C opt B k 2 2 using O  B 2 p p − 1 δ 2 1 − p log 7 N  -time and O  B 2 p p − 1 δ 2 1 − p log 6 N  -measurements fr om f . Pr o of: W e note that Theorems 2 and 3 still hold for p -compr essible in ﬁnite signals/vectors ˆ A (i.e., signals with ∞ -length ). For the purposes of proof we may consider ˆ A to be formed by any bijecti ve mapping g : N → Z so that both g ( [0 , N ) ) =  −  N 2  ,  N 2  and g ( n ) ≡ n m od N , fo r all n ∈ N , are true. W e then set ˆ A ( n ) = ˆ f  g ( n )  , f or all n ∈ N . In this case we note that ˆ f ’ s B ′ largest m agnitude frequen cies belo nging to  − l N 2 m , j N 2 ki implies our that K -majo rity B ′ -strongly separating collection of (inﬁnite) subsets, S , will still co rrectly isolate all the B ′ most energetic frequency positions in ˆ A . Continuing , we may still consider our inﬁnite length p -comp ressible, with p > 1 , signal ˆ A (i.e., the Fourier co - efﬁcient series ˆ f ) to be so rted by magnitude for the p urpose of id entifying valid ω 1 , ω 2 , etc. . Fur thermor e, we may bound the (now inﬁn ite) sums of ˆ A ’ s en tries’ m agnitudes by the sam e integrals as above. The proof s of th e the- orems/supp orting lemmas will go thro ugh exactly as befor e if we consider all the ﬁnite sum s in th eir proo fs to be absolutely conv ergent inﬁnite sums. The only real difference from our work in Section 4 is that we are computing our < χ S l , j , h , ˆ A > -m easuremen ts differently . Similar to The orem 2, the requir ed n umber of measu rements from f will be O  B 2 p p − 1 ǫ 2 1 − p log 6 N  . This is exactly because for each ( q j , p l ) -pair we compute all the measurements n < χ S l , j , h , ˆ f >    h ∈ [0 , q j p l ) o via on e FFT requir ing q j p l samples. Hen ce, the number of samples f rom f we use is once again bou nded ab ove by m · K · p m q K . Fu rthermo re, each of the mK FFT’ s will take O ( p m q K log p m q K ) -time (despite the signal lengths’ factor - izations [2, 27]). T he result follows. ✷

A Deterministic Sub-linear Time Sparse Fourier Algorithm via Non-adaptive Compressed Sensing Methods

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment