Uncertainty Principle and Sparse Reconstruction in Pairs of Orthonormal Rational Function Bases

Uncertain t y Principle and Sp arse Reconstruction in P airs of Orthonormal Rational F unction Bases Dan Xiong 1 , Li Chai 2 , Jingxin Zhang 3 1. Sc h o ol of Information Science and Engineering 2. Engineering Researc h C en ter of Metallurgical Automation and Measurement T ec hnology W uhan Universit y of Science and T echnolog y , Hub ei, W uhan, 43008 1, China 3. Sc h o ol of Soft wa r e and Electrical Engineering Swinburne Univ ersity of T echnolog y , Melb ourn e, VIC3122, Australia Abstract : Most rational systems can b e describ ed in terms o f orthonormal basis functions. This paper considers the reconstruction o f a sparse co eﬃcien t v ector for a rational transfer function under a pair of orthonormal rational function bases and from a limited n um b er of linear frequency-domain measuremen ts. W e prov e the uncer- tain ty principle concerning pairs o f compressible represen tatio n of orthonormal rational functions in the inﬁnite dime nsional function s pace. The unique ness of compressible represen tation using suc h pairs is prov ided as a direct conseque nce of uncertain ty prin- ciple. The b ound of the num b er of measuremen ts which guarante es the replacemen t of 1 0 optimization searc hing for the unique sparse reconstruction b y 1 1 optimization using ra ndom sampling on the unit circle with high probability is provide d as w ell. Key w ords : sparse system r epresen t a tion; pairs of orthonormal rational function bases; uncertain ty principle; l 1 optimization. 1 In tro duc t ion A n umber of signal represen tations hav e b een dev elop ed for the v ariet y of the basis, suc h as sin usoids, wa v elets [1 ], Wilson bases [2], ridgelets [3] a nd curv elets [4 ]. The ob ject of interes t is the eﬀectiv e represen tation which requires v ery few signiﬁcan t co eﬃcien ts. Ho w ev er, diﬀeren t basis sho ws its own adv an t a ge in diﬀeren t asp ect when represen ting the signal eﬀectiv ely . F o r example, w av elets p erfo rm relativ ely p o orly on high-frequency sinus o ids, f or whic h sin usoids ar e v ery eﬀectiv e. On the o ther hand, sin usoids p erfo r m p o orly on impulsiv e even ts, fo r whic h w a vele ts are ve ry eﬀectiv e. Then a natura l question arises: can w e get a muc h shorter (sparser) represen tation using terms from eac h of sev era l diﬀeren t bases. T his question is p osed b y Donoho and his cow orker, whic h has led to the use of dictionaries made from a concatenation of sev era l orthonormal ba ses, and to seek represen tations of a signal S as S = n X i =1 γ φ i φ i + n X i =1 γ ψ i ψ i = [Φ Ψ] γ , (1.1) where Φ = { φ 1 , φ 2 , · · · , φ n } and Ψ = { ψ 1 , ψ 2 , · · · , ψ n } a r e tw o orthonor mal bases for R n with n orthogonal v ectors of unit length, and γ T = [ γ φ 1 , γ φ 2 , · · · , γ φ n , γ ψ 1 , γ ψ 2 , · · · , γ ψ n ] ∈ R 2 n is the represen ta t io n co eﬃcien t . The uniqueness o f the sparse represen tation and the solution of the unique rep- resen tation are tw o k ey problems. Notice that (1.1) is an underdetermined set of n equations with 2 n unkno wns, the unique represen tation needs additional requiremen t, that is the sparsit y . Hence w e need to minimize the num b er of the nonzero elemen t in γ . This can b e expressed as an optimization problem ( P 0 ) : min γ k γ k 0 s.t. S = [Φ Ψ] γ . The results hav e b een exploited b y Donoho and Huo in [5]: if S is represen table as a highly sparse sup erp osition of a t o ms from timeCfrequency dictionary (Φ is the spik e basis and Ψ is the F ourier basis), then there is only one suc h highly sparse represen tation o f S . Sp eciﬁcally , the uniqueness of t he solution to the ( P 0 ) problem is ensured f or k γ k 0 < 1 2 (1 + 1 M ), where M = sup 1 ≤ i,j ≤ n |h φ i , ψ j i| , a nd the solution can b e obtained by solving the conv ex optimization problem to minimize the 1-norm of the co eﬃcien ts among all decomp ositions: ( P 1 ) : min γ k γ k 1 s.t. S = [Φ Ψ] γ . Underlying this result is a general uncertain ty principle whic h states that if tw o orthonormal bases are m utually incoheren t , no nonzero signal can hav e a sparse rep- resen tation in b oth bases sim ultaneously . In mathematical terminology , if signal S is expresse d in each basis resp ectiv ely S = n X i =1 α i φ i = n X i =1 β i ψ i , then we hav e k α k 0 + k β k 0 ≥ 1 + M − 1 , where α := [ α 1 , α 2 , · · · , α n ] T and β := [ β 1 , β 2 , · · · , β n ] T . The uncertain ty principle can b e dated back to the presen tation in the setting of discrete sequences, con tinuous and discrete-time functions, a nd fo r sev eral measures of “concen tration” (e.g., L 2 and L 1 measures)[6]. A v ery general uncertain t y principle f or op erators on Banac h spaces is given in [7 ], including uncertain ty principles for Bessel sequence s in Hilb ert spaces and for integral op erators b et w een measure spaces. In the setting of pair of orthonormal bases, uncertain t y principle holds for a v ariet y of in teresting basis pairs, not just sinu soids and spik es [5]. Related phenomena hold for functions of a real v ariable, with basis pa ir s suc h as sin usoids and w av elets, and for functions of t wo v aria bles, with basis pairs suc h as w av elets and ridgelets. In these settings, if a function is represen table b y a suﬃcien tly sparse superp osition of terms tak en from b oth bases, then there is o nly one such sparse represen tation, and it ma y b e obtained b y minimum 1-norm atomic decomp osition. In the ﬁeld of signal and systems , since the uncertaint y principle leads to the unique- ness of a sparse solution and prov ides the theoretical foundat io n fo r reconstruction metho ds, it plays an imp ortant role in compressed sensing (CS) [8 -11], whic h is a new framew ork for sim ulta neous sampling and compression of signals and has drawn m uc h atten tion since its adv ent sev eral y ears ago . It is w ell kno wn from the CS literature that a signal ma y ha ve a muc h sparser represen ta tion in an o v ercomplete ba sis (redundan t dictionary) consisting of concatenated orthogona l bases [5, 12-16 ]. In the con t ext of ﬁnite dimensional v ector spaces, [12], [13] and [13 ] hav e presen ted and ana lyzed the sparse represen tation of v ector signals under a pair of ortho no r mal bases. In comparison to the result of [5], [12] has presen ted an impro v ed uncertain ty principle with b etter b ounds yielding uniqueness of the ( P 0 ) solution. The bound ensuring the uniqueness o f the ( P 1 ) solution is a c hiev ed for k γ k 0 < 1 M , whic h enlarges the class of signals whose optimal sparse represen tat ion can b e fo und by a simple , linear-programming based searc h. As analyzed in [12] and [14], in t he ﬁnite dimensional v ector space, the uncertain t y principle concerning pairs of represen tations of R N v ectors in diﬀerent orthonor ma l bases has a direct impact on the uniqueness prop erty of the sparse represen tation of suc h v ectors using pair s of orthonormal ba ses as o v ercomplete dictionaries. Hen ce the uncertaint y principle and uniqueness are fundamen ta lly instrumen tal to the sparse represen tation o f signals under the pairs of orthonormal bases. Rational functions are widely used in signal pro cessing and control to mo del b oth signals and dynamic systems. This pap er inv estigat es the sparse represen tation of a rational tra nsfer function under the pairs of or t honormal ratio nal function bases and the reconstruction of the sparse co eﬃcien t fo r the rational transfer function under suc h pairs. Reconstruction using the generalized orthogonal basis function (GOBF ) w as in- tro duced in to system identiﬁcation since the w ork of [17]. Rational tr a snfer functions sho ws the adv an tag es in improving eﬃciency of the represen ta tion of linear systems, and ort honormalit y leads to a great simpliﬁcation of the analysis and syn thesis in- v olve d in using the ba sis functions. Iden t iﬁcatio n and con tr o l of linear stable dynamic systems using orthogona l rationa l functions (ORFs) hav e b een widely studied o ver the last t we n ty y ears, see for instance [17- 31]. In the con text of signal pro cessing, the class of ratio na l orthono r ma l basis functions has turned out to b e particularly useful. These ba ses and the tr a nsformations that are related to them, suc h as the rational orthonormal ﬁlter structures introduced in t he 19 50’s by Kautz, Huggins a nd Y oung [32], [33], hav e prov ed to b e instrumen ta l to the analysis of sev eral pr o blems arising from the signal and systems theory . Orthonormal bases a r e useful to ols in man y other branc hes of science as w ell. F or example, in the mathematical literature, the ratio- nal o r thonormal functions are utilized in ra tional appro ximations and in terp olations, whic h are prov ed to b e in terconnected with the least-squares problem and ha v e b een assem bled a nd f ur t her deve lop ed by W alsh [34 ]. Recen tly , combined with the theory of compressed sensing, sparse system identiﬁ- cation using ORFs has b een inv estigated [35 - 39]. In the contex t of ﬁnite dimensional function spaces, [40] has discussed the random sampling in the b ounded orthonorma l systems with o ne orthonormal basis fro m the p ersp ectiv e of structured random matrix. Ho we v er, the essence of these metho ds is to ﬁnd a sparse represen tation of t he system under a single ORF basis, whic h may not yield t he sparsest solution. Diﬀeren t from the aforemen tioned w ork, we ﬁrst extend the results of [12] and [14 ] to inﬁnite dimensional f unction space to establish the uncertaint y principle and uniqueness of compressible represen tation for rat io nal transfer functions, using the unifo r m b ound of maximal absolute inner pro duct of the pair of the OR F s as an index. W e then deriv e a compressed sensing formulation f or ﬁnding a sparse represen tation of a rational transfer function in pairs of ORF bases using the frequency domain measuremen t sampled on the unit circle. W e giv e a low er bound of the num b er of measuremen ts whic h guaran tees the replacemen t of the l 0 optimization b y the l 1 optimization under a pair of ORF bases. The con tributions of t his pap er are: • Analysis on the uncertain t y principle and uniqueness fo r compressible represen- tation of rational transfer functions in pair s of ORF bases, whic h extends the results of [12 ] to inﬁnite dimensional f unction space. • A nov el reconstruction metho d for ra t ional transfer functions with ﬁnite-order com bination of t wo ORF bases. • Analysis on the low er b o und of the num b er of measuremen ts which guaran tees exact reconstruction by the l 1 optimization under a pair o f OR F bases. The rest of this pap er is organized as follo ws. In Section I I, the uncertaint y principle and uniqueness for compressible represen ta tion of rational transfer functions a r e giv en. The sparse system reconstruction using tw o orthonormal bases is given in Section I I I. Section IV presen ts the lo w er b ound of the num b er of measuremen ts which guarantees the replacemen t of the l 0 optimization by the l 1 optimization, t he pro of of whic h is giv en in Section V. Section VI concludes the pap er and discussed the future w ork. All the o mitted pro ofs are presen ted in App endix A - C. 2 Uncertain t y Pri n ciple and Uniqueness for com- pressib le repres en tatio n of transfer fun c tions T ransfer functions of L TI systems can b e put in a Hilb ert space framew ork. The Hardy space H 2 is a Hilb ert space with the inner pro duct b etw een tw o rational functions F ( z ) a nd G ( z ) deﬁned a s h F ( z ) , G ( z ) i = 1 2 π i I T F ( z ) G ( z ) dz z = 1 2 π Z 2 π 0 F ( e iω ) G ( e iω ) dω , (2.1) where T = { z | | z | = 1 } . And the corresp onding norm in the space H 2 is deﬁned a s k F ( z ) k H 2 = p h F ( z ) , F ( z ) i , (2.2) whic h is simply denoted b y k F ( z ) k hereafter. In this pap er, the space of pro p er, stable, real-rational transfer functions is of inte rest and is referred to as RH 2 , a subspace of H 2 . Giv en a transfer f unction H ( z ) ∈ RH 2 , it has a unique represen tation in ev ery ORF basis of this space. If { φ k ( z ) } ∞ k =1 is a n O R F basis, then w e ha ve H ( z ) = ∞ X k =1 α k φ k ( z ) , where α k = h H ( z ) , φ k ( z ) i . Supp ose we ha v e t w o diﬀerent ORF bases { φ k ( z ) } ∞ k =1 , { ψ l ( z ) } ∞ l =1 in the R H 2 space. Then ev ery tr ansfer function has a unique represen tat io n under the t w o bases resp ec- tiv ely , denoted as H ( z ) = ∞ X k =1 α k φ k ( z ) = ∞ X l =1 β l ψ l ( z ) . (2.3) Ob viously , if H ( z ) has a stable p ole, t hen the represen tation o f H ( z ) using impulse resp onse is inﬁnite. According to classical sampling theory , a large n um b er of sampling data are required to guarantee the appro ximation p erformance. Ho wev er, if the p ole of the selected rational bases is exactly the same as that of H ( z ), then the represen ta t io n will b e m uch shorter. A natural question arises no w: can w e get a muc h shorter (sparser) represen tation of H ( z ) in a join t, o v ercomplete set of general rational baese, sa y { Φ( z ) , Ψ( z ) } = { φ 1 ( z ) , φ 2 ( z ) , · · · , ψ 1 ( z ) , ψ 2 ( z ) , · · · } , and reconstruct the transfer function with few er o bserv ations? The follo wing tw o theorems are established to ensure the uniqueness pro p ert y of the compressible represen tation o f transfer function in space H 2 using pair s of OR F bases. As the compressible represen t a tion is of in terest, w e assume that the coeﬃcien ts of the tra nsfer functions discusse d in this pap er f orm a sequence in l 1 , that is 0 ≤ k α k 1 = ∞ X k =1 | α k | < ∞ . (2.4) The sparsit y deﬁned in [12 ] cannot b e used for ra tional transfer functions, whic h ha ve inﬁnite impulse co eﬃcien ts. W e ﬁrst presen t a new deﬁnition of sparsity , called ε -sparsit y and then establish the uncertain t y principle that leads to the b ound yielding uniqueness of the sparse represen tat io n. Deﬁnition 1. F or a ﬁ x e d thr esho l d ε > 0 and an inﬁnite se quenc e α = ( α 1 , α 2 , · · · ) T in l 1 , let N ε ( α ) = min { K : ∞ X k = K | α k | ≤ ε } , the ε -supp ort of α is deﬁne d as Γ ε ( α ) = { k : | α k | 6 = 0 , 1 ≤ k < N ε ( α ) } , and the c ar dinality of Γ ε ( α ) as the ε -0 norm of α , denote d b y k α k 0( ε ) . Remark 1. Equation (2.4) guar ante es the existenc e of N ε ( α ) . I f ε = 0 , then ε -supp ort Γ ε ( α ) for α is the supp ort of α in the gen er al sense, i.e. { k : | α k | 6 = 0 } . Deﬁnition 2. F or a given p ositive inte ger s, the c o eﬃcient α is ( ε, s ) -sp arse in the sense of ε -0 norm if k α k 0( ε ) ≤ s . F or br evity, we c al l the c o eﬃcient α ε -sp arse if the value of s is not c onc erne d. Deﬁnition 3. A r ational tr ansfer function is ( ε, s ) -s p arse if the r epr esentation c o eﬃ- cient under a orthonorma l r ational func tion b asis is ( ε, s ) -sp arse. If ε = 0, then k α k 0( ε ) = k α k 0 , whic h is the n umber of nonzeros in α , and the ( ε, s )- sparsit y b ecomes the s-sparsit y in compressed sensing. How ev er it should b e noted that for a rational transfer function with p oles aw a y from zero, its impulse resp o nse is usually inﬁnite and cannot b e (0 , s )-sparse corresp onding to ε = 0, whic h shows that the deﬁnition o f the sparsit y in traditional compressed sensing is not applicable to the sparsity of the r a tional transfer function. The uncertain ty principle cannot b e deriv ed either. In this fo llo wing, based on the ( ε, s )-sparsit y , w e pr esen t the uncertaint y principle and uniquenes s for the r epresen t ation of the rational transfer functions under t wo ORF bases. Theorem 1. (Unc ertainty Principle) L et H ( z ) ∈ RH 2 b e a tr ansfer function that c an b e r epr esente d b o th as H ( z ) = P ∞ k =1 α k φ k ( z ) and H ( z ) = P ∞ l =1 β l ψ l ( z ) . F or a ﬁ xe d thr eshol d ε > 0 , Γ ε ( α ) a n d Γ ε ( β ) ar e the ε -supp orts o f α and β , r esp e ctively, wh o se c ar dinal i ties ar e k α k 0( ε ) and k β k 0( ε ) , then for al l such p airs of r epr esentation we have ( q k α k 0( ε ) + ε ) 2 + ( q k β k 0( ε ) + ε ) 2 ≥ 2 µ , wher e µ = sup k ,l |h φ k ( z ) , ψ l ( z ) i | (2.5) and h φ k ( z ) , ψ l ( z ) i is the inner pr o duct of φ k ( z ) and ψ l ( z ) deﬁne d in (2.1). F ol lowing the terminol o gy of c om p r ess e d sensing, we c al l µ the mutual c oher enc e of two ORF b as e s { φ k ( z ) } and { ψ l ( z ) } . Pro of: See App endix A. Uncertain ty Principle rev eals that in the p ersp ectiv e of ε -0 norm, a transfer func- tion H ( z ) having a sparse represen tation in the joint set of t w o ORF ba ses will hav e highly nonsparse represen tation in either o f these bases a lone. And the uncertaint y principle directly determines the b ound whic h g uaran t ees the uniqueness o f the sparse represen tation in a pair of ORF bases. W e giv e a new sparsit y deﬁnition whic h is used to build the uniqueness theorem. Deﬁnition 4. A r ational tr ansfer function H ( z ) is ( ε, s ) -sp arse in the p airs of or- thonormal r ational function b ases if the r epr esentation o f H ( z ) under a orthonorm a l r ational function b asis, such as H ( z ) = ∞ X k =1 θ φ k φ k ( z ) + ∞ X l =1 θ ψ l ψ l ( z ) (2.6) with the c o eﬃcie n t sa tisfying ( ε, s ) -sp arsity. F or br evity, denote θ 1 = ( θ φ 1 , θ φ 2 , · · · ) and θ 2 = ( θ ψ 1 , θ ψ 2 , · · · ) , so the ( ε , s ) -sp arsity for the r ational tr ansfer function H ( z ) i s e q uiv- alenc e to k θ 1 k 0( ε ) + k θ 2 k 0( ε ) ≤ s . Theorem 2. (Uniqueness) F or a ﬁxe d thr eshold ε > 0 , assume H ( z ) is ( ε, s ) -sp arse under a p air of ORF b ases { φ k ( z ) } ∞ k =1 and { ψ l ( z ) } ∞ l =1 . Then the r epr e s e ntation (2 .6) is unique if ( q k θ 1 k 0( ε ) + ε ) 2 + ( q k θ 2 k 0( ε ) + ε ) 2 < 1 µ , wher e θ 1 and θ 2 as deﬁne d in Deﬁnition 4. Pro of: See App endix A. Remark 2. I f ε = 0 and the se quenc e is ﬁnite, then the r e s ults of Th e or em 1 and 2 ar e p ar al lel to the r esults in [12]. The sparsit y b ound determined b y µ is crucial. Since the transfer function can b e represen ted using the impulse resp onses, a simple calculation form ula of µ a s sho wn b elo w can b e obtained. Lemma 1. F or two arbitr ary ORF b ases { φ k ( z ) } ∞ k =1 and { ψ l ( z ) } ∞ l =1 , µ = sup k ,l | ∞ X d =0 b dk a dl | , wher e { b dk } and { a dl } ar e the impulse r esp onses of the two b ases, r esp e c tivel y. Pr o of. The impulse resp onses of bases φ k ( z ) a nd ψ l ( z ) can b e written a s φ k ( z ) := ∞ X d ′ =0 b d ′ k z − d ′ , k = 1 , 2 , · · · , and ψ l ( z ) := ∞ X d =0 a dl z − d , l = 1 , 2 , · · · , resp ectiv ely . Then the inner pro duct of φ k ( z ) a nd ψ l ( z ) b ecomes h φ k ( z ) , ψ l ( z ) i = h ∞ X d ′ =0 b d ′ k z − d ′ , ∞ X d =0 a dl z − d i = ∞ X d ′ =0 ∞ X d =0 b d ′ k a dl h z − d ′ , z − d i = ∞ X d =0 b dk a dl , whic h equals the inner pro duct of the impulse resp onses of the tw o basis functions. With µ deﬁned in (2.5 ), the claim follows. Lemma 1 show s that mutual coherence µ is the uniform b ound o f the maximal absolute inner pro duct of the impulse resp onses of the t wo basis functions. 3 Sparse sys t em reco nstruct ion using pairs of o r - thonormal rational funct ion bases Giv en a pair of ORF bases { φ k ( z ) } and { ψ l ( z ) } , assume that H ( z ) has a ε - sparse represen tation under suc h pairs as in (2.6). W e w ant to reconstruct H ( z ) with a small fraction o f the measuremen ts of H ( z ) on the unit circle. Precisely , deﬁne T N := { z r = e 2 π i ( r − 1) / N , r = 1 , 2 , · · · , N } . W e fo cus on the underdetermined case with only a f ew of the comp onents of { H ( z r ) , r = 1 , 2 , · · · , N } sampled or observ ed. That is, only a small fraction of T N is kno wn. Giv en a subset Ω ⊂ { 1 , 2 , · · · , N } o f size | Ω | = m . The goal is to reconstruct the represen tation co eﬃcien t and hence t he transfer function H ( z ) from the muc h shorter m -dimensional measuremen ts { H ( z r ) , r ∈ Ω } . No w w e will restate this pro blem in a matrix form. Com bining with Deﬁnition 1, H ( z ) can b e rewritten as H ( z ) = n 1 X k =1 θ φ k φ k ( z ) + n 2 X l =1 θ ψ l ψ l ( z ) + ∆ 1 + ∆ 2 , where n 1 = N ε ( θ 1 ) − 1 and n 2 = N ε ( θ 2 ) − 1, ∆ 1 = ∞ P k = n 1 +1 θ φ k φ k ( z ) a nd ∆ 2 = ∞ P k = n 2 +1 θ ψ l ψ l ( z ). By simple calculation, w e hav e k ∆ 1 k 2 = h ∞ X k = n 1 +1 θ φ k φ k ( z ) , ∞ X k = n 1 +1 θ φ k φ k ( z ) i = ∞ X k = n 1 +1 | θ φ k | 2 ≤ ε 2 . Similarly , w e ha v e k ∆ 2 k 2 ≤ ε 2 . Denote ∆ = ∆ 1 + ∆ 2 , then we ha v e k ∆ k ≤ p 2( k ∆ 1 k 2 + k ∆ 2 k 2 ) ≤ 2 ε. No w the transfer function H ( z ) can b e simpliﬁed as H ( z ) = n 1 X k =1 θ φ k φ k ( z ) + n 2 X l =1 θ ψ l ψ l ( z ) + ∆ , (3.1) with k ∆ k ≤ 2 ε . With a little bit abuse of the notatio n, the unkno wn co eﬃcien ts to b e determined here are denoted a s θ 1 = [ θ φ 1 , θ φ 2 , · · · , θ φ n 1 ] T and θ 2 = [ θ ψ 1 , θ ψ 2 , · · · , θ ψ n 2 ] T . Since the arbitrariness of ε , the norm of the term ∆ can b e small enough. And the term ∆ = 0 when ε is exactly zero. In the sequel, we discuss the equation (3.1) with the t erm ∆ omitted. Deﬁne [Φ Ψ] := [( φ k ( z r )) n 1 k =1 ( ψ l ( z r )) n 2 l =1 ] , r = 1 , 2 , · · · , N , (3.2) and H := [ H ( z 1 ) , H ( z 2 ) , · · · , H ( z N )] T . (3.3) Then H = [Φ Ψ]  θ 1 θ 2  . (3.4) W e randomly select the subset Ω of size m ( << n 1 + n 2 ) drawn from the uniform distribution ov er the index set { 1 , 2 , · · · , N } , and denote t he measuremen t by H Ω = [Φ Ψ] Ω  θ 1 θ 2  , (3.5) where H Ω is the m × 1 v ector consisting of { H ( z r ) , r ∈ Ω } , and [Φ Ψ] Ω is the m × ( n 1 + n 2 ) matrix [( φ k ( z r )) n 1 k =1 ( ψ l ( z r )) n 2 l =1 ] , r ∈ Ω. As [Φ Ψ] Ω is the concatenation of tw o bases, the represen tation is not unique. While the uniqueness of the represen tation is guarantee d if the represen tation is suﬃ- cien tly sparse. The goal is to ﬁnd the sparsest decomposition from the l 0 minimization ( P 0 ) : min θ 1 ,θ 2      θ 1 θ 2      0 sub ject to H Ω = [Φ Ψ] Ω  θ 1 θ 2  , whic h is a n infeasible searc h problem [41]. An alternative approac h called Basis Pursuit [8], [11], [42] ( P 1 ) : min θ 1 ,θ 2      θ 1 θ 2      1 sub ject to H Ω = [Φ Ψ] Ω  θ 1 θ 2  . (3.6) F or the sparse represen tation using only one basis, compressed sensing theory has presen ted the equiv alence of l 0 optimization and l 1 minimization when the represen ta- tion is suﬃcien tly sparse [41], [42], and hav e prov ided the suﬃcien t conditions on the n umber of measuremen ts needed to recov er the sparse co eﬃcien t from the randomly sampled measuremen ts by solving the l 1 -minimization problem [40], [41]. Ho we v er, t he existing results cannot b e applied to the setting of ( 3.6), for there are t wo bases concerned. Hence the suﬃcien t conditions on the equiv alence of l 0 opti- mization and l 1 minimization should b e reconsidered. Here w e ﬁrst sho w the orthonor- malit y o f [Φ Ψ] When N is suﬃcien tly large. With the orthonor malit y satisﬁed, w e then presen t a suﬃcien t condition for exact reconstruction b y l 1 optimization in pairs of or thonormal ba ses in t he next Section. Theorem 3. When N is suﬃciently lar ge, the c omp osite sam p ling matrix [Φ Ψ] satisﬁes: (1) Φ ∗ Φ ≈ N I n 1 , wher e ∗ is the c on j ugate tr ansp ose, I n 1 is the identity matrix of dimension n 1 . (2) Ψ ∗ Ψ ≈ N I n 2 . (3) Φ ∗ Ψ = ( P N r =1 ψ k ( z r ) z − l r ) ≈ N ( P ∞ d =0 b dk a dl ) , ( k = 1 , · · · , n 1 , l = 1 , · · · , n 2 ) . Pro of: See App endix A. 4 The lo w er b ound of the n um b er o f measuremen ts for exact recon structi o n b y l 1 optimization In this section, w e giv e the low er b o und of the n umber of measuremen ts whic h guaran tees the replacemen t of l 0 optimization ( P 0 ) b y l 1 optimization ( P 1 ) under t wo ORF ba ses for a ﬁxed (but arbitrary) supp ort. W e ﬁrst introduce some notatio ns. Let [Φ Ψ] b e a comp osite matr ix satisfying [Φ Ψ] ∗ [Φ Ψ] = N  I n 1 Φ ∗ Ψ / N Ψ ∗ Φ / N I n 2  , (4.1) where Φ ∗ Ψ / N is a matrix with t he ( i, j )-th elemen t b eing the coherence of i -th column v ector Φ i of Φ and the j -th column v ector Ψ j of Ψ, i.e. h Φ i , Ψ j i k Φ i kk Ψ j k . The mu tual coherence of matr ices Φ and Ψ is deﬁned as µ (Φ , Ψ) = max i,j h Φ i , Ψ j i k Φ i kk Ψ j k . Deﬁne µ M = max { µ Φ , µ Ψ } , where µ Φ = max | Φ ij | and µ Ψ = max | Ψ ij | are the largest magnitude among the en tries in Φ and Ψ, resp ectiv ely . Denote T 1 = { k : | θ φ k | 6 = 0 } and T 2 = { l : | θ ψ l | 6 = 0 } as the supp ort of co eﬃcien t θ 1 and θ 2 , resp ectiv ely . Let Φ T 1 b e the N × | T 1 | matrix corresp onding to the columns of Φ indexed b y T 1 , a nd let Φ Ω T 1 b e the m × | T 1 | matrix correspo nding t o the r ows of Φ T 1 indexed by Ω. And Ψ T 2 and Ψ Ω T 2 are deﬁned similarly . F urther let F =  I Φ ∗ T 1 Ψ T 2 / N Ψ ∗ T 2 Φ T 1 / N I  . Theorem 4. L e t [Φ Ψ] b e a matrix ob eying (4.1). Fix a subset T = T 1 S T 2 of the c o eﬃcie n t domain, with T 1 and T 2 b ein g a subset of the c o eﬃc i e n t domain of Φ and Ψ , r esp e ctively. C ho o se a subset Ω of the me asur ement domain of size | Ω | = m , and a sign se quenc e τ o n T uniforml y at r ando m. S upp ose m satisﬁes m ≥ C µ 2 M max 2 {| T | , log ( n 1 + n 2 δ ) , C F ,µ (Φ , Ψ) ,T , δ } , wher e C F ,µ (Φ , Ψ) ,T , δ = 4 h ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) [2 log ( 2( n 1 + n 2 ) δ )] − 1 / 2 − µ (Φ , Ψ) p | T | i 2 . Then w i th the pr ob ability exc e e din g 1 − 6 δ , every c o eﬃc i ent ve ctor θ =  θ 1 θ 2  supp orte d on T w ith sign matchin g τ c an b e r e c over e d fr om solv i n g min k ˆ θ k 1 s.t. [Φ Ω Ψ Ω ] ˆ θ = H Ω (4.2) for the c o eﬃcien t ve ctor ˆ θ =  ˆ θ 1 ˆ θ 2  , wher e H Ω = [Φ Ω Ψ Ω ] θ . Theorem 4 shows that fo r most sparse co eﬃcien t θ supp o rted on a ﬁxed (but arbi- trary) set T , the co eﬃcien t can b e recov ered with o verw helming pr o babilit y if the sign of θ o n T and the observ at io ns H Ω = [Φ Ω Ψ Ω ] θ are dra wn a t random. Remark 3. (a) If µ is smal l enough, then C F ,µ (Φ , Ψ) ,T , δ e quals 32 log ( 2( n 1 + n 2 ) δ ) . Henc e m in T h e or em 1 is sim pliﬁe d to m ≥ C µ 2 M max 2 {| T | , log ( n 1 + n 2 δ ) } . (b) Another imp ortant p ar ameter in T he o r em 4 is µ M . S i n c e e ach c o lumn of Φ and Ψ has an l 2 -norm e qual to √ N . µ M take a value b etwe en 1 and √ N . When the c olumn s of Φ and Ψ ar e p erfe ctly ﬂat- al l elements has mo dulus e qual to 1, we w il l ha ve µ M = 1 , and the b ound of m is g o o d. But when a c olumn is max i m al ly c onc entr ate d - al l the c olumn entries but o n e vanish, then µ M = √ N , and T h e or em 4 oﬀers us n o guar ante es for r e c overy fr om a lim ite d numb er of sam ples. 5 Pro of of Theor e m 4 T o prov e Theorem 4, w e need the result ess entially prop osed in [11]. Here w e restate it b y our notations. Lemma 2. The c o eﬃcient θ deﬁn e d in The or em 4 i s the unique solution to (4.2) if and onl y if ther e e xists a dual ve ctor π ∈ R n 1 + n 2 satisfying the fol lowing pr op erties: (1) π is in the r ow sp ac e of [Φ Ω Ψ Ω ] , (2) π ( t ) = sgn θ ( t ) for t ∈ T = T 1 S T 2 , (3) | π ( t ) | < 1 f o r t ∈ T c . In t he fo llo wing, we ﬁrst deﬁne the dual v ector π as π = [Φ Ω Ψ Ω ] ∗ [Φ Ω T 1 Ψ Ω T 2 ][[Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ]] − 1 τ 0 , (5.1) where τ 0 is a | T | -dimensional v ector whose en tries are the signs of θ on T . Then w e will sho w that π in (5.1) satisﬁes condition (1) − (3) in Lemma 2. F or this purp ose, the inv ertibility of π , whic h guarante es the deﬁnition of π , is giv en in Theorem 5, and the conditions leading to the prop ert y (3) is given in Theorem 6, The theorems are presen ted a s b elo w and their pro ofs ar e giv en in the App endix B and C, resp ectiv ely . Theorem 5. L et [Φ Ψ ] b e a matrix ob e ying (4.1). Consider a ﬁxe d set T a n d le t Ω b e a r andom set sample d using the B ernoul li m o del . Supp ose that the numb e r of me a sur eme n ts m ob eys m ≥ | T | µ 2 M · max { 4 C 2 R (1 + 3 k F k ) log | T | , C T log (3 /δ ) } , for some p ositive c onstants C R , C T . Then (1) P  k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − F k > 1 / 2  ≤ δ, wher e k · k is the sp e ctr al norm, the amplitude of the lar gest eigenvalue. (2) P  k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − I k > 1 / 2 + k Φ ∗ T 1 Ψ T 2 / N k  ≤ δ . Theorem 5 rev eals tha t for small v alue o f δ and µ (Φ , Ψ), the eigenv alues of [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] are all close to m with high probabilit y , whic h is an uncertaint y principle. Let θ b e a sequence supp orted on T , then with probabilit y exceeding 1 − δ , w e hav e k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − I k ≤ 1 / 2 + k Φ ∗ T 1 Ψ T 2 / N k . It follo ws that  1 2 + k Φ ∗ T 1 Ψ T 2 / N k  m k θ k 2 ≤ k (Φ Ω T 1 Ψ Ω T 2 ) θ k 2 ≤  3 2 + k Φ ∗ T 1 Ψ T 2 / N k  m k θ k 2 , whic h sho ws that o nly a small p ortion of the energy of θ will b y concen trated on the set Ω in the [Φ Ψ]-domain. Theorem 6. L et [Φ Ψ] , T and Ω b e as de ﬁ ne d in The or em 5. Denote λ = (( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) ) − 1 ( µ (Φ , Ψ) p | T | + a ¯ σ m + √ | T | µ M √ m ) with ¯ σ 2 = mµ 2 M max { 2 , | T | µ M √ m } . F or e ach a > 0 ob eying a ≤ √ 2 m √ | T | µ M if | T | µ M √ m ≤ 2 a n d a ≤ ( m µ 2 M ) 1 / 4 otherwise, then P ( sup t ∈ T c | π ( t ) | ≥ 1 ) ≤ 2( n 1 + n 2 ) e − 1 2 λ 2 + 3( n 1 + n 2 ) e − γ a 2 + P  k [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] k ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) m  Pro of of Theorem 4. It is ob vious that prop erties (1) and (2) in Lemma 2 hold. By Theorem 5, w e kno w that the in ve rtibilit y of [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] is guarantee d with high probability . No w w e will sho w the pro p ert y ( 3 ) holds with hig h probability . Set λ and a > 0 satisfying the condition in Theorem 6, w e ha ve P ( sup t ∈ T c | π ( t ) | ≥ 1 ) ≤ 2( n 1 + n 2 ) e − 1 2 λ 2 + 3( n 1 + n 2 ) e − γ a 2 + P  k [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] k ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) m  F or the second term to b e less than δ , we c ho ose a suc h that a 2 = γ − 1 log ( 3( n 1 + n 2 ) δ ) . (5.2) The ﬁrst term is less than δ if 1 λ 2 ≥ 2 log ( 2( n 1 + n 2 ) δ ) . (5.3) When | T | µ M √ m > 2, the condition in Theorem 6 is a ≤ ( m µ 2 M ) 1 / 4 . Com bining with (5.2) a 2 = γ − 1 log ( 3( n 1 + n 2 ) δ ) ≤ ( m µ 2 M ) 1 / 2 , equiv alen tly m ≥ µ 2 M γ − 2 [log (3( n 1 + n 2 ) /δ )] 2 . (5.4) In t his case, a ¯ σ ≤ p m | T | µ M , then λ ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) − 1 ( µ (Φ , Ψ) p | T | + 2 µ M r | T | m ) . (5.5) When | T | µ M √ m ≤ 2, the conditio n in Theorem 6 is a ≤ √ 2 m √ | T | µ M . That is, when m ≥ | T | 2 µ 2 M 4 , (5.6) with equation (5.2) a 2 = γ − 1 log ( 3( n 1 + n 2 ) δ ) ≤ 2 m | T | µ 2 M , or equiv a len tly m ≥ (2 γ ) − 1 log ( 3( n 1 + n 2 ) δ ) | T | µ 2 M . (5.7) And if | T | ≥ 2 a 2 , then a ¯ σ ≤ q T 2 √ 2 mµ M = p m | T | µ M , whic h gives again (5.5). On the other hand, if | T | ≤ 2 a 2 , then q | T | m µ M ≤ √ 2 a √ m µ M = a ¯ σ m , whic h gives λ ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) − 1 ( µ (Φ , Ψ) p | T | + 2 a ¯ σ m ) = ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) − 1 ( µ (Φ , Ψ) p | T | + 2 √ 2 aµ M √ m ) (5.8) F rom (5.5) and (5.8), λ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) ≤ µ (Φ , Ψ) p | T | + 2 µ M √ m max { p | T | , √ 2 a } . T o v erify (5.3), it suﬃces to t a k e m ob eying µ (Φ , Ψ) p | T | + 2 µ M √ m max { p | T | , √ 2 a } ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) [2 log ( (2( n 1 + n 2 ) δ )] − 1 / 2 , whic h is equiv alent to m ≥   2 µ M max { p | T | , √ 2 a } ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) [2 log ( 2( n 1 + n 2 ) δ )] − 1 / 2 − µ p | T |   2 = 4 µ 2 M max {| T | , 2 a 2 } h ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) [2 log ( 2( n 1 + n 2 ) δ )] − 1 / 2 − µ (Φ , Ψ) p | T | i 2 = 4 µ 2 M max {| T | , 2 γ − 1 log ( 3( n 1 + n 2 ) δ ) } h ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) [2 log ( 2( n 1 + n 2 ) δ )] − 1 / 2 − µ (Φ , Ψ) p | T | i 2 . (5.9) This analysis sho ws that the second term is less than δ if m satisﬁes (5.4) , (5.6), (5.7) and (5.9), whic h can b e simpliﬁed as m ≥ K 1 µ 2 M max 2 {| T | , log ( n 1 + n 2 δ ) , C F ,µ (Φ , Ψ) ,T , δ } , where C F ,µ (Φ , Ψ) ,T , δ = 4  ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k )[2 log ( 2( n 1 + n 2 ) δ )] − 1 / 2 − µ (Φ , Ψ) √ | T |  2 . Finally , by Theorem 5, the last term will b e b ounded b y δ if m ≥ K 2 | T | µ 2 M log ( n 1 + n 2 δ ) . In conclusion, when using Bernoulli mo del the reconstruction is exact with pro ba- bilit y at least 1 − 3 δ pro vided that the num b er of measuremen ts m satisﬁes m ≥ K 3 µ 2 M max 2 {| T | , log ( n 1 + n 2 δ ) , C F ,µ (Φ , Ψ) ,T , δ } . F ollowing [11 ], the desired prop erties hold with Ω sampled using unifor m mo del whenev er the desired prop erties hold when Ω is sampled using a Bernoulli mo del. In fact, suppose Ω 1 of size m is sampled uniformly at random and Ω 2 is sampled by setting Ω 2 = { k : δ k = 1 } , where { δ k } is a sequence of indep endent iden tically distributed 0 / 1 Bernoulli rando m v ariables with probability P ( δ k = 1) = m N . (5.10) Then P ( F ailu r e (Ω 1 )) ≤ 2 P ( F ailu r e (Ω 2 )) . Hence, for Ω sampled using the uniform mo del, the existence of a dual vec t o r fo r θ 0 is guaran teed with probability exceeding 1 − 6 δ . The theorem is prov ed. ✷ 6 Conclus ion Based o n the principle of compressed sensing, we hav e prop o sed a reconstruction metho d f or a sparse rational transfer function under tw o ORF bases. W e ha v e es- tablished the uncertain ty principle concerning compressible represen tation of rational functions under t w o ORF bases, and w e also presen ted the uniqueness of compressible represen tation. The low er b ound of the n um b er of measuremen ts whic h guarantees the replacemen t of 1 0 optimization searc hing for t he unique sparse reconstruction with random sampling on the unit circle by 1 1 optimization with high probability is pro vided as w ell. Since the signal and systems can b oth b e represen ted b y ratio na l functions, the prop osed reconstruction metho d can b e applied in the sparse reconstruction for signals and sparse system identiﬁcation. The linearit y of parameter in the represen t a tion sho ws that the r econstruction metho d can b e applied to sparse rational MIMO system as w ell. A Pro o f of Theor e m 1, 2 and 3 Pr o of of The or em 1. Notice that the t r ansfer function considered is in R H 2 space, satisfying H ( z ) = ∞ X k =1 α k φ k ( z ) = ∞ X l =1 β l ψ l ( z ) . F or { α k } and { β l } , denote Γ ε ( α ) and Γ ε ( β ) as the ε -supp ort of α and β , r esp ective ly . Then w e hav e P k / ∈ Γ ε ( α ) ≤ ε and P l / ∈ Γ ε ( β ) ≤ ε . Without loss of generalit y , w e a ssume h H ( z ) , H ( z ) i = 1, then 1 = h H ( z ) , H ( z ) i = h ∞ X k =1 α k φ k ( z ) , ∞ X l =1 β l ψ l ( z ) i = ∞ X k =1 ∞ X l =1 α k h φ k ( z ) , ψ l ( z ) i β l . Note that 1 = |h H ( z ) , H ( z ) i| = | ∞ X k =1 ∞ X l =1 α k h φ k ( z ) , ψ l ( z ) i β l | ≤ µ ∞ X k =1 ∞ X l =1 | α k || β l | = µ ∞ X k =1 | α k | ∞ X l =1 | β l | = µ ( X k ∈ Γ ε ( α ) | α k | + X k / ∈ Γ ε ( α ) | α k | )( X l ∈ Γ ε ( β ) | β l | + X l / ∈ Γ ε ( β ) | β l | ) ≤ µ ( X k ∈ Γ ε ( α ) | α k | + ε )( X l ∈ Γ ε ( β ) | β l | + ε ) . Similarly , w e ha v e 1 = h H ( z ) , H ( z ) i = h ∞ X k =1 α k φ k ( z ) , ∞ X l =1 α l φ l ( z ) i = ∞ X k =1 ∞ X l =1 α k h φ k ( z ) , φ l ( z ) i α l = ∞ X k =1 α k h φ k ( z ) , φ k ( z ) i α k = ∞ X k =1 | α k | 2 = X k ∈ Γ ε ( α ) | α k | 2 + X k / ∈ Γ ε ( α ) | α k | 2 ≤ X k ∈ Γ ε ( α ) | α k | 2 + X k / ∈ Γ ε ( α ) | α k | ε ≤ X k ∈ Γ ε ( α ) | α k | 2 + ε 2 and 1 = X l ∈ Γ ε ( β ) | β l | 2 + X l / ∈ Γ ε ( β ) | β l | 2 ≤ X l ∈ Γ ε ( β ) | β l | 2 + ε 2 . The b ound of the ab o ve expression can b e solv ed b y the optimization problem max α k ,β l ( X k ∈ Γ ε ( α ) | α k | + ε )( X l ∈ Γ ε ( β ) | β l | + ε ) (A.1) sub ject to | α k | > 0 , | β l | > 0 , X k ∈ Γ ε ( α ) | α k | 2 ≥ 1 − ε 2 , X l ∈ Γ ε ( β ) | β l | 2 ≥ 1 − ε 2 . This can b e separated in to tw o optimization problems: max α k X k ∈ Γ ε ( α ) | α k | sub ject to | α k | > 0 , X k ∈ Γ ε ( α ) | α k | 2 ≥ 1 − ε 2 (A.2) max β l X l ∈ Γ ε ( β ) | β l | sub ject to | β l | > 0 , X l ∈ Γ ε ( β ) | β l | 2 ≥ 1 − ε 2 . (A.3) The optimization (A.2) can b e solv ed b y max α k X k ∈ Γ ε ( α ) | α k | sub ject to | α k | > 0 , X k ∈ Γ ε ( α ) | α k | 2 = C, (A.4) where C ∈ [1 − ε 2 , 1] is a constant. By using Lagrangian multiplier metho d, w e hav e Lagra ng ian function F ( | α k | , λ ) = X k ∈ Γ ε ( α ) | α k | + λ ( X k ∈ Γ ε ( α ) | α k | 2 − C ) . Let the pa rtial diﬀerat ia tion ∂ F ( | α k | , λ ) ∂ | α k | = 1 + 2 λ | α k | = 0 . Then we ha ve all the | α k | a re equal when k ∈ Γ ε ( α ). D enote k α k 0( ε ) = A . Then | α k | = q C A . Hence the maxima of (A.4) is A q C A = √ AC . Since C ∈ [1 − ε 2 , 1], then the maxima o f (A.2) is √ A = p k α k 0( ε ) . Similarly , the maxima of (A.3) is p k β k 0( ε ) . Therefore, the maxima of (A.1) is ( p k α k 0( ε ) + ε ) · ( p k β k 0( ε ) + ε ), and 1 ≤ µ ( X k ∈ Γ ε ( α ) | α k | + ε )( X l ∈ Γ ε ( β ) | β l | + ε ) ≤ µ ( q k α k 0( ε ) + ε ) · ( q k β k 0( ε ) + ε ) . Using the inequality b et wee n the geometric and arithmetic means, we ha ve 1 ≤ µ ( q k α k 0( ε ) + ε ) · ( q k β k 0( ε ) + ε ) ≤ µ ( p k α k 0( ε ) + ε ) 2 + ( p k β k 0( ε ) + ε ) 2 2 . That is ( q k α k 0( ε ) + ε ) 2 + ( q k β k 0( ε ) + ε ) 2 ≥ 2 µ . ✷ Pr o of of The or em 2. Supp ose there are t w o diﬀeren t sparse represen tatio ns of transfer function H ( z ) under the t wo ORF bases { φ k ( z ) } ∞ k =1 and { ψ l ( z ) } ∞ l =1 , that is H ( z ) = ∞ X k =1 θ φ k φ k ( z ) + ∞ X l =1 θ ψ k ψ l ( z ) = ∞ X k =1 ξ φ k φ k ( z ) + ∞ X l =1 ξ ψ l ψ l ( z ) and ( q k θ 1 k 0( ε ) + ε ) 2 + ( q k θ 2 k 0( ε ) + ε ) 2 < 1 µ , ( q k ξ 1 k 0( ε ) + ε ) 2 + ( q k ξ 2 k 0( ε ) + ε ) 2 < 1 µ , where ξ 1 = [ ξ φ 1 , ξ φ 2 , · · · ] T and ξ 2 = [ ξ ψ 1 , ξ ψ 2 , · · · ] T . Then ∞ X k =1 ( θ φ k − ξ φ k ]) φ k ( z ) = ∞ X l =1 ( ξ ψ l − θ ψ l ) ψ l ( z ) . According t o the uncertaint y principle, w e hav e ( q k θ 1 − ξ 1 k 0( ε ) + ε ) 2 + ( q k θ 2 − ξ 2 k 0( ε ) + ε ) 2 ≥ 2 µ . (A.5) Ho we v er, based on the sparsit y a ssumption of Theorem 1 ( q k θ 1 − ξ 1 k 0( ε ) + ε ) 2 + ( q k θ 2 − ξ 2 k 0( ε ) + ε ) 2 < ( q ( k θ 1 k 0( ε ) + k ξ 1 k 0( ε ) ) + ε ) 2 + ( q ( k θ 2 k 0( ε ) + k ξ 2 k 0( ε ) ) + ε ) 2 < ( q k θ 1 k 0( ε ) + ε ) 2 + ( q k ξ 1 k 0( ε ) + ε ) 2 + ( q k θ 2 k 0( ε ) + ε ) 2 + ( q k ξ 2 k 0( ε ) + ε ) 2 < 2 µ , whic h contradicts (A.5). ✷ Pr o of of The or em 3. The integral deﬁnition of inner pro duct show s that when N is suﬃcien tly large, 1 2 π N X r =1 z − k r z − l r 2 π N → h z − k , z − l i = δ k l , where the kroneck er sym b o l δ k l equals 1 if k = l and 0 if k 6 = l . Then the ( k , l ) elemen t of Φ ∗ Φ is N X r =1 φ k ( z r ) φ l ( z r ) = N X r =1 ∞ X d ′ =0 b d ′ k z − d ′ r ∞ X d =0 b dl z − d r = ∞ X d ′ =0 ∞ X d =0 b d ′ k b dl N X r =1 z − d ′ r z − d r → N ∞ X d =0 b dk b dl = N δ k l , the la st equation is based o n the orthonor mality of { ψ l ( z ) } , h φ k ( z ) , φ l ( z ) i = h ∞ X d ′ =0 b d ′ k z − d ′ , ∞ X d =0 b dl z − d i = ∞ X d ′ = d h b d ′ k z − d ′ , b d ′ l z − d i + X d ′ 6 = d h b d ′ k z − d ′ , b dl z − d i = ∞ X d =0 b dk b dl h z − d , z − d i + X d ′ 6 = d b d ′ k b dl h z − d ′ , z − d i = ∞ X d =0 b dk b dl = δ k l , whic h implies (1) . The ( k , l ) elemen t of Ψ ∗ Ψ is N X r =1 ψ k ( z r ) ψ l ( z r ) = N X r =1 ∞ X d ′ =0 a d ′ k z − d ′ r ∞ X d =0 a dl z − d r = ∞ X d ′ =0 ∞ X d =0 a d ′ k a dl N X r =1 z − d ′ r z − d r → N ∞ X d =0 a dk a dl = N δ k l , the la st equation is based o n the orthonor mality of { ψ l ( z ) } , h ψ k ( z ) , ψ l ( z ) i = h ∞ X d ′ =0 a d ′ k z − d ′ , ∞ X d =0 a dl z − d i = ∞ X d ′ = d h a d ′ k z − d ′ , a d ′ l z − d i + X d ′ 6 = d h a d ′ k z − d ′ , a dl z − d i = ∞ X d =0 a dk a dl h z − d , z − d i + X d ′ 6 = d a d ′ k a dl h z − d ′ , z − d i = ∞ X d =0 a dk a dl = δ k l , whic h implies (2) . The ( k , l ) elemen t of Φ ∗ Ψ is N X r =1 φ l ( z r ) ψ l ( z r ) = N X r =1 ∞ X d ′ =0 b d ′ k z − d ′ r ∞ X d =0 a dl z − d r = ∞ X d ′ =0 ∞ X d =0 b d ′ k a dl N X r =1 z − d ′ r z − d r → N ∞ X d =0 b dk a dl . ✷ B Pro of of Theor e m 5 No w, w e will giv e the general idea for proving Theorem 5. The matrix [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] can b e written as [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] = N X k =1 δ k [ u k v k ] ⊗ [ u k v k ] = N X k =1 δ k  u k ⊗ u k u k ⊗ v k v k ⊗ u k v k ⊗ v k  , where u k and v k are the row v ectors o f Φ T 1 and Ψ T 2 , resp ectiv ely , and δ k ob eys (5.10). Denote Y = 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − F = 1 m N X k =1 δ k  u k ⊗ u k u k ⊗ v k v k ⊗ u k v k ⊗ v k  − F . (B.1) The pro o f includes t wo steps: (1) E k Y k is upp er b ounded with small constan t, i.e. on av erage that matrix 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] deviated little fro m F . (2) k Y k − E k Y k is b ounded with high probability . W e presen t these r esults in Lemma 3 and Lemma 4, resp ectiv ely . Lemma 3. L et [Φ Ψ] , T and Ω b e as deﬁne d in The or em 5. Then E k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − F k ≤ C R p 1 + 3 k F k 2 · µ M p | T | log | T | √ m , for so me p o s i tive c onstant C R , pr ovide d that C R / 2 · √ log | T | √ m · max 1 ≤ k ≤ N k [ u k v k ] k is less than 1. Pr o of. First we hav e E Y =  E 1 m P N k =1 δ k u k ⊗ u k − I E 1 m P N k =1 δ k u k ⊗ v k − µ h Φ T 1 , Ψ T 2 i E 1 m P N k =1 δ k v k ⊗ u k − µ h Φ T 1 , Ψ T 2 i E 1 m P N k =1 δ k v k ⊗ v k − I  =  1 m P N k =1 m N u k ⊗ u k − I 1 m P N k =1 m N u k ⊗ v k − µ h Φ T 1 , Ψ T 2 i 1 m P N k =1 m N v k ⊗ u k − µ h Φ T 1 , Ψ T 2 i 1 m P N k =1 m N v k ⊗ v k − I  =  1 N P N k =1 u k ⊗ u k − I 1 N P N k =1 u k ⊗ v k − µ h Φ T 1 , Ψ T 2 i 1 N P N k =1 v k ⊗ u k − µ h Φ T 1 , Ψ T 2 i 1 N P N k =1 v k ⊗ v k − I  = O . Use the symmetrization tec hnique in [44], and let δ ′ 1 , · · · , δ ′ n are indep enden t copies of δ 1 , · · · , δ n , a nd ǫ 1 , · · · , ǫ n b e a sequence of Bernoulli v ariables taking v alues ± 1 with probabilit y 1 2 (and indep enden t of sequences δ and δ ′ ), w e hav e E k Y k ≤ E δ,δ ′ k 1 m N X k =1 ( δ k − δ ′ k )[ u k v k ] ⊗ [ u k v k ] k = E ǫ E δ,δ ′ k 1 m N X k =1 ǫ k ( δ k − δ ′ k )[ u k v k ] ⊗ [ u k v k ] k ≤ 2 E ǫ E δ k 1 m N X k =1 ǫ k δ k [ u k v k ] ⊗ [ u k v k ] k , the ﬁrst equality follo ws from the symmetry of the random v ariable ( δ k − δ ′ k )[ u k v k ] ⊗ [ u k v k ] while the last inequalit y fo llo ws from the triang le inequalit y . Rudelson’s lemma [50] states that E ǫ k N X k =1 ǫ k δ k [ u k v k ] ⊗ [ u k v k ] k ≤ C R 4 · p log | T |· max k : δ k =1 k [ u k v k ] k· v u u t k N X k =1 δ k [ u k v k ] ⊗ [ u k v k ] k , for some univ ersal constant C R > 0. T aking exp ectation o ve r δ then gives E k Y k ≤ C R / 2 · p log | T | m · max 1 ≤ k ≤ N k [ u k v k ] k · E v u u t k N X k =1 δ k [ u k v k ] ⊗ [ u k v k ] k ≤ C R / 2 · p log | T | m · max 1 ≤ k ≤ N k [ u k v k ] k · v u u t E k N X k =1 δ k [ u k v k ] ⊗ [ u k v k ] k . Observ e that E k N X k =1 δ k [ u k v k ] ⊗ [ u k v k ] k = E k mY + mF k ≤ m ( E k Y k + k F k ) . Therefore E k Y k ≤ a p E k Y k + k F k , where a = C R / 2 · √ log | T | √ m · max 1 ≤ k ≤ N k [ u k v k ] k , whic h implies 0 ≤ E k Y k ≤ a 2 + p a 4 + 4 a 2 k F k 2 . It then f o llo ws that if a ≤ 1, E k Y k ≤ p 1 + 3 k F k a. The inequalit y is based o n the fact: when a ≤ 1, ( a 2 + p a 4 + 4 a 2 k F k ) 2 = 2 a 4 + 2 a 3 p a 2 + 4 k F k + 4 a 2 k F k ≤ 2 a 2 + 2 a 2 p 1 + 4 k F k + 4 a 2 k F k ≤ ( 4 + 12 k F k ) a 2 . Since max 1 ≤ k ≤ N k [ u k v k ] k ≤ q | T 1 | µ 2 Φ + | T 2 | µ 2 Ψ ≤ p | T 1 | + | T 2 | max { µ Φ , µ Ψ } = p | T | µ M , then a ≤ C R / 2 · p log | T | √ m · p | T | µ M , whic h completes the pro of. Lemma 4. L et [Φ Ψ] , T and Ω b e as deﬁne d in The or em 5. Then for al l t ≥ 0 , P {|k Y k − E k Y k| > t } ≤ 3 exp  − tm K | T | µ 2 M log(1 + t 2 + E k Y k )  . wher e K is a numeric al c onstant. The pro of of Lemma 4 uses a concen tration equality ab out the la rge deviation o f suprema of sums o f indep enden t v ariables [47], whic h is stated b elow . Lemma 5. L et Y 1 , Y 2 , · · · , Y n b e a se quenc e of indep enden t r andom variables taking values in a Ba n ach sp ac e and let Z b e the supr emum deﬁne d as Z = sup f ∈F n X i =1 f ( Y i ) , wher e F is a c ountable fa mily of r e al-value d functions. Assume that | f | ≤ B for eve ry f in F , an d E f ( Y i ) = 0 for every f in F and i = 1 , 2 , · · · , n . Then for a l l t ≥ 0 , P ( | Z − E Z | > t ) ≤ 3 exp  − t K B log(1 + B t σ 2 + B E ¯ Z )  , wher e σ 2 = sup f ∈F P n k =1 E f 2 ( Y k ) , ¯ Z = sup f ∈F | P n i =1 f ( Y i ) | , an d K i s a numeric al c on s tant. Pr o of. Since 1 N P N k =1 [ u k v k ] ⊗ [ u k v k ] = F , then express Y in (B.1) as Y = N X k =1 ( δ k − m N ) [ u k v k ] ⊗ [ u k v k ] m := N X k =1 Y k , where Y k := ( δ k − m N ) [ u k v k ] ⊗ [ u k v k ] m . According the deﬁnition of sp ectral norm, k Y k = sup f 1 ,f 2 h f 1 , Y f 2 i = sup f 1 ,f 2 N X k =1 h f 1 , Y k f 2 i , where the supremu m is ov er a coun table collection o f unit ve ctors. Let f ( Y k ) denote t he mapping h f 1 , Y k f 2 i , then w e hav e E f ( Y k ) = 0 a nd for all k | f ( Y k ) | = |h f 1 , Y k f 2 i| ≤ |h f 1 , [ u k v k ] ih [ u k v k ] , f 2 i| m ≤ k [ u k v k ] k 2 m ≤ B , where B = max 1 ≤ k ≤ N k [ u k v k ] k 2 m . Hence a ccording to Lemma 5, fo r all t ≥ 0, P ( |k Y k − E k Y k| > t ) ≤ 3 exp  − t K B log(1 + B t σ 2 + B E k Y k )  , (B.2) where σ 2 = sup f P N k =1 E f 2 ( Y k ). W e now compute E f 2 ( Y k ) = m N (1 − m N ) |h f 1 , [ u k v k ] ih [ u k v k ] , f 2 i| 2 m 2 ≤ m N (1 − m N ) k [ u k v k ] k 2 m 2 |h [ u k v k ] , f 2 i| 2 , the inequality is ba sed on Cauch y inequalit y , and N X k =1 E f 2 ( Y k ) ≤ N X k =1 m N (1 − m N ) k [ u k v k ] k 2 m 2 |h [ u k v k ] , f 2 i| 2 ≤ 1 N (1 − m N ) max 1 ≤ k ≤ N k [ u k v k ] k 2 m N X k =1 |h [ u k v k ] , f 2 i| 2 ≤ 1 N (1 − m N ) B N X k =1 |h [ u k v k ] , f 2 i| 2 ≤ 2 B , the la st inequality is based on N X k =1 |h [ u k v k ] , f 2 i| 2 = N X k =1 (Φ k 1 f 2 , 1 + · · · + Φ k T 1 f 2 ,T 1 + Ψ k 1 f 2 ,T 1 +1 + · · · + Φ k T 2 f 2 ,T 1 + T 2 ) 2 ≤ 2 N X k =1 (Φ k 1 f 2 , 1 + · · · + Φ k T 1 f 2 ,T 1 ) 2 + 2 N X k =1 (Ψ k 1 f 2 ,T 1 +1 + · · · + Φ k T 2 f 2 ,T 1 + T 2 ) 2 = 2 N X k =1 [(Φ k 1 f 2 , 1 ) 2 + · · · + (Φ k T 1 f 2 ,T 1 ) 2 ] + 2 N X k =1 [(Ψ k 1 f 2 ,T 1 +1 ) 2 + · · · + (Ψ k T 2 f 2 ,T 1 + T 2 ) 2 ] = 2 N ( f 2 2 , 1 + · · · + f 2 2 ,T 1 ) + 2 N ( f 2 2 ,T 1 +1 + · · · + f 2 2 ,T 1 + T 2 ) ≤ 2 N , (B.3) where f 2 ,j is the j - t h elemen t of f 2 . And the second and third equalities are based on (4.1). Then σ 2 = 2 B . So equation (B.2) can b e rewritten as P ( |k Y k − E k Y k| > t ) ≤ 3 exp  − t K B log(1 + t 2 + E k Y k )  . Since B ≤ | T | µ 2 M m , whic h implies P ( |k Y k − E k Y k| > t ) ≤ 3 exp  − tm K | T | µ 2 M log(1 + t 2 + E k Y k )  . Pr o of of The or em 5. By Lemma 3, taking m ≥ 4 C 2 R (1 + 3 k F k ) µ 2 M | T | log | T | so that E k Y k ≤ 1 4 . And pic k t = 1 4 , then b y Lemma 4 P ( k Y k > 1 / 2) ≤ 3 exp  − m C T | T | µ 2 M  , where C T = 4 K / log (10 / 9). T aking m ≥ C T | T | µ 2 M log ( 3 δ ) so t hat exp  − m C T | T | µ 2 M  < δ , which complete the pro of of the ﬁrst result (1) in Theorem 5. F or the result (2), w e hav e P ( k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − I k > 1 / 2 + k F − I k ) = P ( k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − F + F − I k > 1 / 2 + k F − I k ) ≤ P ( k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − F k + k F − I k > 1 / 2 + k F − I k ) = P ( k 1 m [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] − F k > 1 / 2) ≤ δ. By simple calculation, w e hav e k F − I k = k Φ ∗ T 1 Ψ T 2 / N k , whic h completes the pro of. ✷ C Pro of of Theo r e m 6 In t he sequel, w e will sho w that π ( t ) < 1 for t ∈ T c with high pr o babilit y . F or a particular t 0 ∈ T c , rewrite π ( t 0 ) = h v 0 , [[Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ]] − 1 τ i = h w 0 , z i , where v 0 is the t 0 -th ro w v ector of [Φ Ω Ψ Ω ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] and w 0 = [[Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ]] − 1 v 0 . Set λ 0 k = [Φ Ψ] k t 0 . The v ector v 0 is g iven by v 0 = N X k =1 δ k λ 0 k [ u k v k ] . Let ˜ v 0 = v 0 − E v 0 = N X k =1 ( δ k − E δ k ) λ 0 k [ u k v k ] . The follow ing lemmas giv e estimates for the size of these v ectors. Lemma 6. The se c ond mo m ent of k ˜ v 0 k and k v 0 k ob eys E k ˜ v 0 k 2 ≤ m | T | µ 2 M and E k v 0 k 2 ≤ 2 m | T | µ 2 M + 2 m 2 µ 2 max {| T 1 | , | T 2 |} , r esp e ctively. Pr o of. ˜ v 0 can b e view ed as a sum of indep enden t random v ariables: ˜ v 0 = N X k =1 Y k , Y k = ( δ k − E δ k ) λ 0 k [ u k v k ] , where E Y k = 0. It follows that E k ˜ v 0 k 2 = X k E h Y k , Y k i + X k ′ 6 = k E h Y k , Y ′ k i = X k E h Y k , Y k i = X k m N (1 − m N ) | λ 0 k | 2 k [ u k v k ] k 2 ≤ m N (1 − m N )( | T 1 | µ 2 Φ + | T 2 | µ 2 Ψ ) X k | λ 0 k | 2 = m (1 − m N )( | T 1 | µ 2 Φ + | T 2 | µ 2 Ψ ) ≤ m | T | µ 2 M , the la st equality is based on P k | λ 0 k | 2 = N . Hence E k v 0 k 2 = E k ˜ v 0 + N X k =1 E δ λ 0 k [ u k v k ] k 2 = E k ˜ v 0 + m N N X k =1 λ 0 k [ u k v k ] k 2 ≤ E 2( k ˜ v 0 k 2 + k m N N X k =1 λ 0 k [ u k v k ] k 2 ) Notice tha t if t 0 ∈ T c 1 , then N X k =1 λ 0 k [ u k v k ] = Φ ∗ t 0 [Φ T 1 Ψ T 2 ] = [ − → 0 N X k =1 λ 0 k v k ] , whic h implies that k m N N X k =1 λ 0 k [ u k v k ] k = k m N N X k =1 λ 0 k v k k = m k h Φ t 0 , Ψ T 2 i N k ≤ mµ p | T 2 | . Similarly , if t 0 ∈ T c 2 , then k m N N X k =1 λ 0 k [ u k v k ] k ≤ mµ p | T 1 | . In summary , if t 0 ∈ T c , then k m N N X k =1 λ 0 k [ u k v k ] k ≤ mµ max { p | T 1 | , p | T 2 |} . (C.1) Hence, E k v 0 k 2 ≤ 2 m | T | µ 2 M + 2 m 2 µ 2 max {| T 1 | , | T 2 |} . Lemma 7. Fix t 0 ∈ T c . Deﬁne ¯ σ as ¯ σ 2 = max { 2 mµ 2 M , √ m | T | µ 3 M } = mµ 2 M max { 2 , | T | µ M √ m } . F or a > 0 ob eying a ≤ √ 2 m √ | T | µ M if | T | µ M √ m ≤ 2 a n d a ≤ ( m µ 2 M ) 1 / 4 otherwise. Then P ( k v 0 k > a ¯ σ + mµ p | T | + p m | T | µ M ) ≤ 3 e − γ a 2 , for some p ositive c onstant γ > 0 . Pr o of. By deﬁnition, k ˜ v 0 k is giv en b y k ˜ v 0 k = sup k f k =1 h ˜ v 0 , f i = sup k f k =1 N X k =1 h Y k , f i . F or a ﬁxed unit ve ctor f , let f ( Y k ) denote the mapping h Y k , f i . Since E Y k = 0, then E f ( Y k ) = 0. And | f ( Y k ) | ≤ | λ 0 k ||h f , [ u k v k ] i| ≤ | λ 0 k |k [ u k v k ] k ≤ max { µ Φ , µ Ψ } · q | T 1 | µ 2 Φ + | T 2 | µ 2 Ψ ≤ p | T | µ 2 M = B ′ , for a ll k . By a pplying (B.3), w e hav e N X k =1 E f 2 ( Y k ) = N X k =1 m N (1 − m N ) | λ 0 k | 2 |h [ u k v k ] , f i| 2 ≤ m N (1 − m N ) µ 2 M · 2 N ≤ 2 mµ 2 M , whic h implies σ 2 = 2 mµ 2 M . According to Lemma 5, we hav e P ( |k ˜ v 0 k − E k ˜ v 0 k| > t ) ≤ 3 exp  − t K B ′ log(1 + B ′ t σ 2 + B ′ E k ˜ v 0 k )  , whic h implies P ( k ˜ v 0 k > t + E k ˜ v 0 k ) ≤ 3 exp  − t K B ′ log(1 + B ′ t σ 2 + B ′ E k ˜ v 0 k )  . F or the E k ˜ v 0 k , we simply use E k ˜ v 0 k ≤ p E k ˜ v 0 k 2 ≤ µ M p m | T | . Since v 0 = ˜ v 0 + P N k =1 E δ λ 0 k [ u k v k ] = ˜ v 0 + m N P N k =1 λ 0 k [ u k v k ] , then P ( k v 0 k > t + m N k N X k =1 λ 0 k [ u k v k ] k + E k ˜ v 0 k ) = P ( k ˜ v 0 + m N N X k =1 λ 0 k [ u k v k ] k > t + m N k N X k =1 λ 0 k [ u k v k ] k + E k ˜ v 0 k ) ≤ P ( k ˜ v 0 k + m N k N X k =1 λ 0 k [ u k v k ] k > t + m N k N X k =1 λ 0 k [ u k v k ] k + E k ˜ v 0 k ) = P ( k ˜ v 0 k > t + E k ˜ v 0 k ) ≤ 3 exp  − t K B ′ log(1 + B ′ t σ 2 + B ′ E k ˜ v 0 k )  ≤ 3 exp  − t K B ′ log  1 + B ′ t 2 mµ 2 M + √ m | T | µ 3 M  . Supp ose that ¯ σ 2 = max { 2 mµ 2 M , √ m | T | µ 3 M } = mµ 2 M max { 2 , | T | µ M √ m } . And ﬁx t = a ¯ σ , then by using T ailor’s expansion of log a rithm function, w e hav e P ( k v 0 k > t + m N k N X k =1 λ 0 k [ u k v k ] k + E k ˜ v 0 k ) ≤ 3 e − γ a 2 , pro vided tha t B ′ t ≤ ¯ σ 2 , whic h is equiv alent to the follo wing t wo situations: (1) when | T | µ M √ m ≤ 2, then a ≤ √ 2 m √ | T | µ M . (2) when | T | µ M √ m > 2, then a ≤ ( m µ 2 M ) 1 / 4 . As seen (C.1 ) in Lemma 6, m N k N X k =1 λ 0 k [ u k v k ] k ≤ mµ (Φ , Ψ) max { p | T 1 | , p | T 2 |} ≤ mµ (Φ , Ψ) p | T 1 | + | T 2 | = mµ (Φ , Ψ) p | T | , then P ( k v 0 k > a ¯ σ + mµ (Φ , Ψ) p | T | + p m | T | µ M ) ≤ 3 e − γ a 2 . Lemma 8. L et w 0 = ([Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ]) − 1 v 0 . With the same notations and hyp othese s as in L emm a 7, then P ( sup t 0 ∈ T c k ω 0 k ≥ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) − 1 ( µ (Φ , Ψ) p | T | + a ¯ σ m + + p | T | µ M √ m ) ≤ 3( n 1 + n 2 ) e − γ a 2 + P ( k [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] k ≤ ( 1 2 + k F − I k ) m ) Pr o of. Let A and B b e the ev en ts { k [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] k ≥ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) m } and { sup t 0 ∈ T c k v 0 k ≤ a ¯ σ + mµ (Φ , Ψ) p | T | + p m | T | µ M } resp ectiv ely . Lemma 7 giv es P ( B c ) ≤ 3( n 1 + n 2 ) e − γ a 2 . Then on the ev en t A ∩ B = { sup t 0 ∈ T c k ω 0 k ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) − 1 ( µ (Φ , Ψ) p | T | + a ¯ σ m + + √ | T | µ M √ m ), w e hav e P (( A ∩ B ) c ) = P ( A c ∪ B c ) ≤ P ( A c ) + P ( B c ) , the claim fo llo ws. Lemma 9. Assume that τ ( t ) , t ∈ T is an i.i.d. se quenc e of symm e tric Bernoul li r and o m variab les. F or e a c h λ > 0 , we have P ( sup t ∈ T c | π ( t ) | ≥ 1) ≤ 2( n 1 + n 2 ) e − 1 2 λ 2 + P ( sup t 0 ∈ T c k ω 0 k ≥ λ ) Pr o of. Recall that π ( t 0 ) = h ω 0 , τ i , b y using Ho eﬀding’s inequalit y , we hav e P ( | π ( t 0 ) | > 1 | ω 0 ) = P ( |h ω 0 , τ i| > 1 | ω 0 ) ≤ 2 e − 1 2 k ω 0 k 2 . It then f o llo ws that P ( sup t 0 ∈ T c | π ( t 0 ) | > 1 | sup t 0 ∈ T c k ω 0 k ≤ λ ) ≤ 2( n 1 + n 2 ) e − 1 2 λ 2 . Let A and B b e the eve n ts { sup t 0 ∈ T c | π ( t 0 ) | > 1 } and { sup t 0 ∈ T c k ω 0 k ≤ λ } resp ec- tiv ely . Then P ( A ) = P ( A | B ) P ( B ) + P ( A | B c ) P ( B c ) ≤ P ( A | B ) + P ( B c ) , whic h prov es the result. Pro of of Theorem 6. Set λ = ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) − 1 ( µ (Φ , Ψ) p | T | + a ¯ σ m + + √ | T | µ M √ m ). Com bining Lemma 8 and 9, w e ha ve for ¯ σ and each a > 0 satisfying the conditio ns in Lemma 7 , P ( sup t ∈ T c | π ( t ) | ≥ 1 ) ≤ 2( n 1 + n 2 ) e − 1 2 λ 2 + 3( n 1 + n 2 ) e − γ a 2 + P  k [Φ Ω T 1 Ψ Ω T 2 ] ∗ [Φ Ω T 1 Ψ Ω T 2 ] k ≤ ( 1 2 + k Φ ∗ T 1 Ψ T 2 / N k ) m  ✷ References [1] Daub echies I. T en le c tur es on wavelets . Philadelph ia, P A: SIAM, 1992. [2] Daub echies I, J aﬀard S, Journ´ e J L. A simple Wilson orthono rmal b asis with exp onential de c ay. S iam J ournal on Mathematical Analysis, 1928, 22(2):554-5 72. [3] Cand` e s E J, Donoho D L. Ridgelets: a key to higher-dimensional intermittency? Philosophical T ransactions Mathematical Physica l & Engineering Sciences, 1999, 357(1 760):24 95-2509. [4] Cand` e s E J, Donoho D L. Curvelets, multir esolution r epr esentation, and sc aling laws. Conference on W a velet Applications in Signal and Image Pro cessing VI I I. 2000:1-1 2. [5] Donoho D L, Huo X. Unc ertainty principles and ide al atomic de c omp osition. IE EE T rans- actions on Information Theory , 2001, 47 (7) :2845-2 862. [6] Donoho D L, Stark P B. Unc ertainty principles and signal r e c overy. Siam J ournal on Applied Mathematics, 1989, 49(3) :906-9 31. [7] Goh S S, Go o d man T N T. Unc ertainty principles in Banach sp ac es and signal r e c overy. Journal of Approxima tion Theory , 2006, 143(1):26 -35. [8] Donoho D L. Compr esse d sensing. IEEE T ransactions on Inform ation T heory , 200 6, 52(4): 1289-13 06. [9] Cand` e s E J, T ao T. Ne ar-optimal signal r e c overy fr om r andom pr oje ctions: univ e rsal enc o ding str ate gie s. IEEE T r ansactions on In formation Th eory , 200 6, 52(12): 5406- 5425. [10] Cand` es E J, Rom b erg J , T ao T . Stable signal r e c overy fr om inc omplete and inac cur ate me asur ements. Comm unications on Pu re and Applied Mathematics, 2006, 59(8): 410- 412. [11] Cand` es E J, Rom b erg J, T ao T. R obust u nc ertainty principles: exact signal r e c onstruc- tion fr om highly inc omplete fr e quency information. IEEE T ransactions on I n formation Theory , 2004, 52(2): 489-509. [12] Elad M, Bruc kstein A M. A gener alize d unc ertainty principle and sp arse r epr esentation in p airs of b ases. IEEE T r ansactions on Information Theory , 2002, 48(9): 2558-2567 . [13] Dragotti P L, L u Y M. On sp arse r epr esentation in F ourier and lo c al b ases. IEEE T ran s - actions on Information Theory , 2014, 60(12): 7888-78 99. [14] Donoho D L, Elad M. Optimal ly sp arse r e pr esentation in gene r al (nonortho gonal) dic- tionaries via ℓ 1 minimization. Pro ceedings of the National Academy of Sciences, 2003, 100(5 ): 2197-2 202. [15] Grib onv al R, Nielsen M. Sp arse r epr esentations in unions of b ases. IEEE T ran s actions on Information Theory , 2004, 49(12):332 0-3325. [16] Maliouto v D M, Cetin M, Willsky A S. Optimal sp arse r epr esentations in gener al over- c omplete b ases. IEEE Internatio nal Conference on Acoustics, Sp eec h and Signal Pr o- cessing, 2004 , 2(2):ii- 793-6. [17] Ninness B, Gu stafsson F. A unifying c onstruction of ortho normal b ases for system iden- tiﬁc ation. Pro ceedings of the IEE E Conference on Decision and Con trol, 1994: 515-5 21. [18] Ak ca y H, Ninn ess B. R ational b asis f unctions for r obust identiﬁc ation fr om fr e que ncy and time-domain me asur ements. American Control Con f erence, 1998: 1101-1117. [19] Ninness B. F r e qu ency domain estimation using orthono rmal b ases. Pro ceedings of IF A C W orld Congress, 1996: 381-3 86. [20] W ard N F D, Pa rtin ton J R. R obust identiﬁc ation in the disc algebr a using r ational wavelets and orthonormal b asis functions. Inte rnational Journ al of C ontrol, 1996 , 64(3): 409-4 23. [21] Guc ht P V, Bultheel A. Ortho gonal r ational fu nctions for system identiﬁc ation: numer- ic al asp e cts. IEEE T ransactions on Automatic Con tr ol, 2003, 48(4): 705-709. [22] Ninness B, G´ omez J C , W eller S. MIM O system identiﬁc ation using orthonormal b asis functions. IEEE Conference on Decision & Control, 2000: 703–7 08. [23] Mi W, Qian T. F r e quency-domain identiﬁc ation: An algorithm b ase d on an adaptive r ational ortho gonal system. Automatica, 2012, 48(6):11 54-116 2 . [24] Chen Q, Mai W, Zh ang L, et al. System identiﬁc ation b y discr ete r ational atoms. Auto- matica, 2015 , 56:53- 59. [25] Ohta Y. Sto chastic system tr ansformation using gener alize d ortho normal b asis func- tions with applic ations to c ontinuous-time system identiﬁc ation. Automatica, 2011, 47(5): 1001-1 006. [26] Hida yat E, Medv edev A. L aguerr e domain identiﬁc ation of c ontinuous line ar time-delay systems fr om impulse r esp onse data. Automatica, 2012, 48(11) :2902- 2907. [27] Tiels K, Schouk ens J. Wiener system identiﬁc ation with gener alize d orthonorm al b asis functions. Automatica, 2014, 50(12 ):3147 -3154. [28] Heub erger P S C. Mo del ling and identiﬁc ation with r ational ortho gonal b asis f unctions. Springer London, 2005. [29] Hof P M J V D, Heub erger P S C, and Bok or J. System identiﬁc ation with gener alize d orthono rmal b asis f u nctions. Automatica, 1995, 31(12) : 1821-18 34. [30] W ahlb erg B. System identiﬁc ation u sing Kautz mo dels. IEEE T ransactions on Automatic Con tr ol, 1994, 39(6 ): 1276-1 282. [31] W ahlb erg B, M¨ akil¨ a P M. O n appr oximation of stable line ar dynamic al systems using L aguerr e and Kautz functions. Automatica, 1996, 32(5): 693-708. [32] Kautz W H. T r ansient synthesis in the time domain. T ransactions of the IRE Professional Group on Circuit T heory , 1954, ct-1:29-3 9. [33] Y oun g T Y, Huggins W H. ‘Complementary’ signals and ortho gonalize d exp onentials. IRE T rans actions on Circuit Th eory , 1962, 9(4):36 2-370. [34] W alsh J L. Interp olation and appr oximation by r ational fu nctions in the c omplex do main. American Mathematica l So ciet y , 1956. [35] Gu Y, Jin J, Mei S. l 0 norm c onstr aint LM S algorithm for sp arse system identiﬁc ation. IEEE Signal Pro cessing Letters, 2013, 16(9) : 774-777 . [36] Chen Y, Gu Y, Hero A O. Sp arse LMS for system identiﬁc ation. IEEE In tern ational Conference on Acoustics, Sp eec h and Signal Pr o cessing, 2009: 3125- 3128. [37] Kalouptsidis N, Mileounis G, Babadi B, et al. A daptive algorithms for sp arse system identiﬁc ation. S ignal Pro cessing, 2011, 91(8) : 1910-19 19. [38] Kopsinis Y, S la v akis K, Theo d orid is S. Online sp arse system identiﬁc ation and signal r e c onstruction using pr oje ctions onto weighte d l 1 b al ls. IEE E T r ansactions on Signal Pro cessing, 2010, 59(3): 936-95 2. [39] Sla v akis K, Kopsinis Y, Theo doridis S. A daptive algorithm for sp arse system identiﬁc a- tion using pr oje c tions onto weighte d l 1 b al ls. IEEE In ternational Conference on Acous- tics, Sp eec h and S ignal Pro cessing, 2010: 3742-3 745. [40] Rauhut H. Compr essive sensing and structur e d r andom matric es. Radon Series on Com- putational and Applied Mathematic s, 2010, 9: 1-92. [41] Natara jan B K. Sp arse appr oximate solutions to line ar systems. S iam Journal on Com- puting, 1995, 24(2): 227-2 34. [42] Chen S , Donoho D L, Saunders M. Atomic de c omp osition by b asis pursuit. Siam Review, 2001, 43(1): 33-61 . [43] Bo yd S, V anden b erghe L. Convex optimiza tion. Cam b ridge Univ ers it y Press, Cam bridge, 2004. [44] Cand` es E, Rom b erg J . Sp arsity and inc oher e nc e in c ompr essive sampling. Inv erse P r ob- lems, 2006, 23(3): 969-98 5(17). [45] Daub ec hies I, Defrise M, De Mol C . An iter ative thr esholding algorithm for line ar i nv e rse pr oblems with a sp arsity c onstr aint. Communications on Pure and Ap p lied Mathematics, 2004, 57(11) : 1413C145 7. [46] Daub ec hies I, Dev ore R, F ornasier M, et al. Iter atively r eweighte d le ast squar es mini- mization for sp arse r e c overy. Communications on Pure and Applied Mathematics, 2010 , 63(1): 1-38. [47] Daub ec hies I, F orn asier M, Loris L. A c c eler ate d pr oje cte d gr adient metho d for line ar inverse pr oblems with sp arsity c onstr aints. Journal of F ourier Analysis and Ap plications, 2008, 14(5):7 64-792 . [48] Donoho D L, T saig Y. F ast solution of l 1 -norm minimization pr oblems when the solution may b e sp arse. IEEE T ransactions on Information T h eory , 2008, 54(11): 4789-4812. [49] Kim S, Koh K, Lustig M, Bo yd S and Gorinevsky D. An interior-p oint metho d for lar ge- sc ale ℓ 1 -r e gularize d le ast squar es. IEEE Journal of Selected topics in S ignal p ro cessing, 2007, 1(4): 606-61 7. [50] Melrose M R. R andom ve ctors in the isotr opic p osition. Jour nal of F un ctional Analysis, 1999, 164(1) :60-72.

Uncertainty Principle and Sparse Reconstruction in Pairs of Orthonormal Rational Function Bases

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment