Segmented compressed sampling for analog-to-information conversion: Method and performance analysis
A new segmented compressed sampling method for analog-to-information conversion (AIC) is proposed. An analog signal measured by a number of parallel branches of mixers and integrators (BMIs), each characterized by a specific random sampling waveform,…
Authors: Omid Taheri, Sergiy A. Vorobyov
1 Se gmented compressed sampling for analog-to-informat ion con v ersion: Method and performance analysis Omid T ahe ri, Student Member , IEEE, and Ser giy A. V orobyov , Senior Member , IEEE Abstract A new segmented com pressed samplin g m ethod f or analog-to -inform ation conv ersion (AIC) is pr o- posed. An analog signal measured by a number of parallel branches of mixers and integrators (BMIs), each characterized b y a specific ran dom sampling wa veform, is first segmented in time into M segments. T hen the sub -samples c ollected o n different segments and d ifferent BMIs are reu sed so that a larger numb er of sam ples th an the number of BMIs is co llected. T his te chnique is sh own to be equiv alent to extending the m easurement matrix , which con sists of the BMI sampling wa veforms, by addin g new r ows without actually increasin g the number of BMIs. W e prove that th e extended m easuremen t matrix satisfies th e restricted isometry pro perty with overwhelm ing pr obability if th e o riginal mea surement matrix of BMI sampling w aveforms satisfies it. W e also show th at the signal recovery per formanc e c an be im proved significantly if our segmented AIC is used for sampling instead of the conventional AIC. Simulatio n results verify the effecti veness of the propo sed segmented compressed samp ling m ethod an d the validity of our theoretical stud ies. Index T erms Compressed sampling, analog -to-info rmation converter , co rrelated ran dom variables, l 1 -norm m ini- mization, e mpirical risk m inimization. The authors are with the Department of Electrical and Computer Engineering, Univ ersity of Alberta, Edmonton, Alberta, Canada. The contacting emails are { otaheri, vo robyov } @ec e.ualberta.ca. Corresponding author: Sergiy A. V orobyov , Dept. of Electrical and Computer Engineering, Uni versity of Alberta, 9107- 116 S t., E dmonton, Alberta, T6G 2V4, Canada; Phone: +1 (780) 492 9702, Fax: +1 (780) 492 1811. This work was supported in parts by research grants from the Natural Science and Engineering Research Council (NSERC) of Canada and Alberta Ingenuity Ne w Faculty A ward. June 6, 201 8 DRAFT 2 I . I N T RO D U C T I O N According to Shannon ’ s sampling theorem, an a nalog band-limited signa l can be recovered fr om its discrete-time samples if the sampling rate is a t lea st twice the ma ximum frequency presen t in the signa l. Recent theory of compres sed sampling (CS), ho wev er , suggests that a signal can be recovered from fewer samp les if it is sparse or compressible [1]–[4 ]. CS theory also suggests that a uni versal s ampling matrix ( for example, a random projection matrix) can be d esigned, and it ca n be use d for a ll sparse signals regardless of their nature [2 ]. CS has a lready found a wide rang e of applications such a s image acquisition [5], senso r ne tworks [6], c ogniti ve radios [7], communication c hanne l es timation [8], [9], etc. The sampling p rocess often used in the CS li terature consists of two steps. First, an analog signa l is sampled at the Nyquist rate and then a meas urement matrix is app lied to the time domain samples in order to collec t the c ompresse d samples (see , for exa mple, [7]). T his sa mpling approac h, howev er , defeats one o f the primary purposes of CS , wh ich is avoi ding high rate sampling. A more practical approach for “direct” samp ling a nd compression of analog signals h as be en presented in [10]. The ana log signal is a ssumed to belong to the c lass o f sign als in shift-in variant space s, that is, the analog s ignal ca n be represented as a linear c ombination of a set of m ba sis functions defined ov er a p eriod T . The analog signal is first passed through a filter bank whe re eac h filter is ma tched to one of the m basis functions and the output is s ampled a t time ins tances n T whe re n is an integer . If the signa l is spa rse, the n only S < m samples are no nzero. The set of m output sa mples are then pa ssed through a measu rement matrix to create K ≥ S c ompresse d samples representing the analog signal in a specific pe riod [( n − 1) T , nT ] . It is worth mentioning that this method is a generalization of another me thod in [11] which is de vised for sub-Nyquist sampling of multi-band sign als. T he limits of this method come from the u nderlying assumption tha t the signal belongs to the class o f s ignals in shift-in variant spac es. Although this as sumption is ar gued to be valid for a variety of engineering applica tions [10], [12] and can be generalized to the signals in a union of s ubspac es [13], [14], it is still a limiti ng a ssumption. Moreover , the complexity of this method is by no means lower than the c omplexity of another practical ap proach to CS, wh ich avoids high rate sampling [1], [15]. The name ana log-to-information con verter (AIC) has be en coined for the latter me thod. The AIC cons ists of several parallel branche s of mixers and integrators (BMIs) in which the analog signal is me asured aga inst diff erent random sampling waveforms. Therefore, for every c ollected co mpressed sample, there is a BMI that multipli es the signal to a sampling wav eform and then integrates the result over a period T . In this paper , we propos e a new segmented AIC structure with the goa l of redu cing the hardware June 6, 201 8 DRAFT 3 complexity . 1 The contributions of this work a re the following. (i) A new segmented AIC structure is developed. In this structure, the integration period T is divided into M equal subperiods such that the sampling rate of our segme nted AIC s cheme is M times higher than of the AIC of [1]. Th e sub-sa mples collected over diff erent su bperiods by combining the sub-samples from different BMIs are then reused in orde r to build add itional samples. In this way , a number of s amples lar ger than the number of BMIs can be c ollected, a lthough s uch sa mples will be correlated. W e s how that our segmented AIC tech nique is equiv alent to extending the measurement matrix which consists of the BMI sa mpling wa veforms by adding new rows without ac tually increa sing the number of BMIs. In this respect, the follo wing works also need to be men tioned [17], [18]. In [17], T oeplitz-structured mea surement matrices are considered, while measuremen t matrices built on one random vector with s hifts of D ≥ 1 in between the rows appear in radar imag ing application considered in [18]. (ii) W e show that the restricted isometry p roperty (RIP), that is a sufficient condition for signal recovery base d on comp ressed s amples, is satisfied for the extended mea surement matrix resulting from the segmented AIC structure with overwhelming prob ability if the original matrix of BMI sampling waveforms satisfie s the RIP . Thus, o ur segmented AIC is a valid candidate for CS. (iii) W e also show that the s ignal recovery performance improves if our segmen ted AIC is used for sampling instead of the AIC o f [1] with the sa me number of BMIs. The mathematical challenge in this part of the work is that the samples collected by our segmented AIC are correlated, while a ll av ailable results on performance analysis of the signal rec overy are obtained for the cas e of uncorrelated samples. The rest of this p aper is or ganized as follo ws. Necessary ba ckground on CS, CS signa l recovery , and AIC is briefly s ummarized in Section II. The ma in ide a o f the paper , tha t is, the segmented AIC structure, is explained in Section III. W e prove in Section IV tha t the extende d measurement ma trix resu lting from the p roposed segmen ted AIC sa tisfies the RIP an d, therefore, the s egmented AIC is a legitimate CS method. The signal rec overy performance a nalysis for ou r segmen ted AIC is summarized in Se ction V. Section VI demo nstrates the simulation resu lts and Sec tion VII c oncludes the pa per . I I . B AC K G RO U N D CS basics a nd notations : CS deals with a lo w rate repres entation of sparse signals, i.e., such signals which have few non zero projections o n the vectors of a n orthogonal basis (spa rsity basis). Let Ψ = ψ T 1 , ψ T 2 , . . . , ψ T N T be an N × N ma trix of basis vectors ψ i , i = 1 , . . . , N , i.e., the s parsity basis, and 1 Some preliminary results hav e been reported in [16] . June 6, 201 8 DRAFT 4 f be a discrete-time sp arse s ignal 2 represented in this b asis as f = N X i =1 x i ψ H i = Ψ H x (1) where x = ( x 1 , x 2 , . . . , x N ) T is the N × 1 vec tor of coefficients and ( · ) T and ( · ) H stand for the transp ose and Hermitian tr ans pose, respecti vely . A s ignal is S -spa rse if at most S projections on the ro ws of Ψ , i.e., coefficients of x , are nonze ro. It is known that a universal comp ressed sa mpling me thod ca n b e designed to effecti vely sa mple and rec over S -sparse s ignals regardless of the s pecific sparsity domain [1], [2]. Among various bo unds on the sufficient n umber of collected co mpressed samples 3 K ( S < K < N ) required for recovering an S -sparse signa l, the first an d mos t pop ular one is given by the following inequality S ≤ C K/ log ( N /K ) where C is some cons tant [1]. This bound is de ri ved based on the uniform uncertainty principle [20]. L et Φ be a K × N measurement matrix applied to a sparse s ignal for collecting K comp ressed samples . The n the un iform u ncertainty principle states that Φ must s atisfy the follo wing res tricted isometry property (RIP) [1]. Le t Φ T be a sub-matrix of Φ re taining only the columns with their indexes in the set T ⊂ { 1 , . . . , N } . The n the S -restricted isometry consta nt δ S is the smallest number satisfying the inequality K N (1 − δ S ) k c k 2 l 2 ≤ k Φ T c k 2 l 2 ≤ K N (1 + δ S ) k c k 2 l 2 (2) for all sets T of cardinality less than o r equal to S and a ll vec tors c (he re k · k l 2 denotes the Euclidean norm of a vector). As shown in [2], [21 ], if the entries of Φ are, for exa mple, ind epende nt zero mean Gaussian variables with variance 1 / N , then Φ sa tisfies the RIP for S ≤ C K/ log ( N /K ) with high probability . 4 Recovery methods: Using the measureme nt matrix Φ , the 1 × K vector of c ompresse d samples y can be calculated as y = Φ f = Φ ′ x where Φ ′ = ΦΨ H . A signal can be recovered from its no iseless sample vector y base d on the follo wing co n vex optimization problem that c an be solved by a linear program [2], [22] min k ˜ x k l 1 subject to Φ ′ ˜ x = y (3) where k · k l 1 denotes the l 1 -norm of a v ector . 2 It can be in R N or C N . 3 See [19] for broader rev iew . 4 Note that in order to ensure consistency throughout the paper , the variance of the elements i n Φ i s taken to be 1 / N instead of 1 /K as, for example, in [2]. Thus, t he multiplier K / N is added in the left- and right-hand sides of (2). June 6, 2018 DRAFT 5 If the c ompresse d samples are noisy , the sampling process can be express ed as y = Φ f + w (4) where w is a zero mean noise vector with identically and indep endently distributed (i.i.d.) entries of variance σ 2 . Then the rec overy prob lem is mod ified as [23 ] min k ˜ x k l 1 subject to k Φ ′ ˜ x − y k l 2 ≤ γ (5) where γ is the bound on the square root o f the nois e ene r gy . Another technique for spars e s ignal rec overy from noisy sa mples (see [4]) uses the empirical risk minimization me thod that was first developed in s tatistical lea rning theory for approximating an unknown function based on noisy meas urements [24]. Note that the emp irical risk minimization-ba sed re covery method is of a pa rticular interest s ince under s ome simplifications (see [4, p. 40 41]) it redu ces to ano ther well-known least absolute shrinkage a nd s election o perator (LASSO) me thod [25]. Therefore, the risk minimization-based method of [4] provides the ge nerality w hich we need in this pa per . In applica tion to CS, the un known function is the s parse s ignal and the noisy compres sed sa mples are the collected data. Let the entries of the meas urement ma trix Φ be selected with equal probab ility as ± 1 / √ N , a nd the energy o f the signal f be bounde d so that k f k 2 ≤ N B 2 . The risk r ( ˆ f ) o f a candidate reconstruction ˆ f and its empirical risk ˆ r ( ˆ f ) are defin ed a s follo ws [24] r ( ˆ f ) = k ˆ f − f k 2 N + σ 2 , ˆ r ( ˆ f ) = 1 K K X j =1 y j − φ j ˆ f 2 . (6) Then the can didate rec onstruction ˆ f K obtained based on K samples can b e found as [4] ˆ f K = arg min ˆ f ∈F ( B ) ( ˆ r ( ˆ f ) + c ( ˆ f ) log 2 ǫK ) (7) where F ( B ) = { f : k f k 2 ≤ N B 2 } , c ( ˆ f ) is a n onnegativ e number as signed to a candidate signal ˆ f , and ǫ = 1 / 50( B + σ ) 2 . Moreover , ˆ f K giv en by (7) s atisfies the following ine quality [4] E ( k ˆ f K − f k 2 N ) ≤ C 1 min ˆ f ∈F ( B ) ( k ˆ f − f k 2 N + c ( ˆ f ) log 2 + 4 ǫK ) (8) where C 1 = [(27 − 4 e )( B /σ ) 2 + (50 − 4 √ 2) B /σ + 26] / [(23 − 4 e )( B /σ ) 2 + (50 − 4 √ 2) B /σ + 24] , e = 2 . 7183 . . . , and E {·} stands for the expectation operation. Let a co mpressible signa l f be defin ed as a s ignal for which k f ( m ) − f k 2 ≤ N C A m − 2 α where f ( m ) is the best m -term app roximation of f wh ich is obtained by retaining the m most s ignificant coefficients of vector x ( x being the rep resentation of f in the sparsity ba sis Ψ ), a nd C A > 0 and α ≥ 0 are June 6, 2018 DRAFT 6 some cons tants. Le t also F c ( B , α, C A ) = { f : k f k 2 ≤ N B 2 , k f ( m ) − f k 2 ≤ N C A m − 2 α } be the set of compressible s ignals. Th en base d on the we ight a ssignment c ( f ) = 2 log ( N ) N x (here N x is the actual number of non zero c oefficients in x ) the follo wing ine quality holds [4] sup f ∈F c ( B ,α,C A ) E ( k ˆ f K − f k 2 N ) ≤ C 1 C 2 K log N − 2 α/ (2 α +1) (9) where C 2 = C 2 ( B , σ, C A ) > 0 is a cons tant. If signa l f is indee d sparse and belongs to F s ( B , S ) = { f : k f k 2 ≤ N B 2 , k f k l 0 ≤ S } , then there exists a c onstant C ′ 2 = C ′ 2 ( B , σ ) > 0 such that [4] sup f ∈F s ( B ,S ) E ( k ˆ f K − f k 2 N ) ≤ C 1 C ′ 2 K S log N − 1 . (10) AIC: The random mo dulation preintegration (RMPI) structure is proposed for AIC in [1 ]. The RMPI multiplies the signal and the sampling waveforms in the analog do main and the n integrates the product over the signal period to produce sa mples. It implies that the s ampling device has a number o f pa rallel BMIs in order to proces s the ana log signa l in real-time. The RMPI structure is shown in F ig. 1 , wh ere f ( t ) is the analog sign al b eing sa mpled, φ i ( t ) , i = 1 , . . . , K are the sampling w aveforms (rows of the measureme nt matrix Φ ), a nd y i , i = 1 , . . . , K are the compressed sa mples. R T 0 f ( t ) Φ 1 ( t ) Φ 2 ( t ) Φ K ( t ) R T 0 R T 0 y 2 y K y 1 Fig. 1. T he structure of the AIC based on RMPI. I I I . S E G M E N T E D C O M P R E S S E D S A M P L I N G M E T H O D AIC removes the need for high sp eed sampling, but it ma y still be nece ssary in many practical applications to collect a larger n umber of c ompresse d samples than the AIC hardware (the n umber of p arallel BMIs) may allow . Indeed, a smaller numbe r of samples ma y have a negati ve ef fect o n the signal recovery accuracy which c an be an issue in a numbe r of ap plications. In order to co llect a lar ger number of compres sed samp les us ing AIC, we need to increas e the hardware complexity by adding more June 6, 2018 DRAFT 7 BMIs. The latter ma kes the AIC device complex and expen siv e althoug h its sa mpling rate is much lower than tha t of analog-to-digital con verter (ADC). Th erefore, it is de sirable to reduce the numbe r of parallel BMIs in AIC without s acrificing the s ignal recovery accu racy . It c an be ac hiev ed by adding to AIC the capability of sampling at a highe r rate, which is, h owe ver , signific antly lower than the sa mpling rate required by ADC. The latter can be achiev ed by splitting the inte gration period T in every BMI of the AIC in Fig. 1 into shorter su bperiods. It is equiv alent to genera ting a numb er of inco mplete samples of a signal. Note that since the original integration period is divided into a numbe r of smaller subp eriods, the samples collected over all parallel BMIs during o ne subperiod do not have co mplete infor mation about the signal. Therefore, they are called incomplete sample s. Hereafter , the complete samp les obtained over the wh ole pe riod T are referred to as just samples , while the incomp lete sa mples are referred to as sub-samples . A. The Bas ic Ide a and the Mo del The basic ide a is to c ollect the sub-samples as described above an d then reu se the m in orde r to build additional s amples. In this manner , a lar ger n umber o f s amples than the number of BMIs c an b e collected. It allows for a tradeoff between AIC a nd ADC since a s in AIC the s ignal is me asured at a low rate by correlating it to a number of sa mpling wa veforms, while the integration period is split into shorter s ub- intervals which is similar to the requ irement of a higher sampling rate as in ADC. Howe ver , the required sampling rate in the propos ed s cheme is still s ignificantly lower than that req uired by ADC. Let the integration period b e split into M sub-intervals, and let y k =( y k , 1 , . . . , y k ,M ) T , k = 1 , . . . , K be the vectors o f su b-samples co llected against the sampling wav eforms φ k , k = 1 , . . . , K , whe re K is the original numbe r of s ampling wav eforms, i.e., the numb er o f BMIs. The s ub-sample y k ,j is given by y k ,j = Z j T / M ( j − 1) T / M x ( t ) φ k ( t ) dt. (11) Then the t otal number of sub-samples co llected in a ll BMIs over all subperiods is M K . These su b-samples can be ga thered in the following K × M matrix Y = y 1 , 1 y 1 , 2 . . . y 1 ,M y 2 , 1 y 2 , 2 . . . y 2 ,M . . . . . . . . . . . . y K, 1 y K, 2 . . . y K,M (12) where the k -th ro w c ontains the sub-samples ob tained by correlating the measured s ignal with the wa veform φ k over M subperiods eac h of length T / M . June 6, 2018 DRAFT 8 The original K samples , i.e., the samples co llected at BMIs over the whole time pe riod T , are y k = M X m =1 [ Y ] k ,m , k = 1 , . . . , K (13) where [ Y ] k ,m denotes the ( k , m ) -th element of Y , that is, [ Y ] k ,m = y k ,m . In orde r to con struct a dditional samp les to the s amples obtaine d us ing (13), we consider columnwise permuted versions of Y . The following definitions are then in order . The permutation π is a one-to-one mapping of the elements of a s et D to itself by simply cha nging the order of the eleme nts. Then π ( k ) stands for the index of the k -th element in the permuted se t. For example, let D cons ists of the elemen ts of a K × 1 vector z , a nd the order o f the elements in D is the same a s in z . After applying the pe rmutation fun ction π to z , the permuted vector is z π = z π (1) , . . . , z π ( k ) , . . . , z π ( K ) T . If vector z is itself the vector o f indexes, i.e. , z = (1 , . . . , K ) T , then obviously z π ( k ) = π ( k ) . The permuted versions of the su b-sample matrix Y can be obtaine d b y a pplying different permu- tations to different columns o f Y . Spec ifically , let P ( i ) = { π ( i ) 1 , . . . , π ( i ) j , . . . , π ( i ) M } be the i -th s et of co lumn permutations with π ( i ) j being the permutation function applied to the j -th column of Y , and let I stand for the numb er of s uch permutation sets. The n a ccording to the above notations, the matrix resu lting from applying the set of p ermutations P ( i ) to the columns of Y c an b e expressed as Y P ( i ) = y π ( i ) 1 1 , . . . , y π ( i ) j j , . . . , y π ( i ) M M where y j is the j -th column of Y . Permutation sets P ( i ) , i = 1 , . . . , I are c hosen in suc h a way that all sub-sample s in a spe cific ro w of Y P ( i ) come from dif ferent rows o f the original su b-sample matrix Y a s well as from dif ferent rows of other permuted matrices Y P (1) , . . . , Y P ( i − 1) . For example, all su b-samples in a specific row of Y P (1) must come from dif ferent ro ws of the original matri x Y only , while the sub-samples in a specific ro w of Y P (2) come from different rows of Y and Y P (1) and s o on . This requ irement is forced to make sure that any a dditional sample has the lea st possible co rrelation with the original s amples of (13). The n the additional K I samples can be o btained bas ed on the permuted matrices Y P ( i ) , i = 1 , . . . , I as y P ( i ) k = M X m =1 [ Y P ( i ) ] k ,m , k = 1 , . . . , K i = 1 , . . . , I . (14) It is worth n oting that in terms of the hardware structure, the sub-samples us ed to generate a dditional samples must be cho sen from different BMIs as well a s diff erent integration sub periods. This is eq ui valent to collecting additional samp les by correlating the sign al with ad ditional samp ling wa veforms which are not present among the actual BMI sampling wa veforms. Each of these ad ditional sampling wa veforms comprises the no n-overlapping s ubperiods o f M dif ferent origi nal w aveforms. June 6, 2018 DRAFT 9 Now the q uestion is how many permuted matrices, wh ich sa tisfy the above summarized c onditions, can be ge nerated based on Y . Conside r the follo wing K × M matrix Z , ( z , z , . . . , z ) | {z } M times (15) where z is the vector of indexes. Applying the column p ermutation set P ( i ) to the columns of Z , we obtain a p ermuted matrix Z P ( i ) = z π ( i ) 1 , . . . , z π ( i ) j , . . . , z π ( i ) M . Then the se t of all pe rmuted versions o f Z c an be den oted as S Z = { Z P (1) , . . . , Z P ( I ) } . W ith these notations , the follo wing theorem is in order . Theorem 1. The size of S Z , i.e., the number I of permutation s ets P ( i ) , i = 1 , . . . , I w hich sa tisfy the conditions [ Z P ( i ) ] k ,j 6 = [ Z P ( i ) ] k ,r , ∀ Z P ( i ) ∈ S Z , j 6 = r , k ∈ { 1 , . . . , K } , j, r ∈ { 1 , . . . , M } (16) ∃ ! j o r ∄ j su ch that [ Z P ( i ) ] k ,j = [ Z P ( l ) ] h,j , ∀ Z P ( i ) , Z P ( l ) ∈ S Z , Z P ( i ) 6 = Z P ( l ) , ∀ j ∈ { 1 , . . . , M } ∀ k , h ∈ { 1 , . . . , K } (17) is at mos t K − 1 . Here [ Z P ( i ) ] k ,j stands for the ( k , j ) -th element of the p ermuted matrix Z P ( i ) . Remark 1 . Using the pr oper ty tha t z π ( k ) = π ( k ) for the vector of indexes z , the c onditions (16) a nd (17) can also b e expr essed in ter ms of per mutations as π ( i ) j ( k ) 6 = π ( i ) r ( k ) ∀ i ∈ { 1 , . . . , I } , j 6 = r , k ∈ { 1 , . . . , K } , j, r ∈ { 1 , . . . , M } (18) ∃ ! j o r ∄ j su ch that π ( i ) j ( k ) = π ( l ) j ( h ) ∀ i, l ∈ { 1 , . . . , I } , i 6 = l , ∀ j ∈ { 1 , . . . , M } , ∀ k , h ∈ { 1 , . . . , K } . (19) Pr oof: Se e Append ix A. Example 1: L et the sp ecific choice of index permutations be π s ( k ) = (( s + k − 2) mod K ) + 1 , s, k = 1 , . . . , K with π 1 being the identity p ermutation and ’mod’ standing for the modu lo operation. For this specific choice, π ( i ) j = π [ i ( j − 1) mod K ]+1 , i = 1 , . . . , K − 1 , j = 1 , . . . , M . Co nsider the following matrix notation for the s et P where the ele ments along the i -th ro w are the pe rmutations P ( i ) , i = 1 , . . . , I June 6, 2018 DRAFT 10 P , P (1) P (2) P (3) . . . P ( K − 2) P ( K − 1) = π (1) 1 π (1) 2 π (1) 3 . . . π (1) M π (2) 1 π (2) 2 π (2) 3 . . . π (2) M π (3) 1 π (3) 2 π (3) 3 . . . π (3) M . . . . . . . . . . . . . . . π ( K − 2) 1 π ( K − 2) 2 π ( K − 2) 3 . . . π ( K − 2) M π ( K − 1) 1 π ( K − 1) 2 π ( K − 1) 3 . . . π ( K − 1) M = π 1 π 2 π 3 . . . π M π 1 π 3 π 5 . . . π [2( M − 1) mod K ]+1 π 1 π 4 π 7 . . . π [3( M − 1) mod K ]+1 . . . . . . . . . . . . . . . π 1 π K − 1 π K − 3 . . . π [( K − 2)( M − 1) mod K ]+1 π 1 π K π K − 1 . . . π [( K − 1)( M − 1) mod K ]+1 . (20) Note that no t all pe rmutations P ( i ) , i = 1 , . . . , I used in (20) may be pe rmissible. In fact, the set of permutations P ( i ) with K/g cd ( i, K ) < M h as at lea st one repea ted permutation that co ntradicts the condition (18). Here g cd ( · , · ) stand s for the greates t c ommon devisor of two numbe rs. For example, for K = 8 an d M = 4 , K/g cd (4 , K ) = 2 < M a nd P (4) is impermissible. T herefore, instea d of K − 1 = 7 , only the following 6 sets of pe rmutations are allowed P = π (1) 1 π (1) 2 π (1) 3 π (1) 4 π (2) 1 π (2) 2 π (2) 3 π (2) 4 π (3) 1 π (3) 2 π (3) 3 π (3) 4 π (4) 1 π (4) 2 π (4) 3 π (4) 4 π (5) 1 π (5) 2 π (5) 3 π (5) 4 π (6) 1 π (6) 2 π (6) 3 π (6) 4 = π 1 π 2 π 3 π 4 π 1 π 3 π 5 π 7 π 1 π 4 π 7 π 2 π 1 π 6 π 3 π 8 π 1 π 7 π 5 π 3 π 1 π 8 π 7 π 6 . (21) Theorem 1 shows h ow many different permuted versions of the original sub -sample matrix Y can be obtained such that the correlation between the original and additional samples would be minimal. Indeed, s ince the s et of sub-samples that are us ed to build additional samples is c hosen in a way that additional samples have at most one sub-sample in common with the previous samples , i.e ., c onditions (18) a nd (19) are sa tisfied, the set of p ermutations (20 ) is a valid cand idate. The i -th element o f P , i.e., the eleme nt P ( i ) = π ( i ) 1 , . . . , π ( i ) M , is the set of permutations applied to Y to obtain Y P ( i ) . Adding u p the entries alon g the ro ws of Y P ( i ) , a set of K add itional sa mples c an be obtaine d. June 6, 2018 DRAFT 11 Example 2: Le t the numb er of new samp les K a be at most K . This means that all permutations a re giv en by on ly P (1) in (20). In this spec ial cas e, the sub-sa mple selection method can be summarized as follo ws. For constructing the ( K + 1) -st sample, M s ub-samples on the main diagonal of Y are summed up tog ether . Then the M su b-samples on the secon d diagon al are used to c onstruct the ( K + 2) -nd sample, a nd s o o n up to the K a -th s ample. Mathematically , the so co nstructed ad ditional samp les can be expressed in terms o f the elements of Y as y K + k = M X m =1 y l,m , k = 1 . . . , K a (22) where l = [( k + m − 2) mod K ] + 1 an d K a ≤ K . Fig. 2 shows sch ematically how the su b-samples are selected in this exa mple. y 2 K − 1 y 1 , 1 y 2 , 1 y 3 , 1 y K − 1 , 1 y K, 1 . . . . . . y 1 , 2 y 2 , 2 y 3 , 2 y K − 1 , 2 y K, 2 . . . y 1 , 3 y 2 , 3 y 3 , 3 y K − 1 , 3 y K, 3 . . . . . . · · · · · · · · · · · · · · · y 1 ,M y 2 ,M y 3 ,M y K − 1 ,M y K,M y K +1 y K +2 y 2 K Fig. 2. S ub-sample selection principle for building additional samples in E xample 2. Our segmented sampling process can be equi valently expresse d in te rms of the measuremen t matrix. Let Φ be the original K × N measurement matri x. Le t the k -th row of the matri x Φ be φ k = φ k , 1 , . . . , φ k ,M where φ k ,j , j = 1 , . . . , M are some vectors. Let for simplicity , the length of φ k ,j be N/ M and N / M be an integer number . The set o f p ermutations applied to Y in orde r to obta in Y P ( i ) is P ( i ) . Th en the operation Φ P ( i ) can be expresse d as follo ws. The first N/ M column s of Φ , which are the vectors φ k , 1 , k ∈ { 1 , ..., K } , are p ermuted w ith π ( i ) 1 . The sec ond N / M co lumns o f Φ are permuted with π ( i ) 2 and so o n until the last N / M columns of Φ which are p ermuted with π ( i ) M . Then the extended measureme nt matrix which c ombines all poss ible permutations P ( i ) , i = 1 , . . . , I can b e expressed a s Φ e = Φ T , ( Φ P (1) ) T , . . . , ( Φ P ( I ) ) T T (23) where K e , K + K a = K + K I . June 6, 2018 DRAFT 12 Example 3: Con tinuing with the set up used in Exa mple 2, let K a ≤ K . Then the extend ed meas ure- ment matrix is Φ e = Φ Φ 1 = φ 1 , 1 φ 1 , 2 . . . φ 1 ,M . . . . . . . . . . . . φ K, 1 φ K, 2 . . . φ K,M φ 1 , 1 φ 2 , 2 . . . φ M ,M . . . . . . . . . . . . φ K a , 1 φ π 2 ( K a ) ,M . . . φ π M ( K a ) ,M (24) where Φ 1 contains only K a rows of Φ P (1) and Φ 1 = Φ P (1) if K a = K . B. Implementation Issue s and Discus sion Due to the s pecial structure of the extende d measuremen t matri x Φ e , the s ampling h ardware needs only K parallel BMIs for collecting K I samples . These BMIs are esse ntially the sa me as those in Fig. 1. The only difference is that the integration period T is divided into M equal s ubperiods. After ev ery sub period, each integrator’ s ou tput is s ampled and the integrator is rese t. In ad dition, a multiplexer which selec ts the sub-samples for con structing ad ditional s amples is n eeded. Note that pa rtial s ums c an be kept for constructing the samples (original and additional), that is, the results of the integration are upd ated and accumulate d for each sample iteratively after eac h sub period. In this way , there is no n eed o f designing the circuitry to me morize the matrix of sub -samples Y , but only the partial sums for e ach sample are memorized at any current su bperiod. Since the proposed s egmented AIC sc heme collects the sub -samples at the M times higher rate than the AIC in Fig. 1, an improved signal recovery performance is expec ted. It agrees with the co n vention that the rec overy p erformance ca nnot be improved only due to the post processing . Moreover , note that since the original random sampling waveforms a re linearly ind epende nt with high prob ability , the a dditional sampling wav eforms o f our segmen ted co mpressed sampling metho d are also linearly independen t with overwhelming prob ability . However , a sufficient c ondition that guarantees that the extended mea surement matrix of the proposed s egmented AIC sc heme is a n eligible choice is the RIP . Therefore, the RIP for the propos ed segmented comp ressed s ampling sc heme is analyz ed in the next section. I V . R I P F O R T H E S E G M E N T E D C O M P R E S S E D S A M P L I N G M E T H O D The pu rpose of this section is to show that the exten ded meas urement ma trix Φ e in (23) satisfies the RIP if the original me asurement matrix Φ satisfies it. T he latter will also imply tha t Φ e can be use d June 6, 2018 DRAFT 13 as a valid CS me asurement matrix. In our set up it is only ass umed that the elements of the o riginal measureme nt matrix are i.i.d. zero me an Ga ussian variables and the measurement matrix is extend ed b y adding its pe rmuted versions as described in the previous sec tion. Let us first cons ider the spec ial case of Ex ample 3. In this case , Φ , Φ 1 , an d Φ e are the original measureme nt matrix, the matrix o f a dditional sa mpling waveforms, and the extended measureme nt matrix giv en b y (24), respectively . Let the ma trix Φ satisfy the RIP with sufficiently high probability . For example, let the elements of Φ be i.i.d. zero mean Gaussian random v ariables with variance 1 / N . Let T be any sub set of s ize S of the set { 1 , . . . , N } . Then for any 0 < δ S < 1 , the matrix Φ T , wh ich is a sub-matrix of Φ wh ich c onsists of on ly the co lumns with their indexes in the set T s atisfies (2) with the follo wing p robability [21] Pr { Φ T satisfies (2) } ≥ 1 − 2 (12 /δ S ) S e − C 0 ( δ S / 2) K (25) where C 0 ( δ S / 2) = δ 2 S / 16 − δ 3 S / 48 . Herea fter , the notation C 0 is used instead of C 0 ( δ S / 2) for b re vity . First, the following auxiliary result o n the extended measureme nt matrix Φ e is of interest. Lemma 1. Let the elements of the measurement matrix Φ be i.i.d. zer o mean Gaussian variables with variance 1 / N , Φ e be for med as shown in (24) , and T ⊂ { 1 , . . . , N } of size S . If K a is chose n such that min { K, K a + M − 1 } ≤ ⌈ ( K + K a ) / 2 ⌉ , the n for any 0 < δ S < 1 , the following inequality holds Pr { ( Φ e ) T satisfies (2) } ≥ 1 − 4 (12 /δ S ) S e − C 0 ⌊ K + K a 2 ⌋ (26) where ⌈ x ⌉ a nd ⌊ x ⌋ are the smallest inte ger lar ger than or e qual to x a nd the larg est inte ger smaller than or equa l to x , respectively , and C 0 is a c onstant g iven after (25) . Pr oof: Se e Append ix B. Using the above lemma, the following main resu lt, which s tates that the extended mea surement matrix Φ e in (24) sa tisfies the RIP , c an be also proved. Theorem 2. Le t Φ e be for med as in (24) and le t the e lements of Φ be i.i.d. z er o mean Gaus sian var iables with va riance 1 / N . If min { K, K a + M − 1 } ≤ ⌈ ( K + K a ) / 2 ⌉ , then for a ny 0 < δ S < 1 , the r e exist constants C 3 and C 4 , wh ich depend only on δ S , s uch tha t for S ≤ C 3 ⌊ ( K + K a ) / 2 ⌋ / log ( N/S ) the inequality (2) ho lds for all S -sparse vectors with pr obability that satisfie s the following inequa lity Pr { Φ e satisfies RIP } ≥ 1 − 4 e − C 4 ⌊ ( K + K a ) / 2 ⌋ (27) where C 4 = C 0 − C 3 [1 + (1 + log (12 / δ S )) / log ( N/S )] a nd C 3 is small enough that guaran tees that C 4 is positive. June 6, 2018 DRAFT 14 Pr oof: Se e Append ix C. Let us cons ider now the gen eral case when the n umber of additional samples K a is larger than the number of BMIs K , i.e., K a > K , K e > 2 K , and the extend ed mea surement matrix is given by (23). Note that wh ile proving L emma 1 for the spe cial ca se of Examp le 3 , we were able to split the rows of Φ e into two sets each consisting of independe nt e ntries. In the general case, some of the entries of the original measureme nt ma trix appear more than twi ce in the extende d me asurement matrix Φ e , and it is no longer p ossible to split the rows o f Φ e into on ly two sets wit h indepen dent entries. Due to the way that the additional sa mples are buil t, the s amples y lK +1 , y lK +2 , . . . , y ( l +1) K obtained based on the permuted matrix Y P ( l ) , i.e., the l -th set of ad ditional sa mples, are uncorrelated with each other , but they are c orrelated with every other se t of samp les base d on the original matrix Y and the pe rmuted matrices Y P ( i ) , ∀ i, i 6 = l . Thus, the follo wing principle can be used while partitioning the rows of Φ e into the sets with inde penden t e ntries. First, the rows c orresponding to the original sa mples form a single set with inde penden t entries, then the rows corresp onding to the first s et of a dditional sa mples based o n the matrix Y P (1) form another set and so on. T hen the number of such s ets is n p = ⌈ K e /K ⌉ , while the size of each set is K i = K, 1 ≤ i < ⌈ K e K ⌉ − 1 K e − ( ⌈ K e K ⌉ − 1) K, i = ⌈ K e K ⌉ (28) The extended me asuremen t matrix (23) can be re written a s Φ e = ( Φ e ) T 1 , ( Φ e ) T 2 , . . . , ( Φ e ) T n p T (29) where ( Φ e ) i is the i -th partition of Φ e of s ize given by (28). The n the general form of Lemma 1 is as follo ws. Lemma 2. Let the elements of the measurement matrix Φ be i.i.d. zer o mean Gaussian variables with variance 1 / N , Φ e be the extended measurement matrix (23) , an d T ⊂ { 1 , . . . , N } of size S . Le t also K a > K and n p = ⌈ K e /K ⌉ . The n, for any 0 < δ S < 1 , the following ine quality holds Pr { ( Φ e ) T satisfies (2) } ≥ 1 − 2( n p − 1) ( 12 /δ S ) S e − C 0 K − 2 ( 12 /δ S ) S e − C 0 K n p (30) where K n p = K e − ⌈ K e K ⌉ − 1 K an d C 0 is a c onstant g iven after (25) . Pr oof: Se e Append ix D. Lemma 2 is n eeded to prove that the extended measu rement matrix (29) satisfies the RIP . The refore, the gen eral version of The orem 2 is a s follo ws. June 6, 2018 DRAFT 15 Theorem 3. L et the e lements of Φ be i.i.d. zer o mean Gaussian variables with variance 1 / N and Φ e be formed as in (23) . If K a > K , then for any 0 < δ S < 1 , there exist constants C 3 , C 4 and C ′ 4 , such that for S ≤ C 3 K n p / log( N /S ) the inequ ality (2) holds for all S -sparse ve ctors with pr obability that satisfies the following inequality Pr { Φ e satisfies RIP } ≥ 1 − 2( n p − 1) e − C ′ 4 K − 2 e − C 4 K n p (31) where C ′ 4 = C 0 − ( C 3 K n p /K ) × [1 + (1 + log (12 /δ S )) / log ( N /S )] , C 4 is given after (27 ) , and C 3 is small enou gh to guarantee that C 4 and C ′ 4 ar e both po sitive. Pr oof: Se e Append ix E. When s plitting the ro ws of Φ e in a number o f se ts as desc ribed before Lemma 2 it ma y hap pen that the last sub set ( Φ e ) n p has the s mallest size K n p . As a result, the dominant term in (31 ) will likely be the term 2 e − C 4 K n p . Moreover , it may lea d to a more stringent sparsity condition, that is, S ≤ C 3 K n p / log( N /S ) . T o improve the lower bound in (31), we ca n move some of the ro ws from ( Φ e ) n p − 1 to ( Φ e ) n p in o rder to make the last two partitions of almost the s ame size. Then the requireme nt on the sparsity lev el will be come S ≤ C 3 K ′ / log( N /S ) wh ere K ′ = ⌊ ( K + K n p ) / 2 ⌋ . The refore, the lower bound on the probability calcu lated in (31 ) improves. V . P E R F O R M A N C E A N A L Y S I S O F T H E R E C OV E RY In this s ection, we aim at ans wering the question whethe r signa l rec overy also improves if the proposed segmented AIC method, i.e., the extend ed measureme nt matrix Φ e (23), is u sed ins tead of the original matrix Φ . The s tudy is performed b ased on the empirical risk minimization method for sign al rec overy from noisy rand om projections [4]. As men tioned in Section II, the LASSO me thod can be viewed as one of the po ssible implementations of the empirical risk minimi zation method. W e first conside r the spec ial case of Examp le 3 when the extend ed mea surement ma trix is gi ven by (24). Let the entries of the mea surement matrix Φ be selected with equ al probability as ± 1 / √ N , i.e., be i.i.d. Be rnoulli distributed with variance 1 / N . Th is ass umption is the s ame as in [4] a nd it is used here in order to shorten our de ri vations by only emphas izing the diff erenc es c aused b y our construction of matrix Φ e , where some rows are correlated to each other , as compared to the ca se analyzed in [4], where the measuremen t matrix co nsists of a ll i.i.d. entries. Note that o ur results can be easily applied to the ca se of Gauss ian dis trib uted entries of Φ by on ly cha nging the moments o f Bernoulli distribut ion to the momen ts of Gau ssian distrib ution. June 6, 2018 DRAFT 16 Let r ( ˆ f , f ) , r ( ˆ f ) − r ( f ) b e the “excess risk” between the c andidate reconstruction ˆ f o f the signa l sampled u sing the extende d measuremen t matrix Φ e and the a ctual s ignal f , and ˆ r ( ˆ f , f ) , ˆ r ( ˆ f ) − ˆ r ( f ) be the “empirical excess risk” between the candidate s ignal rec onstruction a nd the actual signal, whe re r ( ˆ f ) and ˆ r ( ˆ f ) are de fined in (6). The n the dif ference between the “excess risk” and the “empirical exce ss risk” can be foun d as r ( ˆ f , f ) − ˆ r ( ˆ f , f ) = 1 K e K e X j =1 ( U j − E [ U j ]) (32) where U j , ( y j − φ j f ) 2 − ( y j − φ j ˆ f ) 2 . The mean -square e rror (MSE) between the candidate reconstruction and the a ctual signal can be expressed a s [24] MSE , E k g k 2 = N r ( ˆ f , f ) (33) where g , ˆ f − f . Therefore, if we k now an u pper bo und on the right-hand side of (32), de noted herea fter as U , we can immediately find an uppe r bound on the MSE in the form MSE ≤ N ˆ r ( ˆ f , f ) + N U . In other words, to find the candidate reconstruction ˆ f one c an minimize ˆ r ( ˆ f , f ) + U , that will also result in a b ound on the MSE as in (8). The Craig-Bernstein ine quality [4], [26] can be used in order to find an uppe r bo und U on the right- hand side of (32). In the notations of our pap er , this inequality states that the probability of the following ev ent 1 K e K e X j =1 ( U j − E { U j } ) ≤ log 1 δ K e ǫ + ǫ var n P K e j =1 U j o 2 K e (1 − ζ ) (34) is greater than or e qual to 1 − δ for 0 < ǫh ≤ ζ < 1 , if the random variables U j satisfy the follo wing moment condition for some h > 0 and all k ≥ 2 E n | U j − E { U j }| k o ≤ k ! var { U j } h k − 2 2 . (35) The s econd term in the right-hand side of (34 ) co ntains the variance var n P K e j =1 U j o , which we nee d to calculate or a t leas t find an up per bound on it. In the c ase of the extende d mea surement matrix, the random variables U j , j = 1 , . . . , K e all satisfy the moment c ondition for the Craig-Bernstein ine quality [26] with the same coe f ficient h = 16 B 2 e + 8 √ 2 B σ , where σ 2 is the v ariance of the Ga ussian no ise. 5 Moreover , it is easy to s how that the following bo und 5 The deriv ation of the coef ficient h coincides with a similar deriv ation in [4], and therefore, is omitted. June 6, 2018 DRAFT 17 on the variance of U j is valid for the extende d measureme nt matrix 6 var { U j } ≤ 2 k g k 2 N + 4 σ 2 k g k 2 N ≤ 8 B 2 + 4 σ 2 r ( ˆ f , f ) . (36) Howe ver , unlike [4], in the case of the extended measurement matrix, the variables U j are not independ ent from each other . Thus , we ca n no t s imply replace the term var n P K e j =1 U j o with the sum of the variances for U j , j = 1 , . . . , K e . Using the definition of the variance, we can write that var K e X j =1 U j , E K e X j =1 U j 2 − E K e X j =1 U j 2 = K e X j =1 E { U 2 j } + 2 K e − 1 X i =1 K e X j = i +1 E { U i U j } − K 2 e k g k 2 N 2 = K e X j =1 E { U 2 j } − k g k 2 N 2 ! + 2 K e − 1 X i =1 K e X j = i +1 E { U i U j } − k g k 2 N 2 ! = K e X j =1 var { U j } + 2 K e − 1 X i =1 K e X j = i +1 E { U i U j } − k g k 2 N 2 ! (37) where the u pper bound on var { U j } is gi ven by (36). Using the fact that the rando m no ise components w i and w j are indep endent from φ i g and φ j g (see the noisy model (4)), res pectively , E { U i U j } can be expressed a s E { U i U j } = E [2 w i φ i g − ( φ i g ) 2 ][2 w j φ j g − ( φ j g ) 2 ] = 4 E w i w j E φ i g φ j g − 2 E w i E φ i g ( φ j g ) 2 − 2 E w j E φ j g ( φ i g ) 2 + E ( φ i g ) 2 ( φ j g ) 2 . ( 38) The latter expres sion c an b e further simplified using the fact that E { w i } = E { w j } = 0 . Thus, we obtain that E { U i U j } = 4 E w i w j E ( φ i g )( φ j g ) + E ( φ i g ) 2 ( φ j g ) 2 . (39) It is easy to verify that if φ i and φ j are indep endent, then E ( U i U j ) = E ( φ i g ) 2 E ( φ j g ) 2 = k g k 2 / N 2 which indeed c oincides with [4]. Ho wever , in o ur c ase, φ i and φ j may depen d on e ach other . If they indee d d epend on each othe r , they have L = N/ M common entries, while the rest of the entries are independen t. In addition, the additiv e noise terms w i and w j are no longer independe nt 6 This bound also coincides with a si milar one in [4] June 6, 2018 DRAFT 18 random variables as we ll an d, thu s, E w i w j = σ 2 / M . W ithout loss of generality , let the first L entries of φ i and φ j be the sa me, tha t is, φ i g = A z }| { g 1 a 1 + . . . + g L a L + P i z }| { g L +1 φ i,L +1 + . . . + g N φ i,N (40) φ j g = A z }| { g 1 a 1 + . . . + g L a L + P j z }| { g L +1 φ j,L +1 + . . . + g N φ j,N (41) with a 1 , ..., a L being the common part betwee n φ i and φ j . Let g A be a sub-vector of g co ntaining the L eleme nts o f g corresp onding to the commo n pa rt be tween φ i and φ j , and g A ′ be the sub-vector co mprising the rest of the elements. Then u sing the fact that A , P i , and P j are all zero mean ind epende nt random variables, we can express E { ( φ i g )( φ j g ) } from the first term on the right-hand side of (39 ) as E { ( φ i g )( φ j g ) } = E { ( A + P i )( A + P j ) } = E { A 2 } + E { AP i } + E { AP j } + E { P i P j } = E { A 2 } = P L k =1 g 2 k 2 N = k g A k 2 N . (42) Similar , the secon d term on the ri ght-hand side o f (39) can b e expressed a s E ( φ i g ) 2 ( φ j g ) 2 = E ( A 2 + P 2 i + 2 AP i )( A 2 + P 2 j + 2 AP j ) . (43) Using the facts that 4 E w i w j = 4 σ 2 / M , E { A 2 } = k g A k 2 / N , and E { P 2 i } = k g A ′ k 2 / N , the expression (43) can be further re written as E ( φ i g ) 2 ( φ j g ) 2 = E A 4 + A 2 P 2 i + A 2 P 2 j + P 2 i P 2 j = E { A 4 } + 2 k g A k 2 N · k g A ′ k 2 N + k g A ′ k 2 N 2 = E { A 4 } + k g k 2 N 2 − k g A k 2 N 2 . (44) Substituting (42) a nd (44) into (39), we obtain that E { U i U j } = 4 σ 2 M · k g A k 2 N + E { A 4 } + k g k 2 N 2 − k g A k 2 N 2 . (45) Moreover , s ubstituting (45) into (37), we fin d that var K e X j =1 U j = K e X j =1 var { U j } + 2 X φ i , φ j dependent E { A 4 } − k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! . (46) Using the fact that the extended mea surement matrix is constructed such that the waveforms φ i , i = K + 1 , . . . , K e are built upon M rows of the original matrix and also using the inequa lity 7 E { A 4 } − 7 W e skip the deriv ation of this inequality since it is relat iv ely well known and can be found, for example, in [4, p. 4039]. June 6, 2018 DRAFT 19 k g A k 2 / N 2 ≤ 2 k g A k 2 / N 2 for all thes e M ro ws, we ob tain for every φ i , i = K + 1 , . . . , K e that M X k =1 E { A 4 } − k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! ≤ M X k =1 2 k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! (47) where g A correspond s to the first L entries of g for k = 1 , to the en tries from L + 1 to 2 L for k = 2 and so on. Ap plying also the triangle inequality , we fi nd that M X k =1 2 k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! ≤ 2 k g k 2 N 2 + 4 σ 2 M · k g k 2 N . (48) Combining (47) and (48) and u sing the fact tha t there are K a additional rows in the extended measureme nt matrix, we obtain that 2 X φ i , φ j dependent E { A 4 } − k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! ≤ 4 K a k g k 2 N 2 + 8 σ 2 K a M · k g k 2 N . (49) Noticing that k g k 2 / N = r ( ˆ f , f ) and k g k 2 ≤ 4 N B 2 , the right-hand side of the inequality (49) can be further upper boun ded as 4 K a k g k 2 N 2 + 8 σ 2 K a M · k g k 2 N ≤ 16 K a B 2 r ( ˆ f , f ) + 8 σ 2 K a M r ( ˆ f , f ) . (50) Using the upper boun d (50) for the s econd term in (46) and the upper bou nd (36 ) for the first term in (46), we finally can uppe r bou nd the var n P K e j =1 U j o as var K e X j =1 U j ≤ K e 8 B 2 1 + 2 K a K e + 4 σ 2 1 + 2 K a M K e r ( ˆ f , f ) . (51) Therefore, based on the Craig-Bernstein ineq uality , the probability that for a given candidate signal ˆ f the following inequa lity holds r ( ˆ f , f ) − ˆ r ( ˆ f , f ) ≤ log( 1 δ ) K e ǫ + 8 B 2 1 + 2 K a K e + 4 σ 2 1 + 2 K a M K e r ( ˆ f , f ) ǫ 2(1 − ζ ) (52) is greater than or eq ual to 1 − δ . Let c ( ˆ f ) be ch osen such tha t the Kraft inequa lity P ˆ f ∈F ( B ) 2 c ( ˆ f ) ≤ 1 is satisfied (see also [4]), and let δ ( ˆ f ) = 2 − c ( ˆ f ) δ . Applying the un ion b ound to (52), it c an be sh own that for all ˆ f ∈ F ( B ) and for all δ > 0 , the following inequality holds with proba bility of at leas t 1 − δ r ( ˆ f , f ) − ˆ r ( ˆ f , f ) ≤ c ( ˆ f ) log 2 + log( 1 δ ) K e ǫ + 8 B 2 1 + 2 K a K e + 4 σ 2 1 + 2 K a M K e r ( ˆ f , f ) ǫ 2(1 − ζ ) . (53) June 6, 2018 DRAFT 20 Finally , setting ζ = ǫ h and a = 8 B 2 1 + 2 K a K e + 4 σ 2 1 + 2 K a M K e ǫ 2(1 − ζ ) (54) ǫ < 1 4 1 + 2 K a K e + 16 e B 2 + 8 √ B σ + 2 σ 2 1 + 2 K a M K e (55) where 0 < ǫ h ≤ ζ < 1 as required by the Craig-Bernstein inequ ality , the follo wing inequality ho lds with probability of at least 1 − δ for all ˆ f ∈ F ( B ) (1 − a ) r ( ˆ f , f ) ≤ ˆ r ( ˆ f , f ) + c ( ˆ f ) log 2 + log( 1 δ ) K e ǫ . (56) The following res ult on the recovery performance of the empirical risk minimization method is in order . Theorem 4. Let ǫ be c hos en as ǫ = 1 60 ( B + σ ) 2 (57) which s atisfies the inequ ality (55) , then the signa l r eco nstruction ˆ f K e given by ˆ f K e = arg min ˆ f ∈F ( B ) ( ˆ r ( ˆ f ) + c ( ˆ f ) log 2 ǫK e ) (58) satisfies the following inequality E ( k ˆ f K e − f k 2 N ) ≤ C 1 e min ˆ f ∈F ( B ) ( k ˆ f − f k 2 N + c ( ˆ f ) log 2 + 4 ǫK e ) (59) where C 1 e is the cons tant g iven as C 1 e = 1 + a 1 − a , a = 2 1 + 2 K a K e B σ 2 + 1 + 2 K a M K e (30 − 8 e ) B σ 2 + 60 − 4 √ 2 B σ + 30 (60) with a obtaine d fr o m (54 ) for the s pecific choice o f ǫ in (57) . Pr oof: The proof follo ws the exac t s teps o f the proo f of the related resu lt for the uncorrelated cas e [4, p . 4039 –4040] w ith the exception of using, in o ur co rrelated cas e, the ab ove calculated values for ǫ (57) and a (60). Example 4: Let one se t of sa mples be obta ined ba sed on the me asurement matrix Φ e with K a = K , K e = 2 K , an d M = 8 , an d let anothe r set of samples b e o btained using a 2 K × N measu rement matrix with all i.i.d. (Be rnoulli) elemen ts. Let also ǫ be selec ted as (57). Then the MSE error bounds for thes e two ca ses diff er from e ach other only b y a consta nt factor given for the former cas e b y C 1 e in (60) and in the latter case b y C 1 (see (8) and the ro w after). Considering the tw o limiting cas es wh en June 6, 2018 DRAFT 21 B /σ → 0 and B /σ → ∞ , the intervals of change for the co rresponding coefficients ca n be ob tained as 1 . 08 ≤ C 1 e ≤ 2 . 88 and 1 . 06 ≤ C 1 ≤ 1 . 63 , res pectiv ely . The following resu lt on the ac hiev able recovery pe rformance for a spars e o r compressible signal sampled based on the extended measureme nt matrix Φ e is also of g reat interest. Theorem 5. F or a sparse signa l f ∈ F s ( B , S ) = { f : k f k 2 ≤ N B 2 , k f k l 0 ≤ S } and c orresponding r econ structed signal ˆ f K e obtained according to (58) , there exists a constant C ′ 2 e = C ′ 2 e ( B , σ ) > 0 , such that sup f ∈F s ( B ,S ) E ( k ˆ f K e − f k 2 N ) ≤ C 1 e C ′ 2 e K e S log N − 1 . (61) Similar , for a compressible signal f ∈ F c ( B , α, C A ) = { f : k f k 2 ≤ N B 2 , k f ( m ) − f k 2 ≤ N C A m − 2 α } and corr esp onding r econ structed signal ˆ f K e obtained a ccording to (58) , the r e exists a c onstant C 2 e = C 2 e ( B , σ, C A ) > 0 , such that sup f ∈F c ( B ,α,C A ) E ( k ˆ f K e − f k 2 N ) ≤ C 1 e C 2 e K e log N − 2 α/ (2 α +1) . (62) Pr oof: Th e proof follo ws the exac t s teps of the proofs of the related res ults for the uncorrelated cas e [4, p . 4040 –4041] w ith the exception of using, in o ur co rrelated cas e, the ab ove calculated values for ǫ (57) and a (60). Example 5: Le t one set of samples be obtained b ased on the extended mea surement matrix Φ e with K a = K , K e = 2 K , and M = 8 an d let an other set of samp les be obtained using the K × N measurement matrix with all i.i.d. (Bernoulli) elements. T he error bound s corresponding to the cas e of K uncorrelated samples of [4] a nd our case of K e correlated samples are (10) a nd (61), respec ti vely . The comparison between the se two e rror bou nds boils down in this example to compa ring 2 C 1 C ′ 2 and C 1 e C ′ 2 e . Assuming the same ǫ as (57) for both methods, the following h olds true C ′ 2 e = C ′ 2 . F ig. 3 c ompares C 1 e and 2 C 1 versus the signal-to-noise ratio (SNR) B 2 /σ 2 . Since C 1 e < 2 C 1 for all values of SNR, the quality of the signal recovery , i.e., the corresponding MSE, for the case of 2 K × N extend ed mea surement ma trix is expected to be better tha n the quality of the signal recovery for the case o f K × N measu rement matrix of all i.i.d. e ntries. The above results c an be e asily generalized for the case when K a > K . Inde ed, we on ly ne ed to recalculate var n P K e j =1 U j o for K a > 2 K . The o nly dif ference with the previous c ase of K a ≤ K is the increased nu mber of pairs of dependent rows in the extend ed mea surement matrix Φ e , wh ich has a lar ger size now . The latter a f fects on ly the sec ond term in (46). In particular , every row in Φ P (1) depend s on M rows of the original mea surement matrix Φ . Mo reover , the term P 2 K − 1 i =1 P 2 K j = i +1 E { U i U j } over June 6, 2018 DRAFT 22 −20 0 20 40 60 80 100 1 1.5 2 2.5 3 3.5 SNR (dB) C 1e 2C 1 Fig. 3. C 1 e and 2 C 1 versus SNR. all thes e M rows is bounde d as in (48). Then c onsidering all K M pa irs o f de penden t ro ws from Φ and Φ P (1) , we have 2 X φ i , φ j dependent E { A 4 } − k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! ≤ 4 K k g k 2 N 2 + 8 σ 2 K M · k g k 2 N . (63) Similar , every row of Φ P (2) depend s on M rows of Φ P (1) and M rows of Φ . Considering all these 2 K M pairs of d epende nt rows, we h av e 2 X φ i , φ j dependent E { A 4 } − k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! ≤ 4(2 K ) k g k 2 N 2 + 8 σ 2 (2 K ) M · k g k 2 N . (64) Finally , the number of rows in the last matrix ( Φ e ) n p is K n p (see (28) an d (29)). Every row o f ( Φ e ) n p depend s on M rows of ea ch of the previous n p − 1 matrices Φ P ( i ) , i = 1 , . . . , n p − 1 . Considering all ( n p − 1) K n p M pairs o f depen dent rows, we ha ve 2 X φ i , φ j dependent E { A 4 } − k g A k 2 N 2 + 4 σ 2 M · k g A k 2 N ! ≤ 4( n p − 1) K n p k g k 2 N 2 + 8 σ 2 ( n p − 1) K n p M · k g k 2 N . (65) Based on the e quations (37) and (63)–(65 ) we ca n find the follo wing bou nd var K e X j =1 U j ≤ K e 8 B 2 1 + D K e + 4 σ 2 1 + D M K e r ( ˆ f , f ) (66) where D = 2 K P n p − 2 i =1 i + 2 K n p ( n p − 1) . Note that in the cas e that K e = n p K , we hav e D /K e = n p − 1 . June 6, 2018 DRAFT 23 Therefore, it c an be s hown for the gene ral extended matrix (23) that the ine quality (56 ) holds with the follo wing values of a an d ǫ : a = 8 B 2 1 + D K e + 4 σ 2 1 + D M K e ǫ 2(1 − ζ ) (67) ǫ < 1 4 1 + D K e + 16 e B 2 + 8 √ B σ + 2 σ 2 1 + D M K e (68) Moreover , the theorems similar to Theo rems 4 and 5 foll ow straightforwardly with the co rrections to a and ǫ wh ich are giv en now b y (67) and (68), respecti vely . W e fina lly ma ke some remarks on no n-RIP co nditions for l 1 -norm-based recovery . Since the extended measureme nt matrix of the p roposed segmen ted compressed sampling me thod s atisfies the RIP , the resu lts of [23] on recoverabilit y and stability of the l 1 -norm minimization straightforwardly apply . A different non-RIP-based ap proach for study ing the recoverability a nd s tability of the l 1 -norm minimization, which uses some properties o f the nu ll spa ce o f the mea surement matrix, is us ed in [27]. Then the non-RIP sufficient con dition for recoverabilit y of a s parse signa l from its no iseless compress ed samples wit h the algorithm (3) is [27] √ S < min 0 . 5 k v k l 1 k v k l 2 : v ∈ {N ( Φ ) \ { 0 }} (69) where N ( Φ ) denotes the null s pace of the meas urement ma trix Φ . Let us s how tha t the condition (69) is also satisfied for the extended me asurement matrix Φ e . Let d b e any vector in the nu ll sp ace of Φ e , i.e., d ∈ N ( Φ e ) . T herefore, [ Φ e ] i d = 0 , i = 1 , . . . , K e where [ Φ e ] i is the i -th 1 × N row- vector of Φ e . Since the first K rows of Φ e are exactly the same as the K rows of Φ , we have [ Φ ] i d = 0 , i = 1 , . . . , K . Therefore, d ∈ N ( Φ ) , and we can conc lude that N ( Φ e ) ⊂ N ( Φ ) . Due to this property , we hav e min { 0 . 5 k v k l 1 / k v k l 2 : v ∈ N ( Φ ) } ≤ min { 0 . 5 k v k l 1 / k v k l 2 : v ∈ N ( Φ e ) } . Therefore, if the original mea surement matrix Φ satisfies (69), so does the extend ed me asurement matrix Φ e , and the signal is recoverable from the samples taken by Φ e . Moreover , the necess ary an d sufficient c ondition for a ll signals with k x k l 0 < S to be rec overable from noiseless compres sed sa mples u sing the l 1 -norm minimization (3) is that [27 ] k v k l 1 > 2 k v T k l 1 , ∀ v ∈ { N ( Φ ) \ { 0 }} (70) where T is the set of ind exes correspond ing to the nonze ro coefficients of x . It is e asy to see that since N ( Φ e ) ⊂ N ( Φ ) , the condition (70) also holds for the extended measureme nt matrix if the original measureme nt matrix satisfie s it. June 6, 2018 DRAFT 24 V I . S I M U L A T I O N R E S U L T S Throughou t our simulations we us e the sparse signal of dimension 128 with o nly 3 nonzero entries, which a re set to ± 1 with equ al probabilities. Since the signal is s parse in the time do main, Ψ = I . Th e collected samples are as sumed to be noisy , i.e., the model (4) is a pplied. In all our simulation examples, three dif ferent meas urement ma trices (sampling sc hemes) are u sed: (i) the K × N measuremen t ma trix Φ with i.i.d. entries referred to as the original me asuremen t matrix; (ii) the extended K e × N measureme nt matrix Φ e obtained using the prop osed segmented compres sed samp ling metho d and referred to as the extended mea surement matrix; and (iii) the K e × N mea surement matrix with all i.i.d entries referred to as the e nlarged measureme nt matrix. Th is las t measu rement matrix correspo nds to the sampling sche me with K e independ ent B MIs in the AIC in Fig. 1. The numbe r of segments in the proposed segmented compresse d sa mpling method M is se t to 8 . T o make s ure that the me asurement noise for additional samples obtained based o n the extended me asurement matrix is c orrelated wit h the measu rement noise of the original sa mples, the K × M matrix of noisy sub-samples with the noise v ariance σ 2 / M is first generated. T hen the permutations a re a pplied to this matrix and the sub-samples along each row of the original and pe rmuted matrices are ad ded u p together to build the noisy samples. The recovery p erformance for three aforeme ntioned s ampling sc hemes is measu red using the MSE between the recovered a nd original signals. In all examples, MSE v alues are computed base d on 5 000 independ ent simulation runs for all s ampling scheme s tested . Th e SNR is d efined as k Φ f k 2 l 2 / k w k 2 l 2 . Approximating k Φ f k 2 l 2 by ( K ′ / N ) k f k 2 l 2 , which is valid bec ause o f (2), the correspon ding noise variance σ 2 can be ca lculated if SNR is giv en, a nd v ise versa. He re K ′ = K for the sampling sc heme ba sed on the original measuremen t matrix, while K ′ = K e in the other tw o sche mes. For example, the approx imate SNR in dB s can be ca lculated a s 10 log 10 (3 /N σ 2 ) . Recovery b ased on the l 1 -norm minimization algorithm: In our first simulation example, the l 1 -norm minimization algorithm (5) is use d to recover a signa l s ampled using the three aforementione d s ampling scheme s. Since Ψ = I , then Φ ′ = Φ in (5). The numb er of BMIs in the sampling device is taken to be K = 16 , while γ in (5), which is the bou nd on the root squa re o f the noise e nergy , is set to √ K ′ σ . The e ntries of the original and e nlarged mea surement matrices a re gen erated a s i.i.d. G aussian d istrib uted random variables with zero mea n and variance 1 / N . Fig. 4 shows the MSEs co rresponding to all three aforemen tioned measu rement matrices versus the ratio o f the numbe r of a dditional sa mples to the nu mber of original samples K a /K . T he results are s hown for three different SNR v alues of 5, 15 and 25 dB. It c an be seen from the figure tha t better recovery June 6, 2018 DRAFT 25 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 −5 10 −4 10 −3 10 −2 10 −1 K a /K MSE SNR = 5 dB SNR = 15 dB SNR = 25 dB Original Gaussian Matrix Extended Matrix Enlarged Matrix Fig. 4. Recovery based on the l 1 -norm minimization algorithm: MSEs versus K a /K . quality is achieved by using the extend ed measuremen t matrix as compared to the original me asurement matrix. T he improvements are more significant for high SNRs sinc e the re covery error is p roportional to the noise po wer [23]. As expected , the recovery performance in the c ase of the extended measurement matrix is not as good as in the c ase of the enlarged measu rement matrix. This dif ference , howe ver , is small as compared to the performance improv emen t over the original mea surement matrix. Note also that in the case of the e nlarged me asuremen t matrix, the AIC in Fig. 1 c onsists of K e BMIs, wh ile only K BMIs are required in the case of the extend ed measu rement ma trix. Thus, the segmente d AIC requires K e − K less BMIs. For exa mple, the number of such BMIs halves if K a /K = 1 . Additionally , it can be seen that the rate of MSE improvement decreas es as the number of collected samples increase s. The latter ca n be observed for both the extended and enlarged me asurement matrices and for all three values of SNR. Recovery base d on the empirical risk minimiza tion metho d: In our seco nd simulation example, the empirical risk minimi zation method is used to recover a signal sampled using the three aforeme ntioned sampling s chemes tested with K = 24 . The minimization problem (7) is solved to obtain a cand idate reconstruction ˆ f K ′ of the original sparse signa l f . Considering ˆ f K ′ = Ψ H ˆ x K ′ , the problem (7) ca n be rewr itten in terms of ˆ x K ′ as ˆ x K ′ = arg min ˆ x ∈X ˆ r ( Ψ H ˆ x ) + c ( ˆ x ) log 2 ǫK ′ = arg min ˆ x ∈X k ( y ) − ΦΨ H ˆ x k 2 l 2 + 2 log 2 log N ǫ k ˆ x k l 0 (71) and solved us ing the iterati ve bou nd optimization p rocedure [4]. This proce dure uses the threshold June 6, 2018 DRAFT 26 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 K a /K MSE SNR = 5 dB SNR = 15 dB SNR = 25 dB Original Gaussian Matrix Extended Matrix Enlarged Matrix (a) Measurement matrix with Gaussian distributed entries 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 10 −6 10 −5 10 −4 10 −3 10 −2 10 −1 K a /K MSE SNR = 5 dB SNR = 15 dB SNR = 25 dB Original Bernoulli Matrix Extended Matrix Enlarged Matrix (b) Measurement matrix with Bernoulli distri buted entries Fig. 5. Recovery based on the empirical r isk minimization method: MSEs versus K a /K . p 2 log 2 log N /λǫ , where λ is the largest eigen value of the matrix Φ T Φ . In our simulations, this threshold is set to 0.035 for the c ase of the extended measurement matrix and 0.05 for the case s o f the original and e nlarged meas urement matrices. These threshold v alues are optimized as recomme nded in [4]. The stopping crit erion for the iterati ve bound optimization procedure is k ˆ x ( i +1) − ˆ x ( i ) k l ∞ ≤ θ , where k . k l ∞ is the l ∞ norm and ˆ x ( i ) denotes the value of ˆ x obtaine d in the i -th iteration. The value θ = 0 . 001 is selected. Fig. 5 s hows the MSEs ob tained b ased on the empirical risk minimization metho d for all three measureme nt matrices versus the ratio K a /K . The results are shown for three dif ferent SNR values of 5, 15 and 25 d B. T wo case s are co nsidered: (a) the entries of the original and the e nlarged measure ment matrices are ge nerated as i.i.d. zero mea n Gaussian distributed random variables with v ariance 1 / N and (b) the entries of the original and enlarged me asuremen t matrices are gen erated as i.i.d. ze ro mean Bernoulli d istrib uted rando m variables with variance as in cas e (a). The same conclus ions as in the first example can be drawn in this example. Moreover , the results for cases (a) and (b) are also similar . Therefore, the proposed segmented AIC ind eed leads to s ignificantly improv ed signal recovery performance without increas ing the number of BMIs. V I I . C O N C L U S I O N A n ew segmented comp ressed s ampling me thod for AIC has be en propos ed. Acc ording to this method, signal is segmented into M s egments a nd pass ed through K BMIs of AIC to g enerate a K × M matrix of sub-samples . Then, a number of correlated samp les larger than the numbe r of BMIs is cons tructed by June 6, 2018 DRAFT 27 adding up dif ferent subs ets of sub-samples selected in a specific manner . Due to the inherent structure of the method, the complexity of the s ampling device is almost unch anged, while the s ignal recovery performance is sh own to be s ignificantly improved. The comp lexity increa se is only due to the M times higher s ampling rate and the ne cessity to solve a larger size optimization problem at the rec overy stag e, while the numb er of BMIs remains the sa me a t the samp ling stag e. The validity and superiority o f the proposed segmented AIC me thod over the con ventional AIC is justified throu gh theoretical analysis of the RIP and the quality of signal rec overy . Simulation results also verify the effecti veness a nd superiority of the prop osed s egmented AIC method an d approve our theoretical studies . A P P E N D I X A : P R O O F O F T H E O R E M 1 The total numb er of p ossible p ermutations of z is K ! . Let A be the set o f permutations π s , s = 1 , . . . , |A| that sa tisfy the followi ng condition π s ( k ) 6 = π t ( k ) , s 6 = t, ∀ s, t ∈ { 1 , . . . , |A |} , ∀ k ∈ { 1 , . . . , K } . (72) It is easy to see that the nu mber o f dis tinct permutations satisfying the con dition (72) is K , so |A| = K . It is a lso straightforward to see tha t the c hoice of s uch K distinct p ermutations is no t unique . A s a specific choice , let the elements of A , i.e., the permutations π s , s = 1 , . . . , K , be π s ( k ) = (( s + k − 2) mod K ) + 1 , s, k = 1 , . . . , K (73) with π 1 being the identity permutation, i.e., the p ermutations that doe s no t change z . Consider now the matrix Z which con sists of M c olumns z . The i -t h s et of co lumn permutations of matrix Z is P ( i ) = { π ( i ) 1 , . . . , π ( i ) M } a nd the correspond ing permuted matrix is Z P ( i ) . Let { π ( i ) 1 , . . . , π ( i ) M } be any combination o f the K pe rmutations in (73). Then there are K M possible choices for P ( i ) . Ho wever , not all of the se poss ible cho ices a re p ermissible by the c onditions of the theo rem. Indeed, let the se t P (1) be a co mbination of p ermutations from A that satisfies (18 ). Th ere are I − 1 other sets P ( i ) , i = 2 , . . . , I which satisfy b oth (18 ) and (19). Ga thering all such sets in on e set, we obtain the set P = {P (1) , . . . , P ( I ) } . Now let P ( I +1) = [ π ( I +1) 1 , . . . , π ( I +1) M ] be one more set of permutations where ∃ π ( I +1) m , m = 1 , . . . , M such that π ( I +1) m / ∈ A . An arbitrary k -th row o f Z P ( I +1) is [ Z P ( I +1) ] k , 1 , . . . , [ Z P ( I +1) ] k ,M where [ Z P ( I +1) ] k , 1 , . . . , [ Z P ( I +1) ] k ,M ∈ { 1 , . . . , K } . This exact same row ca n be found as the first row of one of the permuted matrices Z P ( i ) , P ( i ) ∈ P . Specifica lly , this is the pe rmuted matrix Z P ( i ) that is obtaine d b y a pplying the permutations P ( i ) = n π [ Z P ( I +1) ] k, 1 , . . . , π [ Z P ( I +1) ] k,M o . The permutations P ( i ) either has to belong to P or b eing crosse d out June 6, 2018 DRAFT 28 from P b ecaus e of conflicting with some other element P ( l ) ∈ P , l 6 = i . In both cases, P ( I +1) can n ot be added to P b ecaus e it will c ontradict the conditions (18) a nd (19). Therefore, the set P c an b e built using on ly the pe rmutations from the s et A , i.e ., the K permutations in (73). Re arranging the ro ws o f Z P ( i ) in a certain way , one can force the elements in the first column of Z P ( i ) to a ppear in the original increas ing orde r , i.e., enforce the first c olumn be equiv alent to the vector of indexes z . It can be do ne by applying to each permutation in the s et P ( i ) the in verse permutation π ( i ) 1 − 1 , which it self is one o f the permutations in (73). Therefore, the s et P ( i ) = { π ( i ) 1 , . . . , π ( i ) M } ca n be replaced by the equiv alent se t π ( i ) 1 − 1 π ( i ) 1 , . . . , π ( i ) 1 − 1 π ( i ) M = π 1 , . . . , π ( i ) 1 − 1 π ( i ) M , where π 1 is the ide ntity permutation and π ( i ) 1 − 1 π ( i ) j ∈ A . Henc e, we can con sider only the permutations o f the form P ( i ) = { π 1 , . . . , π ( i ) j , . . . , π ( i ) M } . Since the condition (18) requires that π ( i ) 2 should be dif ferent from π 1 , the only av ailable o ptions for the pe rmutations on the s econd column of Z are the K − 1 permutations π 2 , . . . , π K in (73). Therefore, I at most e quals K − 1 . Note that I can be smaller than K − 1 if for s ome i ∈ { 1 , . . . , K − 1 } , K/gcd ( i, K ) < M (also see Example 1 a fter The orem 1). Thus, in gene ral I ≤ K − 1 . A P P E N D I X B : P RO O F O F L E M M A 1 Let all the rows o f ( Φ e ) T be partitioned into two s ets of sizes (cardinality) as close as pos sible to e ach other , wh ere all ele ments in eac h set are guaranteed to be statistically inde pende nt. In particular , note that the elemen ts o f the new K a rows of Φ e are cho sen either from the first K a + M − 1 rows o f Φ if K a + M − 1 < K or from the whole matrix Φ . Th erefore, if K a + M − 1 < K , the last K − K a − M + 1 rows of Φ play no role whatsoever in the proces s of extending the measu rement matrix an d they are independ ent o n the rows of Φ 1 in (24). Th ese rows are called unused rows. Thus, one ca n freely add any numbe r o f such unuse d rows to the set of rows in Φ 1 without disrupting its status of being formed by independe nt Ga ussian v ariables . Sinc e min { K, K a + M − 1 } ≤ ⌈ ( K + K a ) / 2 ⌉ , there exist at least ⌊ ( K + K a ) / 2 ⌋ − K a unused rows wh ich can be added to the set of rows in Φ 1 . Suc h proc ess describes how the rows of ( Φ e ) T are s plit into the des ired sets ( Φ e ) 1 T and ( Φ e ) 2 T of statistically indepe ndent elements. As a result, the first ma trix ( Φ e ) 1 T includes the first ⌈ ( K + K a ) / 2 ⌉ rows of ( Φ e ) T , while the rest of the rows are include d in ( Φ e ) 2 T . Since the elements of the matrices ( Φ e ) 1 T and ( Φ e ) 2 T are i.i.d. Ga ussian, they will sa tisfy (2) with probabilities equal o r larger than 1 − 2 ( 12 /δ S ) S e − C 0 ⌈ K e / 2 ⌉ and 1 − 2 (12 /δ S ) S e − C 0 ⌊ K e / 2 ⌋ , resp ectiv ely . June 6, 2018 DRAFT 29 Therefore, both matrices ( Φ e ) 1 T and ( Φ e ) 2 T satisfy (2) simultane ously with the co mmon proba bility Pr { ( Φ e ) i T satisfies (2) } ≥ 1 − 2(12 /δ S ) S e − C 0 ⌊ K e / 2 ⌋ , i = 1 , 2 . (74) Let K ′ 1 , ⌈ K e / 2 ⌉ and K ′ 2 , ⌊ K e / 2 ⌋ . Cons ider the event wh en both ( Φ e ) 1 T and ( Φ e ) 2 T satisfy (2). Then the following ine quality hold for any vector c ∈ R S : 2 X i =1 K ′ i N (1 − δ S ) k c k 2 l 2 ≤ 2 X i =1 k ( Φ e ) i T c k 2 l 2 ≤ 2 X i =1 K ′ i N (1 + δ S ) k c k 2 l 2 (75) or , equi valently , K e N (1 − δ S ) k c k 2 l 2 ≤ k ( Φ e ) T c k 2 l 2 ≤ K e N (1 + δ S ) k c k 2 l 2 . (76) Therefore, if both ma trices ( Φ e ) 1 T and ( Φ e ) 2 T satisfy (2), then the matrix ( Φ e ) T also satisfie s (2). Moreover , the probability that ( Φ e ) T does not sa tisfy (2) can be fou nd as Pr { ( Φ e ) T does not sa tisfy (2) } ≤ Pr { ( Φ e ) 1 T or ( Φ e ) 2 T does not sa tisfy (2) } ( a ) ≤ 2 X i =1 Pr { ( Φ e ) i T does not sa tisfy (2) } ( b ) ≤ 4 (12 /δ S ) S e − C 0 ⌊ K e / 2 ⌋ (77) where the ine quality (a) follows from the u nion bounding and the inequality (b) follows from (74). Thus, the inequ ality (26) h olds. A P P E N D I X C : P RO O F O F T H E O R E M 2 According to (26), the matrix ( Φ e ) T does not satisfy (2) with probability les s than or equal to 4 (12 /δ S ) S e − C 0 ⌊ K e / 2 ⌋ for a ny subse t T ⊂ { 1 , . . . , N } of cardina lity S . Since there are N S ≤ ( N e/S ) S dif ferent su bsets T of c ardinality S , Φ e does not sa tisfy the RIP with probability Pr { Φ e does not sa tisfy RIP } ≤ 4 N S (12 /δ S ) S e − C 0 ⌊ K e / 2 ⌋ ≤ 4 ( N e/ S ) S (12 /δ S ) S e − C 0 ⌊ K e / 2 ⌋ = 4 e − ( C 0 ⌊ K e / 2 ⌋− S [ log( N e/S )+log( 12 /δ S )]) ≤ 4 e −{ C 0 ⌊ K e / 2 ⌋− C 3 [log( N e/S )+log(12 /δ S )] ⌊ K e / 2 ⌋ / log( N /S ) } = 4 e −{ C 0 − C 3 [1+(1+log(12 /δ S )) / log( N /S )] }⌊ K e / 2 ⌋ . (78) Setting C 4 = C 0 − C 3 [1 + (1 + log ( 12 /δ S )) / log ( N/S )] and c hoosing C 3 small enough that guarantee s that C 4 is positive, we obtain (27). June 6, 2018 DRAFT 30 A P P E N D I X D : P R O O F O F L E M M A 2 The method of the proof is the same as the o ne u sed to prove Le mma 1 an d is bas ed on splitting the rows of Φ e into a numb er of sets with indepen dent entries. He re, the splitting is carried ou t as shown in (29). Let ( Φ e ) i T , i = 1 , . . . , n p − 1 be the matrix containing the ( i − 1) K + 1 -th to the iK -th rows of ( Φ e ) T . The last K e − ( n p − 1) K rows o f ( Φ e ) T form the matrix ( Φ e ) n p T . Since the matrices ( Φ e ) i T , i = 1 , . . . , n p − 1 con sist of indep endent entries, they satisfy (2) each with p robability of a t least 1 − 2 (12 /δ S ) S e − C 0 K . For the same reas on, the matrix ( Φ e ) n p T satisfies (2) with probab ility greater than or equal to 1 − 2 (12 /δ S ) S e − C 0 K n p . In the e vent that all the matrices ( Φ e ) i T , i = 1 , .., n p satisfy (2) simultaneous ly , for c ∈ R S we have n p X i =1 K i N (1 − δ S ) k c k 2 l 2 ≤ n p X i =1 k ( Φ e ) i T c k 2 l 2 ≤ n p X i =1 K i N (1 + δ S ) k c k 2 l 2 ⇒ K e N (1 − δ S ) k c k 2 l 2 ≤ k ( Φ e ) T c k 2 l 2 ≤ K e N (1 + δ S ) k c k 2 l 2 . (79) Therefore, using the union b ound and (79), we c an conc lude that Pr { ( Φ e ) T does not sa tisfy (2) } ≤ n p X i =1 Pr { ( Φ e ) i T does not sa tisfy (2) } ≤ 2( n p − 1) ( 12 /δ S ) S e − C 0 K + 2 ( 12 /δ S ) S e − C 0 K n p (80) which proves the lemma. A P P E N D I X E : P R O O F O F T H E O R E M 3 According to Lemma 2, for a ny subse t T ⊂ { 1 , . . . , N } of c ardinality S , the probability that ( Φ e ) T does not s atisfy (2) is less than or e qual to 2( n p − 1) ( 12 /δ S ) S e − C 0 K + 2 ( 12 /δ S ) S e − C 0 K n p . Using the fact that there are N S ≤ ( N e/S ) S dif ferent s ubsets T , the proba bility that the extende d measurement June 6, 2018 DRAFT 31 matrix Φ e does not sa tisfy the RIP c an be compu ted as Pr { Φ e does not sa tisfy the RIP } ≤ 2( n p − 1) N S (12 /δ S ) S e − C 0 K + 2 N S (12 /δ S ) S e − C 0 K n p ≤ 2( n p − 1) ( N e/S ) S (12 /δ S ) S e − C 0 K + 2 ( N e /S ) S (12 /δ S ) S e − C 0 K n p = 2( n p − 1) e − ( C 0 K − S [log ( N e/S )+log(12 /δ S )]) + 2 e − ( C 0 K n p − S [ log( N e /S )+log (12 /δ S )] ) ≤ 2( n p − 1) e − n C 0 K − C 3 K n p K [log( N e/S )+log (12 /δ S )] K/ log( N /S ) o + 2 e − { C 0 K n p − C 3 K n p [log( N e/S )+log (12 /δ S )] K n p / log( N/S ) } = 2( n p − 1) e − n C 0 − C 3 K n p K [1+(1+log(12 /δ S )) / log( N /S )] o K + 2 e −{ C 0 − C 3 [1+(1+log(12 /δ S )) / log( N/S )] } K n p . (81) Denoting the con stant terms a s C 4 = C 0 − C 3 [1 + (1 + log (12 / δ S )) / log ( N /S )] and C ′ 4 = C 0 − ( C 3 K n p /K ) × [1 + (1 + log (12 /δ S )) / log ( N /S )] , and choosing C 3 small en ough in order to gua rantee that C 4 and C ′ 4 are positiv e, we obtain (31). R E F E R E N C E S [1] E. J. Candes, and M. B. W akin, “ An introduction to compressiv e sampling, ” IE EE Signal Pr ocessing Mag azine , vol. 25, pp. 21–30, March 2008. [2] E. Candes and T . T ao, “Decoding by linear programming, ” IE EE Tr ans. Inf. Theory , vo l. 51, pp. 4203–4 215, Dec. 2005. [3] D. Donoho, “Compressed sensing, ” IEEE T ra ns. Inf. Theory , vol. 52, pp. 1289–13 06, Apr . 2006. [4] J. H aupt and R. Nowa k, “Signal reconstruction from noisy random projections, ” IEEE Tr ans. Inf. Theory , vol. 52, pp. 4036–4 048, Sept. 2006. [5] M. W akin, J. N. Laska, M.F . Duarte, D. Baron, S . S arvotha m, D. T akhar , K.F . Kelly , and R.G. Baraniuk, “ An architecture for compressiv e i maging, ” in Proc. IEEE ICIP , Atlanta, US A, Oct. 2006, pp. 1273–12 76. [6] W . Bajwa, J. Haupt, A. Sayeed, and R. Now ak, “Joint sourcecha nnel communication for distributed estimation in sensor networks, ” IEE E T rans. Inf. Theory , vol. 53, pp. 3629–3653 , Oct. 2007. [7] Z. Y u, S. Hoyos, and B. M. Sadler, “Mixed-signal parallel compressed sensing and reception for cogniti ve radio, ” i n Pro c. IEEE ICASSP , L as V egas, USA, Apr . 2008, pp. 3861– 3864. [8] G. T aubock, and F . Hl aw atsch, “ A compressed sensing technique for OFDM channel estimation in mobile en vironments: Exploiting channel sparsity for reducing pilots, ” in Pr oc. IE EE ICA SSP , L as V egas, US A, Apr . 2008, pp. 2885–2888. [9] W . U. B ajwa, J. Haupt, G. Raz, and R . Nowa k, “Compressed channel sensing, ” in Proc . IEEE CISS , P rinceton, USA, Mar . 2008, pp. 5–10. [10] Y . C. Eldar , “Compressed sensing of analog signals in shift-in varian t spaces, ” IEEE T rans. Sig. Pro cessing , vol. 57, No. 8, pp. 2986–2997, Aug. 2009. [11] M. Mishali and Y . C. Eldar, “Blind multiband signal reconstruction: Compressed sensing for analog signals, ” IE EE T rans. Sig. Pro cessing , vol. 57, No. 3, pp. 993–1009, Mar . 2009. June 6, 2018 DRAFT 32 [12] C. de Boor, R. De V ore, and A. Ron, “T he structure of finite genetated shift-in variant spaces in L 2 ( R d ) , ” J. Funct. Anal. , vol. 119, No. 1, pp. 37–78, 1994. [13] Y . M. Lu and M. N. Do, “ A theory for sampling signals from a union of subspaces, ” IEE E T ran s. Sig. P r ocessing , vol. 56, No. 6, pp. 2334–2345 , Jun. 2008. [14] Y . C. Eldar and M. Mishali, “Rob ust recove ry of signals from a structured union of subspaces, ” IEEE Tr ans. Inf. Theory , vol. 55, No. 11, pp. 5302–5316, Nov . 2009. [15] J. N. L aska, S. Kirolos, M.F . Duarte, T .S . Ragh eb, R.G. Baraniuk, and Y . Massoud, “Theory an d implementation of an analog-to-information con verter using random demodulation, ” in Pr oc. IEEE ISCAS , New Orleans, USA, May 2007, pp. 1959–1 962. [16] O. T aheri and S. A. V orobyov , “Segmented compressed sampling for analog-to-information con version, ” in Proc. IEEE CAMSAP , Ar uba, Dutch Anti lles, Dec. 2010, pp. 113–11 6. [17] W . Badjwa, J. D. Haupt, G. M. Raz, S. J. Wright, and R. D. Now ak, “T oeplitz-structured compressed sensing matrices, ” in P r oc. IE EE SSP , Madison, USA, Aug. 2007, pp. 294–298. [18] R. Baraniuk and P . St eeghs , “Compressi ve radar imaging, ” in Pr oc. IEEE Radar Conf. , W alt ham, MA, USA, Apr . 2007. [19] D. L. Donoho and J. T anner, “Counting face s of randomly projected polytopes when the projection r adically lo wers dimension, ” J ournal of the American Math. Society , vol. 22, no. 1, pp. 1–53, Jan. 2009. [20] E. Candes and T . T ao, “Near optimal signal reco very from random projections: univ ersal encoding stategies?, ” IEEE T rans. Inf. Theory , vol. 52, No. 12 , pp. 5406–5425 , Dec. 2006. [21] R. Baraniuk, M. Dav enport, R. De V ore, and M. W akin, “ A si mple proof of the restricted isometry property for random matrices, ” Constructive Appr oximation , Jan. 2008. [22] D. Donoho, “For most large underdetermined systems of linear equations the minimal l 1 -norm solution is also the sparsest solution, ” Communi. P ur e and A pplied Math. , vol. 59, pp. 797–829, Jun. 2006. [23] E. Candes, J. Romberg, and T . T ao, “Stable signal recov ery from incomplete and inaccurate measurements, ” Communi. Pur e and Applied Math. , vol. 59, pp. 1207–1223 , Aug. 2006. [24] V . N. V apnik, Statistical Learning Theory , Wile y , New Y ork, 1998. [25] D. Angelosante, G. B. Giannakis, “RLS-wei ghted LASSO for adaptiv e estimation of sparse signals, ” Pro c. IEE E ICA SSP , T aipei, T aiwan, Apr . 2009, pp. 3245–3248. [26] C. Craig, “On t he Tchebychef f inequality of Bernstein, ” Ann. Math. Stat. , vol. 4, no. 2, pp. 94–102, May 1933. [27] Y . Zhang, “Theory of compressi ve sensing via l 1 -minimization: A non-RIP analysis and extensions, ” , 2010. June 6, 2018 DRAFT
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment