Approximate Decoding Approaches for Network Coded Correlated Data

1 Approximate Decoding Approaches for Netw ork Coded Correlated Data Hyunggon Park ∗ , Nikolaos Thomos † and Pascal Frossard † ∗ Multimedia Communica tions and Networking Lab oratory , Ewha W omans Univ ersity , Seoul, Korea. † Signal Processing L ab . (L TS4), Ecole Po lytechniq ue F ´ ed ´ erale de L ausanne (EPFL), La usanne, Switzerland hyunggon.park @ewha.ac.kr, { nikolaos.thomos, pascal.frossard } @epfl.c h Abstract This paper considers a fram ew ork where data from correlated sources are transmitted with help of network co ding in ad -hoc network topolo gies. The cor related data are enc oded ind ependen tly at sensors and network coding is employed in the intermediate nodes in order to improve the data delivery perfor mance. In such setting s, we foc us on the pro blem of reco nstructing the sour ces at decoder when perfect decod ing is not po ssible due to losses or ban dwidth bottlen ecks. W e ﬁrst show that the source d ata similarity can b e used at dec oder to pe rmit decod ing based on a n ovel an d simple a ppr oximate decodin g scheme. W e analyz e the inﬂuenc e of the n etwork coding p arameters and in particular the size o f ﬁnite coding ﬁelds on the decod ing perfo rmance. W e fu rther d etermine the optimal ﬁeld size that maximizes the expected d ecoding perform ance as a trade -off between info rmation loss incurred b y limiting the resolution o f the sour ce data and the error pr obability in the recon structed data. Moreover, we show that the performan ce of the ap prox imate deco ding improves when the accuracy of the source model increases ev en with simple approxima te deco ding technique s. W e provide illustrativ e examples abou t th e possible of our algo rithms that can be deployed in sensor network s an d distributed ima ging applications. In b oth cases, the experimental results con ﬁrm the validity o f our an alysis an d demonstra te the beneﬁts o f our low complexity solu tion fo r deli very of correlated data sources. Index T erms Network cod ing, appr oximate decodin g, correlated d ata, distributed tran smission, ad- hoc networks. This work has been supported by the Swiss National S cience Foundation (grants P Z00P2-121906 and 200021 -118230) and by Basic Science R esearch Program through the National Research Found ation of K orea (NRF) funded by the Ministry of Education, Science and T echnology (2010-000971 7). This work was mostly performed while the ﬁrst author was wit h EP FL. 2 I . I N T RO D U C T I O N The rapid deployment of distributed networks such as se nsor networks, cloud networks has motiv ated a ple thora of researche s that study the des ign of lo w co mplexity a nd efﬁcient solutions for information deliv ery . Sinc e the coordina tion amon g intermed iate no des is often difﬁcult to a chieve, the information dissemination in the intermediate nodes has often to be performed in a distributed manner on ad-hoc or overlay mesh network topologies. Network cod ing [1] has b een recen tly proposed a s a method to buil d e fﬁcient distributed delivery algorithms in networks with path and source diversity . It is base d o n a parad igm where the network nodes are allowed to perform basic proce ssing o perations on information streams. The network nodes can combine information packets and trans mit the resulting data to the next network nod es. Su ch a strategy pe rmits to improve the throughpu t of the sys tem and to approac h better the ma x-ﬂow min-cut limit of networks [2], [3]. When the decode r receiv es eno ugh data, it can recover the original source information by performing in verse ope rations (e.g., with Gaussian elimination). These ad vantages motiv ate the deployment of network cod ing in various scenarios where the ne twork div ersity is signiﬁcan t (e.g ., [4]–[9]). Ma ny of thes e so lutions are based on random linea r ne twork coding (RLNC) [10] that permits to implement distrib uted solutions with low c ommunication costs. RLNC represents an interesting solution for the deployment of practical systems where it can work in conjunction with data disse mination protocols suc h as gos siping algorithms [8]. The resulting systems are robust against li nk failures, do not require r econc iliation between the networ k node, and can signiﬁcantly improve the performanc e o f da ta delivery compared to ’ store and forward’ ap proache s. Most o f rese arch so far h as howe ver foc used either on the oretical aspects of network coding suc h a s achiev able c apacity and coding gain, or on it s practical aspects such as rob ustness and i ncreas es throughput when the number of innov ati ve packets is sufﬁcient for pe rfect decod ing. Howe ver , it g enerally do es not consider the p roblematic cas es where the clients rece i ve an insufﬁcient number of innovati ve p ackets for perfect decoding due to losses or timing cons traints for example. This is the ma in problem address ed in this pap er . W e consider a framew ork wh ere network coding is used for the de li very of correlated da ta tha t are discretized and inde penden tly e ncode d at the sensors. The information streams are de li vered with help of RLNC in lossy ad-hoc n etworks. Whe n an insufﬁcient n umber of symbols at dec oder pre vent exac t data recovery , we de sign a novel lo w complexity appr oximate decoding algorithm that us es the data c orrelation for s ignal reco nstruction. The information a bout source similarity typically p rovides additional co nstraints in the decoding process, such that well-kno wn approaches f or matrix in version (e.g., Gaussian elimination) can be ef ﬁ ciently used e ven in the case where the decoding problem is a priori underde termined. W e show analytically tha t the use of so urce models at deco ding process leads to an improved da ta recovery . Then, we a nalyze the impact of accurate knowledge o f da ta s imilarity at decoder , wh ere more precise 3 information leads to better performance in the approximate decod ing. W e further ana lyze the inﬂuen ce of the choice o f the Galois Field (GF) size in the coding op erations on the pe rformance of the approximate decoding frame work. W e demons trate that the ﬁ eld size should be selected by con sidering the trade off between resolution in representing the s ource and ap proximate decod ing performanc e. Speciﬁca lly , whe n the GF size increas es, the quantization error of the sou rce data decreases while the decod ing error probability increase s with the GF size. W e show that the re is an optimal value for the GF size when the approximate de coding is e nabled a t the rece i vers. Finally , we illustrate the performance of the ne twork coding a lgorithm with the ap proximate dec oding on two types of correlated data, i.e., seismic da ta an d video sequen ces. The simulation resu lts con ﬁrm the v alidity of the GF size analys is and sh ow tha t the approximate d ecoding sc heme le ads to efﬁcient reconstruction when the acc urate correlation information is u sed during decod ing. In summary , the ma in c ontributi ons of o ur p aper are (i) a new framework for the distrib uted de li very o f correlated data with network coding, (ii) a novel ap proximate decod ing s trategy that exploits the data similarity with lo w complexity when the received data doe s not permit perfect decoding , (iii) an an alysis of the inﬂuen ce of the accuracy of the data similarity information an d the GF size on the deco ding pe rformance, an d (iv) the implementation of illustrati ve examples with external or intrinsic source correlation. In general, the transmission of c orrelated sources is studied in t he framew ork of distrib uted coding [11] (i.e., in the c ontext of S lepian-W olf prob lem), whe re sou rces are typica lly en coded by systematic channe l enc oders an d eventually decod ed jointly [12], [13]. DSC (distrib uted source coding) is also combined with network coding schemes [14]–[17] in the gathering of correlated data. Our focus is howe ver not on the design of a distributed compress ion s cheme, which generally as sumes that se nsors are aware of the similarity between the da ta source s. Rather , we foc us on the transmission o f correlated data that are encoded indep endently , trans mitted with help o f network coding over an overlay ne twork and jointly decod ed at the rec eiv ers. Howe ver , due to the network dyna mics, there is no guara ntee that each node rece i ves e nough useful pac kets for successful da ta recovery . Hence, it is essential to ha ve a low complexity me thodology that en ables the recovery of the original data with a good ac curacy , when the number o f useful pac kets is n ot sufﬁcient for pe rfect decod ing. When RLNC is implemented in the network, the e ncoding and dec oding process es of ea ch node a re ba sed on linear ope rations (e.g., linear combinations, in verse of linear ma trix, etc.) in a ﬁnite algeb raic ﬁeld. In the c ase of ins uf ﬁcient nu mber of innovati ve packets for pe rfect deco ding, one ca n simply d eploy an existing regularization techn ique that may minimize the norm of the errors using the pse udo-in verse of the encoding matrix. Howe ver , it is generally kno wn that this type of r egularization t echniqu es ma y result in signiﬁc antly unreasonable approximation [18]. Alternati vely , Tikhonov regularization provides a n improved performanc e by s lightly modifying the standard least square formula. Howev er , this technique requires to determine ad ditional 4 optimization p arameters, which is nontri vial in practice. Spa rsity a ssumptions might also b e used [19] for regularized de coding in un derdetermined systems in c ases where a s parse model of the signal of interest is known a priori. H owe ver , all these regularization tech niques have been designe d a nd developed in the continuous domain, but n ot for ﬁnite ﬁelds that are used in n etwork coding app roaches . Th us, they may show signiﬁcan tly poor performanc e if they are blindly ap plied in our framework, as they cannot c onsider several properties (e.g., cyc lic) o f ﬁnite ﬁeld operations. Un derdetermined systems ca n also be solved approximately bas ed on the maximum likelihood estimation (MLE) techn iques (se e e.g., [20] (Part II)) , but these technique s require eff ectiv e da ta models and typic ally in volv e lar ge computationa l complexity . The paper is or ganized as follo ws. In Section II, we present our frame work and desc ribe the approximate decoding algorithm. W e discuss the inﬂuence of the source model information in t he approximate decod ing process in Section III. In Section IV, we analyze the relation between the decoding performance and the GF size , a nd then determine an optimal GF s ize that achieves the s mallest exp ected dec oding error . Section V and Section VI p rovide il lustrati ve examples t hat show how the proposed approach can be implemented in sens or network o r video deliv ery a pplications. I I . A P P RO X I M A T E D E C O D I N G F R A M E W O R K W e begin to des cribe the gene ral frame work cons idered in this pa per and pre sent the proposed distrib uted d eliv ery strategy for correlated data s ources. W e also disc uss the concept o f app roximate decoding that enable s receivers to estimate the sou rce information when the numbe r o f d ata packets is not sufﬁcient for perfect deco ding. A. RLNC En coding W e cons ider an overlay network wit h sources, intermediate nodes, and clients distrib uted over a network (e.g., ad-hoc network). W e denote by s 1 , . . . , s N the symbols generated by N discrete and co rrelated sources , whe re s n ∈ S ( ⊂ R ) for 1 ≤ n ≤ N . S is an alphab et set of s n and |S | de notes the size of S . These so urce da ta are transmitted to the clients via intermediate nodes that are able to p erform network coding (i.e., RLNC). Hence , e ach s n also needs to be conside red as an element in a GF . In order to explicitly specify whether s n is in the ﬁeld of real numbers or in a GF , we de ﬁne identity functions , deﬁned as    1 R G : R → GF , 1 R G ( s i ) = x i 1 G R : GF → R , 1 G R ( x i ) = s i (1) which means that x i is an element in GF representing s i . Thus, a n intermediate node k us ing RL NC transmits a packet generated as y ( k ) = N X n =1 M { c n ( k ) ⊗ x n } , ( c 1 ( k ) ⊗ x 1 ) ⊕ ( c 2 ( k ) ⊗ x 2 ) ⊕ · · · ⊕ ( c N ( k ) ⊗ x N ) 5 which is a l inear co mbination o f x n and coding c oefﬁcients c n ( k ) in GF . ⊕ and ⊗ den ote an additive operation and a multiplicativ e o peration d eﬁned in GF , resp ectiv ely . The c oding coe fﬁcients are uniformly and randomly chosen from GF with size 2 r , denoted by GF (2 r ) . This implies that the GF size is determined by r and that c n ( k ) ∈ GF (2 r ) . In o ur implementation, the addition in GF with cha racteristic 2, i.e., GF (2 r ) , is p erformed b y the exclusiv e-OR (XOR) o peration. The s ize of the ﬁe ld de termines the set of c oding operations that ca n be performed o n source s ymbols. W e thus a ssume that the size o f the input set is |S | ≤ 2 r . If |S | > 2 r , the input set is redu ced (using e.g., source binning or qu antization), such tha t the input se t doe s not exce ed the GF size (i.e., 2 r ). The enc oded symb ols in each no de are trans mitted to neighboring no des tow ards the c lient nod es. If a dec oder receiv es K innov ati ve (i.e., l inearly inde penden t) symbo ls y (1) , . . . , y ( K ) , where all y ( k ) ∈ GF (2 r ) , a linear sys tem y = C ⊙ x can be formed a s 1      y (1) . . . y ( K )      = h c 1 · · · c N i ⊙      x 1 . . . x N      , N X n =1 M { c n ⊗ x n } (2) where ⊙ d enotes the multiplication be tween matrices in a ﬁ nite ﬁeld. The K × N matrix C is referred to as the coding co efﬁcient matrix, which consists of column vectors c n = [ c n (1) , c n (2) , · · · , c n ( K )] T , where A T denotes the transpose of a ma trix A . An illustrativ e example for N = 3 is shown in Fig. 1, where the symbols s 1 , s 2 , and s 3 , which are map ped into x 1 , x 2 and x 3 respectively , from s ources are network enc oded at intermediate nodes u sing randomly chos en coding coe f ﬁcients. B. Approximate Decoding Upon receiving a s et of symbo ls y g enerated by (2), the dec oder attempts to recover the source data. If K = N , i.e., the coding c oefﬁcient ma trix C is full-rank as N innov ati ve symbols are avail able, then x is uniquely d etermined as x = C − 1 ⊙ y (and c orrespond ingly , s = 1 G R ( x ) ) from the linear system in (2). Note that C − 1 represents the in verse of the coding coefﬁcient matrix C and c an be obta ined by well-known approac hes such as the Gau ssian elimination method over a GF . Howe ver , if the number of rece i ved symb ols i s insuf ﬁcient (i.e., K < N ), there may be an inﬁnite number of solutions ˆ x = [ ˆ x 1 , . . . , ˆ x N ] T to the system in (2), as C is not full -rank. Hence , a dditional constraints s hould be impose d so that the cod ing co efﬁcient matrix bec omes full-rank. Hence, we modify the decoding s ystem in ( 2) in order to include external information as coding constraints that permits decoding . T his leads to approximate decoding, where the correlation of the input data is exploited to construct additional c onstraints D (all ele ments of D are in GF as well) and ν in the decoding proces s 1 In this paper , vectors and matrices are represented by boldfaced lowercase and boldfaced capital letters, respectiv ely . 6 1 x 3 x 2 x ( 1 ) y 1 ( 1 ) c 2 ( 1 ) c 3 ( 1 ) c ( 2 ) y (3 ) y y C x Node 1 Node 3 Node 2 Source Data ... 1 x 3 x 2 x 1 x 3 x 2 x 1 ( 2 ) c 2 ( 2 ) c 3 ( 2 ) c 1 (3 ) c 2 (3 ) c 3 (3 ) c Fig. 1. Illustrativ e example of network coding with N = 3 source data and three network coding nodes. The input data s n , which is mapped into x n in GF , are linearly combin ed with random co efﬁcients in each network coding nod e, to generate v ector y . so that the system be comes solvable. W ith the additional con straints, D and ν , an approximate decoding solution can be expressed a s ˆ x =   C D   − 1 ⊙   y ν   (3) which again can be implemen ted by the Gau ssian elimination method in a ﬁnite ﬁeld. The additional constraints D and ν typically depe nd on the problems un der c onsideration, i.e., the source models. 2 An a pproximation ˆ s of the original data can then be ob tained by the iden tity functions deﬁ ned in (1), i.e., ˆ s = 1 G R ( ˆ x ) . The distortion between s and ˆ s is denoted by k s − ˆ s k l , whe re k · k l denotes the l − no rm operation [21]. An illustrati ve examp le of approx imate de coding algorithm is described in Algorithm 1. C. Simple Implemen tation of Ap pr oximate Deco ding While the approx imate deco ding frame work is generic, we presen t a simple instanc e of the algorithm in this p aper . 3 Thus, our foc us is o n highlighting the potential advantages a chieved b y deploying simple approximate deco ding approach for delivery of co rrelated data in resource c onstrained en vironments. Since K innov ativ e s ymbols a re rec eiv ed, the ran k o f C in (3) is K , and correspondingly , D (3) is a ( N − K ) × N matrix of coefﬁcients. Th e co efﬁcients in D are determined based o n the sou rce c orrelation or similarity model. The source similarit y is meas ured by the d istance between data [22], [23]. More 2 Alternativ ely , the source model information and the received symbols can be translated from GF into the ﬁeld of real numbers, and the decoding process is performed. Howe ver , this may incur more computational complexity . 3 By deploy ing more general source models and sophisticated algorithms on top of the proposed framew ork, better performance can be achieve d. 7 Algorithm 1 Approximate Decod ing Giv en: recei ved symbols y , co efﬁcient matrix C , data source model, data size N , GF s ize 2 r . 1: if r ank ( C ) = N , then 2: ˆ s = 1 G R  C − 1 ⊙ y  3: else // r ank ( C ) < N an d use approximate decoding 4: Construct D and ν based on av ailable so urce mod el information 5: Compute ˆ s = 1 G R      C D   − 1 ⊙   y ν      6: end if speciﬁca lly , the mos t similar data s i and s j have the smallest d istance | s i − s j | . Then, we cons truct D with e ach row consisting of z eros, (i.e., additi ve identity of GF( 2 r )), except two elements of value “1” and “1” (bec ause 1 is also a n additi ve in verse o f 1 in GF( 2 r )) that co rrespond to the positions of the most similar data x i and x j . Ac cordingly , ν is set as a zero vector with size o f ( N − K ) , which is also append ed to y and rep resent the results o f the add itional con ditions set in D . Thus, the impleme ntation is expressed a s ˆ x =   C D   − 1 ⊙   y 0 ( N − K )   . (4) This enables the d ecode r to reconstruct the original symbols whenever the number of symbo ls is not sufﬁcient for perfect decoding. W ith these additional equations, the deco der c an then i n vert the linear system and approximate the da ta x with clas sical de coding a lgorithms. Note that the co ding c oefﬁcient matrix in (4) is as sumed here to be n on-singular , which happens with high proba bility if the s ize of the GF is lar ge en ough. Howe ver , the probability that the coding coefﬁcient matrix becomes singu lar increas es as the size of D is en lar ged. In this case, the system includes a lar g e n umber of s imilarity-dri ven coefﬁcient rows with respec t to the random coefﬁcients of the original c oding matri x. The impac t of the s ingularity of the c oding co efﬁcient matrix on the performance of the approximate de coding is qu antiﬁed in S ection VI-B. Finally , we gene rally con sider that there exists a solution to the de coding p roblem formed by the augmented coefﬁcient matrix in (4). Otherwise, the deco der outputs a dec oding error sign al. W e study in the next sec tions the inﬂuence of the accuracy o f the sou rce model information and the inﬂuence of the ﬁ nite ﬁeld s ize (GF size ) in the propo sed approximate decod ing a lgorithm. Speciﬁc implementations of the ap proximate dec oding are later discus sed in d etail in Section V and S ection VI with illustrati ve examples . 8 I I I . A P P RO X I M A T E D E C O D I N G B A S E D O N A P R I O R I I N F O R M A T I O N O N S O U R C E M O D E L W e discuss in this section the performance of the proposed a pproximate decoding a lgorithm for recovering the so urce data from an insuf ﬁcient number of network coded packets. In particular , we ana lyze and qua ntify the impac t of the acc uracy of the sou rce model information (i.e., expec ted similari ty betwee n source values) at d ecode r when the aug mented s ystem in (4) enforce s that the most s imilar data should have similar values after decoding. Recall that if approximate decoding is not deployed, con ventiona l network dec oding a pproache s for the network c oded data cannot recover any sou rce d ata. W e ﬁrst sh ow that the deco ding error in ou r a pproximate decoding algorithm can be up per b ounded as source data are more similar . T his is described in Property 1. Pr operty 1: The reconstruc tion er r or decreases a s the sources a r e mor e similar . Pr oof: L et y be a set of K received innovati ve p ackets (wi th K smaller than the numb er of original symb ols N , i.e., K < N ). Le t further C b e the correspond ing c oding co efﬁcient matrix a nd x be original s ource data as in (2). S ince on ly K < N innov ati ve packets are available at d ecoder , ( N − K ) additional constraints are imposed into the cod ing c oefﬁcient matrix D bas ed on the approach discuss ed in Section II-C. This leads to the app roximate d ecoding solution ˆ x in (4). W e n ow ana lyze the err or incurred by the proposed a pproximate decoding algorithm. The recovered symbol ˆ s = 1 G R ( ˆ x ) f rom the a pproximate solution ˆ x is comp ared to the exact solution s . This exact solution is recons tructed b ased on the set of coding coefﬁcients C and the co efﬁcients D , but with the exact c onstraints d (all the e lements in d are in GF (2 r ) ) and n ot their approximation by a zero vector as done in (4). W e de note these actual constraints by the vector d , deﬁne d as d = D ⊙ x = [ d (1) , . . . , d ( N − K )] T (5) which is computed by applying the a dditional coefﬁcients in D on the original vector x . Equiv alently , x can be computed by x =   C D   − 1 ⊙   y d   . (6) Note that ˆ x in (4) and x in (6) are obta ined b ased on the operations deﬁned in GF (2 r ) , and thus, the resulting elements in x or ˆ x are in GF (2 r ) . Howe ver , they originally repres ent d ata in R (e.g ., s ource data). Hen ce, in order t o quantify the performance of t he p roposed algo rithm, we are interested in the error b etween the exact and a pproximate solutions, i.e ., k s − ˆ s k l . From the assumption that  C T D T  T in (4) is no t singu lar , its in verse,  C T D T  − T can b e writ- ten as  M ( K ) M ( N − K )  =  m (1) · · · m ( K ) m ( K +1) · · · m ( N )  , where M ( K ) and M ( N − K ) indicate sub-matrices with { m (1) , . . . , m ( K ) } and { m ( K +1) , . . . , m ( N ) } column vectors. Th us, ˆ s and s can be 9 expressed from (4) a nd (6 ), respectively , as ˆ s = 1 G R ( ˆ x ) = 1 G R  M ( K ) ⊙ y  (7) s = 1 G R ( x ) = 1 G R  ( M ( K ) ⊙ y ) ⊕ ( M ( N − K ) ⊙ d )  . (8) Therefore, the error betwee n the exac t and the approximate solutions ca n be express ed as k s − ˆ s k l = k 1 G R ( x ) − 1 G R ( ˆ x ) k l (9) =   1 G R  ( M ( K ) ⊙ y ) ⊕ ( M ( N − K ) ⊙ d )  − 1 G R  M ( K ) ⊙ y    l (10) ≤   1 G R  ( M ( K ) ⊙ y ) ⊕ ( M ( N − K ) ⊙ d ) ⊕ ( M ( K ) ⊙ y )    l (11) =   1 G R  M ( N − K ) ⊙ d    l =      1 G R N − K X k =1 M { m K + k ⊗ ( x i,k ⊕ x j,k ) } !      l (12) ≤      N − K X k =1 1 G R { m K + k ⊗ ( x i,k ⊕ x j,k ) }      l . (13) The inequa lities from (10) to (11) an d from (12) to ( 13) stem fr om the properties of op erations in the ﬁeld of real numbers and GF , i.e., s i − s j ≤ 1 G R ( x i ⊕ x j ) ≤ s i + s j (14) where x i and x j are the GF repres entation of s i and s j , respectively (see ( 1)). Moreov er , d = [ d (1) · · · d ( N − k )] T , where d ( k ) = x i,k ⊕ x j,k , 0 ≤ i, j ≤ N , as eac h e lement in d depe nds on two non-zero elemen ts in each row of D , and thus , o n o ur choice of the additional cons traints. When the data s i and s j are very similar , the distan ce between them, | s i − s j | , be comes small, which leads to sma ller values of 1 G R ( x i ⊕ x j ) with very high probability , i.e., the proba bility mass function is concen trated near 1 G R ( x i ⊕ x j ) = 0 a nd it is deca ying very sharply for larger 1 G R ( x i ⊕ x j ) . If the data s i and s j have less similarity , howe ver , it results in lar ger distance of | s i − s j | , leading to the proba bility mass function of 1 G R ( x i ⊕ x j ) is widely spread . This is s hown experimentally in Append ix I. The refore, giv en vectors m K +1 , . . . , m N in (13), the error in the s ense of similarity , i.e., k s − ˆ s k l , betwee n the exact and app roximate s olutions de creases o n average when the data have more similarity . Property 1 implies that the d ecoding e rror is boun ded, and tha t this bo und beco mes smaller when original da ta are more similar . This me ans that the b est way to cons truct D consists in building additional constraints with source sy mbols that are expec ted to have the highest similarity . In order to show this analytically , conside r D and ˜ D (with ˜ D 6 = D ), where ˜ D is con structed by a set of les s similar data than D that is con structed by the most similar data. This means from (13) that the upp er bound s of the 10 0 5 10 15 20 25 30 0 0.5 1 1.5 2 2.5 3 x 10 4 Variances between Original Data Samples MSE Average MSE for AD with Higher/Lower Similarities D based on Higher Similarity D based on Lower Similarity 0 100 200 300 400 500 0 1 2 3 4 5 6 x 10 4 Trials (500 Independent Trials) MSE Instances of MSE for AD with Higher/Lower Similarities D based on Higher Similarity D based on Lower Similarity Fig. 2. Illustrativ e e xamples: performance comparison of the propose d approximate decoding algorithm w ith highe r and lower similarities (i.e., with matrix D and ˜ D ). In order to emulate the higher and lower similarities in these examples, source data is generated as s i = s 1 + N (0 , σ i ) where s 1 is gi ven and N (0 , σ i ) denotes a zero mean Gaussian random variable with standard de viations σ 2 and σ 3 . In these experiments, 1 of 3 packets is lost and approximate decoding algorithm uses D = [1 1 0 ] for higher similarity and D = [1 0 1 ] for lowe r similarity . A verag e performance (sho wn left with a ﬁxed σ 2 = 0 . 2 and variable σ 3 ( ≥ 0 . 2) ) and instantaneous performances ( σ 2 = 1 , σ 3 = 10 ) in independent experiments. distance with D and ˜ D are respectively      N − K X k =1 1 G R { m K + k ⊗ ( x i,k ⊕ x j,k ) }      l (15) and      N − K X k =1 1 G R { m K + k ⊗ { ˜ x i,k ⊕ ˜ x j,k }}      l . (16) Since x i,k and x j,k are s peciﬁed by D while ˜ x i,k and ˜ x j,k are s peciﬁed by ˜ D , it is true wit h high probability that x i,k ⊕ x j,k ≤ ˜ x i,k ⊕ ˜ x j,k (17) as discus sed in App endix I. Th erefore, we c an co nclude from (9)–(13), (15), (16) a nd (17) that D lea ds to better performance (or equiv alently less errors) than ˜ D on average if the approximate decoding is deployed in co njunction with the implemen tation propose d in Se ction II-C. An illustrati ve s et of s imulation res ults are s hown in Fig. 2. In summary , we observe that the ef ﬁciency of approximate decoding increases with the source simil arity and with the accuracy a bout the correlation information that is use d to de ri ve additional cons traints for decoding . 11 I V . O P T I M A L F I N I T E F I E L D S I Z E W e study he re the des ign of the co ding co efﬁcient matrix, and in particular , of the inﬂuence o f the size of the ﬁnite ﬁe ld (i.e., GF) on the performance of the approximate dec oding framew ork. This size has an inﬂue nce on the reconstruction e rror when the number o f symb ols is insu f ﬁcient for perfect deco ding. The GF size determines the resolution of the source enc oding s ince o nly a ﬁ nite number of symbols (that is equa l to the GF size) can be uniquely represented by the ide ntity functions deﬁ ned in Sec tion II-A. Thus, as the GF size is enlarged, the error that ma y be inc urred by quantizing s ource data bec omes smaller . At the same time, however , there is h igher p robability that a large distortion is induced by the approximate re construction. W e therefore determine the op timal GF s ize that minimizes the expec ted decoding error by trading off source ap proximation error a nd decod ing error probab ility . W e ﬁrst prove the f ollowing property , which states that the decoding errors incre ase as the GF size is enlarged. While this prope rty seems contradictory , this is true be caus e a perfect source model tha t identiﬁes which sou rce data are exactly the same is not av ailable. Rather , the s ource model can only provide the information about the most s imilar da ta, so that the approximate de coding can u se it for data recovery with b est efforts. In the analysis, we cons ider a worse-case sce nario, where data recovered by the cons traints in the D matrix of the a pproximate d ecoding are uniformly distributed over S . 4 Pr operty 2: Given a ﬁnite s et of data S , the averag e r econs truction err or increases as the GF size for the coding ope rations incr eases . Pr oof: Let s ∈ S be an original symbol, where the size o f the original da ta spac e is giv en by |S | = 2 r . Let further ˆ s r = 1 G R ( ˆ x r ) and ˆ s R = 1 G R ( ˆ x R ) be the decoded symbols when coding is performed in respec ti vely GF (2 r ) and GF( 2 R ) with R > r , for r , R ∈ N , i.e., GF( 2 R ) is an extended GF from GF( 2 r ). In this scena rio, the decoding errors a re uniformly d istrib uted over S . Thus, the proba bility mass function of ˆ s k is giv en by p k ( ˆ s k ) =    1 / 2 k , if ˆ s k ∈ [0 , 2 k − 1] 0 , otherwise for k ∈ { r , R } . T o p rove that a lar ger GF size res ults in a higher d ecoding error , we have to sh ow that Pr ( | s − ˆ s R | ≥ | s − ˆ s r | ) > 1 2 . (18) If this condition is sa tisﬁed, the expe cted d istortion is larger for s R than s r , or equ i valently , for the larger 4 If distribution of the decoded data is kno wn, it can be used for better approximate decoding. This may be an interesting future research direction. 12 GF size. The left hand side of (18) c an be expressed as Pr  ˆ s R ≥ ˆ s r , s ≤ ˆ s R + ˆ s r 2  + Pr  ˆ s R < ˆ s r , s > ˆ s R + ˆ s r 2  = Pr ( ˆ s R ≥ ˆ s r ) Pr  s ≤ ˆ s R + ˆ s r 2     ˆ s R ≥ ˆ s r  + Pr ( ˆ s R < ˆ s r ) Pr  s > ˆ s R + ˆ s r 2     ˆ s R < ˆ s r  =  1 − 2 r − R − 1  ˆ P + 2 r − R − 1  1 − ˆ P  = 2 r − R − 1 +  1 − 2 r − R  ˆ P becaus e ˆ s R and ˆ s r are both uniformly distributed. In the pre vious eq uations, we h av e posed ˆ P , Pr  s ≤ ˆ s R + ˆ s r 2   ˆ s R ≥ ˆ s r  . W e further s how in Ap pendix II that ˆ P > 1 2 . Therefore, w e have 2 r − R − 1 +  1 − 2 r − R  ˆ P > 2 r − R − 1 +  1 − 2 r − R  · 1 2 = 1 2 (19) which completes the proof. Property 2 implies that a small GF size is preferable in terms o f expected decod ing error . In particular , it is preferred n ot to enlarge the GF size more than the size o f the input spa ce since app roximate deco ding performs worse in very large ﬁe ld. Alternati vely , if the GF size be comes smaller tha n the size of the input alpha bet size, the maximum number of s ource sy mbols that can be distincti vely repres ented de crease s corresp ondingly . Sp eciﬁcally , if we c hoose a GF size of 2 r ′ such that |S | > 2 r ′ for r ′ < r , pa rt of the data in S needs to be discarde d to form a subset S ′ such t hat |S ′ | ≤ 2 r ′ . In this case, we assume that if the GF size is reduced from GF( 2 r ) to G F( 2 r − z ), where 0 ≤ z ( ∈ Z ) ≤ r − 1 , the least signiﬁcant z bits in the representation of the original data a re discarded ﬁrst from x ∈ S . Then, all the data in S ′ can b e distinctly e ncode d in GF( 2 r ′ ). In summary , wh ile reducing the GF s ize may resu lt in lower deco ding error , it ma y ind uce larger information loss in the source da ta. Ba sed on this clea r tradeoff, we propos e P roperty 3 that s hows the existence of an optimal GF size . Note that disc arding part of sou rce data information res ults in errors at the source, similar to data quantization. Thus, we as sume tha t the correspon ding source information loss is uniformly dis trib uted a nd that the dec oded data is a lso u niformly distrib uted in the following ana lysis. Pr operty 3: Ther e e xists an optimal GF size that minimizes the expected er r or in data reconstruc tion at decod er . The optimal GF size is given b y GF( 2 r − z ∗ ), whe r e z ∗ = ⌈ ( r − 1) / 2 ⌉ and z ∗ = ⌊ ( r − 1) / 2 ⌋ . Pr oof: Suppose that the number of original source s ymbols is |S | = 2 r and that the coding ﬁeld is GF( 2 r ). As discus sed in Prope rty 2 , the GF size does not need to be e nlarged more than 2 r , as this only increas es the probab ility of the expe cted de coding e rror . If the GF s ize is reduced from GF( 2 r ) to GF( 2 r − z ), the approximate de coding is more efﬁcient and the de coding e rrors are uniformly distributed over [ − r D , r D ] , where r D = 2 r − 1 − z − 1 , i.e., p e D ( e D ) =    1 / (2 r D + 1) , if e D ∈ [ − r D , r D ] 0 , otherwise . (20) 13 At the same t ime, if the G F size is reduc ed, the input data set S is redu ced to S ′ and the number of inpu t sy mbols is decreased. By discarding the z lea st signiﬁc ant bits, the numbe r of inp ut s ymbols becomes |S ′ | = 2 r − z . Su ch an information loss also resu lts in errors over [ − r I , r I ] , where r I = 2 z − 1 , i.e., p e I ( e I ) =    1 / (2 r I + 1) , if e I ∈ [ − r I , r I ] 0 , otherwise . (21) Based on these indepe ndent distortions, the d istrib ution of the total error , p e T ( e T ) = p e D ( e D ) + p e I ( e I ) , is giv en by [24] p e T ( e T ) = H 2 {| e T + r I + r D + 1 | − | e T + r I − r D | − | e T − r I + r D | + | e T − r I − r D − 1 |} for | e T | ≤ r I + r D , e max T and H = (2 r I + 1) − 1 (2 r D + 1) − 1 . Since e T + r I + r D + 1 ≥ 0 and e T − r I − r D − 1 ≤ 0 for a ll | e T | ≤ e max T (= r I + r D ) , by sub stituting r I and r D , we hav e p e T ( e T ) = H 2  2  2 z + 2 r − 1 − z − 1  −   e T + 2 z − 2 r − 1 − z   −   e T − 2 z + 2 r − 1 − z    . (22) By denoting a ( z ) , 2 z − 2 r − 1 − z and b ( z ) , 2 z + 2 r − 1 − z , the expected decoding error can be express ed as E [ | e T | ] = ∞ X e T = −∞ | e T | · p e T ( e T ) = e max T X e T = − e max T H 2 | e T | · [2( b ( z ) − 1) − | e T + a ( z ) | − | e T − a ( z ) | ] . (23) Since both | e T | and [2( b ( z ) − 1) − | e T + a ( z ) | − | e T − a ( z ) | ] are sy mmetric on z = ⌈ ( r − 1) / 2 ⌉ and z = ⌊ ( r − 1) / 2 ⌋ (see Appendix III), E [ | e T | ] is also symmetric. Thus , E [ | e T | ] = H e max T X e T =1 e T · { 2( b ( z ) − 1) − | e T + a ( z ) | − | e T − a ( z ) |} = H e max T X e T =1 e T · { 2( b ( z ) − 1)] − H e max T X e T =1 e T · {| e T + a ( z ) | + | e T − a ( z ) |} = H · ( b ( z ) − 1) e max T ( e max T + 1) − H e max T X e T =1 e T · { | e T + a ( z ) | + | e T − a ( z ) |} . (24) If we conside r the cas e where a ( z ) > 0 , which corresp onds to r / 2 < z ≤ r − 1 , we have e max T X e T =1 e T · {| e T + a ( z ) | + | e T − a ( z ) |} = a ( z ) − 1 X e T =1 e T · 2 a ( z ) + e max T X e T = a ( z ) e T · 2 e T = 1 3 e max T ( e max T + 1)(2 e max T + 1) + 1 3 a ( z )( a ( z ) 2 − 1) . Note tha t e max T = b ( z ) − 2 . Therefore, for the cas e where a ( z ) > 0 , E [ e T ] can be expresse d a s E [ e T ] = H ·  ( b ( z ) − 1) 2 ( b ( z ) − 2) − 1 3 ( b ( z ) − 1)( b ( z ) − 2)(2 b ( z ) − 3 ) − 1 3 a ( z )( a ( z ) 2 − 1)  = H ·  1 3 b ( z )( b ( z ) − 1)( b ( z ) − 2) − 1 3 a ( z )( a ( z ) 2 − 1)  (25) 14 which is an increas ing func tion for r / 2 < z ≤ r − 1 (se e Ap pendix IV). Since E [ e T ] is s ymmetric on z = ⌈ ( r − 1) / 2 ⌉ and z = ⌊ ( r − 1) / 2 ⌋ , and is an increasing func tion over r / 2 < z ≤ r − 1 , E [ e T ] is con vex over 0 ≤ z ≤ r − 1 . Therefore, there exists an op timal z ∗ that minimizes the expected de coding error . Finally , since E [ e T ] is symmetric on ⌈ ( r − 1) / 2 ⌉ and ⌊ ( r − 1) / 2 ⌋ , the minimum E [ e T ] can be achieved if z ∗ = ⌈ ( r − 1) / 2 ⌉ and z ∗ = ⌊ ( r − 1) / 2 ⌋ . The two o ptimum points can be the sa me for o dd r . V . A P P R OX I M AT E D E C O D I N G I N S E N S O R N E T W O R K S A. System Des cription W e illustrate in this section an exa mple, wh ere the approximate dec oding frame work is used to recover the data transmitted by s ensors tha t ca pture a source s ignal from diff erent spatial loca tions. W e c onsider a sensor network, whe re sens ors transmit RLNC encode d data. Speciﬁcally , eac h sensor meas ures its own observations and rece i ves the other observations from its neighbor s ensors. Then, each s ensor combines the recei ved data with its o wn data us ing RLNC. I t transmits the r esulting data to its neigh bor nodes or receivers. In the c onsidered s cenario, the re a re 30 s ensors which measure seismic signals p laced a t a distance of 100m by ea ch othe r . A sensor h captures a s ignal S h that represents a series of s ampled values in a time windo w o f size w , i .e., S h = [ s 1 h , . . . , s w h ] T . W e assume that the data mea sured at ea ch sens or are in the range of [ − s min , s max ] , i.e. , s l h ∈ S = [ − s min , s max ] for all 1 ≤ l ≤ w . W e further assu me that they a re quantized and mapped to the nearest integer values, i.e., s l h ∈ Z . Thus, if the meas ured data exce ed the range of [ − s min , s max ] , then they a re c lipped to the minimum or maximum values of the range (i.e., s l h = − s min or s l h = s max ). The data captured b y the dif ferent sensors are correlated, a s the signals at diff erent neighboring positions are mostly time-shifted a nd energy-scaled versions of each other . The ca ptured data have l ower correlation with o ther signals as the d istance be tween se nsors b ecomes lar ger . An illustrati ve example is s hown in Fig. 3(a) that prese nts seismic data recorded by 3 different s ensors. The da ta mea sured b y sens or 1 ha s much highe r temporal correlation with the data measured b y se nsor 2 in terms of time s hift an d signal energy than the data me asured by senso r 30. This is be cause s ensor 2 is signiﬁca ntly clos er to sen sor 1 than sensor 30. W e cons ider that the nodes perform ne twork coding for d ata deliv ery . W e d enote by H n ( ⊆ H ) a se t of sensors that are in the proximity o f a sens or n ∈ H . The n umber of se nsors in H n is | H n | = N n . A se nsor n receives data S h from all the s ensors h ∈ S n in its prox imity and e ncodes the rec eiv ed data with RL NC. The coding coefﬁcients c h ( k ) are rand omly sele cted from GF (2 r ) where the ﬁeld size is de termined su ch that |S | ≤ 2 r . The encoded symbols are then transmitted to the neighboring nodes o r to the receiver . The 15 200 400 600 800 1000 1200 1400 −1 0 1 x 10 7 Magnitude Seismic Data Measured from Sensors (Original) Sensor 1 200 400 600 800 1000 1200 1400 −1 0 1 x 10 7 Magnitude Sensor 2 200 400 600 800 1000 1200 1400 −1 0 1 x 10 7 Time Magnitude Sensor 30 (a) 200 400 600 800 1000 1200 1400 −1 0 1 x 10 7 Magnitude Seismic Data Measured from Sensors (Decoded) Sensor 1 200 400 600 800 1000 1200 1400 −1 0 1 x 10 7 Magnitude Sensor 2 200 400 600 800 1000 1200 1400 −1 0 1 x 10 7 Time Magnitude Sensor 30 (b) Fig. 3. Measured original seismic data (a) and decoded seismic data based on approximate decoding (b). k th encoded da ta p ackets for a window of samples are denoted by Y ( k ) = P h ∈ H n L { c h ( k ) ⊗ X h } , where X h = 1 R G ( S h ) . An illustrati ve e xample is sho wn in Fig. 4. This examp le presents a set of four 1 { , 4 1 } 3 , H 1 1 1 1 3 1 4 4 1 3 ( ) ( ( ) ) ( ) Y X c k k X c X c k k ! " ! " ! H 1 2 3 4 2 } 4 { 2 , H 2 2 4 2 4 2 2 ( ) ( ) ( ) c k c X k Y k X ! " ! 1 H 2 H Receiver Fig. 4. Illustrative examp le of network coding in sensor networks. sensors deno ted by H tha t consis ts in two subs ets o f n eighbors, i.e., H 1 = { 1 , 3 , 4 } and H 2 = { 2 , 4 } . The en coded d ata packets that the rece i ver collects from sensor 2 and s ensor 4 are denoted by Y ( k 1 ) and Y ( k 2 ) . When a receiv er collects enough innovati ve packets, it ca n solve the linear s ystem given in (6) and it can recover the original data. Howe ver , if the number o f pa ckets is no t sufﬁcient, the recei ver ap plies our proposed approximate decoding strategy tha t exp loits the s imilarity b etween the dif ferent signa ls. W ith such a strategy , the dec oding pe rformance ca n be improved as d iscusse d in Property 1. W e assume that the system setup is a pproximately kn own by the sensors. In othe r words, a simple correlation model can be computed, which includes the relati ve temporal shifts and en ergy scaling between the signals from 16 1 2 3 4 5 6 7 8 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 Discarded Bits Normalized MSE MSE of Sensor 1 MSE of Sensor 2 MSE of Sensor 30 Average MSE Fig. 5. Normalized avera ge MSE for different GF sizes (i.e., GF( 2 10 − z )). the dif ferent sensors. In particular , since the sens or p ositions are known, on e can simply assume that the data similarity depen ds only on the d istance b etween s ensors. B. Simulation Res ults W e ana lyze an illustrati ve s cenario, where the receiver collects en coded packets from sensors 1, 2 an d 30 a nd tries to reconstruct the original s ignals from thes e three s ensors . W e c onsider tempo ral windows of size w = 300 for data representation. The captured data is in the range of [0 , 1023 ] . Thus, the maximum GF size is 2 10 , i.e., GF( 2 10 ). W e assume tha t 2 / 3 of the linear equations required for perfect decoding are received with no error , an d that the rest of 1 / 3 o f equ ations are not r eceived. Thus, 1 / 3 o f the system co nstraints at deco der is built on , which is imposed into the co ding coe f ﬁcient matrix base d o n the assu mption that the signals from sen sor 1 a nd senso r 2 are highly correlated . W e study the inﬂu ence of the size of the coding ﬁeld on the deco ding performance. Fig. 5 sh ows the MSE (Mea n S quare Error) distortion for the dec oded signals f or dif ferent nu mber of discarded bits z , or equiv alently for different GF s izes 2 10 − z . The conclus ion drawn from Property 3 is conﬁrmed from these results, as the decod ing error is minimized at z ∗ = ⌈ (10 − 1) / 2 ⌉ = 5 . An instantiation o f seismic data recovered by the approximate de coding is further shown in Fig. 3, where a GF( 2 10 − z ∗ ) = GF( 2 5 ) is used. Since the additional constraints are imposed into the coding coefﬁcient matr ix based on the assumption of high correlation between the data measured by sensors 1 and 2, the recovered da ta of se nsors 1 and 2 in Fig. 3(b) are very similar , but at the sa me time, the data are q uite accu rately recovered. W e obs erve that the error in correlation estimation results in higher distortion in the signa l recovered by sens or 30 . 17 V I . A P P RO X I M A T E D E C O D I N G O F I M A G E S E Q U E N C E S A. System Des cription In this sec tion, we illustrate the applica tion of approximate deco ding to the recovery of image se - quence s. W e consider a sy stem, where information from su ccess i ve frames is combined with network coding. Encode d packets are transmitted to a common receiver . Packets may , howe ver , be lost or delayed , which pre vents perfect reconstruction of the images. Thus, for improv ed decoding performance, we e xploit the correlation between succe ssive frames. W e c onsider a group of s ucces siv e image s in a vide o seque nce. Each image S n is divided into N patches S n,p , i.e., S n = [ S n, 1 , . . . , S n,N ] . A pa tch S n,p contains L × L pixels s b n,p , 1 ≤ b ≤ L × L , i.e., S n,p = [ s 1 n,p , . . . , s L × L n,p ] . Such a representation is illustrated on Fig. 6. The system impleme nts RLNC 2, p S 3, p S 1 , p S Group of Images ! ! ! ! ! ! ! !! ! " ! ! ! ! ! ! ! ! ! ! # Frame 1 Frame 3 Frame 2 1 , n p s 2 , n p s 4 , n p s 3 , n p s , 2 3 4 , , , , 1 , , : , n T n p n n n p p p p s s s S p s ! " # $ % & Patch L blocks L blocks Fig. 6. Illustrative examp les of patches in a group of images ( L = 2 ). and c ombines patches at similar positions in d if ferent frames to produce encod ed symbols. In others words, it prod uces a s eries of symbols Y p ( k ) = P N n =1 L c n,p ( k ) ⊗ X n,p , where X n,p = 1 R G ( S n,p ) , for a location of p atch p . The cod ing c oefﬁcients c n,p ( k ) are ra ndomly chose n in GF (2 r ) . W e as sume that the original da ta (i.e., pixels) can take values in [0 , 255] , and thu s, we ch oose the max imal size of the coding ﬁeld to be |S | = 256 = 2 8 . When the rece i ver co llects enough innovati ve s ymbols per patch, it ca n recover the corresponding su b- images in e ach patch , and eventually the group of images. If, h owe ver , the number of enco ded symbo ls is insuf ﬁcient, add itional co nstraints are added to the decoding sys tem in order to enable approximate decoding . These constraints typica lly dep end on the correlation between the succ essive images. As an illustration, in our case, the con straints a re imposed based on similarities between blocks of pixels in succe ssive frames, i.e., x b 1 n,p = x b 2 n +1 ,p , wh ere 1 ≤ b 1 , b 2 ≤ L × L . Th e matche d pixels, b 1 and b 2 , a re determined based o n the motion information in s ucce ssive image frames n and n + 1 , suc h that the similarity between pa tch p is maximized. The motion information permits to add additional c onstraints to the de coding system so that e stimations of the original b locks of da ta can b e obtained by Gaussian 18 0 1 2 3 4 5 6 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10 −6 Normalized MSE from Decoded Frames for GF Sizes (2 8−z ) z Normalized MSE Frame 1 Frame 2 Frame 3 Average Fig. 7. Achieve d Normalized MSE for dif ferent GF sizes (i.e., GF( 2 8 − z )) in the approx imate decoding of the Silent sequence. elimination techniques. Due to our des ign c hoices, the de coding s ystem can be de compos ed into smaller independ ent sub-systems that corresp ond to patche s. B. P erformanc e of Appr oximate Decod ing In our experiments, we c onsider three conse cutiv e frames extr acted fr om the Silent standard MPEG sequen ce with QCIF format (174 × 144). The p atches are c onstructed with four blocks of 8 × 8 p ixels. W e assume that only 2 / 3 of the linear equations requ ired for perfect decoding a re rec eiv ed. The decod er implements approximate decoding by ass uming that the correlation information is known at the deco der . The missing constraints are ad ded to the de coding system base d on the be st ma tched p airs of blocks in consec utiv e frames, in the sense of the smallest d istance (i.e, highest similarity) between the pixel values in blocks in dif ferent frames . In the ﬁrst s et o f experiments, we a nalyze the inﬂuence of the size of the coding ﬁ eld, by chang ing the GF sizes from GF( 2 8 ) to GF( 2 8 − z ). W e redu ce the size of the ﬁeld by disc arding the z least sign iﬁcant bits for eac h pixel value. Fig. 7 sh ows the normalized MSE a chieved from the d ecoded frames for different numbers o f disc arded bits z . As discus sed in Property 3, the expected de coding error can be minimized if z ∗ = ⌈ ( r − 1) / 2 ⌉ and z ∗ = ⌊ ( r − 1) / 2 ⌋ , which c orrespond s to z ∗ = 3 and z ∗ = 4 . This can be veriﬁed from this illustrativ e example, where t he ma ximum normalized MSE is achie ved at z = 4 for frame 1 and frame 2, and a t z = 3 for frame 3. The correspond ing d ecode d images for tw o different GF sizes are presented in Fig. 8 . From the d ecoded images, we ca n observe that s everal pa tches a re completely black o r white. This is becaus e the cod ing coe f ﬁcient matrices are sing ular , lead ing to the failure of Gaussian elimination during the deco ding proce ss. Note t hat the goal of resu lts sho wn in Fig. 7 is t o verify the Property 3, b ut i s not to maximize the PSNR performance. In orde r to further improve the 19 Frames Decoded over GF(256) Frames Decoded over GF(32) Fig. 8. Decoded frames for the Silent sequence f or 2 different sizes of the coding ﬁeld. MSE performance, s everal image and video enhan cement techniqu es s uch a s e rror concea lment [25] can be deployed. Next, we compare the ap proximate decoding ap proach wi th MLE b ased decoding for RLNC coded data, as the MLE can also use the joint prob ability distribution of so urces for solving an undertermined system. In this experiment, our focu s is on the case wh ere clients recei ve a set of encoded pa ckets that is insufﬁcient for building a full-rank co efﬁcient matrix, as this case is meaningful for b oth the a pproximate decoding and the MLE-bas ed de coding. The so urce da ta are the ﬁrst three frames of QCIF F or eman and Silent se quence s. They have dif ferent c haracteristics as F or eman sequen ce ha s much high er motion than Silent se quenc e. For fair comparison, the s ame correlation information, i.e. , the mos t similar data sh ould be set equal, i s used both for t he MLE decoding and approximate decoding. For the approximate decoding, we a ssume that if the Gaussian elimination for a patch f ails due to the singular coefﬁcient matrix h aving D constraints, the resulting decode d patch is se t to the average value of image pixel block s. This ch oice is moti vated by the f act that the MLE-ba sed d ecoding alw ays selects a s olution even though the selected solution is not the best. The results a re presented in Fig. 9 with respect to the numb er of discarded bitplanes z , where the size of GF is determined by GF (2 8 − z ) . From Fig. 9(a), we ca n observe that the approx imate de coding outperforms the MLE for Silent se quence in a ll range of z values. Whil e the MLE shows a better performance tha n the a pproximate de coding for Foreman s equenc e in Fig. 9(b), there are several values of the GF size s that show similar performance for both method s. The gain of the MLE for Foreman sequen ce mainly c omes from the selection of brighter c olors for representing the blocks , while the approximate d ecoding selects graye r colors. Howe ver , in terms of c omplexity , the approximate deco ding requ ires signiﬁca ntly les s complexity tha n 20 0 1 2 3 4 5 6 10 −7 10 −6 10 −5 Discarded Bits (z) Normalized MSE MLE vs Approximate Decoding for Silent (QCIF), GF(2 8−z ) Approximate Decoding MLE 0 1 2 3 4 5 6 10 −5.7 10 −5.6 10 −5.5 10 −5.4 10 −5.3 Discarded Bits (z) Normalized MSE MLE vs Approximate Decoding for Foreman (QCIF), GF(2 8−z ) Approximate Decoding MLE (a) (b) Fig. 9. Performance comparison of the proposed approximate decoding method with MLE based decoding wi th respect to the number of discarded bitplanes z for : (a) Silent QCIF and (b) Foreman QCIF sequence. the MLE, as the Gau ssian e limination is applied to very sparse ma trices. In particular , assume that we have y unknowns and x eq uations a re received. Then, it is known that the Gauss ian elimination requires asymptotica lly a t most O ( y 3 ) operations [26], while the MLE with exhausti ve sea rch req uires asymptotically O ( q y − x x 3 ) , where q ≥ 2 is a GF size [20] (Part II). As y inc reases, q y − x increases much faster tha n ( y /x ) 3 , which means that the approx imate d ecoding ca n perform signiﬁcantly faster than MLE-based approac h. The refore, we c an conclude tha t the approximate decod ing represe nts an effecti ve solution for deco ding with insufﬁcient data a nd mode rate c omplexity . W e als o illustrate the inﬂ uence of the acc uracy of the c orrelation information by conside ring zero motion a t the decod er . In other words, a dditional constraints for approximate decod ing simply impose that the co nsecu ti ve frames are ide ntical. F ig. 10 shows the frames dec oded with no mo tion over GF(32). W e can see that the ﬁrst three frames still provides an a cceptable quality sinc e the motion between the se frames is ac tually very s mall. Howe ver , in frames 208 , 2 09, and 210, where motion is highe r , we clearly observe sign iﬁcant performanc e degradation, espe cially in the po sitions wh ere high mo tion exists. Next, we stud y the inﬂu ence of the size of the grou p of imag es (i.e., w indow size) that is c onsidered for encoding . It ha s b een discu ssed that the c oding coefﬁcient matrices c an be singular , a s the coefﬁcients are randomly selected in a ﬁnite ﬁeld. This results in pe rformance d egradation for the a pproximate decoding. Moreover , it is s hown that the probability tha t random ma trices over ﬁnite ﬁ elds are s ingular becomes smaller as the size of matrices be comes lar ger [27]. Thu s, if the group of image s (i.e., window size ) becomes larger , the coding co efﬁcient matrix bec omes lar ger . As a result, the probability that Gaus - sian elimi nation f ails is correspon dingly smaller . This is quantitati vely in vestigated from the follo wing 21 Decoded Frames 1, 2, 3 with no MV over GF(32) Decoded Frames 208, 209, 210 with no MV over GF(32) Fig. 10. Decoded frames with no information about motion estimation. 3 4 5 6 7 8 9 10 11 12 1.5 2 2.5 3 x 10 −7 MSE with Sufficient Innovative Packets [dB] Number of Frames in a Single Window Average PSNRs (24 Total Frames) 3 4 5 6 7 8 9 10 11 12 7 8 9 10 x 10 −7 MSE with Insufficient Innovative Packets [dB] MSE with Sufficient Innovative Pakcets MSE with Insufficient Innovative Packets Fig. 11. Decoding MSE for different windo w sizes in the encoding of the sequence Silent . experiment. W e consider 24 fr ames extracted from t he Silent seque nce and a set of dif feren t window sizes that contains 3, 4, 6, 8, and 12 frames. For examp le, if window size is 3, then there are 24/3=8 windo ws that are u sed in this expe riment. The av erage normalized MSE achieved in the lossless ca se, where the decode r receives en ough packets for decoding, is pre sented in Fig. 11. The normalized MSE decreas es as the window sizes are enlarged. The only reas on why all the frames are n ot pe rfectly recovered is the failure o f the Gauss ian elimination, when the cod ing coe f ﬁcient matrices becomes singu lar . This conﬁrms the above-mentioned disc ussion, i.e., if window size increase s, the size of cod ing coefﬁcient matrix a lso increases . Since the probability tha t the en lar ged co ding coefﬁcient matrices are singular bec omes smaller , higher av erage MSE s can correspon dingly be ac hiev ed for larger size of window . Finally , we stud y the inﬂuen ce of the window size in the los sy ca se. W e assu me that we have a los s rate 22 1 2 3 4 5 6 7 8 9 10 11 2 2.5 3 3.5 4 4.5 5 x 10 −7 Packet Loss Rate (%) Normalized Average MSE Normalized Average MSE BSC GEC Fig. 12. A verage MSE for t he transmission of Container QCIF sequence with respect to various packet loss rates in the BSC and i n GEC. Network coding is performed on windows of four frames. of 1 / 2 4 in all the conﬁgurations and the approximate d ecoding is implemented. Fig. 11 shows the achieved av erage MSE acros s the recovered frames for dif ferent window sizes. Since the de coding errors incurred by the ap proximate decoding are limited to a window and do not inﬂ uence the de coding of the other windows, a small window siz e is desirable for limited error prop agation. Howev er , as disc ussed , a smaller window s ize can result in higher p robability that the coding coefﬁcient matrices become singular , an d that the Gau ssian e limination f ails. Due to this tradeo f f, we can observe that the achieved MSE be comes high when window size is 4 in ou r example. Note that the compu tational c omplexity for decoding (i.e., Gaussian elimination) a lso increases a s the wind ow size increase s. Hence, the proper window size needs to be determined base d on several design tradeoffs in practice. C. P e rformanc e in V arious Network Conditions W e thus far c onsidered a network having a ﬁxed pac ket loss rate (i.e., a dedicated ﬁn al node rec eiv es 2/3 of the required linear equations and doe s not recei ve 1 /3 of the required linear equations). W e now e xamine more general network scenarios, which may result in different packet loss rates for the ﬁnal dec oder . As an illustration, we consider a network which consists of three pa irs of sources an d destinations with several n etwork nod es performing network coding ope rations. W e assume that there are no los s in the sou rces an d destinations a nd they a re prop erly dimensione d. H owe ver , the links between nodes performing n etwork cod ing ope rations are los sy with different pa cket loss rates. W e study the achieved performance (MSE) that corresponds t o dif ferent packet loss rates . The results are s hown i n Fig. 12. Thes e results show the average MSE that the ﬁnal node achieves when it expe riences a variety of pac ket loss rates an d d ecode s the received data with the propos ed approximate d ecoding method for 23 binary sy mmetric ch annel (BSC) a nd Gilbert Elliot cha nnels (GEC) [28], respectively . The source imag es are from Container samp le MPEG sequ ences with QCIF resolution. In all c ases , the data is enc oded with RLNC an d a w indow of fou r p ackets is cons idered. W e simulate los s w ith a GE C model [28] tha t con sists in a two-state Markov ch ain whe re the good and bad s tates rep resent the c orrect rec eption or the loss of a packet, respe cti vely . W e c hoose the average length of burst of errors to 9 packets, and we vary the average packet loss rate in orde r to study the performanc e of our approx imate reconstruction algorithm in d if ferent channe l con ditions. For the BSC model, the experiments are pe rformed with a s et of different av erage packet los s rates . As expected, the performance worsens as the p acket los s rate increase s. Moreover , thes e results show that the a pproximate decoding en ables the decoder to achieve a noticeab le g ain i n terms of decoded q uality compared to traditional network coding based s ystems, which may completely f ail to recover data. Alternati vely , this me ans that the a pproximate d ecoding may require less ne twork loads than traditional deco ding algorithms in order to a chieve the same de coding quality . V I I . C O N C L U S I O N S In this pap er , we have described a framework for the deliv ery o f correlated information s ources with help of netw ork cod ing along with a n ovel low complexity approximate decoding algorithm. The approximate dec oding algorithm permits to re construct a n ap proximation o f the so urce s ignals even when an insuf ﬁcient nu mber of innovati ve pac kets are a vail able for perfect decoding. W e ha ve an alyzed the tradeoffs be tween the d ecoding pe rformance and the size of the coding ﬁelds. W e have d etermined an optimal ﬁeld size tha t leads to the highe st app roximate deco ding performance. W e a lso have in ves tigated the impact of the accu racy of the da ta similarity information use d in building the approximate dec oding solution. The proposed approac h is implemented in illustrati ve examples of sensor network and distrib uted imaging applications, whe re the experimental results conﬁrm our ana lytical study as well a s the bene ﬁts of approximate decod ing solutions as a n e fﬁcient way to dec ode un dertermined systems with reasonab le complexity whe n sou rce d ata are h ighly correlated. V I I I . A C K N O W L E D G M E N T S The autho rs would like to tha nk Dr . L aurent Duval for providing the s eismic data u sed in the sensor network example. A P P E N D I X I In this appendix, we p rovide illustrati ve exa mples that verify the arguments, where s maller values of | s i − s j | ca n indee d lead to smaller values of 1 G R ( x i ⊕ x j ) , which is disc ussed in the proof of Prop erty 1 . In this example, we c onsider G F(512), an d study several examples of | s i − s j | = 0 , 1 , 2 , 50 , 100 , 150 , 256 . 24 0 100 200 300 400 500 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 GR ( x i ⊕ x j ) in G F (512 ) Probability |s i −s j |=0 |s i −s j |=1 |s i −s j |=2 (a) 0 100 200 300 400 500 0 0.01 0.02 0.03 0.04 0.05 0.06 1 GR ( x i ⊕ x j ) in G F (512 ) Probability |s i −s j |=50 |s i −s j |=100 |s i −s j |=150 |s i −s j |=256 (b) Fig. 13. The probability mass function for different valu es of 1 G R ( x i ⊕ x j ) corresponding to various valu es of | s i − s j | in GF( 512 ). In the c ases where smaller dif ferences between s i and s j (e.g., | x i − x j | = 0 , 1 , 2 ), we can observe that the most of the values of 1 G R ( x i ⊕ x j ) are conce ntrated a round 0. In the cas es where lar ger differences between s i and s j (e.g., | s i − s j | = 50 , 100 , 150 , 256 ), howev er , the values of 1 G R ( x i ⊕ x j ) are spread over the elements in GF . T herefore, it is obviously con ﬁrmed that smaller values of | s i − s j | indeed result in 1 G R ( x i ⊕ x j ) . These are dep icted in Fig. 1 3. A P P E N D I X I I In this app endix, we show that ˆ P ≥ 1 2 , wh ere ˆ P is deﬁned a s ˆ P , Pr  s ≤ ˆ s R + ˆ s r 2   ˆ s R ≥ ˆ s r  in (19 ). Note tha t both ˆ s r and ˆ s R are recons tructed data, an d thu s, they are real values. Us ing Ba yes’ rule, ˆ P = Pr  s ≤ ˆ s R + ˆ s r 2     ˆ s R ≥ ˆ s r  = 2 r − 1 X z =0 Pr  z ≤ ˆ s R + ˆ s r 2     ˆ s R ≥ ˆ s r , s = z  Pr ( s = z ) = 1 2 r 2 r − 1 X z =0 Pr  z ≤ ˆ s R + ˆ s r 2     ˆ s R ≥ ˆ s r , s = z  . Referring to Fig. 14, we have 2 r − 1 X z =0 Pr  z ≤ ˆ s R + ˆ s r 2     ˆ s R ≥ ˆ s r , s = z  = 1 2 r + R 2 r − 1 X z =0 " 2 r + R − ( 2 r − 1 (2 r − 1) + 2 z X l =0 l )# = 1 2 r + R  2 2 r + R − 1 6  5 · 2 3 r − 3 · 2 2 r − 2 · 2 r   . Thus, ˆ P can be expresse d as ˆ P = 1 2 r  1 2 r + R  2 2 r + R − 1 6  5 · 2 3 r − 3 · 2 2 r − 2 · 2 r   = 1 − 1 6  5 · 2 r 2 R − 3 2 R − 2 2 r + R  . 25 Fig. 14. An illustration for A ppendix I I Since r , R ∈ N a nd R > r , R can be express ed as R = r + α , where α ∈ N . Thus , ˆ P = 1 − 1 6  5 · 1 2 α − 3 2 r + α − 2 2 2 r + α  . Since lim r →∞ ˆ P = 1 − 5 6 · 1 2 α > 1 2 for all α ∈ N a nd ˆ P is a non-increas ing fun ction o f r , ˆ P > 1 2 for all r , R . A P P E N D I X I I I In this ap pendix, we prove that the function g ( z ) = 2( b ( z ) − 1) − | e T + a ( z ) | − | e T − a ( z ) | is s ymmetric on ⌈ ( r − 1) / 2 ⌉ , which is use d in the proof of Prope rty 3. T o show this, we ne ed to prove that g ( z ) = g ( r − 1 − z ) for a ll 0 ≤ z ( ∈ Z ) ≤ r − 1 . Note tha t a ( r − 1 − z ) = 2 r − 1 − z − 2 r − 1 − ( r − 1 − z ) = − (2 z − 2 r − 1 − z ) = − a ( z ) and b ( r − 1 − z ) = 2 r − 1 − z + 2 r − 1 − ( r − 1 − z ) = 2 z + 2 r − 1 − z = b ( z ) . Thus, g ( r − 1 − z ) = 2( b ( r − 1 − z ) − 1) − | e T + a ( r − 1 − z ) | − | e T − a ( r − 1 − z ) | = 2( b ( z ) − 1) − | e T − a ( z ) | − | e T + a ( z ) | = g ( z ) which completes the proof. A P P E N D I X I V In this appen dix, we show that h ( z ) = 1 3 b ( z )( b ( z ) − 1)( b ( z ) − 2) − 1 3 a ( z )( a ( z ) 2 − 1) (26) is an increasing function for z ∈ Z where r / 2 < z ≤ r − 1 . This is used in the proof o f Property 3 Note that (26) is equiv alent to function h ( z ) with z ∈ R where r / 2 < z ≤ r − 1 , s ampled at e very z ∈ Z . 26 Thus, we focus on showing that h ( z ) is a n increas ing function over z ∈ R where r / 2 < z ≤ r − 1 . T o show that h ( z ) is an increas ing function, w e may show that d dz h ( z ) > 0 for r / 2 < z ≤ r − 1 . Note that d dz a ( z ) = ln 2 · (2 z + 2 r − 1 − z ) = b ( z ) ln 2 and d dz b ( z ) = ln 2 · (2 z − 2 r − 1 − z ) = a ( z ) ln 2 . Therefore, d dz h ( z ) = ln 2 3  3 b ( z ) 2 db ( z ) dz − 6 b ( z ) db ( z ) dz + 2 db ( z ) dz  −  3 a ( z ) 2 da ( z ) dz − da ( z ) dz  = ln 2 3 { 3 a ( z ) b ( z )( b ( z ) − a ( z ) − 2) + 2 a ( z ) + b ( z ) } . Since a ( z ) b ( z ) = 2 2 z − 2 2( r − 1 − z ) > 0 a nd b ( z ) − a ( z ) = 2 · 2 r − 1 − z ≥ 2 for r / 2 < z ≤ r − 1 , d dz h ( z ) = ln 2 3 { 3 a ( z ) b ( z )( b ( z ) − a ( z ) − 2) + 2 a ( z ) + b ( z ) } > 0 which implies that h ( z ) is an increasing function over r / 2 < z ≤ r − 1 . R E F E R E N C E S [1] R. Ahlswede, N. Cai, S.-Y . R. Li, and R . W . Y eung, “Network information ﬂ o w , ” IEEE T rans. Inf. Theory , vol. 46, no. 4, pp. 1204–12 16, Jul. 2000. [2] P . A. Chou and Y . Wu , “Network coding for the i nternet and wireless networks, ” IEE E Signal Process. Mag . , vol. 24, no. 5, pp. 77–85, Sep. 2007. [3] S.- Y . R. L i, R. W . Y eung, and N. Cai, “Linear network coding, ” IEEE Tr ans. Inf. Theory , vol. 49, no. 2, pp. 371–381, Feb . 2003. [4] C. Gkan tsidis and P . R. Rod riguez, “Network cod ing for l arge scale co ntent distrib ution, ” in Pr oc. IEEE Int. Conf . Comput. Commun. (INFOCOM 2005) , vol. 4, Mar . 2005, pp. 2235–224 5. [5] C. Gkantsidis, J. Miller , and P . Rodriguez, “Comprehensi ve view of a liv e network coding P2P system, ” in P r oc. ACM SIGCOMM/USENIX IMC’06 , Brasil, Oct. 2006. [6] A. G. Dimakis, P . B. Godfrey , M. J. W ainwright, and K. Ramchandran, “Netwo rk coding for distributed storage systems, ” in IE EE Int. Conf. C omput. Commun. (INF OCOM 2007) , May 2007. [7] S. Aceda ´ nski, S. Deb, M. M´ edard, and R. K oetter , “Ho w good i s random l inear coding based d istributed network ed storage?” in P r oc. W orkshop N etwork Coding, Theory , and Applications , Apr . 2005. [8] S. Deb, M. M ´ edard, and C. Choute, “On r andom network coding based information dissemination, ” i n Proc. IEEE Int. Symp. Inf. Theory (ISIT ’ 05) , Adelaide, Australia, Sep. 2005, pp. 278–282. [9] P . A. C hou, Y . Wu , and K. Jain, “P ractical network coding, ” in Proc. Allerton Conf. Commun., Contr ol and Comput. , Monticell, IL, US A, Oct. 2003. [10] T . Ho, M. M ´ edard, J. Shi, M. Effros, and D. R . Karger , “On randomized network coding, ” in Pro c. Allerton Annual Conf. Commun., Contro l, and Comput. , Monticello, IL, US A, Oct. 2003. [11] D. Slepian and J. K. W olf, “Noiseless coding of correlated information sources, ” IEEE T rans . Inf. Theory , vol. 19, no. 4, pp. 471–480 , Jul. 1973. 27 [12] T . P . Coleman, E. Martinian, and E. Ordentlich, “Joint source-chann el coding for transmitting correlated sources ov er broadcast networks, ” I EEE Tr ans. Inf. Theory , vol. 55, no. 8, pp. 3864–3868, Aug. 2009. [13] S. L. Howard and P . G. F likkema, “Integrated source-channel decoding for correlated data-gathering sensor networks, ” in Pr oc. IEE E W ir eless Commun. and Netw . Conf. (WCNC ’08) , Las V egas, NV , USA, Apr . 2008, pp. 1261–1266. [14] T . Ho, M. M ´ edard, M. Effros, and R. K oetter , “Network coding for correlated sources, ” in IEEE Int. Conf. Inf. Sciences and Syst. (CISS ’04) , Princeton, NJ, USA, Mar . 2004. [15] J. Barros and S. D. Servetto, “Netwo rk information ﬂow with correlated sources, ” IEEE T ra ns. Inf. Theory , v ol. 52, no. 1, pp. 155–170 , Jan. 2006. [16] A. R amamoorthy , K. Jain, P . A. Chou, and M. Effros, “Separating distributed source coding from network cod ing, ” v ol. 52, no. 6, pp. 2785–2795 , Jun. 2006. [17] Y . W u, V . S tanko vic, Z. Xi ong, and S . Y . Kung , “On practical design for j oint distributed source and network coding, ” in Pr oc. Firs t W orkshop on Netw . Coding, Theory , and Applications, NetCod-05 , Riv a del Garda, Italy , Apr . 2005. [18] A. Neumaier, “S olving ill-conditioned and singular linear systems: A tutorial on regularization, ” SIAM Review , vol. 40, no. 3, pp. 636–666, Sep. 1998. [19] E. J. Cand ` es and M. B. W akin, “ An introduction to compressi ve sampling, ” IEEE Signal P r ocess. Mag . , vo l. 25, no. 2, pp. 21–30, Mar . 2008. [20] S. S. Bhattacharyya, H andboo k of Sig nal Processin g Systems , S. S. Bhattacharyya, E. F . Deprettere, R. Leupers, and J. T akala, Eds. Springer , 2010. [21] S. Boyd and L. V andenberg he, Conv ex Optimi zation . Ne w Y ork, NY : Cambridge University Press, 2004. [22] R. Zwick, E. Carlstein, and D. V . B udescu, “Measures of similarity among f uzzy concepts: A comparati ve analysis, ” International Jo urnal of Appr oximate Reasoning , vol. 1, no. 2, pp. 221 – 242, 1987. [23] R. Hanneman and M. Ri ddle, Introd uction to Social Network Methods . Riverside, CA: Univ ersity of California, 2005. [24] D. M. Bradley and R. C. Gupta, “On t he distribution of the sum of n non-identically distrib uted uniform random variables, ” J . Annals of the Instit ute of Statistical Mathematics , vol. 54, no. 3, pp. 689–700, Sep. 2002. [25] M. v an der Schaar and P . A. Chou, Eds., Multimedia over IP and W ir eless Networks . Academic Press, 2007. [26] X. G. Fang and G. Hav as, “On the worst-case comple xity of integer gaussian elimination, ” in Pr oc. Int. Symposium on Symbolic and Algebraic Computation (ISSAC ’97) , 1997, pp. 28–31. [27] J. Kahn and J. K oml ´ os, “Singularity probabilities for random matrices over ﬁnite ﬁ elds, ” Combinatorics, Pr obability and Computing , vol. 10, no. 2, pp. 137–157, Mar . 2001. [28] E. O. Elliott, “Estimates of error rates for codes on b urst-noise channels, ” Bell Syst. T ech. J . , v ol. 42 , pp. 1977–1 997, Sep. 1963.

Approximate Decoding Approaches for Network Coded Correlated Data

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment