Communication over Finite-Field Matrix Channels

Communication o v er Finit e-Field Matrix Channels Danilo Silva, Frank R. Kschischang, and Ralf K ¨ otter Abstract This p aper is motiv ated by the pr oblem of error co ntrol in network coding when errors are introduce d in a random fashion (rather than chosen by an adversary). An additi ve-multiplicative m atrix ch annel is considered as a model fo r ran dom network cod ing. The mo del assume s that n packets of len gth m are transmitted o ver the netw ork, an d up to t erroneo us packets a re ran domly ch osen an d injected into th e network. Upper and lo wer boun ds on capacity are obtained for any c hannel paramete rs, and asymptotic expressions are provided in th e limit of large ﬁeld or matrix size. A simple coding scheme is presented that achieves capacity in both limiting cases. The scheme has decod ing com plexity O ( n 2 m ) and a probab ility of erro r th at decreases exponentially bo th in the p acket length and in the ﬁeld size in bits. Extensions of these r esults for co herent network coding are also presented. Index T erms Error co rrection , error tr apping , matrix channels, network co ding, one-sho t codes, probab ilistic error model. I . I N T RO D U C T I O N Linear network coding [1]–[3] i s a promising new app roach to information dissemination over networks. The fact that pac kets may be linearly combined at intermediate nod es a f fords, in many use ful scen arios, higher rates than con ventional routing approac hes. If the linear comb inations are ch osen in a ran dom, This work was sup ported by CAPES Founda tion, Brazil, and by the Natural Sciences and Engineering Research Council of Canada. The material in this p aper w as presen ted in part at the 2 4th Biennial Symposium on Communications, Kingston, Canada, June 2008. D. Silv a and F . R. Kschischang are with The Edwa rd S. Rogers Sr . Department of Electrical and Computer Engineering, Univ ersity of T oronto, T oronto, ON M5S 3G4, C anada (e-mail : danilo@comm.utoronto .ca, frank@comm.utoronto.ca). R. K ¨ otter i s deceased. 2 distrib uted fashion, then random linear network coding [4] no t o nly maintains most of the ben eﬁts o f linear network coding, but also affords a remarkable simplicity of design that is p ractically very appealing. Howe ver , linear n etwork co ding h as the intrinsic drawback of being extremely sens iti ve to error propagation. Due to pa cket mixing, a s ingle co rrupt packet ha s the po tential to contaminate all packets receiv ed b y a destination node. The problem is better understood by looking at a matrix model for (single-source) linear network coding, given by Y = AX + D Z. (1) All matrices a re over a ﬁnite ﬁe ld. Here, X is an n × m matrix whose rows are pa ckets transmitted by the source no de, Y is an N × m matrix whose rows a re the packets received by a (s peciﬁc) des tination n ode, and Z is a t × m matrix who se ro ws are the additiv e error packets injected at some network links. The matrices A and D are transfer matrices that desc ribe the linear transformations incurred b y packets on route to the destination. S uch linear transformations a re resp onsible for the (un con ventional) phenomen on of error propagation. There h as b een an increasing amoun t o f res earch on error control for network co ding, with results naturally dep ending on the speciﬁc chann el model used, i.e ., the j oint stati stics of A , D and Z g i ven X . Under a worst-case (or adversa rial) error mod el, the work in [5], [6] (toge ther with [7]–[10]) has obtaine d the maximum a chiev able rate for a wide range of co nditions. If A is square ( N = n ) and nonsingula r , and m ≥ n , then the ma ximum information rate that can be achieved in a s ingle use of the chann el is exactly n − 2 t packets wh en A is kno wn at the receiv er , and approximately m − n m ( n − 2 t ) packets when A is unknown. T hese a pproach es are inhe rently pessimistic and share ma ny similarities with classical coding theory . Recently , Montanari and Urbanke [11] brou ght the problem to the realm of information theory by considering a probab ilistic error mo del. Their model assumes, as above, that A is in vertible a nd m ≥ n ; in addition, they ass ume that the matrix D Z is ch osen uniformly at rand om among all n × m matrices of rank t . For such a mode l and, under the assumption that the trans mitted matrix X mus t contain an n × n identity s ubmatrix as a h eader , they co mpute the ma ximal mutual information i n the li mit of large matrix size—approx imately m − n − t m ( n − t ) packets p er c hanne l u se. T hey also presen t an iterativ e co ding scheme with decoding complexity O ( n 3 m ) that asymptotically achieves this rate. The prese nt pap er is motiv a ted by [11], and by the c hallenge o f c omputing o r approximating the ac tual channe l c apacity (i.e., without any prior assu mption on the inp ut distribution) for any chan nel pa rameters (i.e., not necessarily in the lim it of lar ge matrix size ). Our contrib utions can be summarized as follo w s: 3 • Assuming that the matrix A is a con stant known to the receiver , we compu te the exa ct cha nnel capac ity for a ny chan nel parameters. W e also presen t a s imple coding sch eme tha t asymptotically achieves capac ity in the limit of large ﬁeld or matri x size. • Assuming that the ma trix A is cho sen uniformly at rando m a mong a ll nonsingular matrices , we compute upp er and lower bou nds on the channe l c apac ity for any c hanne l p arameters. These bou nds are s hown to c on verge a symptotically in the limit of lar ge ﬁeld or matrix size. W e also prese nt a simple coding sc heme that a symptotically a chieves capac ity in both limiting c ases . The scheme h as decoding c omplexity O ( n 2 m ) a nd a probability of error that de cays exponen tially f a st b oth in the packet length and in the ﬁe ld size in bits. • W e pres ent sev eral e xtensions of our results for situations where the matri ces A , D and Z ma y be chosen according to more general probability distri butions. A ma in assumption that underlies this paper (even the extensions mentioned above) is that the trans fer matrix A is a lw ays in vertible. One migh t question whether this assumption is rea listic for ac tual network coding sy stems. For ins tance, if the ﬁeld size is s mall, then random network co ding may not produce a nonsingu lar A with high probab ility . W e belie ve, ho wever , that remo ving this assumption complicates the ana lysis without of fering much insight. Under an end-to-end c oding (or laye r ed ) approach , there is a clear separation betwee n the network co ding protoc ol—which induc es a matrix chann el—and the error control technique s ap plied at the so urce and destination node s. In this ca se, it is reas onable to assume that network coding sy stem will be de signed to be feas ible (i.e., able to de li ver X to all de stinations) when no errors oc cur in the network. Indee d, a ma in premise of linear network coding is that the ﬁeld size is sufﬁciently large in order to allo w a feas ible n etwork cod e. Th us, the results of this pape r may be seen as conditional on the n etwork coding layer being su ccess ful in its task. The remainder of this paper is or ganized as follo ws . In Section II, we provide general co nsiderations on the type of chann els stud ied in this paper . In S ection III, we add ress a spe cial cas e of (1) where A is random and t = 0 , whic h may be seen as a mode l for rando m ne twork c oding without errors. In Section IV, we a ddress a spec ial cas e of (1) wh ere A is the identity matrix. This chann el may be seen a s a model for network co ding with errors when A is known at the rec eiv er , since the recei ver can alw a ys compute A − 1 Y . The complete channe l with a random, un known A is ad dressed Section V, where we make crucial us e of the results a nd intuition developed in the pre vious sections . Section VI disc usses possible extensions of our results, a nd Section VII presents our conclusions. W e will make use of the following no tation. L et F q be the ﬁn ite ﬁe ld with q elements. W e use F n × m q to d enote the se t of all n × m matrices over F q and T n × m,t ( F q ) to denote the set of a ll n × m matrices of 4 rank t over F q . W e s hall write simply T n × m,t = T n × m,t ( F q ) when the ﬁeld F q is c lear from the context. W e also use the notation T n × m = T n × m, min { n,m } for t he set of all ful l-rank n × m matrices. The n × m all-zero matrix and the n × n identity matrix a re denoted b y 0 n × m and I n × n , respe cti vely , where the subscripts may be omitted when there is no ri sk of confusion. The red uced ro w echelon (RRE) form of a matrix M will be denoted by RRE ( M ) . I I . M A T R I X C H A N N E L S For clarity and consistency of notation, we recall a fe w deﬁnitions from information theo ry [12]. A discrete cha nnel ( X , Y , p Y | X ) c onsists o f an input a lphabet X , an output a lphabet Y , and a co n- ditional probability distribution p Y | X relating the chann el input X ∈ X and the cha nnel output Y ∈ Y . An ( M , ℓ ) code for a cha nnel ( X , Y , p Y | X ) consists of an enc oding func tion { 1 , . . . , M } → X ℓ and a decoding fun ction Y ℓ → { 1 , . . . , M , f } , where f de notes a decoding failure. It is understood that an ( M , ℓ ) code is applied to the ℓ th extension of the discrete memoryless channel ( X , Y , p Y | X ) . A rate R (in b its) is said to be ac hiev a ble if there exists a seque nce of ( ⌈ 2 ℓR ⌉ , ℓ ) c odes su ch tha t de coding is u nsucc essful (either an e rror or a failure occurs) with probability arbitrarily small a s ℓ → ∞ . The capac ity o f the c hannel is the supremum of all achievable rates . It is well-known tha t the c apacity is giv en by C = max p X I ( X ; Y ) where p X denotes the input distrib ution. Here, we a re interested in matrix c hanne ls, i.e., c hannels for which b oth the input an d output variables are matrices. In pa rticular , we are interested in a family of ad diti ve matrix c hanne ls given by the c hanne l law Y = AX + D Z (2) where X , Y ∈ F n × m q , A ∈ F n × n q , D ∈ F n × t q , Z ∈ F t × m q , and X , ( A, D ) and Z a re statistically independ ent. Since the capa city of a matrix c hannel naturally sca les with nm , we also deﬁ ne a norma lized capacity C = 1 nm C. In the following, we assume that statistics of A , D and Z are g i ven for all q , n, m, t . In this ca se, we may de note a matrix channe l s imply b y the t uple ( q, n, m, t ) , and we may also indicate thi s dependency in both C and C . W e n ow d eﬁne two limiting forms o f a matrix chann el (strictly s peaking , of a seque nce 5 of matrix chan nels). The ﬁrst form, which we call the inﬁnite-ﬁeld-size cha nnel , is ob tained by tak ing q → ∞ . The cap acity of this c hannel is gi ven by lim q →∞ 1 log 2 q C ( q , n, m, t ) represented in q -ary units per channel use. Th e second form, which we call the inﬁn ite-rank chan nel , is obtained by setting t = τ n and n = λm , an d taking m → ∞ . The normalized ca pacity of this chan nel is gi ven by lim m →∞ 1 log 2 q C ( q , λm, m, τ λm ) represented in q -ary units per trans mitted q -ary symbol. W e will herea fter assume that log arithms are taken to the ba se q and omit the f actor 1 log 2 q from the above express ions. Note that, t o achiev e the capacity of an inﬁnite-ﬁeld-size channe l (similarly for an inﬁnite-rank channel), one shou ld ﬁnd a two-dimensional family of co des: namely , a seq uence of code s with increasing block length ℓ for each q , as q → ∞ (or for each m , as m → ∞ ). W e will simplify our task here by considering only codes with block leng th ℓ = 1 , which we call one-shot code s . W e will show , howev er , that these co des c an achieve the capa city o f both the inﬁnite- ﬁeld-size a nd the inﬁnite-rank channe ls, at least for the classes o f ch annels c onsidered h ere. In other words, one-shot codes a re asymptotically optimal as either q → ∞ or m → ∞ . For comp leteness, we deﬁ ne also two more versions of the chann el: the inﬁnite-packet-length chan nel , obtained by ﬁx ing q , t and n , an d letting m → ∞ , and the i nﬁnite-batch-size channe l , obtaine d by ﬁxing q , t and m , and letting n → ∞ . Th ese channels are discussed in Section VI-E. It is important to note that a ( q , n, ℓm, t ) chan nel is not the same as the ℓ -extension of a ( q , n, m, t ) channe l. For instance, the 2 -extension of a ( q , n, m, t ) channe l ha s channel law ( Y 1 , Y 2 ) = ( A 1 X 1 + D 1 Z 1 , A 2 X 2 + D 2 Z 2 ) where ( X 1 , X 2 ) ∈  F n × m q  2 , and ( A 1 , D 1 , Z 1 ) and ( A 2 , D 2 , Z 2 ) co rrespond to indep endent realizations of a ( q , n, m, t ) ch annel. This is not the same a s the channel law for a ( q , n, 2 m, t ) cha nnel, h Y 1 Y 2 i = A 1 h X 1 X 2 i + D 1 h Z 1 Z 2 i since ( A 2 , D 2 ) may not be equal to ( A 1 , D 1 ) . T o the b est of our knowledge, the ℓ -extension of a ( q , n, m, t ) chan nel has not bee n cons idered in previous w orks, with the exception of [13]. For instan ce, [14] and [11] c onsider only limiting forms of a ( q , n, m, t ) chan nel. Alt hough both mode ls are r eferred to simply as “rand om linear network cod ing, ” 6 the model implied by the results in [11] is in fact an inﬁnite-rank ch annel, while the mode l implied by the results in [14] is an inﬁ nite-packet-length-inﬁnite-ﬁeld-size channel. W e now proceed to in vestigating spec ial case s of (2), by conside ring speciﬁc statistics for A , D an d Z . I I I . T H E M U LT I P L I C A T I V E M AT R I X C H A N N E L W e de ﬁne the mult iplicative matrix channel (MMC) by t he cha nnel la w Y = AX where A ∈ T n × n is chosen un iformly at rando m amon g a ll n × n nonsingu lar matrices, a nd inde pende ntly from X . Note that the MMC is a ( q, n, m, 0) ch annel. A. Capacity and Capacity-Achieving Cod es In order to ﬁnd the capacity of t his chan nel, we will ﬁrst solve a more general problem. Pr op osition 1: Let G be a ﬁ nite group that acts on a ﬁnite se t S . Co nsider a ch annel with input variable X ∈ S and output v ariable Y ∈ S gi ven by Y = AX , wh ere A ∈ G is drawn uniformly a t random and independ ently from X . The capacity of this channel, in bits per channel u se, is gi ven by C = log 2 |S / G | where |S / G | is the numbe r of equiv a lence classes of S under the a ction of G . Any complete set of representatives of the equiv a lence clas ses is a capacity-achieving code. Pr oo f: For each x ∈ S , let G ( x ) = { g x | g ∈ G } denote the orbit of x u nder the action of G . Recall that G ( y ) = G ( x ) for all y ∈ G ( x ) an d all x ∈ S , that is, the orbits form equiv alen ce class es. For y ∈ G ( x ) , let G x,y = { g ∈ G | gx = y } . By a few ma nipulations, it is easy to show that |G x,y | = |G x,y ′ | for a ll y , y ′ ∈ G ( x ) . Since A has a uniform distribution, it follows that P [ Y = y | X = x ] = 1 / |G ( x ) | , for all y ∈ G ( x ) . For any x ∈ S , conside r the same channe l but with the input alpha bet restricted to G ( x ) . Note that the output alph abet will also be restricted to G ( x ) . This is a |G ( x ) | -ary chann el with uniform transition probabilities; thus, the cap acity of this channe l is 0. Now , the overall c hanne l c an be co nsidered as a sum (union of alphabets) of all the restricted channels. Th e ca pacity of a su m o f M cha nnels with capacities C i , i = 1 , . . . , M , is k nown to be log 2 P M i =1 2 C i bits. Thus , the capa city of the overall ch annel is log 2 M bits, where M = |S / G | is the number of orbits. A capacity-ach ieving code (with block length 1) may be obtained by simply selecting one r eprese ntati ve from eac h equi valence class. 7 Proposition 1 shows that in a ch annel induc ed by a group action, where the group elements a re sele cted uniformly at rando m, the recei ver cannot distinguish between transmitted elements that be long to the same equiv a lence class. Thus, the transmitter can only co mmunicate the choice of a pa rticular equiv a lence class. Returning to ou r original problem, we have S = F n × m q and G = T n × n (the ge neral linear group GL n ( F q ) ). The equ i valence clas ses of S under the action of G are the s ets o f ma trices that share the same row space . Thus , we c an identify each equiv alence class with a su bspac e of F m q of dimension at most n . Let the Gaussian coef ﬁcient  m k  q = k − 1 Y i =0 ( q m − q i ) / ( q k − q i ) denote the number of k -dimens ional subspa ces of F m q . W e have the foll owing corollary of Prop osition 1. Cor o llary 2: The ca pacity of t he MMC, in q -ary units pe r channel use, i s gi ven by C MMC = log q n X k =0  m k  q . A capacity-achieving c ode C ⊆ F n × m q can be obtained by ens uring t hat each k -dimens ional subspac e of F m q , k ≤ n , is the row s pace of some unique X ∈ C . Note that Corollary 2 reinforces the idea introduced in [9] that, in order to commu nicate unde r ran dom network coding, the trans mitter should encode information in t he cho ice of a subspac e. W e n ow compute the c apacity for the two limiting forms of the chan nel, as discuss ed in Section II. W e have the following result. Pr op osition 3: Let λ = n /m an d assume 0 < λ ≤ 1 / 2 . Then lim q →∞ C MMC = ( m − n ) n (3) lim m →∞ n = λm C MMC = 1 − λ. (4) Pr oo f: Fir st, observe that  m n ∗  q < n X k =0  m k  q < ( n + 1)  m n ∗  q (5) where n ∗ = min { n, ⌊ m/ 2 ⌋} . Using the fact t hat [9] q ( m − k ) k <  m k  q < 4 q ( m − k ) k (6) it follo ws that ( m − n ∗ ) n ∗ < C MMC < ( m − n ∗ ) n ∗ + log q 4( n + 1) . (7) The last term on the ri ght vanishes on b oth limit ing cases. 8 The cas e λ ≥ 1 / 2 ca n also be read ily obtained but is less interes ting since , in practice, the pa cket length m will be much lar ger than the numbe r of p ackets n . Note that an expression similar to (7) has been found in [13 ] under a different as sumption on the transfer matrix (namely , that A is uniform on F n × n q ) . It is interesting to note that, also in that cas e, the same conclusion can be reached about t he sufﬁciency of transmitting subspa ces [13]. An i ntuiti ve way t o interpret (3) is the follo wing: out of the nm symbols obtained by the recei ver , n 2 of these symbols are used to describe A , while the remaining ones are used to communicate X . It is interesting to note that (3) pr ecisely matches (4) after normalizing by the total numbe r of t ransmitted symbols, nm . Both limiti ng cap acity expressions (3) and (4) c an be achieved using a s imple coding s cheme wh ere an n × ( m − n ) data matrix U is con catenated on the left with an n × n identity matrix I , yielding a transmitted ma trix X = h I U i . T he ﬁrst n symbols of ea ch trans mitted pa cket may be interpreted as pilot symbo ls u sed to pe rform “cha nnel so unding”. Note that this is simply the standard way of u sing random network coding [15]. I V . T H E A D D I T I V E M A T R I X C H A N N E L W e de ﬁne the additive matrix chan nel (AM C) acc ording to Y = X + W where W ∈ T n × m,t is c hosen uniformly at random among all n × m matrices of rank t , independ ently from X . Note tha t the AMC is a ( q , n, m, t ) channe l with D ∈ T n × t and Z ∈ T t × m uniformly d istrib uted, and A = I . A. Capacity The capacity of the AMC is computed in the n ext proposition. Pr op osition 4: The cap acity of t he AMC is giv en by C AMC = nm − log q |T n × m,t | . For λ = n/m and τ = t/n , we have the limit ing expressions lim q →∞ C AMC = ( m − t )( n − t ) (8) lim m →∞ n = λm t = τ n C AMC = (1 − λτ )(1 − τ ) . (9) 9 Pr oo f: T o compute the capacity , we expan d the mutual information I ( X ; Y ) = H ( Y ) − H ( Y | X ) = H ( Y ) − H ( W ) where the las t equ ality holds bec ause X and W are indepe ndent. No te that H ( Y ) ≤ nm , an d the maximum is achieved when Y is u niform. Since H ( W ) doe s not d epend on the input distrib ution, we can maximize H ( Y ) by ch oosing, e.g., a u niform p X . The entropy of W is given b y H ( W ) = log q |T n × m,t | . The numbe r of n × m matrices of ran k t is giv en by [16, p . 455] |T n × m,t | = |T n × t ||T t × m | |T t × t | = |T n × t |  m t  q (10) = q ( n + m − t ) t t − 1 Y i =0 (1 − q i − n )(1 − q i − m ) (1 − q i − t ) . (11) Thus, C AMC = nm − log q |T n × m,t | = ( m − t )( n − t ) + log q t − 1 Y i =0 (1 − q i − t ) (1 − q i − n )(1 − q i − m ) The limiti ng expres sions ( 8) and (9) follo w immediately from the equation above. Remark: The expres sion (9), which gives the capacity of the inﬁnite-rank AMC, has been previously obtained in [11] for a chann el that is e quiv a lent to the AMC. Our p roof is a simple extension of the proof in [11]. As ca n be seen from (11), a n n × m matrix of rank t ca n be speciﬁe d with approximate ly ( n + m − t ) t symbols. Thus, the capacity (8) can be interpreted as the number of symbols con veyed by Y minus the number of symbols needed to describe W . Note that, as in S ection III, the normalized capac ities of the inﬁnite-ﬁeld-size AMC and the inﬁnite- rank AMC are the same. An intuiti ve explanation mi ght be the fact that, for the two channels, both the number of b its pe r row and the nu mber of bits per column ten d to inﬁnity . In co ntrast, the normalized capac ity is dif ferent whe n only on e of these quantities grows while the other is ﬁxed. T his is the case o f the inﬁnite-packet-length AMC and t he inﬁnite-batch-size AMC, which are studied in Section VI -E. B. A Coding Scheme W e n ow presen t a n efﬁcient co ding s cheme that ac hieves ( 8) and (9). The sche me is b ased o n an “error trapping” strategy . 10 Let U ∈ F ( n − v ) × ( m − v ) q be a da ta ma trix, where v ≥ t . A cod ew ord X is formed by adding all-zero rows and columns to U so tha t X =   0 v × v 0 v × ( m − v ) 0 ( n − v ) × v U   . These all-zero rows a nd columns may be interpreted as the “ error traps. ” Clearly , the rate of this s cheme is R = ( n − v )( m − v ) . Since the noise matrix W has rank t , we can write it as W = B Z =   B 1 B 2   h Z 1 Z 2 i where B 1 ∈ F v × t q , B 2 ∈ F ( n − v ) × t q , Z 1 ∈ F t × v q and Z 2 ∈ F t × ( m − v ) q . The received matrix Y is then given by Y = X + W =   B 1 Z 1 B 1 Z 2 B 2 Z 1 U + B 2 Z 2   . W e deﬁne an e rror trapping f a ilure to be the ev ent that rank B 1 Z 1 < t . Intuit i vely , this c orresponds to the situation where e ither the ro w space or the column space of the error matrix has n ot been “ trapped”. For now , as sume that the error trapping is succes sful, i.e., rank B 1 = rank Z 1 = t . Consider the submatrix corresp onding to the ﬁrst v columns of Y . Since rank B 1 Z 1 = t , the rows of B 2 Z 1 are completely spanne d by the rows of B 1 Z 1 . Thus, there exists s ome matrix ¯ T such that B 2 Z 1 = ¯ T B 1 Z 1 . But ( B 2 − ¯ T B 1 ) Z 1 = 0 implies that B 2 − ¯ T B 1 = 0 , since Z 1 has full ro w rank. It follo ws that T   B 1 B 2   =   B 1 0   , whe re T =   I 0 ¯ T I   . Note also that T X = X . Th us, T Y = T X + T W =   B 1 Z 1 B 1 Z 2 0 U   from which the data matri x U can be readily obtained. The c omplexity o f the scheme is co mputed as follo ws . In o rder to obtain ¯ T , it sufﬁces to p erform Gaussian elimination on the left n × v sub matrix of Y , for a cos t of O ( n v 2 ) ope rations. The d ata matrix can be extracted by multiplying ¯ T with the top right v × ( n − v ) submatrix of Y , which c an be accomplishe d in O (( n − v ) v ( m − v )) operations. Thus, the overall complexity of the s cheme is O ( nmv ) operations in F q . Note that B 1 Z 1 is available at the receiver as the top-left submatrix of Y . Moreover , the rank of B 1 Z 1 is a lready c omputed during the Gauss ian elimination s tep of the decoding . Thus, the ev ent that the error 11 trapping fails ca n be readily detected a t the rece i ver , wh ich ca n then declare a deco ding failure. It follows that the error probability of the s cheme is zero. Let us now compu te the probability of decoding failure. Consider , for instance , P 1 = P [ rank Z 1 = t ] , where Z = h Z 1 Z 2 i is a full-rank matrix cho sen un iformly at ra ndom. An equiv alent way of gen erating Z is to ﬁrst ge nerate the entries of a matrix M ∈ F t × m q uniformly at rando m, and then discard M if it is not full-rank. Thus, we want to compute P 1 = P [ rank M 1 = t | rank M = t ] , where M 1 correspond s to the ﬁrst v columns of M . This probab ility is P 1 = P [ ra nk M 1 = t ] P [ ra nk M = t ] = q mt Q t − 1 i =0 ( q v − q i ) q vt Q t − 1 i =0 ( q m − q i ) > t − 1 Y i =0 (1 − q i − v ) ≥ (1 − q t − 1 − v ) t ≥ 1 − t q 1+ v − t . The same analysis holds for P 2 = P [ rank B 1 = t ] . By the union bound, it follo ws that the probability of failur e satisﬁes P f < 2 t q 1+ v − t . (12) Pr op osition 5: The cod ing sc heme des cribed above c an a chieve both capa city exp ressions (8) and (9). Pr oo f: From (12), we see that achieving either of the limiting ca pacities amo unts to setting a su itable v . T o achieve (8), we set v = t and let q grow . The resulting code will have the c orrect rate, n amely , R = ( n − t )( m − t ) in q -ary units, while the probability of failure will dec rease exponentially with the ﬁeld size in bits. Alternati vely , to achieve (9), we can c hoose some s mall ǫ > 0 and set v = ( τ + ǫ ) n , w here both τ = t/n and λ = n /m are assume d ﬁxed. By letting m grow , we obtain a probability of failure tha t decreas es expo nentially wi th m . The (normalized) gap to capacity o f the resulting code wi ll be ¯ g , lim m →∞ C AMC − R/ ( nm ) = (1 − λτ )(1 − τ ) − (1 − λ ( τ + ǫ ))(1 − ( τ + ǫ )) = λǫ (1 − ( τ + ǫ )) + ǫ (1 − λτ ) < λǫ + ǫ = (1 + λ ) ǫ which can be made as small as we wish. 12 V . T H E A D D I T I V E - M U LT I P L I C A T I V E M A T R I X C H A N N E L Consider a ( q , n, m, t ) chann el with A ∈ T n × n , D ∈ T n × t and Z ∈ T t × m uniformly distributed a nd independ ent from othe r v a riables. Since A is in vertible, we c an re write (2) as Y = AX + D Z = A ( X + A − 1 D Z ) . (13) Now , since T n × n acts transiti vely on T n × t , the channe l law (13) is equi valent to Y = A ( X + W ) (14) where A ∈ T n × n and W ∈ T n × m,t are chos en un iformly a t rando m a nd independ ently from a ny other variables. W e call (14 ) the additive-multiplicative matrix channel (AMMC). A. Capacity One of the main results of this se ction is the following theorem, which provides an upper b ound on the capac ity of the AMMC. Theorem 6: For n ≤ m/ 2 , the capacity of the AMM C is upp er bo unded by C AMMC ≤ ( m − n )( n − t ) + log q 4(1 + n )(1 + t ) . Pr oo f: Le t S = X + W . By expa nding I ( X, S ; Y ) , an d using the fact that X , S a nd Y form a Markov ch ain, in t hat order , we h ave I ( X ; Y ) = I ( S ; Y ) − I ( S ; Y | X ) + I ( X ; Y | S ) | {z } =0 = I ( S ; Y ) − I ( W ; Y | X ) = I ( S ; Y ) − H ( W | X ) + H ( W | X , Y ) = I ( S ; Y ) − H ( W ) + H ( W | X , Y ) (15) ≤ C MMC − log q |T n × m,t | + H ( W | X, Y ) (16) where (15) follo ws s ince X an d W are indepe ndent. W e now co mpute an upp er b ound o n H ( W | X , Y ) . L et R = ra nk Y a nd write Y = G ¯ Y , where G ∈ T n × R and ¯ Y ∈ T R × m . Note that X + W = A − 1 Y = A − 1 G ¯ Y = A ∗ ¯ Y where A ∗ = A − 1 G . Since ¯ Y is full-rank, it must conta in an in vertible R × R s ubmatrix. By reordering columns if nece ssary , assume that the left R × R sub matrix of ¯ Y is in vertible. Write ¯ Y = h ¯ Y 1 ¯ Y 2 i , 13 X = h X 1 X 2 i and W = h W 1 W 2 i , where ¯ Y 1 , X 1 and W 1 have R columns , an d ¯ Y 2 , X 2 and W 2 have m − R columns. W e have A ∗ = ( X 1 + W 1 ) ¯ Y − 1 1 and W 2 = A ∗ ¯ Y 2 − X 2 . It follo ws that W 2 can be computed if W 1 is known. Thus , H ( W | X , Y ) = H ( W 1 | X, Y ) ≤ H ( W 1 | R ) ≤ H ( W 1 | R = n ) ≤ log q t X i =0 |T n × n,i | ≤ log q ( t + 1) |T n × n,t | (17) where (17) follo ws s ince W 1 may possibly be any n × n matrix with rank ≤ t . Applying this result in (16), and using (5) a nd (10), we ha ve I ( X , Y ) ≤ log q ( n + 1)  m n  + log q ( t + 1) |T n × t |  n t  |T n × t |  m t  ≤ log q ( n + 1)( t + 1)  m − t n − t  (18) ≤ ( m − n )( n − t ) + log q 4(1 + n )(1 + t ) . where (18) follo ws from  m n  n t  =  m t  m − t n − t  , for t ≤ n ≤ m . W e now develop a connec tion with t he subspace a pproach o f [9] that will be u seful t o obtain a lower bound on the capa city . F rom Se ction III, we know that, in a multiplicativ e matrix channe l, the receiv er can only distinguish between tr ansmitted sub space s. Thus, we can equiv a lently exp ress C AMMC = max p X I ( X ; Y ) where X a nd Y d enote the ro w spaces of X and Y , respec ti vely . Using this interpretation, we can obtain the following lo wer bound on capacity . Theorem 7: Assu me n ≤ m . For any ǫ ≥ 0 , we have C AMMC ≥ ( m − n )( n − t − ǫt ) − log q 4 − 2 tnm q 1+ ǫt . In order to prove Theorem 7, we need a fe w lemmas. Lemma 8: Let X ∈ F n × m q be a ma trix of rank k , and let W ∈ F n × m q be a ran dom matrix c hosen uniformly among all matrices of ran k t . If k + t ≤ min { n, m } , then P [ ra nk ( X + W ) < k + t ] < 2 t q min { n,m }− k − t +1 . 14 Pr oo f: Write X = X ′ X ′′ , where X ′ ∈ F n × k q and X ′′ ∈ F k × m q are full-rank ma trices. W e c an generate W as W = W ′ W ′′ , where W ′ ∈ T n × t and W ′′ ∈ T t × m are chos en un iformly a t random and independ ently from ea ch other . Then we hav e X + W = X ′ X ′′ + W ′ W ′′ = h X ′ W ′ i   X ′′ W ′′   . Note that rank ( X + W ) = k + t if and o nly if the column s paces of X ′ and W ′ intersect tri vially an d the row spac es o f X ′′ and W ′′ intersect trivially . Let P ′ and P ′′ denote the probabilities of these two ev ents, r espe cti vely . By a simple counting ar gument, we have P ′ = ( q n − q k ) · · · ( q n − q k + t − 1 ) ( q n − 1) · · · ( q n − q t − 1 ) = t − 1 Y i =0 (1 − q k − n + i ) (1 − q − n + i ) > t − 1 Y i =0 (1 − q k − n + i ) ≥ (1 − q k − n + t − 1 ) t ≥ 1 − tq k − n + t − 1 . Similarly , we hav e P ′′ > 1 − tq k − m + t − 1 . Thus, P [ ra nk ( X + W ) < k + t ] < t q n − k − t +1 + t q m − k − t +1 ≤ 2 t q min { n,m }− k − t +1 . For dim X ≤ n ≤ m , let S X , n denote the set of all n -dimens ional s ubspa ces of F m q that contain a subspa ce X ⊆ F m q . Lemma 9: |S X , n | =  m − k n − k  q where k = dim X . Pr oo f: By the fourth isomo rphism theo rem [17], there is a bijection between S X , n and the set of all ( n − k ) -dimensional subspa ces of the quo tient space F m q / X . Since dim F m q / X = m − k , the resu lt follo ws. W e ca n no w gi ve a proof of Theorem 7. Pr oo f of Theor e m 7: Ass ume that X is selec ted from T n × m,k , where k = n − (1 + ǫ ) t and ǫ ≥ 0 . Deﬁne a random v ariable Q as Q =      1 if dim Y = ra nk ( X + W ) = k + t 0 othe rwise. 15 Note that X ⊆ Y when Q = 1 . By Lemma 9 and (6), we ha ve H ( Y |X , Q = 1) ≤ log q |S X , n ′ | ≤ ( m − n ′ ) t + log q 4 where n ′ = k + t . C hoosing X uniformly from T n × m,k , we c an also make Y uniform within a giv en dimension; in particular , H ( Y | Q = 1) = log q  m n ′  q ≥ ( m − n ′ ) n ′ . It follo ws that I ( X ; Y | Q = 1) = H ( Y | Q = 1) − H ( Y |X , Q = 1) ≥ ( m − n ′ )( n ′ − t ) − log q 4 ≥ ( m − n )( n − t − ǫt ) − log q 4 . Now , using Lemma 8, we obtain I ( X ; Y ) = I ( X ; Y , Q ) = I ( X ; Q ) + I ( X ; Y | Q ) ≥ I ( X ; Y | Q ) ≥ P [ Q = 1] I ( X ; Y | Q = 1) ≥ I ( X ; Y | Q = 1) − P [ Q = 0] nm ≥ ( m − n )( n − t − ǫt ) − log q 4 − 2 tnm q ǫt +1 . Note t hat, dif ferently from the res ults of previous sections, Theorems 6 and 7 provide o nly upper and lower boun ds on the cha nnel capacity . Nevertheless, it is still possible t o compute exact expres sions f or the capac ity of the AMMC in certain limiti ng c ases. Cor o llary 10: For 0 < λ = n/m ≤ 1 / 2 a nd τ = t/n , we have lim q →∞ C AMMC = ( m − n )( n − t ) (19) lim m →∞ n = λm t = τ n C AMMC = (1 − λ )(1 − τ ) . (20) Pr oo f: The fact tha t the values in (19 ) and (20) are upp er bounds follo ws immediately from Theorem 6. The fact that (19) is a lo wer bound follows i mmediately from Theorem 7 by setting ǫ = 0 . T o o btain (20) from Theorem 7, it s ufﬁces to c hoose ǫ suc h tha t 1 /ǫ grows sublinearly with m , e .g., ǫ = 1 / √ m . 16 Once again, note that (19) a grees with (20) if we consider the n ormalized capacity . Dif ferently from the MMC and the AMC, s ucces sful decoding in the AMMC does not (ne cess arily) allow recovery of all so urces of cha nnel unce rtainty—in this ca se, the matrices A and W . In general, for ev ery ob servable ( X , Y ) pair , t here are many v alid A and W suc h that Y = A ( X + W ) . Suc h cou pling between A and W is reﬂected in extra term H ( W | X, Y ) in (15 ), which provides an additional rate o f roughly (2 n − t ) t a s compared to the straightforward lower bound C AMMC ≥ C MMC − log q |T n × m,t | ≈ ( m − n ) n − ( n + m − t ) t . Remark: In [11], the problem of communicating over the AMMC was a ddresse d assuming a speciﬁc form of transmiss ion ma trices that con tained an n × n identity header . No te that, if we cons ider su ch a header as being part of the c hannel (i.e., beyond the c ontrol of the co de d esigner) then, with high probability , the resulting cha nnel be comes e quiv a lent to the AMC (see [11] for details). Howe ver , a s a coding strategy for the AMMC, using such an n × n ide ntity header results in a su boptimal inp ut distrib ution, as the mutual information ac hieved is s trictly smaller than the capacity . Inde ed, the capa city- achieving d istrib ution u sed in Theorem 7 and Co rollary 10 correspond s to transmission matrices of rank n − (1+ ǫ ) t . This result shows that, for the AMMC, using head ers is neither f undamen tal nor asymptotically optimal. B. A Coding Scheme W e n ow propose an efﬁcient coding scheme that ca n as ymptotically ac hieve (19 ) and (20). The s cheme is based on a combination of c hanne l so unding and error tr apping strategies. For a d ata matrix U ∈ F ( n − v ) × ( m − n ) q , where v ≥ t , let the correspon ding co deword be X =   0 ¯ X   =   0 v × v 0 v × ( n − v ) 0 v × ( m − n ) 0 ( n − v ) × v I ( n − v ) × ( n − v ) U   . Note t hat the all-zero matrices provide the error traps, while the identit y matrix corresponds to the pilot symbols. Clearly , the rate of this sch eme is R = ( n − v )( m − n ) . Write the noise matrix W as W = B Z =   B 1 B 2   h Z 1 Z 2 Z 3 i where B 1 ∈ F v × t q , B 2 ∈ F ( n − v ) × t q , Z 1 ∈ F t × v q , Z 2 ∈ F t × ( n − v ) q and Z 3 ∈ F t × ( m − n ) q . The auxiliary ma trix S is then gi ven b y S = X + W =   B 1 Z 1 B 1 Z 2 B 1 Z 3 B 2 Z 1 I + B 2 Z 2 U + B 2 Z 3   . 17 Similarly a s in Sec tion IV, we deﬁne that the error trapp ing is suc cess ful i f ra nk B 1 Z 1 = t . Ass ume that this is the case. From S ection IV, there exists some matrix T ∈ T n × n such that T S =   B 1 Z 1 B 1 Z 2 B 1 Z 3 0 I U   =   B 1 0 0 I     Z 1 Z 2 Z 3 0 I U   . Note further that RRE     Z 1 Z 2 Z 3 0 I U     =   ˜ Z 1 0 ˜ Z 3 0 I U   for some ˜ Z 1 ∈ F t × v q in RRE form and some ˜ Z 3 ∈ F t × ( m − n ) q . It follo ws that RRE ( S ) = RRE     B 1 0 0 I     Z 1 Z 2 Z 3 0 I U     =      ˜ Z 1 0 ˜ Z 3 0 I U 0 0 0      where the bottom v − t rows are all -zeros. Since A is in vertible, we hav e RR E ( Y ) = RRE ( S ) , from which U can be readily obtained. Th us, decoding amo unts to performing Gau ss-Jordan elimination on Y . It follo ws that the complexity of the scheme is O ( n 2 m ) operations in F q . The probability tha t the e rror trapping is not succ essful, i.e., rank B 1 Z 1 < t , was c omputed in Section IV. L et ˆ A correspon d to the ﬁrst n c olumns of Y . Note that ra nk B 1 Z 1 = t if and on ly if rank ˆ A = n − v + t . T hus, when the error trapp ing is n ot suc cessful, the receiver c an eas ily detect this ev ent by looking at RR E ( Y ) a nd then declare a decoding failure. It follows that the scheme has ze ro error probability and probability of f ailure gi ven by (12 ). Theorem 11: The propos ed coding s cheme can asymptotically achieve (19) and (20). Pr oo f: Using (12) and the s ame argument as in the proo f o f Propos ition 5, we c an set a suitable v in order to a chieve arbitrarily low g ap to cap acity while ma intaining a n arbitrary low probability of failure, for both cases where q → ∞ or m → ∞ . V I . E X T E N S I O N S In this section, we disc uss possible extens ions of the results and mo dels p resented in the pre vious sections. 18 A. Depend ent T ransfer Matrices As dis cusse d in Sec tion V, the AMMC is equ i valent to a chan nel of the form (2) whe re A ∈ T n × n and D ∈ T n × t are c hosen uniformly a t random and independ ently from ea ch other . Suppos e now that the ch annel is the same, except for the fact that A and D are not inde pende nt. It should be clear that the capacity of the chann el cannot be smaller than that of the AMMC. For ins tance, on e ca n always con vert this chann el into an AMMC by employing rand omization at the source . (Th is is, in fact, a natural procedure in any random network c oding sy stem.) Le t X = T X ′ , where T ∈ T n × n is chosen uniformly at random and indepe ndent from any other v ariables. T hen A ′ = AT is uniform on T n × n and independ ent from D . Thu s, the channel gi ven by Y = A ′ X ′ + D Z is an AMMC. Note that our coding sche me does no t rely on a ny particular statistics of A giv en X a nd W (except the assump tion that A is in vertible) and therefore works unchanged in this cas e. B. T ransfer Matrix In vertible b ut Nonuniform The mod el for the AMMC a ssumes that the t ransfer matri x A ∈ T n × n is chos en u niformly at ra ndom. In a realistic network cod ing sy stem, the transfer ma trix may be a function o f bo th the network c ode and the network topology , a nd therefore may not have a uniform distribution. Consider the case where A is chosen ac cording to an arbitrary prob ability distributi on o n T n × n . It sho uld be c lear that the capac ity c an only increase as co mpared with the AMMC, sinc e less “randomne ss” i s introduc ed in the chan nel. The best possible situation is to ha ve a c onstant A , in which c ase the channel becomes e xactly an AMC. Again, note that our coding s cheme for the AMMC is stil l applica ble in this ca se. C. Nonuniform P ac ket Errors When expres sed in the form (2), the models for b oth the AMC an d the AMMC assume that the matrix Z is uniformly distributed on T n × t . In particular , ea ch error pa cket is uniformly distributed on F 1 × m q \ { 0 } . In a realistic s ituation, howev er , it may be the case that error pa ckets of low weight a re mo re likely to occur . Con sider a model ide ntical to the AMC or the A MMC except for the fact that the matrix Z is chosen ac cording to a n arbitrary prob ability distribution on T t × m . Once again, it should be c lear that the ca pacity can only increas e. No te that the exact cap acity in Propo sition 4 a nd the upper b ound of Theorem 6 can be eas ily modiﬁed to ac count for this c ase (by replac ing log q |T n × m,t | with the entropy of W ). Although our coding s cheme in principle doe s n ot hold in this more gene ral case, we can e asily c on vert the channel into an AMC or AMMC by applying a random tr ansformation a t the source ( and it s in verse 19 at the de stination). Le t X = X ′ T , where T ∈ T m × m is cho sen un iformly at random and independ ent from any o ther variables. The n Y ′ = Y T − 1 = ( AX + D Z ) T − 1 = AX ′ + D Z ′ where Z ′ = Z T − 1 . S ince T m × m acts (by right multiplication) transitiv ely on T t × m , we have that Z ′ is uniform on T t × m . Thus, we o btain p recisely an AMMC (or A MC) and the assu mptions of ou r coding scheme hold. Note, howe ver , that, de pending o n the error mod el, the capac ity may be muc h larger than what can b e achieved by the sc heme described above. For instance, if the rows of Z are constrained to have weight at most s (otherwise chos en, say , uniformly at random), then the capa city would inc rease by a pproximately  m − s − log q  m s  t , which might be a s ubstantial amount if s is small. D. Err or Matrix with V ariable Rank ( ≤ t ) The model we considere d for the AMC and the AMMC ass umes an e rror matrix W whose rank is known and equal to t . It is us eful to c onsider the c ase where ra nk W is allowed to vary , while still bounde d by t . More precisely , we as sume that W is cho sen uniformly at random from T n × m,R , wh ere R ∈ { 0 , . . . , t } is a random v ariable with probability distri bution P [ R = r ] = p r . Since H ( W ) = H ( W , R ) = H ( R ) + H ( W | R ) = H ( R ) + X r p r H ( W | R = r ) = H ( R ) + X r p r log q |T n × m,r | ≤ H ( R ) + log q |T n × m,t | , we c onclude tha t the cap acities of the AMC and the AMMC may be reduce d by a t mos t H ( R ) ≤ log q ( t + 1 ) . This loss is asymptotically negligible for lar ge q and/or l arge m , s o the expressions (8 ), (9), (19) and (20) remain unchange d. The steps for dec oding and computing the probab ility of error trapping failure also remain the sa me, provided we replace t by R . The o nly dif ference is tha t now dec oding errors may occur . More prec isely , suppos e that rank B 1 Z 1 = t ′ < t . A neces sary c ondition for s ucces s is that rank B 1 Z = rank B Z 1 = t ′ . If this condition is not s atisﬁed, then a dec oding failure is dec lared. Howev er , if the condition is true, then the dec oder canno t determine wh ether t ′ = R < t (an error trapping s ucces s) o r t ′ < R ≤ t (an 20 error trapping failure), and mus t proc eed assuming the former c ase. If the latter case turns out to be true, we would have a n undetected error . T hus, for this mo del, the expression (12) gi ves a bound on the probability that decoding is not successful, i.e ., that either an error or a f a ilure occurs. E. Inﬁnite-P a cket- Length Channel and Inﬁnite-Batc h-Size Channel W e now extend our results to the inﬁ nite-packet-length AMC and AMMC and the inﬁnite-batch-size AMC. (Note that, as pointed out in Section III, there is little justiﬁcation to co nsider an inﬁnite-batch-size AMMC.) From the proof of P ropositon 4 and the p roof of Corollary 10, it is straightfor ward to se e that lim m →∞ C AMMC = lim m →∞ C AMC = ( n − t ) /n lim n →∞ C AMC = ( m − t ) /m. It is not straightforward, howev er , to obtain cap acity-achieving schemes for these cha nnels. The schemes described in Sec tions IV a nd V for the inﬁnite-rank AMC and AMMC, re spectively , use an e rror trap whose size (in terms of columns and rows) grows proporti onally with m (or n ). While this is necessary for ac hieving vanishingly sma ll error probability , it also implies that these sch emes are not suitable for the inﬁn ite-packet-length chann el (whe re m → ∞ but not n ) or the inﬁnite-batch-size chann el (whe re n → ∞ but not m ). In these situations, the propos ed sch emes ca n be a dapted by replac ing the data matrix and p art of the error trap with a maximum-rank-distan ce (MRD) code [18]. Cons ider ﬁrst an inﬁ nite-packet-length AMC. Let the transmitted matrix b e giv en by X = h 0 n × v x i (21) where x ∈ F n × ( m − v ) q is a c odeword of a matrix co de C . If (column) error trapping is su cces sful then, under the terminology of [10], the decod ing problem for C amounts to the co rrection of t eras ures . It is k nown that, for m − v ≥ n , an MRD code C ⊆ F n × ( m − v ) q with rate ( n − t ) /n c an co rrect exac tly t erasures (with zero probability of error) [10]. Thus, de coding fail s if and only if column trapping f ails. Similarly , for an inﬁnite-batch-size AMC, let the tr ansmitted matrix be gi ven by X =   0 v × m x   where x ∈ F ( n − v ) × m q is a cod ew ord of a matrix code C . If (row) error trapping is succe ssful then , under the terminology o f [10 ], the de coding problem for C amounts to the c orrection of t deviations . It is kn own 21 that, for n − v ≥ m , an MRD code C ⊆ F ( n − v ) × m q with rate ( m − t ) / m can correct exactly t deviations (with zero probability of error) [10]. Th us, decoding fails i f and only if row trapping fails. Finally , for the inﬁnite-packet-length AMMC, it is sufﬁcient to prep end to (21) an ide ntity ma trix, i.e., X = h I n × n 0 n × v x i . The same reaso ning as for the inﬁnite-pac ket-length AMC applies here, and the dec oder in [10] is a lso applicable in this case. For more d etails on the dec oding of an MRD code c ombined with an e rror trap, we refer the read er to [19]. The decoding complexity is in O ( tn 2 m ) and O ( tm 2 n ) (whichever is s maller) [10]. In all cas es, the sch emes have probab ility of error u pper bou nded by t/q 1+ v − t and therefore a re capac ity-achieving. V I I . C O N C L U S I O N S W e have c onsidered the problem of reliable co mmunication over certain add iti ve matrix ch annels inspired by ne twork coding. Thes e chan nels provide a reasona ble model for both coherent an d rando m network coding s ystems subjec t to rand om packet e rrors. In pa rticular , for an ad diti ve-multiplicati ve matrix cha nnel, we hav e obtained u pper an d lower bounds on cap acity for any ch annel parame ters and asymptotic cap acity expres sions in the limit o f large ﬁeld size and/or large matrix size; roughly spe aking, we nee d to use t redun dant p ackets in order to be a ble to correct up to t injec ted error pa ckets. W e have also presented a simple coding sch eme tha t achieves capacity in the se limiting cases while requiring a signiﬁcantly lo w decoding complexity; in fact, decoding amounts simply to p erforming Gauss-Jorda n elimination, which is alread y the standard decod ing procedu re for ran dom network co ding. Compared to previous work on co rrection of ad versarial errors (where a pproximately 2 t redun dant pac kets are required), the results of this paper s how an improvement of t redun dant p ackets that can be used to transport data, if errors occur according to a p robabilistic model. Several ques tions remain op en and may serve as an interesting avenue for future research : • Our results for the AMMC ass ume tha t the transfer matrix A is always n onsingular . It ma y be useful to cons ider a model where rank A is a ran dom variable. Note that, in this c ase, one can not expect to achieve reliable (and efﬁcient) co mmunication with a on e-shot cod e, as the chann el realization would be unknown at the transmitter . Thus, in order to achieve ca pacity under such a model (e ven with arbitrarily lar ge q or m ), it is strictly necess ary to cons ider mu lti-shot codes . 22 • As pointed out in Se ction VI-C, our propo sed co ding sc heme may not b e even c lose to optimal when packet errors occu r acc ording to a nonuniform p robability model. E specially in the ca se of low- weight errors, it is an important q uestion how to a pproach capac ity with a low-complexity coding sch eme. It might also be interesting to kn ow whe ther one-shot codes are still useful in this case. • Another important a ssumption of this pape r is the b ounded number of t < n pac ket errors. What if t is u nbound ed (although with a lo w numbe r of errors being more likely than a high nu mber)? While the c apacity o f such a channe l may not b e too ha rd to approximate (gi ven the r esults of this paper), ﬁnding a lo w-complexity c oding scheme seems a v ery challenging problem. A C K N O W L E D G E M E N T S W e would like to thank the associate ed itor and the anonymou s reviewers for their helpful comments. R E F E R E N C E S [1] R. Ahlswede, N. Cai, S.-Y . R. Li, and R. W . Y eung, “Network information ﬂow , ” IEEE T rans. Inf. T heory , vol. 46, no. 4, pp. 1204–121 6, Jul. 2000. [2] S.-Y . R. Li, R. W . Y eung, an d N. Cai , “L inear network co ding, ” IEEE T rans. Inf. Theory , vol. 49, no. 2, p p. 371– 381, Feb . 2003. [3] R. K oetter and M. M ´ edard, “ An algebraic approach to network coding, ” IEEE/ACM T rans. Netw . , vol. 11, no. 5, pp. 782–79 5, Oct. 2003. [4] T . Ho, M. M ´ edard, R. K oetter , D. R. K arger , M. Effros, J. Shi, and B. L eong, “ A random linear network coding approach to multicast, ” IE EE Tr ans. Inf. Theory , vol. 52, no. 10, pp. 4413–4430, Oct. 2006. [5] D. Silva and F . R. Kschischang, “On metrics for error correction in ne twork coding, ” IEE E T rans. Inf. Theory , 2008, t o be published. [Online]. A v ailable: http://arxiv . org/abs/08 05.3824 [6] ——, “ Adv ersarial error correction for network coding: Models and metrics, ” in Pr oc. 46th Annual Allerton Conf. on Commun., Control, and Computing , Monticello, IL, Sep. 23-26, 2008, pp. 1246–1253. [7] N. Cai and R. W . Y eung, “Netw ork error correction, part II: Lo wer b ounds, ” Commun. Inform. Syst. , vol. 6, no. 1, pp. 37–54, 2006. [8] R. W . Y eun g and N. Cai, “Netwo rk error co rrection, part I: Basic concepts and upp er boun ds, ” Commun. Inform. Syst. , vol. 6, no. 1, pp. 19–36, 2006. [9] R. K ¨ otter and F . R. Kschischang, “Coding for errors and erasures in random network coding, ” IEEE T rans. Inf. Theory , vol. 54, no. 8, pp. 3579–3591, Aug. 2008. [10] D. Si lva , F . R. Kschischang, and R. K ¨ otter , “ A rank-metric approach to error control in random network coding, ” IE EE T rans. Inf. Theory , vol. 54, no. 9, pp. 3951–3967, 2008. [11] A. Montanari and R. Urbanke, “Iterativ e coding for network coding, ” 2007, submitted for publication. [Online]. A vailable: http://arxiv .org/abs/0711.393 5 [12] T . M. Cover and J. A. Thomas, Elements of Information Theory . John Wiley & sons, 1991. 23 [13] M. Siavoshan i, C. Fragouli, and S. Diggavi, “Noncoherent multisource network coding, ” in Pr oc. IEEE Int. Symp. Information T heory , T oronto, Canada, Jul. 6–11, 2008, pp. 817–821. [14] S. Jaggi, M. Langberg, S. Katti, T . Ho, D. Katabi, M. M ´ edard, and M. Effros, “Resilient network coding i n the presence of Byzantine adversaries, ” IEEE T rans. Inf. Theory , vol. 54, no. 6, pp. 2596–26 03, Jun. 2008. [15] P . A. Chou, Y . W u, and K. Jain, “Practical network coding, ” in Pro c. Al lerton Conf. on Comm., Contr ol, and Computing , Monticello, IL, Oct. 2003, pp. 40–49. [16] R. Lidl and H. Niederreiter , F inite Fields . Reading, MA: Addison-W esley , 1983. [17] D. S. Dummit and R. M. Foote, Abstract Algebra , 3rd ed. John W iley & Sons, 2004. [18] E. M. G abidulin, “Theory of codes with maximum rank distance, ” Pr obl. Inform. T ransm. , vol. 21, no. 1, pp. 1–12, 1985. [19] D. Silva and F . R. Kschischang, “ A key-b ased error control scheme for network coding, ” in Pro c. 11th Canadian W orkshop Inform. T heory , Ottawa, Canada, May 13-15, 2009, pp. 5–8. Danilo S ilva (S ’06) received the B.Sc. degree fr om the Federal University of P ernambu co, Recife, B razil, in 2002, the M.Sc. degree from the Pontiﬁcal Catholic Uni versity o f Rio de Janeiro (PUC-Rio), Rio de Janeiro, Brazil, in 20 05, and the Ph.D. degree from the Univ ersity of T oronto, T oronto, Canada, in 2009, all i n electrical engineering. He is currently a P ostdoctoral Fell o w at t he Univ ersity of T oronto. H is research interests include channel coding, i nformation theory , and network coding. Frank R. Kschisch ang (S’83–M’91– SM’00–F’06) receiv ed the B.A.Sc. de gree (with honors) from the Uni versity of Briti sh Columbia, V ancouver , BC, Canada, i n 1985 and the M.A.Sc. and Ph.D. degrees from the University of T oronto, T oronto, O N, Canada, in 1988 and 1991, respecti vely , all in electri cal engineering. He i s a P rofessor of Electrical and Computer E ngineering and Canada Research Chair in Communication Algorithms at t he Univ ersity of T oronto, where he has been a faculty member since 1991. During 1997–1998, he was a V isiting Scientist at the Massachusetts Institute of T echnology , Cambridge, an d in 2005 he was a V isiting P rofessor at the ET H, Z ¨ urich, Switzerland. His research interests are focused on the area of channel coding techniques. Prof. Kschischang was the recipient of the Ontario Premier’ s Research Excellence A ward. From 1997 to 2000, he served as an Associate Editor for Coding Theory for the I E E E T R A N S AC T I O N S O N I N F O R M AT I O N T H E O RY . He also served as T echnical Program Co-Chair for the 2004 IEE E International Symposium on Information Theory (ISIT), Chicago, IL, and as General Co-Chair for ISIT 2008, T oronto. 24 Ralf K ¨ otter (S’92-M’95-SM’06-F’09) receiv ed a Diploma in Electrical Engineering from the T echnical University Darmstadt, Germany in 1990 and a Ph.D. degree from the Department of Electrical Engineering at Link ¨ oping Univ ersity , Sweden. From 1996 to 1997, he was a V isiting S cientist at the IBM Almaden R esearch Laboratory in San Jose, CA. He was a V isiting Assistant Pr ofessor at the Univ ersity of Ill inois at Urbana-Champaign and a V isiting Scientist at CNRS in Sophia- Antipolis, F rance, from 1997 to 1998. In the years 1999-2006 he w as membe r of the facu lty of the Uni versity of Illinois at Urbana-Champaign , where his research interests included coding and information theory and t heir application to communication systems. In 2006 he joined the faculty of the T echnische Univ ersit ¨ at M ¨ unchen, Munich, Germany , as t he Head of the I nstitute for Communications Engineering. He served as an Associate Editor for both the I E E E T R A N S A C T I O N S O N C O M M U N I C AT I O N S and the I E E E T R A N S AC T I O N S O N I N F O R M AT I O N T H E O RY . He received an IBM In vention Achiev ement A ward in 1997, an NSF CAREER A ward i n 2000, an IBM Partnership A ward in 2001, and a 2006 XERO X award for faculty research. He is co-recipient of the 2004 Information Theory Society Best Paper A ward, of the 20 04 IEEE Signal P rocessing Magazine Best Paper A ward, and of t he 2009 Joint Communications S ociety and Information Theory Society Best Paper A wa rd. H e receiv ed the V odafone Innov ationspreis in 2008. Ralf K ¨ otter died in February , 2009.

Communication over Finite-Field Matrix Channels

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment