"Real" Slepian-Wolf Codes

“Real” Slepian-W olf Co des Bikash Kumar De y , Sidharth Jaggi, and Michael Langb er g Abstract W e pro vide a n ov el achie v ability proof of the S lepian-W olf theorem for i.i. d. sources ov er ﬁ nite alphabets. W e demonstrate that random codes that are linear over the real ﬁeld achie ve the classical Slepian-W olf rate-region. For ﬁnite alphabets we show that typicality decoding is equiv alent t o solving an integer program. Minimum entropy decoding is also shown to achie ve exp onentially small probability of err or . T he techniques used may be of independe nt interest for code design for a wide class of information theory problems, and for the ﬁ eld of compressed sensing. I . I N T RO D U C T I O N A well-known result by Slepian and W olf in [2] ch aracterizes the r ate-region fo r n ear-lossless source codin g o f distributed sources. The result dem onstrates that if two (or m ore) sourc es possess cor related data, even independ ent encodin g of the sources’ data can still achiev e essentially th e sam e p erforma nce as when the sources en code jointly . This result has impo rtant im plications f or inf ormation theor etic pro blems as diverse as sensor ne tworks [3], secrecy [4], and lo w-complexity v ideo encoding [5]. Unfortun ately for the distributed so urce coding problem, codes that are provably both rate-optima l and comp utationally ef ﬁcient to implement are hard to come by . Section II giv es a partial h istory of results f or the Slepian -W olf (SW) pro blem. In this work we provide n ovel codes that asymptotically achieve the SW rate-region with vanishing pro bability of error . Our encod ing proc edure com prises of ran dom linear op erations over the real ﬁeld R , and ar e h ence called Real Sle pian-W olf Codes or RSWCs. In co ntrast mo st oth er codes in the literature operate over appr opriate ﬁnite ﬁelds F q . W e demonstrate that RSWCs can b e used in a way that enables the receiver to decod e the sources’ informa tion by solvin g a set of integer prog rams (IPs). Besides b eing interesting in their own right as a new class of codes ac hieving the SW rate-region, the relation between RSWCs and I Ps has so me intriguin g implications. In general IPs are compu tationally intractab le to solve. Howe ver , o ur c ode desig n g i ves us signiﬁcant ﬂexibility in choosing the par ticular IPs co rrespon ding to our co des. That is, we show that “almost all” RSWCs result in IPs that hav e “good ” perf ormance fo r the SW p roblem. But there are well-studied classes of IPs that are known 0 The work in this pape r was presen ted in part in ISIT 2008, T oronto, Canada, July 2008 [1 ]. B. K. Dey is with the Department of Electri cal Engineerin g, Indian Institute of T echnology Bombay , Mumbai, India , 400 076, email: bikash@ee .iitb .ac.in . S. Jaggi is with the Department of Information Enginee ring, Chine se Univ ersit y of Hong Kong, Shatin, N. T ., Hong Kon g, email: jaggi@i e.cuhk.edu.hk M. Langber g is with th e Computer Scienc e Divi sion, Open Univ ersity of Israel , 108 Rav utski St., Raanana 43107 , Israel, email: mike l@openu.ac .il DEY , J A GGI, AND LANGBERG: “R EAL ” SLEPIAN-WOLF CODES 2 to be compu tationally tractable to solve (for e.g ., IPs correspon ding to T otally Unimod ular matrices [6]). It is thus conceiv able that suitably chosen RSWCs may b e decodab le with low co mputation al comp lexity . Linear SW codes over ﬁnite ﬁelds were intro duced in [7] and they were shown to achiev e the SW rate-region. Decoding such codes is equiv alent to ﬁnding a vertex of a hyper cube satisfying some combin atorial p roperties. Such pr oblems are computatio nally intractab le. Ou r SW codes are linear over R . Thoug h deco ding our co des may still be d ifﬁcult, we can use tools f rom the m atured ﬁeld o f co n vex optim ization for deco ding our code s. Also, our work has direct implication s f or the new ﬁeld of Compr essed Sensing (CS) . In the CS setup, N sources each generate a single real n umber . The resulting leng th- N sequ ence is k -spa rse , i.e., can b e written with at most k ≪ N n on-zer o coefﬁcients in a prespeciﬁed b asis. A typical result [8 ] in this setup shows tha t if a receiver gets O ( k log( N )) rando m line ar combinations over R of the sour ces’ sequence, it can, with high prob ability , r econstruct the sour ce seq uence exactly in a c omputatio nally efﬁcient manner by solv ing a linear pro gram. The CS setup is quite similar to that o f the RSWCs we d esign – the source seq uence conta ins a large amo unt of red undancy , and a rando m R -linear mixture of the sequence sufﬁces for exact reconstruction via op timization techniques. There are, howe ver , two major d ifferences. First, RSWCs o perate at inf ormation -theoretically o ptimal rates wher eas CS codes are bou nded away fr om such perform ance. Seco nd, CS codes are comp utationally tractable, wherea s we are currently not aware of ef ﬁcient decoding techniques for RSWCs. W e th ink this tradeoff betwee n computation al efﬁciency and rate- optimality is intere sting an d worth y of furthe r inv estigation. In Section II, we discuss some b ackgro und an d tools to be used in th e sub sequent sections. In Section III, we present the constructio n of our RSWCs and the re lated m ain r esults. These results are then proved in Section s IV and V. In Sec tion V I, we present the d irect c onstruction of RSWCs for any poin t on the Slepian-W olf r ate-region without time-sharing between the corner p oints. The univ ersal minim um-entro py d ecoding alg orithm is shown to work for our RSWCs in Section VII. Section VIII shows that our RSWCs achieve the rate-region of m ore gen eral normal sour ce networks withou t help ers introd uced in [9]. Finally Section IX co ncludes th e paper . I I . B AC K G RO U N D A N D D E FI N I T I O N S Shannon ’ s seminal source cod ing theorem [10] demon strates th at a sequ ence of discrete rand om variables can essentially be compressed down to the entro py of the un derlying prob ability d istribution gen erating the sequ ence. Of the m any extensions sparked by this paper, th e Slepian-W olf theor em [ 2] is the one this paper builds on. A. Slepian W olf Th eor em for i.i.d. sources [2] Problem Statement: T wo sources named Xavier and Yvonne generate two sequences of discrete ran dom variables, X △ = X 1 , X 2 , . . . , X n over the ﬁnite alphabet X , and Y △ = Y 1 , Y 2 , . . . , Y n over the ﬁnite alphabet Y , respectively . The sequence ( X , Y ) is assum ed to be i.i.d. with a jo int distribution p X,Y ( x, y ) that is known in ad vance to bo th Xavier an d Yv onne. The co rrespond ing marginal distributions over X and Y are denoted by p X ( x ) and p Y ( y ) respectively . Xa vier and Yvonne wish to commun icate ( X , Y ) to a receiv er Zor ba. T o this e nd Xavier uses h is encoder to transmit a message that is a fun ction only of X and p X,Y ( x, y ) to Z orba. Similarly , Yvonne u ses her Novem ber 14, 2018 DRAFT DEY , J A GGI, AND LANGBERG: “R EAL ” SLEPIAN-WOLF CODES 3 encoder to tr ansmit a message tha t is a fu nction only of Y and p X,Y ( x, y ) to Z orba. Zorba u ses his deco der to attempt to reconstruct ( X , Y ) . Xavier and Yvonne’ s enco ders and Zor ba’ s deco der comp rise a SW co de C . The SW code C is said to be near-lossless if Zorba’ s rec onstruction of ( X , Y ) is co rrect with a p robab ility of error over p X Y ( x, y ) tha t is asymptotically negligible in the blo ck-length n . The rate-pair ( R X , R Y ) is said to be achievable for th e SW pro blem if for ev ery ǫ > 0 there exists a code C that is near-lossless, and the average (over p X,Y ( x, y ) ) number of bits that C requires Xavier and Yv onne to transmit to Zorba are at most n ( R X + ǫ ) and n ( R Y + ǫ ) respectively . The set of a ll rate-p airs that are achiev able is called th e rate-r e g ion . Slepian an d W olf ’ s charac terization of the rate-region is remark ably clean. Theor em 1 : [2] Th e ra te-region for the Slepian-W olf pr oblem is given by the intersection o f R X ≥ H ( X | Y ) , R Y ≥ H ( Y | X ) , (1) R X + R Y ≥ H ( X , Y ) . Here H ( X | Y ) and H ( Y | X ) den ote the con ditional entr o py and H ( X , Y ) d enotes the jo int en tr op y of ( X , Y ) (implicitly , over the joint distribution p X,Y ( x, y ) ). B. Linear SW co des over ﬁnite ﬁelds The SW codes in [2] have c omputatio nal complexity that is expon ential for both encod ing and decoding. An improvement was made in [7], where it w as shown that random linea r enc oders sufﬁce. W e brieﬂy restate th at result here, r estricting ourselves to the case when X = Y = { 0 , 1 } for simplicity . Let D X and D Y be r espectiv ely ⌈ n ( R X + ǫ ) ⌉ × n an d ⌈ n ( R Y + ǫ ) ⌉ × n m atrices over the ﬁnite ﬁeld F 2 , with each entry of both matrices chosen i.i.d. as either 0 or 1 with prob ability 1 / 2 . Here ǫ is a n arb itrary positiv e constan t. Abusing notation, let X and Y also denote length- n colum n vectors over F 2 . Xavier and Y vonne’ s enco ders are then deﬁned r espectively via the matrix multiplicatio ns D X X and D Y Y , an d th eir m essages to Zo rba are re spectiv ely the resulting c olumn vecto rs. W e now deﬁne Zorba’ s d ecoder . For an arbitrary distribution p X,Y ( x, y ) over ﬁnite alp habets, let the str on gly ǫ - jointly typical set A n ǫ,p X,Y [11] (hencefo rth simply called the typical set) be the set of all length- n sequences ( X , Y ) such that the e mpirical distribution in duced by ( X , Y ) differs co mponen t-wise from p X,Y ( x, y ) by at m ost ǫ/ ( |X ||Y | ) . That is, A n ǫ,p X,Y △ =  ( x , y ) :     N ( x , y ) ( a, b ) n − p X,Y ( a, b )     < ǫ |X ||Y | for ev ery ( a, b ) ∈ X × Y  where N ( x , y ) ( a, b ) den otes the nu mber of co mpone nt pairs ( x i , y i ) in ( x , y ) which are eq ual to ( a, b ) . For simplicity of notation we denote A n ǫ,p X,Y as A ǫ . Zorba checks to see if there exists a unique length- n sequence ( ˆ X , ˆ Y ) satisfying two conditions. First, that D X ˆ X and D Y ˆ Y respecti vely match the messages transmitted by Xavier an d Yv onne. Second, whether ( ˆ X , ˆ Y ) lies within A ǫ . If bo th condition s ar e satisﬁed for exactly one seq uence ( ˆ X , ˆ Y ) , Zor ba outputs ( ˆ X , ˆ Y ) , else h e de clares a de coding error . Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 4 Then [7] shows the following r esult. Theor em 2 : [7] For each rate pair ( R X , R Y ) in the r egion deﬁn ed by (1) and sufﬁciently large n , with hig h probab ility over choices of D X and D Y the co rrespon ding SW code is n ear-lossless. Many of th e SW codes in the literature build o n such encoders that are linear over a ﬁnite ﬁeld. Some such codes use iteratively decodable channel code s to attain p erform ance that is empir ically “good”, but perfo rmance guarantees have not be en proven (e. g. [12]). Other c odes use rece nt theoretical advances in chann el codes to p roduce near- lossless codes that achiev e any p oint in the SW rate- region, but can not give guara ntees on computation al complexity (e.g. [13]). C. Linear co des over r eal ﬁelds As mentioned in the introductio n, Compressed Sensing co des operate over real (an d comp lex) ﬁelds, and are structurally sim ilar to the codes p roposed in this work. Th e pr imary difference betwee n the two sets of results is that our f ocus is o n achieving in formatio n-theor etically optimal perform ance (at th e co st of p otentially hig h decoding complexity), wher eas CS codes have lower decodin g complexity at the cost of non-o ptimal rates. Some in triguing results on CS cod es can b e fo und in [14], [8]. Concurren tly , code s over the real ﬁeld R also seem to have applicatio ns fo r the chan nel codin g pro blem. Using signiﬁcantly different techniques, T ao et al. [15] obtained ch annel codes tha t c an be decoded solving a line ar progr am (LP). Also, lattice codes hav e be en shown to ach iev e capacity for the A WGN chann el [16]. I I I . R S W C M O D E L As is common in the SW literature [11], we focus on just the point ( H ( X ) , H ( Y | X )) in the SW rate-region. T ime-sharing betwe en th is and the symmetric point ( H ( X | Y ) , H ( Y )) enab les us to achieve all points in the r ate- region. Thus Xavier encod es h is data X using a classical lossless source c ode, an d Zorba dec odes it losslessly . W e hencefo rth discuss only Yvonne’ s RSWC enco der fo r Y and Zorba’ s corr espondin g decoder . In Section VI we show how to generalize o ur proo f techniqu es to get codes that achie ve any point in the SW rate-region witho ut time-sharing . W e co nsider on ly X an d Y tha t are or dered ﬁn ite sub sets o f R . RSWC Encoder: W e deﬁne an R m × n encodin g matrix D . Here m is a code-desig n param eter to be speciﬁed later , and D is chosen as follows. Each component D ij of D is chosen ra ndomly fr om a ﬁnite set D . More precisely , each element of D is chosen i.i.d . from D accor ding to a d istribution p D . The set D can be any arbitr ary ﬁnite subset of R , an d the distribution p D can be ch osen arbitrarily on D , as long as th e probab ility of at least two elements of D is non-zero . For ease of proo f, we assum e that p D is zero -mean – the m ore gen eral case requires o nly small changes in the proo f details. The p articular values o f D an d p D can be chosen accor ding to the ap plication. W e denote th e i -th row of D b y D i . For a ﬁxed block- length n , Yvonne’ s data is arranged as a colu mn vector Y △ = ( Y 1 , Y 2 , · · · , Y n ) T . T o enco de, Y is multiplied by D to get a length- m real vecto r U △ = D Y . W e de note the r eal interval ( − n 0 . 5+ ǫ , n 0 . 5+ ǫ ) by I q . Each compon ent U i of U is unifor mly qu antized by di viding I q into steps of size ∆ n = 2 n − ǫ . Thu s ⌈ (0 . 5 + 2 ǫ ) log n ⌉ Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 5 bits sufﬁce fo r this quantizatio n. Note th at the values outside the ran ge I q are quantized to the farth est q uantization lev els fro m or igin. Here and throu ghout the paper lo g( . ) deno tes the binary logarithm, an d ǫ is a code- design parameter that can be used to trade off between the probab ility of error and th e rate of th e RSWC. I t can be ch osen as any arb itrarily small positiv e r eal number . The quantized value of U i is denoted by ˆ U i and the correspo nding length- m q uantized vector is denoted by ˆ U . W e take m = ⌈ ( n ( H ( Y | X ) + 3 ǫ )) / (0 . 5 lo g n ) ⌉ since then Yvonne’ s encoder will encode at abou t H ( Y | X ) bits per symbo l. Thus the to tal numb er of bits Yvonne transmits to Zorba equals m ⌈ (0 . 5 + 2 ǫ ) log n ⌉ , wh ich for all sufﬁciently large n can be bo unded from above by nH ( Y | X ) + ρǫn for a universal con stant ρ . RSWC Decoder: Zo rba ﬁrst dec odes X = x . Suppose he recei ved ˆ U = ˆ u from Y . He ﬁnds a vector y which is strongly ǫ - jointly typical with x , and for which d Dy = ˆ u . I f there is no such y or there is more tha n one suc h y he d eclares a d ecoding error . The ensemble of RSWC en coder-decoder pairs descr ibed ab ove is de noted by C ( ǫ, n, p X,Y , p D ) . T he pr o bability of err or o f C ( ǫ, n, p X,Y , p D ) is deﬁned as th e prob ability over p X,Y and p D that Zo rba makes or declares a decoding error . The rate of C ( ǫ, n, p X,Y , p D ) is deﬁned as the n umber o f b its th at Yvonne transmits to Zorba. W e a re now in a po sition to state and prove our main results. The pro ofs o f these results are presented in the next two sections. Theor em 3 shows that our RSWCs a chieve the corn er point ( H ( X ) , H ( Y | X )) in the Slepian- W olf rate-region with expon entially small pr obability of er ror . Theor em 3 : For all suf ﬁciently large n there are universal positive constants c, ρ , such th at th e p robability o f error un der typicality deco ding and rate of C ( ǫ, n, p X,Y , p D ) are a t mo st 2 − cn/ log n and H ( Y | X ) + ρǫ respectively . W e n ext show that Yvonne’ s decoding ca n be done by solving an IP . Theor em 4 : If Yv onne’ s source is binary , then the typicality decodin g of a RSWC for the po int ( H ( X ) , H ( Y | X )) is e quiv alent to solving an IP . Further, we show that even fo r discrete memoryless so urces over larger alphabet Y , the encoder can be imp le- mented as a series of RSWC encode rs each of wh ich is fo r a deriv ed binary source. Then the typ icality deco der can be implemented as a series of deco ders each of which is equiv alent to solving an IP . Theor em 5 : For any ﬁnite alphabet Y , the real SW enco ding can be do ne u sing |Y | − 1 RSWC en coders so tha t the typicality d ecoder can b e implemented b y solv ing |Y | − 1 I Ps. For any ra te-pair in the Slepian- W olf rate-r egion, a d irect constru ction o f the individual RSW C encoders for Xavier and Yv onne without time-shar ing between the corner points is presented in Section VI. It is shown that RSWCs constructed th is way also achieve the Slepian-W olf rate-r egion. Theor em 6 : Any po int in the Slepian-W olf rate-region ca n be achieved directly b y RSWCs without time-shar ing. W e also show th at RSWCs can be de coded by minimum entr opy de coding . Theor em 7 : For all suf ﬁciently large n there are universal positive constants c, ρ , such th at th e p robability o f error under min imum entro py deco ding and rate of C ( ǫ, n, p X,Y , p D ) are at most 2 − cn/ l og n and H ( Y | X ) + ρǫ Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 6 respectively . It is argued in Sectio n VI II that the achievable ra te-region of the mo re g eneral class o f source networks known as normal so ur ce networks without helpers [9] is also ac hiev ed by our RSWCs. Theor em 8 : Random RSWCs achieve the rate region of any n ormal sou rce network withou t help ers. The above results will be pr oved in the sub sequent sections. In the rest of the pap er , for simplicity of exposition many different con stants, in depende nt of n , w ill b e d enoted b y th e same sym bol “ c ”. I V . P RO O F O F T H E O R E M 3 The probability of decoding err or is g i ven by P n e ≤ P 1 + P 2 (2) where P 1 is the p robability that ( X , Y ) a re not stro ngly jointly ǫ -ty pical, and P 2 is the p robability that ( X , Y ) ∈ A ǫ , but there is another y ′ 6 = y such that ( X , y ′ ) ∈ A ǫ , an d d D Y = d Dy ′ . Bounding P 1 : For P 1 , note that f or any no n-typical sequ ence ( x , y ) , its type p ( x , y ) satisﬁes | p X,Y − p ( x , y ) | 1 ≥ ǫ/ |X ||Y | . So, usin g D ( p X,Y || p ( x , y ) ) ≥ | p X,Y − p ( x , y ) | 2 1 / (2 ln 2) [1 1, Le mma 12.6.1] an d Sanov’ s theore m [11, Theorem 12.4.1 ], we have P 1 ≤ ( n + 1) |X ||Y | exp  − n. ǫ 2 2 |X | 2 |Y | 2  ≤ 2 − cn (3) for some p ositiv e co nstant c . The re st of this section focu ses on boun ding P 2 in (2). Lemma 8 Eq. (3) Sanov’s Theorem Berry−Esseen Theorem Theorem 3 Lemma 9 Lemma 10 Lemma 6 Lemma 7 Fig. 1. Dependen ce structure of Lemmas Bounding P 2 : I n the follo wing, we present a sequence o f le mmas leading to Lemma 12, which gi ves a bound on P 2 . A depende ncy “graph ” of lemmas is shown in Fig. 1 to ease u nderstand ing. W e start by a general lemm a proved in the Ap pendix . Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 7 Lemma 9 : Let W 1 , W 2 , · · · , W n be a sequen ce o f i. i.d. zero -mean rand om variables taking values fro m W , and a △ = max {| w || w ∈ W } . Then fo r any positive constant A , P r (      n X i =1 W i      > A ) ≤ 2( n + 1) |W | exp  − A 2 2 na 2  W e n ow show some properties o f o ur quantization of U i = D i Y . Lemma 1 0: There exists a positive c onstant c so that fo r any y ∈ Y n , Pr {| D i y | > n 0 . 5+ ǫ } ≤ 2 − cn 2 ǫ . Proof: Let y max be the element in Y with m aximum ab solute value. For any y ∈ Y , let S y be the set o f ind ices j such tha t y j = y , i.e ., S y △ = { j | y j = y } . If | D i y | =    P y ∈Y  P j ∈ S y D ij y j     > n 0 . 5+ ǫ then f or at least one y , | P j ∈ S y D ij y j | > (1 / |Y | ) n 0 . 5+ ǫ . So, Pr {| D i y | > n 0 . 5+ ǫ } ≤ Pr          X j ∈ S y D ij y j       > 1 |Y | n 0 . 5+ ǫ for at least o ne y    ≤ X y ∈Y Pr          X j ∈ S y D ij y j       > 1 |Y | n 0 . 5+ ǫ    = X y ∈Y Pr          X j ∈ S y D ij       > 1 |Y || y | n 0 . 5+ ǫ    ≤ X y ∈Y  2( | S y | + 1) |D | exp  − 1 2 | S y | α 2 1 |Y | 2 | y | 2 n 1+2 ǫ  (4) ≤ X y ∈Y 2( n + 1) |D | exp  − n 1+2 ǫ 2 nα 2 |Y | 2 | y max | 2  (5) ≤ |Y | 2( n + 1) |D | exp  − n 2 ǫ 2 α 2 |Y | 2 | y max | 2  ≤ 2 − cn 2 ǫ for so me constant c , for la rge enough n , and where α = max {| d || d ∈ D } . Here (4) f ollows f rom L emma 9, and (5) follows from | S y | ≤ n and | y max | ≥ | y | ∀ y ∈ Y .  The following lemma giv es, fo r two different y , y ′ ∈ Y n , an u pper b ound on the pr obability that d D i y = [ D i y ′ . Let p ± denote the minimum o f P r { D ij > 0 } and P r { D ij < 0 } . Since D ij has zer o mean and has at least two symbols with non-ze ro probab ility , it follows that p ± 6 = 0 . Lemma 1 1: If y ∈ Y n and y ′ ∈ Y n differ in t compon ents then P r {| D i ( y − y ′ ) | < ∆ n } ≤ min  1 − p ± , c √ t  for some ﬁxed constant c ∈ R . Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 8 Proof: Let b y be the smallest differ en ce in Y , i.e., b y △ = min y 1 ,y 2 ∈Y ,y 1 6 = y 2 | y 1 − y 2 | . W e d enote th e j -th comp onent ( y j − y ′ j ) of y − y ′ by α j . T hen th ere are t n onzero α j , a nd w .l.o.g., we assume that α 1 , α 2 , . . . , α t 6 = 0 . No te that |{ y − y ′ | y , y ′ ∈ Y , y 6 = y ′ }| ≤ |Y | 2 . So there a re at least τ △ = t/ |Y | 2 elements among α 1 , α 2 , . . . , α t which are the same. Let us assum e, w .l. o.g., that α 1 = α 2 = · · · = α τ . L et σ 2 be the variance of D ij . T hen th e rand om variables V 1 = α 1 D i 1 , V 2 = α 2 D i 2 , . . . , V τ = α τ D iτ are i.i.d . with zero me an and variance σ ′ 2 = | α 1 | 2 σ 2 . The central limit theorem states that the distribution o f the norm alized sum W τ = P τ j =1 V j / ( σ ′ √ τ ) appro aches the no rmal N (0 , 1) distribution as τ increases. The Berry-Esseen theo r em [17] gi ves a unifor m u pper boun d on th e deviation o f the cumulative d istribution function (cdf) o f W τ from the cdf o f N (0 , 1) . The Berry-Esseen bou nd is given b y | P r { W τ < w } − Φ( w ) | ≤ β γ σ ′ 3 √ τ , (6) for any w ∈ R . Here γ = E {| V 1 | 3 } is the th ird moment of V 1 , and β is a un iv ersal constan t wh ose value has been improved over the d ecades. W e u se the Berry -Esseen bo und to prove the lemm a as below . P r {| D i ( y − y ′ ) | < ∆ n } = P r {− ∆ n < D i ( y − y ′ ) < ∆ n } = P r  − ∆ n | α 1 | σ √ τ < D i ( y − y ′ ) | α 1 | σ √ τ < ∆ n | α 1 | σ √ τ  ≤ P r  − ∆ n σ b y √ τ < D i ( y − y ′ ) | α 1 | σ √ τ < ∆ n σ b y √ τ  = P r ( − P n j = τ +1 D ij ( y j − y ′ j ) | α 1 | σ √ τ − ∆ n σ b y √ τ < P τ j =1 D ij ( y j − y ′ j ) | α 1 | σ √ τ < − P n j = τ +1 D ij ( y j − y ′ j ) | α 1 | σ √ τ + ∆ n σ b y √ τ ) = P r ( − P n j = τ +1 D ij ( y j − y ′ j ) | α 1 | σ √ τ − ∆ n σ b y √ τ < W τ < − P n j = τ +1 D ij ( y j − y ′ j ) | α 1 | σ √ τ + ∆ n σ b y √ τ ) (7) ≤ 2∆ n σ b y √ τ √ 2 π + 2 × β γ σ ′ 3 √ τ (8) = c √ t Eq. (8) follows b y using the Berry-E sseen bo und (6) on the norm alized su m W τ . The ﬁrst term 2 × 1 √ 2 π × ∆ n σb y √ τ in (8) is an upper bo und on the pro bability o f N (0 , 1) lying in th e interval o f length 2 × ∆ n σb y √ τ in (7). Th is bou nd is obtained b y multiplyin g the ma ximum value 1 / √ 2 π of the p robab ility density fun ction of N (0 , 1) by the length of the inter val. The deviation of the cdf of W τ from that of N (0 , 1 ) at each bound ary point of the inter val is bou nded by the Berry-Esseen b ound . T he seco nd term in (8) is the sum of this bound at these two bou ndary po ints. For t > 0 , there is at least o ne j such that y j 6 = y ′ j . L et us assum e, w .l.o .g., that y 1 6 = y ′ 1 . For large eno ugh n , ∆ n < b y × min d ∈D ,d 6 =0 | d | . So, P r {| D i ( y − y ′ ) | < ∆ n } ≤ 1 − p ± . This can be easily ch ecked by consider ing the change in the value from P n j =2 D ij ( y j − y ′ j ) to D i ( y − y ′ ) .  Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 9 Lemma 1 2: Let y and y ′ be any two vector s differing in t co mponen ts. Th en for some constant c and a co nstant ˜ p < 1 , both indepen dent of y and y ′ , we have Pr { d D i y = [ D i y ′ } ≤ min  ˜ p, c √ t  for large enoug h n . Proof: Pr { d D i y = [ D i y ′ } ≤ Pr { d D i y = [ D i y ′ || D i y | ≤ n 0 . 5+ ǫ , | D i y ′ | ≤ n 0 . 5+ ǫ } + Pr {| D i y | > n 0 . 5+ ǫ } + Pr {| D i y ′ | > n 0 . 5+ ǫ } ≤ Pr {| D i ( y − y ′ ) | < ∆ n } + Pr {| D i y | > n 0 . 5+ ǫ } + Pr {| D i y ′ | > n 0 . 5+ ǫ } ≤ min  1 − p ± , c √ t  + 2(2 − cn 2 ǫ ) (9) for large enoug h n . The second term in (9) is o btained by ap plying L emma 9 on the last two te rms in th e p revious line. For any co nstant c ′ > c , we have c √ t + 2(2 − cn 2 ǫ ) < c ′ √ t for large enough n . Also, for any ˜ p > 1 − p ± , 1 − p ± + 2(2 − cn 2 ǫ ) < ˜ p for large enoug h n . So the result f ollows.  W e ar e n ow ready to p resent an up per bo und o n P 2 . Lemma 1 3: For large eno ugh n , P 2 ≤ 2 − cn/ log n , (10) where c is a co nstant. Proof: P 2 = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) P r n ∃ y ′ 6 = y s. t. d Dy ′ = d Dy , ( x , y ′ ) ∈ A ǫ o ≤ X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X y ′ 6 = y ( x , y ′ ) ∈ A ǫ P r n d Dy ′ = d Dy o (11) = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t> 0 X ( x , y ′ ) ∈ A ǫ d H ( y , y ′ )= t  P r { [ D 1 y ′ = d D 1 y }  m (12) ≤ X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t> 0 X ( x , y ′ ) ∈ A ǫ d H ( y , y ′ )= t  min  ˜ p, c √ t  m (13) = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t> 0 N x , y ( t )  min  ˜ p, c √ t  m (14) where N x , y ( t ) is the n umber of y ′ which are jointly typical with x and which are at Hamming distance t fro m y , i.e., N x , y ( t ) △ = |{ y ′ ∈ Y n | ( x , y ′ ) ∈ A ǫ , d H ( y , y ′ ) = t } | . Eq. (11) follows by unio n bound , Eq. (12) follows because the rows of D a re i.i.d. , and Eq. (13) follows f rom Lemma 12. For t > 0 , let N ( t ) deno te the maximum of N x , y ( t ) over all possible typic al ( x , y ) p airs, i.e., N ( t ) △ = max ( x , y ) ∈ A ǫ N x , y ( t ) . Fur ther, let t n denote the value of t fo r which the expression inside the second summation in (14) takes the maximu m value f or some typical ( x , y ) , Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 10 i.e., t n △ = ar g max t> 0  N ( t )  min  ˜ p, c/ √ t  m  . T he subscr ipt in t n is to emphasize that it is a function o f n . Then by substituting in (1 4), P 2 ≤ nN ( t n )  min  ˜ p, c √ t n  m . W e em phasize he re th at e very app earance of “ c ” may denote a d ifferent constant in the following. For any δ ≤ ǫ/ 2 ( H ( Y | X ) + 3 ǫ ) , we consider two regimes: (1 ) t n > n 1 − δ and (2 ) t n ≤ n 1 − δ . In the ﬁrst regime, we use the b ound s N ( t n ) ≤ 2 n ( H ( Y | X )+2 ǫ ) [11, Theorem 14 .2.2], P r { d D i y 6 = [ D i y ′ } ≤ c/ √ t , an d m = ⌈ ( n ( H ( Y | X ) + 3 ǫ )) / (0 . 5 lo g n ) ⌉ to get, for large eno ugh n , log( P 2 ) ≤ log n + log N ( t n ) − n ( H ( Y | X ) + 3 ǫ ) 0 . 5 log n ((0 . 5 − 0 . 5 δ ) log n − log c ) = n ( H ( Y | X ) + 2 ǫ ) − n ( H ( Y | X ) + 3 ǫ )(1 − δ ) + n ( H ( Y | X ) + 3 ǫ ) 0 . 5 log n c + log n (15) = − n ( ǫ − δ ( H ( Y | X ) + 3 ǫ )) + n  ( H ( Y | X ) + 3 ǫ ) 0 . 5 log n c + log n n  . Now , using δ ≤ ǫ/ 2( H ( Y | X ) + 3 ǫ ) and ( c ( H ( Y | X ) + 3 ǫ ) / 0 . 5 log n + (log n ) /n ) < ǫ / 4 fo r suf ﬁciently large n , we get log( P 2 ) ≤ − nǫ 2 + nǫ 4 = − nǫ 4 . (16) In the regime t n ≤ n 1 − δ , we use the boun ds N ( t n ) < ( |Y | − 1) t n  n t n  < ( |Y | n ) t n , and P r { d D i y 6 = [ D i y ′ } ≤ ˜ p to get log( P 2 ) ≤ log n + t n log n + t n log |Y | − n ( H ( Y | X ) + 3 ǫ ) log n log  1 ˜ p  ≤ log n + n 1 − δ log n + n 1 − δ log |Y | − cn log n , (17) where c = ( H ( Y | X ) + 3 ǫ ) log(1 / ˜ p ) . For large enou gh n , (log n ) 2 < cn ( δ/ 2) / 3 ⇒ lo g n < cn ( δ/ 2) / (3 log( n )) . Also, for large enoug h n , n − δ log |Y | < c/ (3 log( n )) for some con stant c . So, for some constant c ′ , log( P 2 ) ≤ log n − c ′ n 3 log n ≤ − cn log n (18) for large enoug h n and for some constant c . Since cn/ log n < nǫ / 4 for large enough n , th e result fo llows by combin ing (16) and (18).  From (2), (3), an d ( 10), we h av e, f or large enough n , P n e ≤ P 1 + P 2 ≤ 2 P 2 ≤ 2 − cn/ l og n , for a constant c , thu s completing the pr oof o f Th eorem 3. Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 11 V . P R O O F O F T H E O R E M 4 A N D T H E O R E M 5 W e ﬁrst show tha t for Y = { 0 , 1 } the ty picality decoding of ou r sch eme can be done via the solution of an IP . Recall that for a vector y , we deﬁned, for any y ∈ Y , S y = { i | y i = y } . Similar ly with abuse of notation, for any vector x = ( x 1 , . . . , x n ) deco ded b y Zor ba, and x ∈ X , let us deﬁne S x = { i | x i = x } . The con straint ( x , y ) ∈ A ǫ can be written a s th e linear constraints p (1 , x ) − ǫ |X ||Y | ≤ 1 n X i ∈ S x y i ≤ p (1 , x ) + ǫ |X ||Y | , ∀ x ∈ X Moreover , the con straints d Dy = ˆ u = ( ˆ u 1 , . . . , ˆ u n ) can be wr itten as ˆ u i − ∆ n / 2 ≤ D i y ≤ ˆ u i + ∆ n / 2 , ∀ i = 1 , . . . m. Finally w e add the ‘integrality’ constrain ts, nam ely , that y ∈ Y n . For arbitrary ﬁnite alphabets Y , Yvonne and Zorba perfo rm |Y | − 1 enc oding an d decod ing stag es, each of which in volves IP decod ing of a binar y vector . A sketch f ollows. Let y (1) , . . . , y ( |Y | ) denote the distinct values of Y . I n the ﬁrst stage, instead of enco ding y directly , Yvonne uses C ( ǫ, n, p 1 X,Y , p D ) to encode the vector f 1 ( y ) . Here the vector f 1 ( y ) equa ls 1 in the locations th at y equals y (1) and equals 0 otherwise, and p 1 X,Y is the co rrespon ding indu ced d istribution p X,f 1 ( Y ) deﬁned on X × { 0 , 1 } . Since f 1 ( y ) is a bina ry vector , Zor ba can use th e IP d ecoding described above, and theref ore can retrieve the locatio ns where y equals y (1) . Ind uctively , in the i th stage, Yvonne uses C ( ǫ, n ( i ) , p i X,Y , p D ) to encode the vecto r f i ( y ) . Here n ( i ) e quals the number o f locatio ns whose values a re still und etermined befo re the i th stage, i.e., n ( i ) equals |{ j | y j ≥ y ( i ) }| . The leng th- n ( i ) vector f i ( y ) is obtained by ﬁrst throwing a way the locations in f i − 1 ( y ) that equalled 1 , and then marking the r emaining locations 1 if and on ly if the corr espondin g locatio ns in y eq ual y ( i ) . At each stage, Zorba can use the IP decoding described above, and therefore can retrieve the loc ations where y equals y ( i ) . Let f i ( Y ) d enote the corresp onding bin ary random variable s. t. ( X , f i ( Y )) has the joint distribution giv en by p i X,Y ( x, 1) = P r { X = x, Y = y ( i ) | Y 6 = y (1) , y (2) , . . . , y ( i − 1) } , and p i X,Y ( x, 0) = P r { X = x, Y 6 = y ( i ) | Y 6 = y (1) , y (2) , . . . , y ( i − 1) } . T hen by a direct extension of the group ing axiom [18, Page 8], we have H ( Y | X ) = H ( f 1 ( Y ) | X ) + (1 − p Y ( y (1) )) H ( f 1 ( Y ) | X ) + (1 − p Y ( y (1) ) − p Y ( y (2) )) H ( f 2 ( Y ) | X ) + . . . +( p Y ( y ( |Y |− 1) ) + p Y ( y ( |Y | ) )) H ( f |Y |− 1 ( Y ) | X ) . (19) Clearly , f or a sing le stage enco ding/dec oding, the average codeleng th for Yvonne is boun ded b y nH ( Y | X ) + cǫn . For a mu lti-stage encod ing/deco ding as described above, for a typical y , the block length at the i -th stage is bound ed by n ( i ) ≤ n (1 − P r { Y ∈ { y (1) , y (2) , . . . , y ( i − 1) }} + ǫ ) and so the cod elength is boun ded as L i ≤ n (1 − P r { Y ∈ { y (1) , y (2) , . . . , y ( i − 1) }} + ǫ ) H ( f i ( Y ) | X ) + c i ǫn for some c onstants c i . The a verage cod elength is thu s bou nded using (19) by L ≤ |Y |− 1 X i =1 L i ≤ nH ( Y | X ) + cǫn (20) Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 12 for some constant c . If y is n ot typical, then in the worst case, the co delength n ( i ) = n for each i . Then th e overall codeleng th is bou nded b y L ≤ c ′ n for some constant c ′ . Since the probab ility of the non-ty pical set is expon entially small, th e overall average cod elength is still bou nded by (20) for some co nstant c . Hence the overall rate o f this multistage RSWC d iffers from H ( Y | X ) by at most cǫ , where c is som e constant d ependen t only o n p X,Y . The overall p robability of error can b e b ounded as P n e ≤ P 1 + |Y | X i =1 P 2 ,i , (21) where P 1 is the pro bability that th e vector y is no t stron gly ty pical, and P 2 ,i is the cond itional p robability o f erro r at the i - th stag e of deco ding given th at the vector y is strong ly ǫ -typical and the d ecoding till the ( i − 1 ) -th stage is corr ect. If y is stro ngly ǫ -typ ical, then the codelength at the i -th stage is n ( i ) ≥ n × P |Y | j = i ( P Y ( y ( i ) ) − ǫ/ |Y | ) ≥ n ( P Y ( y ( |Y | ) ) − ǫ/ |Y | ) . So, P 2 ,i ≤ exp  − c ′ n ( i ) log( n ( i ))  ≤ exp  − c ′ n ( i ) log n  ≤ exp  − c ′ ( P Y ( |Y | ) − ǫ/ |Y | ) n log n  . Since P 1 is also expon entially small, the overall p robability of err or for the mu ltistage encoding/d ecoding is bounded as P n e ≤ exp  − cn log n  .  V I . R E A L S W C O D I N G W I T H O U T T I M E S H A R I N G Any rate-p air in the SW rate -region can also b e directly achieved by RSWCs withou t timesharing be tween the schemes a chieving the rate-pa irs ( H ( X | Y ) , H ( Y )) an d ( H ( X ) , H ( Y | X )) . Let ( R 1 , R 2 ) be a rate-pair in th e SW rate-region. Let m 1 = ⌈ ( n ( R 1 + 3 ǫ )) / (0 . 5 log n ) ⌉ and m 2 = ⌈ ( n ( R 2 + 3 ǫ )) / (0 . 5 log n ) ⌉ . Similar to th e encod ing scheme of Yvonne de scribed in Section I II, Xa vier choo ses an m 1 × n enco der matr ix D 1 over D according to a distribution P D . Similarly Yv onne ch ooses a r andom m 2 × n enco der m atrix D 2 over D according to th e distribution P D 1 . Xavier encod es the length- n vector X by quan tizing each comp onent of U 1 △ = D 1 X uniformly in th e ran ge I q with step- size ∆ n = 2 n − ǫ to o btain the vector b U 1 . Similar ly , Yvonne enco des the len gth- n vector Y by qu antizing each comp onent of U 2 △ = D 2 Y un iformly in the range I q with step size ∆ n = 2 n − ǫ to obtain the vector b U 2 . 1 Our ar guments go through ev en if the elemen ts of D 1 and D 2 are chosen from dif ferent sets D 1 and D 2 accordi ng to some distr ibuti ons. W e restri ct to D 1 = D 2 and the same distribut ion for the ele ments of D 1 and D 2 for simplic ity . Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 13 Zorba ﬁnd s a uniq ue jointly strongly ǫ - typical pair ( x , y ) so that d D 1 x = b U 1 and d D 2 y = b U 2 . I f ther e is no such pair , or if there are more than one such pair , then the decoder declares an error . The p robability of error can be bound ed as P n e ≤ P 1 + P 21 + P 22 + P 23 , (22) where P 1 , a s before, is the pro bability that ( X , Y ) is no t jointly strongly ǫ -typical, P 21 is the pr obability that there is a x ′ 6 = X whic h is a lso jointly strongly ǫ -typical with Y and [ D 1 x ′ = b U 1 , P 22 is the prob ability that there is a y ′ 6 = Y which is a lso jointly strongly ǫ -typical with X and [ D 2 y ′ = b U 2 , and P 23 is the pr obability tha t there is another jointly typ ical p air ( x ′ , y ′ ) so that x ′ 6 = X , y ′ 6 = Y , [ D 1 x ′ = b U 1 and [ D 2 y ′ = b U 2 . W e now in vestigate all the terms in (22). Let D 1 ,i and D 2 ,i denote the i -th rows of the matrices D 1 and D 2 respectively . Similarly a s Lem ma 12, we have P r { \ D 1 ,i x = \ D 1 ,i x ′ } , P r { \ D 2 ,i y = \ D 2 ,i y ′ } ≤ min  ˜ p, c 1 √ t  , when each pair x , x ′ ∈ X n and y , y ′ ∈ Y n differ in t positions. W e d eﬁne the fo llowing f unctions. m ( R ) △ =  n ( R + 3 ǫ ) 0 . 5 log n  , φ 1 ( h, R, δ ) △ = log n 2 n ( h +2 ǫ )  c √ n 1 − δ  m ( R ) ! , and φ 2 ( L, R, δ ) △ = log  n ( Ln ) n 1 − δ ( ˜ p ) m ( R )  . Note th at in this notation, P 2 in Lemma 13 is g iv en by log( P 2 ) ≤ φ 1 ( H ( Y | X ) , H ( Y | X ) , δ ) (23) for t n > n 1 − δ (See (15)). As sh own in ( 16), this is at most − nǫ/ 4 for δ ≤ ǫ/ 2( H ( Y | X ) + 3 ǫ ) fo r large enou gh n . It can b e checked similarly that fo r δ ≤ ǫ/ 2( R + 3 ǫ ) , R ≥ h , and large eno ugh n , φ 1 ( h, R, δ ) ≤ − n (( R − h ) + ǫ/ 4) . Like wise, fo r t n ≤ n 1 − δ , it is shown ( See (17)) that log( P 2 ) ≤ φ 2 ( |Y | , H ( Y | X ) , δ ) , (24) which is at m ost − cn/ log n (See (18)). More generally , it can be similarly proved that for any constants L > 0 and δ > 0 , φ 2 ( L, R, δ ) ≤ − c ( R, ǫ ) n log n for some c onstant c ( R, ǫ ) > 0 and for large en ough n . By deﬁnition, P 22 = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) P r n ∃ y ′ 6 = y s. t. d Dy ′ = d Dy , ( x , y ′ ) ∈ A ǫ o . Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 14 By similar argu ments to those in the p roof of Lemm a 13, we h av e log( P 22 ) ≤ φ 2 ( |Y | , R 2 , δ ) fo r t n ≤ n 1 − δ , and log( P 22 ) ≤ φ 1 ( H ( Y | X ) , R 2 , δ ) for t n > n 1 − δ . Since R 2 ≥ H ( Y | X ) , it follows that for large eno ugh n , log( P 22 ) ≤ − c ( R 2 , ǫ ) n log n . (25) Similarly , for la rge eno ugh n , log( P 21 ) ≤ − c ( R 1 , ǫ ) n log n . (26) As in the pr oof o f Lemma 1 3, P 23 can be simpliﬁed to (27) be low , P 23 = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) P r n ∃ ( x ′ , y ′ ) s. t. x ′ 6 = x , y ′ 6 = y , [ D 1 x ′ = d D 1 x , [ D 2 y ′ = d D 2 y , ( x ′ , y ′ ) ∈ A ǫ o ≤ X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X x ′ 6 = x , y ′ 6 = y ( x ′ , y ′ ) ∈ A ǫ P r n [ D 1 x ′ = d D 1 x , [ D 2 y ′ = d D 2 y o = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t 1 > 0 , t 2 > 0 X ( x ′ , y ′ ) ∈ A ǫ d H ( x , x ′ )= t 1 , d H ( y , y ′ )= t 2 P r n [ D 1 x ′ = d D 1 x , [ D 2 y ′ = d D 2 y o = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t 1 > 0 , t 2 > 0 X ( x ′ , y ′ ) ∈ A ǫ d H ( x , x ′ )= t 1 , d H ( y , y ′ )= t 2  P r { \ D 1 , 1 x ′ = \ D 1 , 1 x }  m ( R 1 )  P r { \ D 2 , 1 y ′ = \ D 2 , 1 y }  m ( R 2 ) ≤ X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t 1 > 0 , t 2 > 0 X ( x ′ , y ′ ) ∈ A ǫ d H ( x , x ′ )= t 1 , d H ( y , y ′ )= t 2  min  ˜ p, c √ t 1  m ( R 1 )  min  ˜ p, c √ t 2  m ( R 2 ) = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t 1 , t 2 > 0 N x , y ( t 1 , t 2 ) Q m ( R 1 ) 1 Q m ( R 2 ) 2 . (27) In ( 27), Q 1 = min  ˜ p, c/ √ t 1  , Q 2 = min  ˜ p, c/ √ t 2  , and N x , y ( t 1 , t 2 ) is the number o f jo intly ty pical ( x ′ , y ′ ) pairs such that x ′ differs from x at t 1 locations a nd y ′ differs from y at t 2 locations, that is, N x , y ( t 1 , t 2 ) △ = |{ ( x ′ , y ′ ) ∈ X n × Y n | ( x ′ , y ′ ) ∈ A ǫ , d H ( x , x ′ ) = t 1 , d H ( y , y ′ ) = t 2 }| . W e deﬁn e N ( t 1 , t 2 ) △ = max x , y N ( x , y ) ∈ A ǫ ( t 1 , t 2 ) , and ( t 1 ,n , t 2 ,n ) as the pair ( t 1 , t 2 ) that maximizes ( N ( t 1 , t 2 ) Q m 1 1 Q m 2 2 ) , that is, ( t 1 ,n , t 2 ,n ) △ = ar g max t 1 ,t 2 > 0 ( N ( t 1 , t 2 ) Q m 1 1 Q m 2 2 ) . Then P 23 ≤ n 2 N ( t 1 ,n , t 2 ,n ) Q m 1 1 Q m 2 2 . For δ < ǫ/ 2( R 1 + R 2 + 3 ǫ ) , we consider four cases. Case I: t 1 ,n > n 1 − δ , t 2 ,n > n 1 − δ . In this case, u sing t he bounds N ( t 1 ,n , t 2 ,n ) ≤ 2 n ( H ( X , Y )+ ǫ ) , Q 1 ≤ c/ √ t 1 ,n , Q 2 ≤ c/ √ t 2 ,n , we have log( P 23 ) ≤ φ 1 ( H ( X , Y ) , R 1 + R 2 , δ ) ≤ − n ( R 1 + R 2 − H ( X , Y ) + ǫ/ 4) ≤ − nǫ/ 4 . (28) Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 15 Case II: t 1 ,n ≤ n 1 − δ , t 2 ,n ≤ n 1 − δ . In this case, using the bo unds N ( t 1 ,n , t 2 ,n ) ≤ ( |X | n ) t 1 ,n ( |Y | n ) t 2 ,n , Q 1 ≤ ˜ p, Q 2 ≤ ˜ p , we have log( P 23 ) ≤ φ 2 ( |X | , R 1 , δ ) + φ 2 ( |Y | , R 2 , δ ) ≤ − c ( R 1 , ǫ ) n log n − c ( R 2 , ǫ ) n log n ≤ − c ( R 1 , R 2 , ǫ ) n log n , (29) where c ( R 1 , R 2 , ǫ ) = c ( R 1 , ǫ ) + c ( R 2 , ǫ ) . Case III: t 1 ,n > n 1 − δ , t 2 ,n ≤ n 1 − δ . In this case, u sing t he bounds N ( t 1 ,n , t 2 ,n ) ≤ 2 n ( H ( X | Y )+2 ǫ ) ( |Y | n ) t 2 ,n , Q 1 ≤ c/ √ t 1 ,n , Q 2 ≤ ˜ p , we have log( P 23 ) ≤ φ 1 ( H ( X | Y ) , R 1 , δ ) + φ 2 ( |Y | , R 2 , δ ) ≤ − n ( R 1 − H ( X | Y ) + ǫ/ 4) − c ( R 2 , ǫ ) n log n ≤ − c ( R 1 , R 2 , ǫ ) n log n . (30) Case I V : t 1 ,n ≤ n 1 − δ , t 2 ,n > n 1 − δ : As in Case III , we have log( P 23 ) ≤ − c ( R 1 , R 2 , ǫ ) n log n . (31) From (3), (22), (25), ( 26), ( 28), ( 29), ( 30), an d (31), we h av e, P n e ≤ 2 − cn/ log n for some c onstant c . V I I . U N I V E R S A L D E C O D I N G : P RO O F O F T H E O R E M 7 An en coding or deco ding oper ation is said to be u niversal in a class o f sou rces if the encoding/d ecoding operation can be chosen without the knowledge of the exact source statistics in th e class. The encoding fo r RSWCs without time-sharing in Section VI r esults in u niv ersal encoding in th e class of i. i.d. sou rces. The two enco ders may choo se to encod e at rates R 1 and R 2 and choose their e ncoding matrices rando mly without the knowledge of the distribution of either source. The joint typicality decoding discussed earlier will be a ble to recover both the sequence s with exponentially small pr obability of error as long as the r ate pair ( R 1 , R 2 ) lies in the Slepian -W olf rate region o f the sources. Howev er , thou gh th e encod ers are u niv ersal, the joint ty picality decodin g is not universal since it re quires the decode r to know the jo int distribution of the sou rces. In this section , we show th at th e well known universal minim um entropy d ecoding ( MED) [9] which does not need the joint distribution of the sources will also be able to decod e o ur cod e w ith expo nentially small probab ility of error provided m 1 ≥ ⌈ n ( R 1 + 4 ǫ ) / (0 . 5 lo g n ) ⌉ and m 2 ≥ ⌈ n ( R 2 + 4 ǫ ) / (0 . 5 lo g n ) ⌉ fo r so me ( R 1 , R 2 ) in the Slepian-W olf rate-r egion of the sources. H ere, the d ecoder ﬁnds the pair ( x , y ) w ith m inimum empirical en tropy which satisﬁes the co nditions d D 1 x = b U 1 and d D 2 y = b U 2 . I f there are mo re than o ne such p air then the deco der declares a decoding err or . Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 16 Before inv estigating the pro bability of erro r under min imum entropy decodin g, let us deﬁne a weakly ǫ -typical vector ( x , y ) as o ne satisfy ing   log 2 ( p n X,Y ( x , y )) + nH ( X , Y )   ≤ nǫ , | log 2 ( p n X ( x )) + nH ( X ) | ≤ nǫ, and | log 2 ( p n Y ( y )) + nH ( Y ) | ≤ nǫ. The set of weakly ǫ -typical vectors will be deno ted b y A ǫ,w eak . A weakly ǫ -typical vector x (similar ly y ) is deﬁned as one satisfy ing | log 2 ( p n X ( x )) + nH ( X ) | ≤ nǫ. The properties o f th e weakly typical set ma y b e found in [1 1]. Let u s denote th e joint en tropy of the type of a pair o f vectors ( x , y ) as H ( x , y ) , the correspon ding conditio nal entropies as H ( x | y ) and H ( y | x ) , and th e individual entrop ies of the vectors as H ( x ) and H ( y ) . The pr obability of error of a m inimum entropy deco der is bou nded as P n e ( M E D ) ≤ P ′ 1 + P ′ 21 + P ′ 22 + P ′ 23 (32) where P ′ 1 is the probab ility that ( X , Y ) is no t jointly weakly ǫ -typ ical, P 21 is the probab ility that there is a x ′ 6 = X such that H ( x ′ , Y ) ≤ H ( X , Y ) a nd [ D 1 x ′ = b U 1 , P 22 is the probability th at th ere is a y ′ 6 = Y such that H ( X , y ′ ) ≤ H ( X , Y ) and [ D 2 y ′ = b U 2 , and P 23 is the probability th at there is an other pair ( x ′ , y ′ ) so that x ′ 6 = X , y ′ 6 = Y , [ D 1 x ′ = b U 1 , [ D 2 y ′ = b U 2 and H ( x ′ , y ′ ) ≤ H ( X , Y ) . W e will brieﬂy discuss all the term s in (32). By d eﬁnition, P ′ 1 = P r { A c ǫ,w eak } . Since the weakly ǫ -typical set is a sup erset of the stron gly ǫ ′ ( ǫ, p X,Y ) -typical set for so me ǫ ′ ( ǫ, p X,Y ) [19], P ′ 1 can be bo unded similar to ( 3) as P ′ 1 ≤ 2 − cn (33) where the co nstant c d epends o n p X,Y . Follo wing similar steps as the pro of of Lemma 13, we hav e P 22 = X ( x , y ) ∈ A ǫ p X,Y ( x , y ) X t> 0 N ′ x , y ( t )  min  ˜ p, c √ t  m 2 where N ′ x , y ( t ) △ = |{ y ′ ∈ Y n | H ( y ′ | x ) ≤ H ( y | x ) , d H ( y , y ′ ) = t }| . Now , let us de ﬁne N ′ ( t ) △ = max ( x , y ) ∈ A ǫ N ′ x , y ( t ) for t > 0 , and t n △ = arg max t> 0  N ′ ( t )  min  ˜ p, c/ √ t  m 2  . Th en c learly , P 22 ≤ nN ′ ( t n )  min  ˜ p, c √ t n  m 2 . Note that fo r a given weakly typical x , the co ndition ( x , y ) ∈ A ǫ,w eak implies H ( y | x ) ≤ H ( Y | X ) + 2 ǫ . So, N ′ x , y ( t ) ⊆ |{ y ′ ∈ Y n | H ( y ′ | x ) ≤ H ( Y | X ) + 2 ǫ, d H ( y , y ′ ) = t } | . So, we can use both the bounds N ′ ( t n ) ≤ 2 n ( H ( Y | X )+3 ǫ ) and N ′ ( t n ) ≤ ( |Y | n ) t n for large eno ugh n . Then it can be shown in the same way as in the p roof Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 17 of Lemma 13 that P 22 ≤ exp( − cn/ log n ) fo r m 2 ≥ ⌈ n ( R 2 + 4 ǫ ) / (0 . 5 lo g n ) ⌉ . Similarly it can be shown th at P 21 , P 23 ≤ exp ( − cn/ log n ) for large enough n if m 1 ≥ ⌈ n ( R 1 + 4 ǫ ) / (0 . 5 lo g n ) ⌉ an d m 2 ≥ ⌈ n ( R 2 + 4 ǫ ) / (0 . 5 lo g n ) ⌉ for a rate p air ( R 1 , R 2 ) in the Slepian -W o lf rate-region . Sin ce P ′ 1 goes to zero expon entially as in (33), it follows that P n e ( M E D ) ≤ exp ( − cn/ log n ) for large enoug h n for some constant c . V I I I . G E N E R A L I Z AT I O N T O O T H E R S O U R C E N E T W O R K S : P R O O F O F T H E O R E M 8 The most simple generalizatio n of the Slepian-W olf so urce network is to m ultiple sources as shown in Fig. 2. The same proof tech nique can be used to show that the decoder can recover all the sources with expo nentially small probab ility o f error if the en coders do ran dom re al encoding at rates satisfying X i ∈L R i ≥ H ( X L | X L c ) for each L ⊆ { 1 , 2 , · · · , k } . Her e L c denotes th e com plement of L . Using the same proof technique as outlined in Sec. VI I, one can show th at the decod er can also do minimum entro py d ecoding to attain vanishing p robab ility o f error . X X X k 2 1 Encoder 1 Encoder 2 Encoder k Decoder R R R 1 2 k Fig. 2. A simple m ulti-sou rce network Csiszar and K orner [9] extended the result o f Slepian and W olf to more general sou rce network s called n ormal sour ce networks (NSN) witho ut helpers . In the fo llowing, we brieﬂy d iscuss their sour ce n etwork and argue that our coding technique can ac hiev e the achievable r ate-region of NSN without helpers. Let A , B and C deno te th e set o f sources, enco ders and decod ers respectively in the network. For any c ∈ C , let S c denote the set of sour ce n odes f rom which info rmation is rece i ved at the de coder node c . Let D c denote the set of sources wh ich are to be r eprodu ced at c . An NSN, as deﬁned in [9] and an example of which is shown in Fig. 3, is a source network wher e (i) th ere are no direct ed ges f rom the sou rces to th e d ecoders, Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 18 X X X k 2 1 Encoder 1 Encoder 2 Encoder k Decoder Decoder 3 Decoder 2 Decoder 1 l Fig. 3. Normal Source Network (ii) |A| = |B | and the edg es from A to B d eﬁne a one -to-one corr esponden ce b etween the sou rces and encod ers, (iii) all the sets S c , c ∈ C are different, a nd (iv) for each pair of o utput vertices c ′ and c ′′ , the in clusion S c ′ ⊆ S c ′′ implies D c ′ ⊆ D c ′′ . For a source a ∈ A , le t X a denote the i.i.d. data generated by th e source. Similarly , fo r a subset L ⊆ A , let X L denote the vector ( X a ) a ∈L . A source a in an N SN is called a helpe r if for some c ∈ C , a ∈ S c \ D c . Clear ly , a source network witho ut helpers satisfy S c = D c for all c ∈ C . For any encod er b ∈ B , let R b denote its encoding rate. For a source network with out h elpers, Csiszar an d K orner char acterized th e rate- region. Theor em 1 4: [9] The achiev able rate-region of an NSN without helpers equals the set of those vectors ¯ R = { R b } b ∈B which satisfy the ine qualities X b ∈L R b ≥ H  X L | X S c \L  (34) for ev ery ou tput c ∈ C and set L ⊂ S c . The achiev ability proo f of this r ate-region reduces to the achiev ability p roof of th e correspo nding rate-region fo r each of the network s obtained by takin g all the sources and one deco der . In other words, if the en coders encode at rates satisfying the co nditions in Theo rem 14, the prob ability of erro r for each decoder is n egligible. So the proof reduces to th e proof for the multiple source network as shown in Fig. 2. I t thus follows that the rate- region of any NSN without helpers is ach iev able by rand om r eal encod ing at e ach encoder . Moreover, th e r ate-region is also achiev able with minimu m entro py d ecoders. I X . C O N C L U S I O N The Real Slepian-W olf Codes analyzed here provide a novel achiev ability pr oof of the Slepian-W o lf theorem. Perhaps just as impor tantly , they d emonstrate the intriguing possibility of design of info rmation- theoretic codes via conv ex optimization techniqu es. For instance, since deco ding RSWCs is e quiv alent to solv ing an optimization problem , it is natural to consider similar “real” codes fo r problems where some function of the code simultaneo usly needs to be optimized. W e are cur rently investigating the perfor mance of RSWCs unde r mor e stru ctured choices of en coding ma trices, with the ho pe of obtaining c odes fo r which IP d ecoding is equiv alent to LP dec oding, and is th erefore computatio nally tractable. Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 19 A P P E N D I X P R O O F O F L E M M A 9 First consider P r { P n i =1 W i > A } . W e deﬁne E △ = { ( w 1 , w 2 , · · · , w n ) | P n i =1 w i > A } . Let p w denote the probab ility mass distribution of W i . Then P r ( n X i =1 W i > A ) = P r { E } = P r  p n | µ p n > A n  . Here p n denotes the typ e of ( w 1 , w 2 , · · · , w n ) an d µ p n denotes the mea n of p n . By Sanov’ s T heorem [11, The orem 12.4.1 ], we have P r ( n X i =1 W i > A ) = p n w ( E ) ≤ ( n + 1 ) |W | 2 − nD ( p ∗ n || p w ) , where p ∗ n = arg min p n : µ p n >A/n D ( p n || p w ) . Since p w has zero mean, the “nearest” distribution to p w that has mean greater than A/n in ab solute value would differ from p w in the largest absolute compo nent b y at least A/ ( an ) . So, µ p ∗ n > A/n implies | p ∗ n − p w | 1 > A/ ( an ) . W e then hav e D ( p ∗ n || p w ) ≥ (1 / 2 ln 2) | p ∗ n − p w | 2 1 > A 2 / (2( na ) 2 ln 2) by [11, Lemma 1 2.6.1] . So , P r ( n X i =1 W i > A ) ≤ ( n + 1) |W | exp  − A 2 2 na 2  . Similarly one can show that P r { P n i =1 W i < − A } ≤ ( n + 1) |W | exp  − A 2 / (2 na 2 )  . So the r esult follows.  A C K N OW L E D G M E N T S The autho rs gratefully acknowledge support from the CUHK direct g rant, the CU-MS-JL gr ant, and a grant fro m the Bharti Centre for Commu nication. W e would like to thank S. Shenvi for his interest a nd inv olvement in several stages of th is work. W e would also like to thank D. Manjunath for fr uitful discussions. R E F E R E N C E S [1] S. S hen vi, B. K. Dey , S. Jaggi, and M. Langber g, ““Rea l” slepian-w olf codes, ” in IEEE International Symposium on Information Theory (ISIT) , (T oronto, Canada), July 2008. [2] D. Slepia n and J. K. W olf, “Noiseless coding of correlat ed informati on sources, ” IEEE T ransac tions on Information Theory , vol. 19, pp. 471–480, July 1973. [3] S. Pradhan , J. Kusuma, and K. Ramchandra n, “Distrib uted compression in a dense mic rosensor network , ” IEEE Signal Pr ocessi ng Magazin e , vol. 19, pp. 51–60, March 2002. [4] I. Csisz ´ ar and P . Narayan , “Common randomness and secret key generation with a helper , ” IEEE T ransac tions on Information Theory , vol. 46, pp. 344–36 6, Mar . 2000. [5] R. Puri and K. Ramchandran, “Prism: a new robust video coding archite cture based on distrib uted compression principles, ” in Proc eedings of the Allerton Confer enc e on Communicatio ns, Contr ol, and Computing , October 2002. [6] A. J. Hoffmann, “The role of unimodularity in applying linea r inequal ities to combinatori al theorems, ” Annals of Discret e Mathematics , vol. 4, pp. 73–84, 1979. [7] I. Csiszar , “Linear codes for source s and source netw orks: error exponents, uni ve rsal coding, ” IEEE Tr ansacti ons on Informat ion Theory , vol. 28, no. 4, pp. 585–592, 1982. Novem ber 14, 2018 DRAFT DEY , JA GGI, AND LANGB ERG: “REAL ” SLEPIAN-WOLF CODES 20 [8] E. Cand ` es, J. Romber g, and T . T ao, “Rob ust uncertain ty principl es: Exact signal recon struction from highly incomplete frequen cy informati on, ” IEEE T ransactions on Information Theory , vo l. 52, pp. 489–509, February 2006. [9] I. Csiszar and J. Korne r , “T ow ards a general theory of source netw orks, ” IEEE T ransactions on Information Theory , vol . 26, no. 2, pp. 155–165, 1980. [10] C. E. Shannon, “ A mathe matica l theory of communication, ” B ell Systems T echnic al Journal , vol. 27, pp. 379–423,623–656 , 1948. [11] T . Cov er and J. Thomas, Element s of Informatio n Theory . John Wi ley and Sons, 1991. [12] J. Garcia -Frias and Y . Z hao, “Compressi on of correlat ed binary sources using turbo codes, ” IEEE Communication Letters , pp. 417–419, October 2001. [13] T . P . Coleman, A. H. Lee, M. M ´ e dard, and M. Effros, “On some ne w approaches to practi cal slepian-w olf compression inspired by channel coding, ” in Pr oceedi ngs of the Confere nce on Data Compr ession , p. 282, March 2004. [14] D. Donoho, “Compressed sensing, ” IEEE T ransactio ns on Information Theory , vol. 52, pp. 1289–1306, April 2006. [15] E. Cand ` es and T . T ao, “Decoding by linear programming, ” IEEE T ransact ions on Information Theory , vol . 51, pp. 4203–4215, December 2005. [16] R. Urbanke and B. Rimoldi, “Lattice codes can achie ve capac ity on the A WGN channel, ” IEEE Tr ansacti ons on Information Theory , vol. 44, no. 1, pp. 273–278, 1998. [17] W . Feller , An Intr oduction to Pr obabili ty Theory and Its Applications, V olume II (2nd ed.) . Ne w Y ork: John W iley & Sons, 1972. [18] R. B. Ash, Informati on Theory . Ne w Y ork: Dov er Publications, Inc., 1965. [19] R. W . Y eung, Information Theory and Net work Coding . A v ail able at http:/ /www .s pringerl ink.com/con tent/978-0-387-79233-0: Springer . Novem ber 14, 2018 DRAFT

"Real" Slepian-Wolf Codes

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment