A Class of Transformations that Polarize Symmetric Binary-Input Memoryless Channels

A Class of T ransform ations that P olarize Symmetric Binary-Input Memoryless Channels Satish Babu Korada and Eren S ¸ a¸ so˘ glu Octob er 24, 2018 Abstract A generaliza tion of Arık an’s p olar co de construction using tra nsformations of the for m G ⊗ n where G is an ℓ × ℓ matrix is co ns idered. Necessary and suﬃcient conditions ar e given fo r these tr ansformatio ns to ensure channel po larizatio n. It is shown that a lar g e class of such transformatio ns p olariz e symmetric binary -input memory less channels. 1 In tro duction P olar codes, in tro d uced b y Arık an in [1], are th e ﬁrs t prov a b ly capacit y ac h ieving co des for arbitrary symmetric bin ary-input discrete memoryless channels (B-DMC) with lo w enco d ing and d eco ding complexit y . P olar co de construction is based on the follo w ing observ ation: Let G 2 =  1 0 1 1  . (1) Consider applying the transform G ⊗ n 2 (where “ ⊗ n ” denotes the n th Kronec ker p ow er) to a b lo c k of N = 2 n bits and transmitting the outpu t through ind ep endent copies of a B-DMC W (see Figure 1). As n gro ws large, the c hannels seen b y individual b its (suitably deﬁn ed in [1 ]) start p olarizing : they app roac h either a noiseless c h annel or a p ure-noise c h annel, wh ere the fraction of c hann els b ecoming noiseless is close to the symmetric m utu al information I ( W ). It w as conjectured in [1] that p olarizat ion is a general ph emonenon, and is not restricted to the particular tran s formation G ⊗ n 2 . In this note w e giv e a p artial aﬃr mation to this conjecture. In particular, w e consid er transformations of the form G ⊗ n where G is an ℓ × ℓ m atrix for ℓ ≥ 3 and pro vid e n ecessary and su ﬃcien t conditions for s uc h G s to p olarize symmetric B-DMCs. 2 Preliminaries Let W : { 0 , 1 } → Y b e a B-DMC. Let I ( W ) ∈ [0 , 1] denote the m u tual information b et ween the input and output of W with uniform d istribution on the inputs. Also let Z ( W ) ∈ [0 , 1] denote the Bhattac haryy a parameter of W , i.e., Z ( W ) = P y ∈Y p W ( y | 0) W ( y | 1). Fix an ℓ ≥ 3 and an in v ertible ℓ × ℓ { 0 , 1 } matrix G . Consider a rand om ℓ -v ector U ℓ 1 that is un iformly d istributed o v er { 0 , 1 } ℓ . Let X ℓ 1 = U ℓ 1 G , where th e multiplicatio n is p erf orm ed o ver 1 W · · · W G ⊗ n bit 1 bit 2 · · · bit N Figure 1: GF(2). Also let Y ℓ 1 b e the output of ℓ u ses of W with the inp ut X ℓ 1 . Obs er ve now that the c hannel b et ween U ℓ 1 and Y ℓ 1 is deﬁned by the transition probabilities W ℓ ( y ℓ 1 | u ℓ 1 ) , ℓ Y i =1 W ( y i | x i ) = ℓ Y i =1 W ( y i | ( u ℓ 1 G ) i ) . Deﬁne W ( i ) : { 0 , 1 } → Y ℓ × { 0 , 1 } i − 1 as th e c h annel with input u i , output ( y ℓ 1 , u i − 1 1 ) and transition probabilities W ( i ) ( y ℓ 1 , u i − 1 1 | u i ) = 1 2 ℓ − 1 X u ℓ i +1 W ℓ ( y ℓ 1 | u ℓ 1 ) , and let Z ( i ) denote its Bhattac haryya parameter, i.e., Z ( i ) = X y ℓ 1 ,u i − 1 1 q W ( i ) ( y ℓ 1 , u i − 1 1 | 0) W ( i ) ( y ℓ 1 , u i − 1 1 | 1) . F or k ≥ 1, let W k : { 0 , 1 } → Y k denote the B-DMC with transition pr obabilities W k ( y k 1 | x ) = k Y j =1 W ( y j | x ) . Also let ˜ W ( i ) : { 0 , 1 } → Y ℓ denote the B-DMC with transition probabilities ˜ W ( i ) ( y ℓ 1 | u i ) = 1 2 ℓ − i X u ℓ i +1 W ℓ ( y ℓ 1 | 0 i − 1 1 , u ℓ i ) . (2) Observ a tion 1. If W is symmetric, then the channels W ( i ) and ˜ W ( i ) ar e e quivalent in the sense that for any ﬁxe d u i − 1 1 ther e exists a p ermutation π u i − 1 1 : Y ℓ → Y ℓ such that W ( i ) ( y ℓ 1 , u i − 1 1 | u i ) = 1 2 i − 1 ˜ W ( i ) ( π u i − 1 1 ( y ℓ 1 ) | u i ) . 2 Finally , let I ( i ) denote the m u tual information b et wee n the inp u t and output of c h annel W ( i ) . Since G is in v ertible, it is ea s y to c heck th at ℓ X i =1 I ( i ) = ℓI ( W ) . 3 P olarization W e will sa y that G is a p olarizing matrix if there exists an i ∈ { 1 , . . . , ℓ } for whic h ˜ W ( i ) is equiv alen t to W k for some k ≥ 2, in th e sense that ˜ W ( i ) ( y ℓ 1 | u i ) = c Y j ∈ A W ( y j | u i ) (3) for some constan t c and A ⊆ { 1 , . . . , ℓ } w ith | A | = k . I f W is symmetric, then Ob serv ation 1 implies the equiv alence of W ( i ) and W k (whic h w e denote b y W ( i ) ≡ W k ) in the sense that W ( i ) ( y ℓ 1 , u i − 1 1 | u i ) = c 2 i − 1 Y j ∈ A W (( π u i − 1 1 ( y ℓ 1 )) j | u i ) . (4) Note that the equiv alence W ( i ) ≡ W k implies I ( i ) = I ( W k ) and Z ( i ) = Z ( W k ). It will b e sho wn that c hannel transformations of the f orm G ⊗ n p olarize symmetric c han n els if and only if G is p olarizing. This statemen t is made precise in the follo wing theorem: Theorem 1. Fix a symmetric B-D MC W . L et G ⊗ n denote the n th Kr one cker p ower of G and c onsider the tr ansformation G ⊗ n : W → ( W ( i ) : i = 1 , . . . , ℓ n ) . i. If G is p olarizing, then for any δ > 0 lim n →∞ #  i ∈ { 1 , . . . , ℓ n } : I ( W ( i ) ) ∈ ( δ, 1 − δ )  ℓ n = 0 . ii. If G is not p olarizing, then I ( W ( i ) ) = I ( W ) for al l n and i ∈ { 1 , . . . , ℓ n } . Theorem 1 is a direct consequence of Lemmas 1 and 2 b elo w. Note that an y inv ertible { 0 , 1 } matrix G can b e written as a (real) sum G = P + P ′ , where P is a p erm utation matrix, and P ′ is a { 0 , 1 } matrix. This fact can b e inferred from Hall’s Theorem [3, Theorem 16 .4.]. Therefore, for an y suc h matrix G , there exists a column p erm utation that results in G ii = 1 for all i . Since the transition p robabilities deﬁnin g W ( i ) are inv arian t (up to a p ermutatio n of the outpu ts y ℓ 1 ) un der column p ermutatio n s on G , we only consider matrices with 1s on the diagonal. The follo wing lemma giv es necessary and suﬃcient conditions for (3) to b e satisﬁed: Lemma 1. F or any symmetric B- DMC W , 3 i. If G is not upp er triangular, then ther e exists an i for which W ( i ) ≡ W k for some k ≥ 2 . ii. If G is upp er triangular, then W ( i ) ≡ W for al l 1 ≤ i ≤ ℓ . Pr o of. Let G ( ℓ − i ) b e the ( ℓ − i ) × ( ℓ − i ) matrix obtained from G by remo ving its last i ro ws and columns. Let the n u m b er of 1s in the last r o w of G b e k . Clearly W ( ℓ ) ≡ W k . I f k ≥ 2 then G is not u pp er triangular and the ﬁr st claim of th e lemma holds. If k = 1, then W ( ℓ ) ≡ W , and ( x 1 , . . . , x ℓ − 1 ) is indep endent of u ℓ . One can then wr ite W ( ℓ − i ) ( y ℓ 1 , u ℓ − i − 1 1 | u ℓ − i ) = 1 2 ℓ − 1 X u ℓ ℓ − i +1 W ℓ ( y ℓ 1 | u ℓ 1 ) = 1 2 ℓ − 1 X u ℓ − 1 ℓ − i +1 ,u ℓ Pr[ Y ℓ − 1 1 = y ℓ − 1 1 | U ℓ 1 = u ℓ 1 ] Pr[ Y ℓ = y ℓ | Y ℓ − 1 1 = y ℓ − 1 1 , U ℓ 1 = u ℓ 1 ] ( a ) = 1 2 ℓ − 1 X u ℓ − 1 ℓ − i +1 ,u ℓ W ℓ − 1 ( y ℓ − 1 1 | u ℓ − 1 1 ) Pr[ Y ℓ = y ℓ | Y ℓ − 1 1 = y ℓ − 1 1 , U ℓ 1 = u ℓ 1 ] = 1 2 ℓ − 1 X u ℓ − 1 ℓ − i +1 W ℓ − 1 ( y ℓ − 1 1 | u ℓ − 1 1 ) X u ℓ Pr[ Y ℓ = y ℓ | Y ℓ − 1 1 = y ℓ − 1 1 , U ℓ 1 = u ℓ 1 ] = 1 2 ℓ − 1  W ( y ℓ | 0) + W ( y ℓ | 1)  X u ℓ − 1 ℓ − i +1 W ℓ − 1 ( y ℓ − 1 1 | u ℓ − 1 1 ) where ( a ) follo ws from th e fact that G lk = 0 , for all k < ℓ . Therefore y ℓ is indep enden t of the inpu ts to the channels W ( ℓ − i ) for i = 1 , . . . , ℓ − 1. T his is equiv alen t to sa ying that channels W (1) , . . . , W ( ℓ − 1) are deﬁned by the matrix G ( ℓ − 1) . Ap plying the same argumen t to G ( ℓ − 1) and rep eating, w e see that if G is up p er triangular, then we hav e W ( i ) ≡ W for all i . On the other hand, if G is n ot u pp er triangular, then there either exists an i for w hic h G ( ℓ − i ) has at least t wo 1s in the last ro w, which in turn implies W ( i ) ≡ W k for some k ≥ 2. Remark 1. The ab ove lemma says that al l tr ansformations that ar e not upp er triangular ar e p olarizing. M or e over, upp er triangular tr ansformations have no eﬀe ct on the channel, i.e., e ach bit se es an indep endent c opy of W after an upp er triangular tr ansformatio n. Corollary 1. F or any p olarizing tr ansformation G , ther e exists an i ∈ { 1 , . . . , ℓ } and k ≥ 2 for which I ( i ) = I ( W k ) (5) Z ( i ) = Z ( W ) k . (6) Pr o of. The ﬁ rst claim is trivial. The second claim follo w s fr om the fact that the Bhattac haryy a parameter of any c hannel of th e form Q j W j is giv en b y Q j Z ( W j ). 4 4 Con v ergence Consider recursiv ely com binin g c hann els W as in [1], using a p olarizing transformation G . F ollo wing Arık an, asso ciate to this construction a tree pro cess { W n ; n ≥ 0 } with W 0 = W W n +1 = W ( B n +1 ) n , where { B n ; n ≥ 1 } is a sequence of i.i.d. random v ariables deﬁ n ed on a probabilit y sp ace (Ω , F , µ ), B n b eing un iformly d istributed o ver the set { 1 , . . . , ℓ } . Deﬁne F 0 = {∅ , Ω } and F n = σ ( B 1 , . . . , B n ) for n ≥ 1. Deﬁne th e pro cesses { I n ; n ≥ 0 } = { I ( W n ); n ≥ 0 } and { Z n ; n ≥ 0 } = { Z ( W n ); n ≥ 0 } . Observ a tion 2. { ( I n , F n ) } i s a b ounde d martingale and ther efor e c onver ges a.s. and in L 1 to a r ando m variable I ∞ . Lemma 2. If W is symmetric and G is p olarizing, then I ∞ = ( 1 w.p. I ( W ) , 0 w.p. 1 − I ( W ) . Pr o of. By the con v ergence in L 1 of I n w e h av e E [ | I n +1 − I n | ] n →∞ − → 0. S ince G is a p olarizing matrix, Lemma 1 implies I n +1 = I ( W k n ) with p r obabilit y at least 1 ℓ , for some k ≥ 2. This in turn implies E [ | I n +1 − I n | ] ≥ 1 ℓ E [ I ( W k n ) − I ( W n )] → 0 . (7) It is sh o wn in the App endix that for an y symmetric B-DMC W n , if I ( W n ) ∈ ( δ, 1 − δ ) for some δ > 0, then there exists an η ( δ ) > 0 suc h that I ( W k n ) − I ( W n ) > η ( δ ). W e therefore conclude th at con ve r gence in (7) implies I ∞ ∈ { 0 , 1 } w.p. 1. T he claim on the p robabilit y distr ibution of I ∞ follo ws from the f act that { I n } is a martingale, i.e., E [ I ∞ ] = E [ I 0 ] = I ( W ). Corollary 2. If W i s symmetric and G is p olarizing, then { Z n } c onver ges a.s. to a r ando m variable Z ∞ and Z ∞ = ( 0 w.p. I ( W ) , 1 w.p. 1 − I ( W ) . Pr o of. The pro of follo ws from the fact that I n → I ∞ a.s. and the inequalities [1] I ( Q ) 2 + Z ( Q ) 2 ≤ 1 I ( Q ) + Z ( Q ) ≥ 1 . for an y B-DMC Q . 5 Theorem 2. Given a symmetric B- D MC W , an ℓ × ℓ p olarizing matrix G , and any β < 1 /ℓ , lim n →∞ Pr[ Z n ≤ 2 − ℓ nβ ] = I ( W ) . Pr o of Ide a. F or an y p olarizing matrix it can b e sh o wn that Z n +1 ≤ ℓZ n with pr obabilit y 1 and that Z n +1 ≤ Z 2 n with probabilit y at least 1 /ℓ . The pro of then follo ws by adapting the p ro of of [2, Theorem 3]. 5 Discussion Using Arık an’s rule for c ho osing the information bits, p olar co d es of blo ckle n gth N = ℓ n can b e constructed starting with any p olarizing ℓ × ℓ matrix G . Th e enco ding and successiv e cancellat ion deco ding complexities of suc h co d es are O ( N log N ). Usin g sim ilar argument s , it is easy to sho w that p olar cod es of blo c klength N = Q n i =1 ℓ i can b e constructed from generator matrices of the form ⊗ i G i , where eac h G i is a p olarizing matrix of size ℓ i × ℓ i . The enco ding and successiv e cancellation deco ding complexitie s of these codes are al so O ( N log N ). App endix In this section w e prov e the follo win g: Lemma 3. L et W b e a symmetric B-DM C and let W k b e deﬁne d as ab ove. If I ( W ) ∈ ( δ , 1 − δ ) for some δ > 0 , then ther e exists an η ( δ ) > 0 such that I ( W k ) − I ( W ) > η ( δ ) . W e will use the follo wing theorem in p r o ving Lemma 3: Theorem 3 ([4, 5]) . L et W 1 , . . . , W k b e k symmetric B-DMCs with c ap acities I 1 , . . . , I k r esp e ctively. L et W [ k ] denote the channel with tr ansition pr ob abilities W [ k ] ( y k 1 | x ) = k Y i =1 W i ( y i | x ) . Also let W [ k ] B S C denote the channel with tr ansition pr ob abilities W [ k ] BSC ( y k 1 | x ) = k Y i =1 W BSC ( ǫ i ) ( y i | x ) , wher e B S C ( ǫ i ) denotes the binary symmetric cha nnel with cr ossover pr ob ability ǫ i ∈ [0 , 1 2 ] , ǫ i , h − 1 (1 − I i ) , wher e h denotes the binary entr opy function. Then, I ( W [ k ] ) ≥ I ( W [ k ] BSC ) . Remark 2. Consider the tr ansmission of a si ng le bit X using k indep endent symmetric B-DM Cs W 1 , . . . , W k with c ap acities I 1 , . . . , I k . The or em 3 states that over the class of al l symmetric channels with given mutual informations, the mutual informatio n b etwe en the input and the output ve ctor is minimize d when e ach of the individual channels is a B SC. 6 Pr o of of L emma 3. Let ǫ ∈ [0 , 1 2 ] b e the crossov er probabilit y of a BSC with capacit y I ( W ), i.e., ǫ = h − 1 (1 − I ( W )). Note that for k ≥ 2, I ( W k ) ≥ I ( W 2 ) . By Theorem 3, w e hav e I ( W 2 ) ≥ I ( W 2 B S C ( ǫ ) ). A simple computation s ho ws that I ( W 2 B S C ( ǫ ) ) = 1 + h (2 ǫ ¯ ǫ ) − 2 h ( ǫ ) . W e can then write I ( W k ) − I ( W ) ≥ I ( W 2 B S C ( ǫ ) ) − I ( W ) = I ( W 2 B S C ( ǫ ) ) − I ( W B S C ( ǫ ) ) = h (2 ǫ ¯ ǫ ) − h ( ǫ ) . (8) Note that I ( W ) ∈ ( δ , 1 − δ ) imp lies ǫ ∈ ( φ ( δ ) , 1 2 − φ ( δ )) where φ ( δ ) > 0, wh ic h in tur n imp lies h (2 ǫ ¯ ǫ ) − h ( ǫ ) > η ( δ ) for some η ( δ ) > 0. References [1] E. Arık an, “Channel p olarizatio n : A metho d for constructing ca p acit y-ac hieving co des for symmetric binary-input memoryless c hannels,” submitted to IEEE T r ans. Inform. The ory . Av ailable at: http:/ /arxiv.or g/p df/0807.3917 . [2] E. Arık an and E. T elat ar, “On the rate o f c hann el p olarizat ion”. Av ailable at http://a rxiv.or g/p df/0807 .3 806 . [3] J.A. Bondy and U.S.R. Murt y , “Graph Theory ,” Springer, 2008. [4] I. Sutsko v er, S. Shamai and J. Ziv, “Extremes of information com bining,” IEEE T r ans. Inform. The ory , vol. 51 , no. 4, pp. 1313–132 5, Apr. 200 5. [5] I. Land, S. Huettinger, P . A. Ho eher and J. B. Hub er, “Bounds on information com bining,” IEEE T r ans. Inform. The ory , v ol. 51, no. 2, pp. 612 –619, F eb. 20 05. 7

A Class of Transformations that Polarize Symmetric Binary-Input Memoryless Channels

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment