Optimal hash functions for approximate closest pairs on the n-cube
One way to find closest pairs in large datasets is to use hash functions. In recent years locality-sensitive hash functions for various metrics have been given: projecting an n-cube onto k bits is simple hash function that performs well. In this pape…
Authors: Daniel M. Gordon, Victor Miller, Peter Ostapenko
1 Optimal hash functions for approximate matches on the n -cube Daniel M. Gordon, V ictor S. Miller and Pe ter Ostapenko Abstract — One way to fi nd near -matches in l arge datasets is to use hash functi ons [7], [16]. In recent years locality-sensitive hash functions f or various metrics hav e been given; fo r th e Hamming metric p rojecting onto k bits is simple hash function that perfo rms well. In this paper we in vestigate alternativ es to projection. For various parameters hash functions given by complete d ecoding algorithms for error -correcting codes w ork better , and asymp- totically random codes perf orm better than projection. I . I N T R O D U C T I O N Giv en a set o f M n -bit vectors, a classical pro blem is to quickly identify on es which ar e close in Hamm ing distance. This problem has applications in num erous areas, suc h as informa tion retrieval and DNA sequence com parison. The nearest-neig hbor prob lem is to find a vector close to a given one, wh ile the closest-pair pr oblem is to find the pair in the set with the smallest Ha mming distance. Approxim ate versions of these problem s a llow an answer wh ere the distance may be a factor of (1 + ε ) larger than the best p ossible. One approach ([ 7], [1 2], [1 6]) is locality- sensitive hashing (LSH). A family of ha sh f unctions H is called ( r , cr , p 1 , p 2 ) - sensiti ve if fo r any two poin ts x , y ∈ V , • if d ( x , y ) ≤ r , then Prob( h ( x ) = h ( y )) ≥ p 1 , • if d ( x , y ) ≥ cr , then Prob( h ( x ) = h ( y )) ≤ p 2 . Let ρ = log (1 /p 1 ) / log (1 /p 2 ) . An LSH schem e can be used to solve the appro ximate n earest neigh bor pr oblem for M points in time O ( M ρ ) . Indy k and Motwani [ 14] showed that pro jection has ρ = 1 / c . The stand ard ha sh to use is projection onto k of the n coordin ates [1 2]. An alter nativ e family of ha shes is ba sed on min imum-weigh t dec oding with er ror-correcting codes [5], [20]. A [ n, k ] code C with a complete decod ing alg orithm defines a hash h C , w here each v ∈ V := F n 2 is map ped to the codeword c ∈ C ⊂ V to which v decodes. Using linear codes for hashing sch emes has b een ind ependen tly suggested many times; see [5], [10], an d the patents [4] and [ 20]. In [5] the binary Golay code was suggested to find approx - imate matches in bit-vecto rs. Data is provid ed that suggests it is effecti ve, but it is still no t clear when the Golay or other codes work b etter tha n p rojection. In this pap er we attem pt to quantify this, u sing tools from coding th eory . Our mod el is somewhat different fr om the usual LSH literature. W e are interested in the scenario wher e we have D. Gordon a nd P . Ostapenk o are with the ID A Cen ter for Communication s Researc h, 4320 W esterr a Court, San Die go, 92121 V . Miller is with the ID A Cent er for Communicat ions Research, 805 Bunn Dri ve, Princeton , New Jersey 08540 collection of M rand om points of V , o ne of wh ich, x , h as been d uplicated with er rors. Th e err or vector e h as each bit nonzer o with p robability p . L et P C ( p ) be the probability that h C ( x ) = h C ( x + e ) . Then the pr obability of collision of two points x and y is • if y = x + e , then Pr ob( h ( x ) = h ( y )) = ˜ p 1 = P C ( p ) , • if y 6 = x + e , then Pr ob( h ( x ) = h ( y )) = ˜ p 2 = 2 − k . Then the number of elements that hash to h ( x ) will b e about M / 2 k , and the probability that o ne of these will be y = x + e is P C ( p ) . I f this fails, we m ay try ag ain with a new hash, say the sam e on e ap plied after shif ting the M points by a fixed vector , and continue u ntil y is found . Let ρ = lo g(1 / ˜ p 1 ) / log (1 / ˜ p 2 ) as fo r LSH. T ak ing 2 k ≈ M , we expect to find y in time M 2 k P C ( p ) = O ( M ρ ) . As with LSH, we want to optimize this by m inimizing ρ , i.e. finding a hash fu nction minimizing P C ( p ) . For a linear code with a com plete translation-inv ariant decodin g algorithm ( so that h ( x ) = c im plies that h ( x + c ′ ) = c + c ′ ), study ing P C is equiv alen t to stud ying the pro perties o f the set S of all points in V that decod e to 0 . In Section III and the append ix we sy stematically in vestigate sets o f size ≤ 64 . Suppose th at we p ick a ran dom x ∈ S . T hen the probability that y = x + e is in S is P S ( p ) = 1 |S | X x , y ∈ S p d ( x , y ) (1 − p ) n − d ( x , y ) . (1) This fu nction has be en studied extensiv ely in the setting of error-detecting cod es [1 7]. In that literatu re, S is a code , P S ( p ) is th e probab ility of an un detected err or , a nd th e go al is to minimize this probab ility . Her e, on the other hand, we will call a set optimal for p if no set in V of size |S | has greater probab ility . As the err or rate p approaches 1 / 2 , this coincid es with the d efinition of distance -sum optima l sets , which wer e first studied by Ahlswed e an d Katona [1]. The error exponen t of a co de C is E C ( p ) = − 1 n lg P C ( p ) . In this paper lg deno tes log to base 2 . W e are inter ested in proper ties of the er ror exponent over codes of rate R = k / n as n → ∞ . Note that ρ = E C ( p ) /R , so m inimizing the error exponent will g i ve us the best code to use fo r fin ding closest pairs. In Section IV we will show that h ash f unctions f rom random (non linear) co des h a ve a b etter error expo nent than projection . 2 I I . H A S H F U N C T I O N S F R O M C O D E S For a set S ⊂ V , let A i = # { ( x , y ) : x , y ∈ S and d ( x , y ) = i } count the numb er of pairs of words in S at distance i . The distance distribution function is A ( S , ζ ) := n X i =0 A i ζ i . (2) This fun ction is directly conn ected to P S ( p ) [17]. If x is a rando m elemen t of S , and y = x + e , where e is an err or vector where ea ch b it is nonzero with probability p , then the probab ility that y ∈ S is P S ( p ) := 1 |S | X x , y ∈ S p d ( x , y ) (1 − p ) n − d ( x , y ) (3) = 1 |S | n X i =0 A i p i (1 − p ) n − i = (1 − p ) n |S | A S , p 1 − p . In this section we will evaluate (3) f or projectio n and for perfect cod es, and then c onsider other linear cod es. A. Pr ojection The simplest hash is to pro ject vectors in V onto k coor- dinates. Let k -pr ojectio n deno te the [ n, k ] co de P n,k corre- sponding to this ha sh. The associated S of vectors ma pped to 0 is an 2 n − k -subcub e o f V . The distanc e distribution fun ction is A ( S , ζ ) = (2(1 + ζ )) n − k , (4) so the pro bability o f collision is P P n,k ( p ) = (1 − p ) n 2 n − k 2 1 − p n − k = (1 − p ) k . (5) P n,k is not a good error-correcting code, but for sufficiently small error pro bability its h ash fun ction is o ptimal. Theor em 1: Let S be the 2 n − k -subcub e of V . For any error probab ility p ∈ (0 , 2 − 2( n − k ) ) , S is an optima l set, and so k - projection is an optimal hash . Pr oof: The distance distribution func tion fo r S is A ( S , ζ ) = 2 n − k (1 + ζ ) n − k . The ed ge isoperimetr ic inequality fo r an n - cube [13] states that Lemma 2: Any sub set S of the vertices of the n - dimen- sional cube Q n has at mo st 1 2 | S | lg | S | edges b etween vertices in S , with equality if an d only if S is a subcub e. Any set S ′ with 2 n − k points has distanc e distribution function A ( S ′ , ζ ) = k X i =0 c i ζ i , where c 0 = 2 n − k , c 1 < ( n − k )2 n − k by Lem ma 2, and the sum of th e c i ’ s is 2 2( n − k ) . By ( 5) the pr obability of c ollision is (1 − p ) n 2 n − k A ( S ′ , p/ (1 − p )) . A ( S ′ , ζ ) ≤ 2 n − k + ζ (( n − k )2 n − k − 1) + ζ 2 2 2( n − k ) − ( n − k + 1)2 n − k + 1 , and A ( S , ζ ) − A ( S ′ , ζ ) ≥ ζ − ζ 2 2 2( n − k ) + 2 n − k − 1 n − k 2 + n − k + 2 + 1 > ζ − ζ 2 (2 2( n − k ) − 1) . This is po siti ve if p < 1 / 2 and (1 − p ) /p > 2 2( n − k ) − 1 , i.e. , for p < 2 − 2( n − k ) . B. Conca tenated Hash es Here we show th at if h and h ′ are goo d hashes, then th e concatenatio n is as well. First we iden tify C with F k 2 and trea t h C as a hash h from F n 2 → F k 2 . W e d enote P C by P h . From h : F n 2 → F k 2 and h ′ : F n ′ 2 → F k ′ 2 , w e get a co ncatenated h ash ( h, h ′ ) : F n + n ′ 2 → F k + k ′ 2 . Lemma 3: Fix p ∈ (0 , 1 / 2) . Let h an d h ′ be hashes. Th en min { E h ( p ) , E h ′ ( p ) } ≤ E ( h,h ′ )( p ) ≤ max { E h ( p ) , E h ′ ( p ) } , with strict ineq ualities if E h ( p ) 6 = E h ′ ( p ) . Pr oof: Since p is fixed, we dro p it from the no tation. Suppose E h ≤ E h ′ . Then lg P h n ≤ lg P h + lg P h ′ n + n ′ ≤ lg P h ′ n ′ . Since P ( h,h ′ ) = P h P h ′ , we have E h ≤ E ( h,h ′ ) ≤ E h ′ . C. P erfect Codes An e -sphere around a vecto r x is the set of all vectors y with d ( x , y ) ≤ e . A n [ n, k , 2 e + 1] cod e is perfect if the e - spheres aro und cod ew ord s cover V . Minimum we ight deco d- ing with perfect codes is a reasonable starting point for hashing schemes, since all vectors are closest to a uniq ue codeword. The o nly perfect binary codes are trivial repetition codes, the Hamming cod es, and the bin ary Golay c ode. Repetitio n codes do bad ly , but the o ther perf ect codes giv e g ood hash function s. 1) Binary Go lay Code: The [23 , 12 , 7] b inary Golay cod e G is an impo rtant perfect code. The 3-sph eres ar ound each code codeword cover F 23 2 . Th e 3-sphe re ar ound 0 in the 23-cub e has distance distribution function 2048 + 1168 4 ζ + 128 524 ζ 2 + 226 688 ζ 3 + 11334 40 ζ 4 + 67298 0 ζ 5 + 20189 40 ζ 6 . From this we find E G ( p ) > E P 23 , 12 ( p ) fo r p ∈ (0 . 2555 , 1 / 2) . 3 T ABLE I C R O S S OV E R E R R O R P RO BA B I L I T I E S p F O R H A M M I N G C O D E S H m . m k p 4 11 0 . 2826 5 26 0 . 1518 6 57 0 . 0838 7 120 0 . 0468 2) Hamm ing Codes: Aside from the repe tition code s and the Golay code, the only perfect binary codes are the Hamming codes. The [2 m − 1 , 2 m − m − 1 , 3] Hamming code H m corrects one error . The distance distribution func tion for a 1 -sphere is 2 m + 2(2 m − 1) ζ + (2 m − 1)(2 m − 2) ζ 2 , (6) so the pro bability o f collision P H m ( p ) is (1 − p ) 2 m − 1 2 m (2 m + 2 (2 m − 1) p 1 − p (7) + (2 m − 1)(2 m − 2) p 2 (1 − p ) 2 ) T able I gives the crossover error prob abilities where the first few Hamming co des become better than pr ojection. Theor em 4: F or any m > 4 and p > m/ (2 m − m ) , th e Hamming code H m beats (2 m − m − 1) -pro jection. Pr oof: T he d ifference between the d istribution function s of the cu be and th e 1-sph ere in dimension 2 m − 1 is f m ( ζ ) := A ( S , ζ ) − A ( H m , ζ ) (8) = 2 m (1 + ζ ) m − (2 m + 2(2 m − 1) ζ + (2 m − 1)(2 m − 2) ζ 2 ) . W e will show that, for m ≥ 4 , f m ( ζ ) has exactly one root in (0 , 1) , de noted by α m , and that α m ∈ ( ( m − 2) / 2 m , m/ 2 m ) . W e calculate f m ( ζ ) = (( m − 2)2 m + 1) ζ − 2 2 m − 3 + m 2 2 m + 2 ζ 2 +2 m m X i =3 m i ζ i . All the coefficients of f m ( ζ ) ar e no n-negative with th e excep- tion of th e coefficient of ζ 2 , which is negative fo r m ≥ 2 . Thus, by Descartes’ ru le of signs f ( ζ ) has 0 or 2 positive roots. Howe ver, it has a r oot at ζ = 1 . Call th e o ther positive root α m . W e have f m (0) = f m (1) = 0 , and since f ′ (0) = ( m − 2 )2 m + 2 > 0 and f ′ (1) = 2 2 m − 1 ( m − 4 ) + 2 m +2 − 2 > 0 for m ≥ 4 , we mu st h a ve α m < 1 for m ≥ 4 . For p > α m the Hamming code H m beats projection . Using (8) and Berno ulli’ s in equality , it is easy to show that f m ( ζ ) > 0 for ζ < c ( m − 2) / 2 m for any c < 1 and m ≥ 4 . For the other directio n, we may use T aylor’ s th eorem to sh ow 2 m 1 + m 2 m m < 2 m + m 2 + m 4 2 m +1 1 + m 2 m m − 2 . Plugging th is into ( 8), we have that f m ( m/ 2 m ) < 0 f or m > 6 . k p d = 3 H 4 H 5 d = 5 d = 7 G 0 5 10 15 20 25 30 0 0 . 05 0 . 1 0 . 15 0 . 2 0 . 25 0 . 3 0 . 35 0 . 4 0 . 45 0 . 5 Fig. 1. Crossov er error probabiliti es for m inimum length linear codes. D. Other Line ar Cod es The above codes g iv e hashin g strategies for a few values of n and k , but we would like hashes for a wider rang e. For a hashing strategy using e rror-correcting codes, we n eed a c ode with an efficient comple te decodin g algorithm; th at is a way to map ev ery vector to a codeword. Given a translation inv ariant decoder, we m ay deter mine S , the set of vector s that decode to 0 , in ord er to compar e strategies as the e rror prob ability changes. Magma [6] has a built-in d atabase of linea r co des over F 2 of len gth up to 256. Most of these do no t come with effi- cient complete deco ding alg orithms, but magma does pr ovide syndrom e decoding. Using this database new hashing sche mes were found. For each dimension k an d m inimum distance d , an [ n, k , d ] binary linear code with m inimum len gth n was chosen for testing . 1 (This criterio n excludes any codes formed by concatenatin g with a pr ojection cod e.) For e ach c ode there is an error pro bability above which the code beats pro jection. Figure 1 shows these crossover pr obabilities. Not surprising ly , the [23 , 12 , 7] Golay code G and Hamming code s H 4 and H 5 all do we ll. The facts that concaten ating the Golay code with p rojection beats the c hosen co de for 13 ≤ k ≤ 17 an d concatenatin g H m with projection beats th e cho sen cod es for 27 ≤ k ≤ 3 0 show th at factors o ther than minimu m len gth are importan t in d etermining a n optima l hashing co de. As linea r cod es are sub spaces of F n 2 , la ttices are subspaces of R n . Th e 24-dime nsional Leech lattice is closely re lated to the Go lay c ode, an d also has particular ly nice prop erties. I t was u sed in [2] to constru ct a good LSH for R n . I I I . O P T I M A L S E T S In the previous section we loo ked at the perf ormances of sets associated w ith various goo d error-corre cting cod es. Howe ver , the pr oblem of determin ing optimal sets S ⊂ F n 2 is of indepen dent interest. The general question o f finding an optimal set of size 2 t in V fo r an error p robability p is q uite h ard. In this section we will find the answer for t ≤ 6 , and look at what h appens when p is n ear 1 / 2 . 1 The magma call BLLC (GF(2),k,d) was used to choose a code. 4 A. Optimal Se ts of Small Size For a vector x = ( x 1 , . . . , x n ) ∈ V , let r i ( x ) be x with th e i -th co ordinate com plemented, and le t s ij ( x ) b e x with the i -th and j -th coord inates switched . Definition 5: T wo sets are isomorp hic if on e can be g otten from the oth er by a series of r i and s ij transform ations. Lemma 6: If S and S ′ are isomo rphic, then P S ( p ) = P S ′ ( p ) for all p ∈ [0 , 1] . The corresp onding non -in vertible transfo rmation are : ρ i ( x ) := ( x 1 , x 2 , . . . , x i − 1 , 0 , x i +1 , . . . x n ) , (9) σ ij ( x ) := x , x min( i,j ) = 0 , s ij ( x ) , x min( i,j ) = 1 . Definition 7: A set S ⊂ V is a d own-set if ρ i ( S ) ⊂ S for all i ≤ n . Definition 8: A set S ⊂ V is right- shifted if σ ij ( S ) ⊂ S for all i, j ≤ n . Theor em 9: If a set S is optim al, then it is isomorp hic to a right- shifted down-set. Pr oof: W e will show that any optimal set is isomorp hic to a r ight-shifted set. The pro of that it must be isomorp hic to a down-set as well is similar . A similar pro of fo r distanc e-sum optimal sets (see Section III -B) was giv en by K ¨ undge n in [18]. Recall that P S ( p ) = (1 − p ) n |S | X x , y ∈ S ζ d ( x , y ) , where ζ = p/ (1 − p ) ∈ (0 , 1) . If S is not righ t-shifted, there is some x ∈ S with x i = 1 , x j = 0 , and i < j . Let ϕ ij ( S ) replace all such sets x with s ij ( x ) . W e only n eed to show that this will not dec rease P S ( p ) . Consider such an x and an y y ∈ S . If y i = y j , th en d ( x , y ) = d ( s ij ( x ) , y ) , and P S ( p ) will n ot chan ge. If y i = 0 and y j = 1 , th en d ( x , y ) = d ( s ij ( x ) , y ) − 2 , and since ζ l − 2 ≥ ζ l , th at term’ s con tribution to P S ( p ) increases. Suppose y i = 1 and y j = 0 . If s ij ( y ) ∈ S , then d ( x , y ) + d ( x , s ij ( y )) = d ( s ij ( x ) , y ) + d ( s ij ( x ) , s ij ( y )) , an d P S ( p ) is unchan ged. Oth erwise, ϕ ij ( S ) will repla ce y b y s ij ( y ) , a nd d ( x , y ) = d ( s ij ( x ) , s ij ( y )) mean s that P S ( p ) will again b e unchan ged. Let R s,n denote an optimal set of size s in F n 2 . By computin g all righ t-shifted down-sets of size 2 t , for t ≤ 6 , we have the following r esult: Theor em 10 : The optimal sets R 2 t ,n for t ∈ { 1 , . . . , 6 } correspo nd to T ab les IV [pg. 7 ] and V [pg. 8] . These fig ures, a nd details of the co mputation s, are given the Append ix. Some of the optimal sets fo r t = 6 do better than the sets cor respondin g to the codes in Figur e 1. B. Optimal Se ts for Lar ge Err or P r obabilities Theorem 1 states th at for any n and k , for a sufficiently small err or pr obability p , a 2 n − k -subcub e is an op timal set. One may also ask what an o ptimal set is at the other extreme, a large er ror probability . In th is section we use existing results about minim um average distance subsets to list addition al sets that are op timal a s p → 1 / 2 − . W e ha ve P S ( p ) := (1 − p ) n |S | A S , p 1 − p = 1 |S | X i A i p i (1 − p ) n − i . Letting p = 1 / 2 − ε and s = |S | , P S ( γ ) becomes s − 1 X i A i (1 / 2 − ε ) i (1 / 2 + ε ) n − i = 1 s 2 n X i A i + ε X i 2( n − 2 i ) A i + O ( ε 2 ) = s 2 n (1 + 2 nε ) − 4 ε s 2 n X i iA i + O ( ε 2 ) . Therefo re, an optim al set for p → 1 / 2 − must m inimize the distance-sum of S d ( S ) := 1 2 X x , y ∈ S d ( x , y ) = 1 2 X i iA i . (10 ) Denote the minima l distan ce su m by f ( s, n ) := min { d ( S ) : S ⊂ F n 2 , |S | = s } . If d ( S ) = f ( s, n ) for a set S of size s , we say that S is distance-sum optimal . The q uestion of which sets ar e distance- sum optimal was prop osed by Ahlswede an d Kato na in 197 7; see K ¨ und gen [18] for references and recen t results. This q uestion is also difficult. K ¨ undg en presents distan ce- sum optim al sets fo r small s and n , which include the ones of size 16 f rom T ab le IV. Jaeger et al. [15] found the d istance- sum optimal set fo r n large. Theor em 11 : (Jaeger , et al. [15], cf . [18, p g. 151]) For n ≥ s − 1 , a gene ralized 1-sph ere (with s points) is distance- sum optimal unless s ∈ { 4 , 8 } ( in which case the su bcube is optimal). From this we h av e: Cor ollary 12: For n ≥ 2 t − 1 , with t ≥ 4 and p suffi cien tly close to 1 / 2 , a (2 t − 1) -dim ensional 1-spher e is h ashing optimal. I V . H A S H E S F R O M R A N D O M C O D E S In this section we will show that hashes from random codes under m inimum weigh t decoding 2 perfor m b etter than projection . Let R = k /n b e the rate o f a code. Th e erro r exponent fo r k -projec tion, E P n,k ( p ) , is − 1 n lg P n,k ( p ) = − 1 n lg(1 − p ) k = − R lg (1 − p ) . (11) Theorem 4 shows that for any p > 0 there are codes with rate R ≈ 1 which beat p rojection. For any fixed R , we w ill bound th e expected er ror expo nent fo r a random co de R of rate R , an d show that it beats (1 1). Let H be the bin ary en tropy H ( δ ) := − δ lg δ − (1 − δ ) lg (1 − δ ) . (12) Fix δ ∈ [0 , 1 / 2) . L et d := ⌊ δ n ⌋ , let S d ( x ) denote the sphere of radius d aro und x , and let V ( d ) := |S d ( x ) | . 2 Tie s arising in minimum weight decod ing are broken in some unspecified manner . 5 It is elemen tary to show (see [11], Exercise 5.9 ): Lemma 13: Let R be a random code of length n and rate R , wh ere n is sufficiently large. For c ∈ R , the probab ility that a given vector x ∈ S d ( c ) is clo ser to anothe r codew o rd than c is at most 2 n ( H ( δ ) − 1+ R ) . Lemma 13 implies that if H ( δ ) < 1 − R (the Gilb ert- V arshamov bou nd), then with high p robab ility , any g i ven x ∈ S d ( c ) will be d ecoded to c . For the re st of this section we will assume this b ound , so that Lemma 13 a pplies. Let P R ( p ) b e the pro bability tha t a rand om point x and x + e both hash to c . This is g reater than the p robability th at x + e has weight exactly d , so P R ( p ) > d X i =0 d i n − d i p 2 i (1 − p ) n − 2 i . Theorem 4 of [3] gives a bo und for this: Theor em 14 : For any ε ≤ 1 / 2 and δ such that H ( δ ) < 1 − R and ε ≤ 2 δ , − E R ( p ) ≥ ε lg p + (1 − ε ) lg(1 − p ) + δ H ε 2 δ + (1 − δ ) H ε 2(1 − δ ) for any ε ≤ 1 / 2 . T he r ight han d side is maximiz ed at ε max satisfying (2 δ − ε max )(2(1 − δ ) − ε max ) ε max 2 = (1 − p ) 2 p 2 . Define D ( p, δ, ε ) := ε lg p + (1 − ε ) lg (1 − p ) + δ H ε 2 δ + (1 − δ ) H ε 2(1 − δ ) − (1 − H ( δ )) lg (1 − p ) . Then E P n,k ( p ) − E R ( p ) ≥ D ( p, δ, ε ) . Theor em 15 : D ( p , δ, ε max ) > 0 for any δ, p ∈ (0 , 1 / 2) . Pr oof: Fix δ ∈ (0 , 1 / 2 ) , and let f ( p ) := D ( p, δ, ε max ) . It is easy to c heck that: lim p → 0 + f ( p ) = 0 , lim p → 1 / 2 − f ( p ) = 0 , lim p → 0 + f ′ ( p ) > 0 , lim p → 1 / 2 − f ′ ( p ) < 0 , Therefo re, it suffices to sh ow that f ′ ( p ) has on ly one zero in (0 , 1 / 2) . Observe that ε max is chosen so th at ∂ D ∂ ε ( δ, p, ε max ) = 0 . Hence f ′ ( p ) = ∂ D ∂ p ( δ, p, ε max ) = ε max p log (2) − 1 − ε max (1 − p ) lg (2) + 1 − H ( δ ) (1 − p ) log(2) , so log(2) f ′ ( p ) = ε max p − 1 − ε max 1 − p + 1 − H ( δ ) 1 − p . Therefo re f ′ ( p ) = 0 wh en ε max = pH ( δ ) . Fro m Theorem 14 we find p = 4 δ (1 − δ ) − H ( δ ) 2 2( H ( δ ) − H ( δ ) 2 ) . Thus we have E P n,k ( p ) > E R ( p ) , an d so: Cor ollary 16: For any p ∈ (0 , 1 / 2 ) , R ∈ (0 , 1) and n sufficiently large, the expected pr obability of c ollision for a random code of rate R is high er than p rojection. A C K N O W L E D G E M E N T S . The autho rs would like to than k William Bradley , David desJardins and David Mo ulton for stimu lating discussions which h elped initiate this work. Also, T om Do rsey and Amit Khetan p rovided the simpler proof of Theo rem 15 given h ere. The an onymous referees made a n umber o f goo d suggestion s that improved the paper, particu larly the e xp osition in the introdu ction. A P P E N D I X By T heorem 9, we may find all o ptimal sets by examining all right-shifted down-sets. Right-shifted do wn-sets correspon d to ideals in the poset whose elements are in F n 2 and with partial order x y if x can be o btained f rom y by a series of ρ i (9) a nd σ ij (10) op erations. It turn s out that there ar e n ot too many such id eals, and they may be co mputed efficiently . Our method for producin g the ideals is not ne w , b u t since the main references are unpublish ed, we describ e th em briefly here. In Section 4.12.2 of [19], Ruskey describes a proce dure GenIdeal for listing th e ideals in a poset P . Let ↓ x denote all the elements x , and ↑ x denote all the elements x . procedure GenIdeal( Q : Poset, I : Ideal) local x : PosetEle ment begin if Q = φ then PrintI t( I ); else x := some element in Q ; GenIdeal( Q − ↓ x , I ∪ ↓ x ) ; GenIdeal( Q − ↑ x , I ) ; end The idea is to start with I emp ty , and Q = P . Then for each x , an ideal e ither conta ins x , in which case it will be found by the fir st call to Ge nIdeal, or it does no t, in which case the second call will fin d it. Finding ↑ x and ↓ x may be done ef ficiently if we precompute two |P | × |P | inciden ce matrices repr esenting these sets for each eleme nt o f P . This p recomp utation takes tim e O ( |P | 2 ) , and then the time p er ideal is O ( |P | ) . T his is indepe ndent of the choice of x . Squire (see [19] for d etails) realized that, by picking x to be the mid dle elemen t of Q in som e linea r extension, th e tim e per ideal can be shown to be O (lg |P | ) . W e are on ly in terested in down-sets that are rig ht-shifted and also are of fairly small size. The feasibility of our com- putations in volves b oth issues. In par ticular , within GenIde al we may r estrict to x ∈ F n 2 with Size( ↓ x ) n o m ore than th e target size of th e set we are lo oking for . If we were u sing 6 T ABLE II N U M B E R O F R I G H T - S H I F T ED D OW N - S E T S size number 2 1 3 1 4 2 5 2 6 3 7 4 8 6 9 7 10 10 size number 11 13 12 18 13 23 14 31 15 40 16 54 17 69 18 91 19 118 20 155 size number 21 1 99 22 2 60 23 3 34 24 4 33 32 3 140 48 130979 64 438462 7 T ABLE III O P T I M A L R I G H T - S H I F TE D D O W N - S E T S R 64 ,n B E A T I N G K N OW N C O D E S . ( T H E R E A R E N O S U C H D O W N - S E T S R 2 t ,n F OR t ≤ 5 . ) k n cr oss R 64 ,n 6 12 0 . 487 h 2 11 , 2 10 + 2 5 , 3 · 2 8 i 7 13 0 . 470 h 2 12 , 2 10 + 2 4 , 3 · 2 8 i 8 14 0 . 439 h 2 13 + 2 2 , 2 13 + 3 , 2 3 + 2 2 + 1 i 9 15 0 . 391 h 2 14 + 3 , 2 10 + 2 2 i 16 22 0 . 244 h 2 21 + 2 i 17 23 0 . 242 h 2 22 + 1 , 2 19 + 2 i 18 24 0 . 238 h 2 23 + 1 , 2 17 + 2 i 19 25 0 . 231 h 2 24 + 1 , 2 15 + 2 i 20 26 0 . 222 h 2 25 + 1 , 2 13 + 2 i 21 27 0 . 212 h 2 26 + 1 , 2 11 + 2 i GenIdeal with th e po set whose ideals correspo nd to down- sets of size 64 in F 63 2 , there would be 83 , 2 78 , 0 01 such x to consider . Howev er, f or ou r situatio n w ith right-shifted d own- sets, ther e a re only 257 such x and the pr oblem beco mes quite manage able. Furthermore , instead o f stoppin g when Q is empty , we stop when I is at o r ab ove the desired size. T able II gives the number of righ t-shifted do wn -sets of different sizes. The comp utation for size 3 2 sets too k just over a second on one processor of an HP Superd ome. Size 64 sets took 23 minute s. Let R s,n refer to an optimal set of size s in F n 2 . T ables IV an d V list R 2 t ,n for all t ≤ 6 and all n < 2 t . Sev eral featur es of T ab les I V and V req uire explanation . First we identify the bin ary e xp ansion x = P i
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment