A law of large numbers for weighted plurality

Consider an election between k candidates in which each voter votes randomly (but not necessarily independently) and suppose that there is a single candidate that every voter prefers (in the sense that each voter is more likely to vote for this speci…

Authors: Joe Neeman

A la w of large n um b ers for w eigh ted pluralit y Jo e Neeman ∗ No v em b er 14, 2018 Abstract Consider an election b et ween k cand id ates in which each voter votes randomly (but n ot n ecessari ly indep endently) and supp ose that there is a single candidate that every voter prefers (in the sense that eac h voter is more likely to vote for th is sp ecial cand idate than any oth er candidate). Supp ose w e hav e a vo ting rule that takes all of th e votes and p roduces a single outcome and supp ose that eac h individual voter has little effect on the outcome of the voting rule. If the voting rule is a weigh ted pluralit y , then w e show that with high probabilit y , t he preferred candidate will win the ele ction. Con versely , w e show that this statemen t fail s for all other reasonable vo ting rules. This result is an exten sion of one by H¨ aggstr¨ om, Kalai and Mossel, who p ro ved the abov e in the case k = 2. 1 In tro duction F or elections b et ween tw o candidates , it is well known that voting r ules in which every voter has a small effect are go od rule s in the sense that they “aggrega te information well:” if every v oter has a small bias tow ards the same candida te then that candidate will win with ov erwhelming proba bility . When voters vote independently , this fact was no ted by Margulis [4] a nd Russo [5], who se results were later strengthened b y K a hn, Kalai and Linial [3] and by T alagra nd [6]. When the voters are not indep endent, the situation is mo r e complicated. It is no longer tr ue, then, that every reasonable voting rule ag gregates well. In fact, [2] s ho w that if we wan t the agg regation to hold for every distribution of the voters, then w eighted ma jority functions ar e the only option. W e extend their result to the non-binary case. The author w ould lik e to thank E lc hanan Mossel for s ug gesting this pr o blem and providing fruitful dis cussions. ∗ Departmen t of Statistics, U.C. Berkeley . joeneeman@gmail.com 1 2 Definitions and results In the int ro duction, we made a few a llusions to “ reasonable” voting rules. Let us now say pr ecisely what tha t means: we will require that our voting rules do not hav e a built-in pr eference for any a lternative. This is a common a ssumption, and its definition is standa rd (see, eg. [1 ]). In what follows, the notation [ k ] stands for the set { 0 , . . . , k − 1 } . Definition 2.1. A function f ∶ [ k ] n → [ k ] is neutra l if f ( σ ( x )) = σ ( f ( x )) for al l x ∈ [ k ] n and al l p ermutations σ on [ k ] , wher e σ ( x ) i = σ ( x i ) . Note that in the case k = 2, a function is neutral if, and only if, it is an ti- symmetric acc o rding to the definition in [2]. Example 2.2 When k = 2 and n is o dd, then the simple ma jority function (for which f ( x ) = 1 if # { i ∶ x i = 1 } > # { i ∶ x i = 0 } ) is neutr al. O n the other hand, if n is even then in order to fully sp ecify the simple ma jor it y function, we need to say what happ ens in the case of a tie; the c hoice o f tie-brea k ing rule will determine whether the resulting function is neutr al. F or example, if we define f ( x ) = x 1 for every tied configuratio n x , then f is neutra l. On the other ha nd, if f ( x ) = 1 for every tied configuratio n x , then f is not ne utr al. The example c a n b e extended to k ≥ 3. In this case, c o nsider the tie-br eaking rule f ( x ) = x i where i is the smallest po ssible num ber for which x i is equal to one of the tied alter nativ es. This tie-brea king rule is neutral, and it is mor e natural than s etting f ( x ) = x 1 bec ause it g ua rantees that the output of f is one of the tied alternatives. 2.1 W ei ghted pluralit y functions Let us say pre c is ely what w e mean by a w eighted pluralit y function. The defini- tion that we take here g e neralizes the definition from [2] of a weighted ma jo rit y function. Definition 2 .3. A function f ∶ [ k ] n → [ k ] is a weigh ted pluralit y function if ther e exist weights w 1 , . . . , w n ∈ R ≥ 0 such t hat ∑ i w i = 1 and for al l a, b ∈ [ k ] , f ( x ) = a implies that  i ∶ x i = a w i ≥  i ∶ x i = b w i . Note that the a bove definition doe s not presc r ibe a particular be ha vior if a tie occ urs b et ween tw o alternatives. If the weigh ts are chosen so that ties never o ccur, then the weighted plur alit y function is clearly neutral. Mor e o ver, for any set of weights we can co nstruct a neutral w eighted plurality function with those weigh ts b y following the tie-breaking rule outlined in Example 2 .2. 2 2.2 The influence of a v oter The final notion that we need befor e s tating our result is a wa y to quant ify the power of a single voter. When k = 2, the notion of effe ct is w ell-es tablished and c a n b e found, for example, in [2]. How ever, there do es not seem to be a w ell-established w ay of quantifying the effect of v oters for non-binary s ocia l choice functions. Here, w e propo se a definition that clo sely r esembles the one used in [2] for binary functions. Definition 2. 4. L et f b e a fun ct ion [ k ] n → [ k ] and fix a pr ob ability distribution P on [ k ] n . The effect of voter i is e i ( f , P ) = k  j = 1 P ( f ( X ) = j  X i = j ) − P ( f ( X ) = j  X i ≠ j ) , wher e X is a r andom variable di stribute d ac c or ding to P . Note that for the case k = 2, the preceding definition reduces to e i ( f , P ) = 2 ( P ( f ( X ) = 1  X i = 1 ) − P ( f ( X ) = 1  X i = 0 )) , which is just twice the definition in [2] of a voter’s effect. Also, the effect is closely related to the correla tion b et ween the v oter s a nd the outcome: P ( f ( X ) = j  X i = j ) − P ( f ( X ) = j  X i ≠ j ) = Cov ( 1 { f = j } , 1 { X i = j } ) P ( X i = j ) P ( X i ≠ j ) ≥ 4 C ov ( 1 { f = j } , 1 { X i = j } ) and so e i ( f , P ) ≥ 4  j Cov ( 1 { f = j } , 1 { X i = j } ) . Example 2.5 The simplest example of e i ( f , P ) is when P is a pro duct measure (ie. the X i are independent) and the function f do es not dep end on its i th co ordinate; in that case, P ( f ( X ) = j  X i = j ) = P ( f ( X ) = j  X i ≠ j ) for all j and so e i ( f , P ) = 0. On the o ther hand, if P is a distribution such that X 1 = X 2 = ⋯ = X n with probability 1 , and if f is a plurality function, then P ( f ( X ) = j  X i = j ) = 1 for all j , while P ( f ( X ) = j  X i ≠ j ) = 0 ; hence, e i ( f , P ) = 1 for all i . F or a less triv ia l ex a mple, supp ose that the X i are indep endent and uniformly distributed o n [ k ] . Let f be an un weigh ted plurality function. Then the Central Limit Theor em implies that e i ( f , P ) = O ( 1 √ n ) as n → ∞ . On the other hand, supp ose that f is still an un weigh ted plurality function and the X i are indep endent, but no w P ( X i = 1 ) > P ( X i = j ) + δ for some δ > 0 and all j ≠ 1. Then Ho effding’s inequality implies that P ( f ( X ) = 1  X i ) ≥ 1 − 2 exp ( − δ 2 n  4 ) for sufficien tly large n , regardless of the v alue of X i . In particular, this implies that e i ( f , P ) = O ( exp ( − δ 2 n  4 )) . Co mpared to the cas e where the X i are uniformly distributed, this demonstrates that e i ( f , P ) can depe nd strong ly o n P , even when P is restricted to being a pr oduct measure. 3 2.3 The main result Our main theorem is the following: Theorem 2.6. (a) F or every δ > 0 and ǫ > 0 , ther e is a τ > 0 su ch that for every weighte d plur ality function f with weights w i and every pr ob ability distribution P on [ k ] n , if e i ( f , P ) ≤ τ and ther e is a set A ⊂ [ n ] such that ∑ i w i P ( X i = a ) ≥ ∑ i w i P ( X i = b ) + δ fo r al l i ∈ [ n ] , al l a ∈ A and al l b  ∈ A , then P ( f ( X ) ∈ A ) ≥ 1 − ǫ . (b) If f is not a weighte d plur ality function then ther e exists a pr ob ability distribution P on [ k ] n such that P ( X i = 2 ) > P ( X i = 1 ) for al l i ∈ [ n ] but P ( f ( X ) = 1 ) = 1 (and henc e e i ( f , P ) = 0 for al l i ) . W e rema rk that the Theorem is constructive in the sense that we can give an algo rithm (ba sed on solv ing a linear prog ram) whic h either constructs so me weigh ts w i witnessing the fact that f is a weigh ted plurality , or a probability distribution P satisfying par t (b). Parts (a) and (b) of Theorem 2 .6 are co nverse to o ne ano ther in the following sense: under the hypothesis o f small effects, par t (a) says that if there is a gap betw een the popula rit y of the most popula r alter natives A and the less p opular alternatives A c then a weigh ted plura lit y function will choos e an alternative in A . Part (b) shows tha t this prop erty fails for every function that is no t a weigh ted plurality . Note that part (a ) has an impor tan t specia l case, which is closer to the statement of [2]: if P ( X i = a ) ≥ P ( X i = b ) + δ for all i ∈ [ n ] and all b ≠ a , then f ( X ) = a with high probability if the effects ar e s mall enoug h. The r emainder of the pap er is devoted to the pro of o f Theorem 2.6 . Pr o of of The or em 2.6 (a). This part of the pro of follows very closely the argu- men t in [2]. Supp ose that f is a weighted plurality function with weight s w i . The first s tep is to show that f is “ correlated” in some sense with each voter: define p ij = P ( X i = j ) a nd let W j be the (random) weigh t a ssigned to alter nativ e j : W j = ∑ i ∶ X i = j w i . Then E n  i = 1 w i k  j = 1 1 { f ( X )= j } ( 1 { X i = j } − p ij ) = E    i,j w i 1 { f ( X )= j } 1 { X i = j } −  i,j 1 { f ( X )= j } w i p ij   = E  i,j w i 1 { f ( X )= j } 1 { X i = j } −  j P ( f = j ) E W j . (1) Now, let α j = P ( f = j ) and set ˜ α j = α j ( ∑ i ∈ A α i ) for j ∈ A and ˜ α j = 0 otherwise. The first term of (1 ) is just E  i,j w i 1 { f ( X )= j } 1 { X i = j } = E  j 1 { f ( X )= j } W j ≥ E  j 1 { f ( X )= j }  i ˜ α i W i =  i ˜ α i E W j (2) 4 since the winning alter nativ e alwa ys has at least a s m uch weigh t as a ny con- vex combination o f a lternatives. Since min j ∈ A E W j ≥ max j / ∈ A E W j + δ , w e can plug (2 ) into (1 ) to obtain (1) ≥  j ˜ α j E W j −  j α j E W j ≥  j ∈ A ( ˜ α j − α j ) δ = δ P ( f  ∈ A ) . Recalling that e i ( f , P ) ≥ 4 ∑ j Cov ( 1 { f = j } , 1 { X i = j } ) , we hav e δ P ( f  ∈ A ) ≤ E n  i = 1 w i k  j = 1 1 { f ( X )= j } ( 1 { X i = j } − p ij ) ≤ 1 4  i w i e i ( f , P ) ≤ τ 4 and so one direction of the theorem is prov ed once w e take τ sma ll enough that ǫ ≥ τ ( 4 δ ) . The pro of of the second pa rt of the theorem follo ws the idea o f [2], in that we use linea r pr ogramming dualit y to find a witness for f b eing a weigh ted plurality function. Howev er, the details of the pro of are quite different, since [2] uses a w ell-known linear program (the fractional vertex cov er of a hypergr aph) which do es not extend beyond k = 2. The pr o o f idea is this: we will wr ite down a linea r progr am and its dual. If the prima l progra m has a large eno ug h v alue it will tur n out tha t f is a weighted plurality function. Otherwise , the dual has a small v a lue and the dual v a riables witness the cla im of Theorem 2 .6 (b). In pa rticular, note that this pro of provides the alg orithm that we men tioned after the statement of Theorem 2 .6. First w e make a trivial obs erv ation that will simplify our linear progr am considerably : if a function is neutral, it is eas ie r to chec k whether it is a weigh ted plurality b ecause it is not necessary to try all poss ible combinations of a, b ∈ [ k ] : Prop osition 2. 7. Supp ose f ∶ [ k ] n → [ k ] is neutr al. Then f is a weighte d plur ality if and only if ther e exist weights w 1 , . . . , w n ∈ R such that f ( x ) = 1 implies that  i ∶ x i = 1 w i ≥  i ∶ x i = 2 w i . W e can write a linear progra m for chec king whether a given neutral function f is a weighted plurality . The v ariables for this pr ogram a re t ; w i for each i ∈ [ n ] a nd g x for each x ∈ [ k ] n for which f ( x ) = 1 . In s tandard form, the pr imal 5 progra m is the following: maximize t + − t − sub ject to g x ≥ 0 for all x ∈ [ k ] n such that f ( x ) = 1 w i ≥ 0 for all i ∈ [ n ] t + ≥ 0 and t − ≥ 0  i w i = 1  i ∶ x i = 1 w i −  i ∶ x i = 2 w i − g x − ( t + − t − ) = 0 for all x ∈ [ k ] n with f ( x ) = 1 . Prop osition 2.8. L et t ∗ b e the value of the ab ove line ar pr o gr am. If t ∗ ≥ 0 t hen f is a weighte d plur ality function. Pr o of. L e t w i , g x , t + and t − be feasible p oint s suc h that t + − t − ≥ 0. Then, for all x with f ( x ) = 1,  i ∶ x i = 1 w i −  i ∶ x i = 2 w i = g x + ( t + − t − ) ≥ 0 and so f satisfies the conditions of Pro po s ition 2 .7. Now c o nsider the dual prog r am; since the primal is in standard form, the dual is easy to write down. Let the dual v ar iables be a and q x for all x such that f ( x ) = 1 . Then the dual pro g ram is: minimize a + − a − sub ject to  x ∶ f ( x )= 1 q x ≤ − 1  x ∶ f ( x )= 1 ( 1 { x i = 1 } − 1 { x i = 2 } ) q x + ( a + − a − ) ≥ 0 for all i ∈ [ n ] q x ≤ 0 for all x such that f ( x ) = 1 a + ≤ 0 and a − ≤ 0 . Prop osition 2.9. L et a ∗ b e the value of the ab ove dual pr o gr am. If a ∗ < 0 then ther e ex ists a pr ob ability distribution on [ k ] n such that P ( X i = 2 ) > P ( X i = 1 ) for al l i but f ( X ) = 1 almost sur ely. Pr o of. Cho o s e a feasible p oint with a + − a − < 0 and define p x = − q x ( ∑ x q x ) . Then p x ≥ 0 and ∑ x p x = 1, so w e can define a pr obability dis tr ibution by P ( X = x ) = p x when f ( x ) = 1 and P ( X = x ) = 0 o therwise. Under this distribution, f ( X ) = 1 with pr o babilit y 1 . On the other hand, with a + − a − < 0 the cons traint s of the dual program imply that  x ∶ f ( x )= 1 1 { x i = 1 } q x >  x ∶ f ( x )= 1 1 { x i = 2 } q x 6 for all i . Thus, P ( X i = 1 ) =  x ∶ f ( x )= 1 1 { x i = 1 } p x <  x ∶ f ( x )= 1 1 { x i = 2 } p x = P ( X i = 2 ) for all i . T o conclude the pr oo f of Theorem 2.6, note that b oth the primal and dual progra ms ar e feasible and b ounded and s o a ∗ = t ∗ . References [1] S.J. B rams and P .C. Fishburn. V oting pro cedures. Handb o ok of so cial choic e and welfar e , 1:1 73–236 , 2002. [2] O. H¨ aggstr¨ om, G. Kalai, a nd E. Mossel. A law of large num ber s for w eighted ma jority. A dvanc es in Applie d Mathematics , 37(1):112– 123, 2006 . [3] J. Ka hn, G. Kala i, and N. Linial. The influence of v ariables on B o o le an functions. In Pr o c e e dings of the 29th Annual Symp osium on F oun dations of Computer Scienc e , pages 68–80 . IEEE C o mputer So ciety , 1988. [4] G. Ma rgulis. Probabilistic c haracter istic o f gr a phs with la rge connectivity . Pr oblems Info. T r ansmission , 1 0:174– 1 79, 1977. [5] L. Russo. An a pproximate zero-one law. Pr ob abili ty The ory and Rela te d Fields , 61 (1):129–13 9, 1982 . [6] M. T alagra nd. On Russo’s a ppr oximate ze ro-one law. The Annals of Pr ob- ability , 22(3 ):1576–1 587, 1994 . 7

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment