When Is Amplification Necessary for Composition in Randomized Query Complexity?

When Is Ampliﬁcat ion Necessary for Comp ositio n in Randomized Query Complexit y? Shalev Ben-Da vid ∗ Mik a Gö ös † Robin K othari ‡ Thomas W atson § June 22, 2020 Abstract Suppos e we ha ve randomized decision tr ees for an outer functio n f and an inner function g . The natural approach for obtaining a randomized decision tree for the compo sed function ( f ◦ g n )( x 1 , . . . , x n ) = f ( g ( x 1 ) , . . . , g ( x n )) in volv es amplifying the succes s pro babilit y of the decision tree f or g , so that a union b ound can be used to b ound the error probability ov er all the co ordinates. The ampliﬁcation in tr oduces a loga rithmic factor cost overhead. W e study the question: When is this log factor necessa ry? W e show that when the outer function is par it y or ma jority , the log factor can b e necessary , even for mo dels t hat are more p ow er ful than plain randomized decision tr ees. Our results are related to, but qualitatively strengthen in v arious wa y s, kno wn res ults ab out decision trees with noisy inputs. 1 In tro duction A deterministic decision tree for computing a partial function f : { 0 , 1 } n → Z is a binary tree where eac h in ternal no de is lab eled with an index from [ n ] and eac h leaf is lab eled with an output v alue from Z . On input x ∈ { 0 , 1 } n , the computatio n f ollo ws a ro ot-to-leaf path wher e at a no de labeled with index i , th e v alue of x i is queried and the p ath goes to the left c hild if x i = 0 and to the righ t c hild if x i = 1 . The leaf reac hed on input x m us t b e lab eled with the v alue f ( x ) (if the latter is deﬁned). The cost of the decision tree is its depth, i.e., the maxim um num b er of queries it makes o v er all inputs. The deterministic query complexit y of f is the minim um cost of any deterministic decision tree that computes f . W e will consider several more general mo dels of decision trees (randomized , etc.), so w e repurp ose traditional complexit y class notation to refer to the v arious asso ciated que ry complexit y measures. Since P is the traditional complexit y class corresp onding to deterministic computa tion, we let P ( f ) denote the deterministic q uery complexit y of f . (Some of the recen t literature uses the notation P dt ( f ) , but this pap er deals exclusiv ely with decision trees, so w e drop the dt sup erscript.) A randomiz ed decision tree is a probabilit y distribution o ver determ inistic decision trees. Com- puting f with erro r ε means that for ev ery input x (for whic h f ( x ) is deﬁned), the probabilit y that the output is not f ( x ) is at most ε . The cost of a randomized decision tree is the maxim um depth of all the deter ministic tree s in its supp ort. The randomized query complexit y BPP ε ( f ) is the minim um cost of any randomized decision tree that computes f with error ε . When we write ∗ Universit y of W aterloo. shalev.b@uwaterloo .ca † Stanford Universit y . goos@sta nford.edu ‡ Microsof t Quan tum and Microsoft Researc h. robin.kot hari@micros oft.com § Universit y of Memphis. Thomas.Watso n@memphis. edu 1 BPP ( f ) with no ε sp eciﬁed, we mean ε = 1 / 3 . A basic fact ab out randomize d computation is that the success probab ilit y can be ampliﬁed, with a m ultiplicativ e o verhea d in cost, b y running sev eral independent trials and taking the ma j ority vo te of the outputs: BPP ε ( f ) ≤ O ( BPP ( f ) · lo g (1 /ε )) . See [ BdW02 ] for a survey of classic results on query complexit y . If f : { 0 , 1 } n → Z and g : { 0 , 1 } m → { 0 , 1 } are t w o partial functions, their comp osition is f ◦ g n : ( { 0 , 1 } m ) n → Z where ( f ◦ g n )( x 1 , . . . , x n ) : = f ( g ( x 1 ) , . . . , g ( x n )) (whic h is deﬁned iﬀ g ( x i ) is deﬁned for all i and f ( g ( x 1 ) , . . . , g ( x n )) is deﬁned). How do es the randomized query complexit y of f ◦ g n dep end on the randomiz ed query complexities of f and g ? A simple observ ation is that to design a randomized decision tree for f ◦ g n , w e can tak e a 1 / 6 -error randomized decision tree for f and replace eac h query—say to the i th input bit of f —with a 1 / 6 n -error randomized decision tree for ev aluating g ( x i ) . By a union b ound, with probabili t y at least 5 / 6 all of the (at most n ) ev aluations of g return the correct answ er, and so with prob abilit y at least 2 / 3 the ﬁnal ev aluation of f is also correct. Since BPP 1 / 6 n ( g ) ≤ O ( BPP 1 /n ( g )) , w e can write this upp er b ound as BPP ( f ◦ g n ) ≤ O ( BPP ( f ) · BPP 1 /n ( g )) ≤ O ( BPP ( f ) · BPP ( g ) · log n ) . (1) When is this tight? It will tak e some eﬀort to suitably form ulate this question. W e b egin b y reviewing kno wn related results. 1.1 When is ampliﬁcation necessary? As for general low er b ounds (that hold for all f and g ), m uch work has gone in to pro ving lo w er b ounds on BP P ( f ◦ g n ) in terms of complexit y measures of f and g that are deﬁned using mo dels more p o werful than plain randomized query complexit y [ GJ16 , A GJ + 17 , BK18 , BDG + 20 , BB20 ]. In terms of just BP P ( f ) an d BPP ( g ) , the state-of-the -art is that BPP ( f ◦ g n ) ≥ Ω( BPP ( f ) · p BPP ( g )) for all f and g [ GLSS19 ]. F urthermo re, it is known that the latter b ound is sometimes tigh t: There exist partial b o olean functions f and g such that BPP ( f ◦ g n ) ≤ e O ( BPP ( f ) · p BPP ( g )) and BPP ( f ) , BPP ( g ) ≥ ω (1) [ GLSS19 , BB20 ]. Th us ( 1 ) is f ar from being always tigh t, ev en without w orrying ab out the need for ampliﬁcation . How ev er, it remains plausible that BPP ( f ◦ g n ) ≥ Ω( BPP ( f ) · BP P ( g )) holds for all total f and all partial g . W e take this as a working conjecture in this pap er. This conjecture has b een conﬁrmed for some s peciﬁc outer functions f , suc h as the iden tity function Id : { 0 , 1 } n → { 0 , 1 } n [ JKS10 ] (this is called a “direct s um” result) and the b oolean functions O r , X or (parit y), and Maj (ma jorit y) [ GJPW18 ]. These results, ho w ever, do not address the need for ampliﬁcation in the upp er b ound ( 1 ). T o form ulate our question of whether ( 1 ) is tigh t, a ﬁrst draft could b e: Question A , with resp ect to a particular f : I s ( 1 ) tigh t for al l part ial functions g ? This is not quite a f air question, for at least tw o reasons: r Regarding the ﬁrst inequalit y in ( 1 ): The simple upp er b ound actually show s BPP ( f ◦ g n ) ≤ O ( BPP ( f ) · BPP 1 / BPP ( f ) ( g )) (the union b ound is only ov er queries that ta k e place, not o v er all p ossible queries). So for simplicit y , let us restrict our atten tion to f satisfying BPP ( f ) ≥ Ω( n ) , whic h is the case for Id , Or , Xo r , and Maj . r Regarding the second inequalit y in ( 1 ): Some f uncti ons g satisfy BPP 1 /n ( g ) ≤ o ( BPP ( g ) · log n ) (e.g., if P ( g ) ≤ O ( BPP ( g )) ). So for simplicit y , let us restrict our atten tion to g satisfying BPP 1 /n ( g ) ≥ Ω( BP P ( g ) · log n ) , whic h (as we sho w later) is the case for t w o partial functions GapOr and GapMaj deﬁned as f ollows ( | x | denotes the Hamming weigh t of x ∈ { 0 , 1 } m ): GapOr ( x ) : = ( 0 if | x | = 0 1 if | x | = m/ 2 and GapMaj ( x ) : = ( 0 if | x | = m/ 3 1 if | x | = 2 m/ 3 . 2 Th us, a b etter formul ation of Question A w ould b e: As suming BPP ( f ) ≥ Ω ( n ) , is ( 1 ) tight for all partial g satisfying BPP 1 /n ( g ) ≥ Ω( BPP ( g ) · log n ) ? Ev en with these cav eats, the answ er is alwa ys “no.” It will b e instructiv e to examine a coun terexample. Let Which : { 0 , 1 } 2 → { 0 , 1 } b e the partial function suc h that Which ( y ) indicat es the location of the unique 1 in y , under the promise that | y | = 1 . Then g = Which ◦ GapOr 2 tak es an inp ut of le ngth 2 m with th e promise tha t there are exactly m/ 2 man y 1 s, either all in the left half or all in the right half, and outputs whic h half has the 1 s. It turns out BPP ( g ) ≤ O (1) and BP P 1 /n ( g ) ≥ Ω(log n ) provided m ≥ lo g n (for similar reasons as GapOr it self ) and yet BPP ( f ◦ g n ) ≤ O ( BPP ( f )) f or all f : T o compute f ◦ g n , we can run an optimal randomized decision tree for f and wheneve r it queries g ( x i ) , w e rep eatedly q uery uniformly rand om bit p ositions of x i un til we ﬁnd a 1 (so the v alue of g ( x i ) is determined b y whic h half we f ound a 1 in). This has the same error probabilit y as the randomized decision tree for f , and the total n um b er of queries to the bits of ( x 1 , . . . , x n ) is O ( BPP ( f )) in exp ectation, b ecause for each i it takes O (1) queries in exp ectation to lo cate a 1 in x i . By Mark o v’s in equalit y , with high constan t probabilit y this halts after only O ( BP P ( f )) total queries. Th us by ab orting the computation if it attempts to mak e too man y queries, w e obtain a randomized decision tree f or f ◦ g n that alw a ys mak es O ( BPP ( f )) q ueries, wit h only a s mall hit in the error probabilit y . Blais and Bro dy [ BB19 ] adjust the statemen t of Que stion A so the answ er b ecomes “yes” in the case f = Id . Sp eciﬁcally , they w eak en the righ t-hand side in such a w ay t hat the ab o ve coun terexample is ruled out. Deﬁning 1 BPP ε ( g ) similarly to BPP ε ( g ) but where the cost of a randomized decision tree is the maxim um o ver all inputs (on w hic h g is deﬁned) of the exp ected n um b er of queries, we now ha v e BPP 1 /n ( g ) ≤ BPP 0 ( g ) ≤ O (1) for the g from the count erexample. The theorem f rom [ BB19 ] is BPP ( f ◦ g n ) ≥ Ω( BPP ( f ) · BP P 1 /n ( g )) when f = Id , in other w ords, BPP ( g n ) = Ω( n · BP P 1 /n ( g )) (a “strong direct sum” result). [ BB19 ] also explicitly asked whether similar results hold for other functions f . The corresp onding conjecture for f = Or is false (as we note b elo w) while for f = Xo r and f = Maj it remains op en. T o make progress, w e step bac k and ask a s eeming ly more inno cuous version of the question: Question B , with resp ect to a particular f : Is ( 1 ) tigh t for some partia l function g ? It turns out the answer is “no” f or f = Or and is “yes” for b oth f = X or and f = Maj . 1.2 Decision trees with noisy inputs Question B is related to “q uery complexit y with noisy input s” (int ro duced in [ FRPU94 ]), so let us review the latter mo del: When input bit y i is queried, the wrong bit v alue is returned to the decision tree with some probabilit y ≤ 1 / 3 (and the correc t v alue of y i is returned with the remaining proba bilit y). The “noise ev en ts” are indep enden t across all queries, including m ultiple queries to the same input bit. Now the adv ersary gets to pic k not only the input, but also the “noise probabilities.” [ FRPU94 ] distinguishes b et ween t w o extreme p ossibilites: A static adv ersary has a single common noise probabilit y for all queries, while a dynamic adv ersary can c hoose a diﬀeren t noise probabilit y for eac h no de in the decision tree. I n this pap er we make a reasonable compromise: The adversary gets to c hoos e a tuple of noise probabilities ( ν 1 , . . . , ν n ) , and each query to y i returns 1 − y i with probabilit y exactly ν i . When a randomized decision tree computes f with err or probabilit y ε , that means for ev ery input y ∈ { 0 , 1 } n and ev ery noise probab ilit y tuple ( ν 1 , . . . , ν n ) (with ν i ≤ 1 / 3 for eac h i ), th e output is f ( y ) with probabi lit y ≥ 1 − ε o ver the random noise and randomness of the decision tree. W e in v en t the notation BPP ∗ ( f ) for the minim um cost of an y randomi zed decision tree that computes f on noisy inputs, with error probabilit y 1 / 3 . W e 1 [ BB19 ] used the notation R instead of BPP . 3 ha ve BPP ∗ ( f ) ≤ O ( BPP ( f ) · log n ) ≤ O ( n log n ) b y rep eating eac h query O (log n ) times and taking the ma jorit y v ote (to drive the noise probabilities do wn to o (1 /n ) ), and using a union b ound to absorb the noise probabili ties in to the error probabi lit y . The connection with comp osition is that BPP ( f ◦ g n ) ≤ BPP ∗ ( f ) · BPP ( g ) , b ecause to design a randomized decision tree for f ◦ g n , w e can tak e a 1 / 3 -error randomized decision tree f or f with noisy inputs, and replace eac h query—sa y to y i —with a 1 / 3 -error randomized decision tree f or ev aluating g ( x i ) . There is a similar connection for 1 -sided error and 1 -sided noise. When a randomized decision tree has 1 -sided error ε , that means on 0 -inputs the output is wrong with probabilit y 0 , and on 1 -inputs t he outpu t is wrong with pr obabilit y at most ε . W e let RP ( g ) denote the minim um cost of an y randomized decision tree that computes g with 1 -sided error 1 / 2 . Similarly , 1 -sided noise means that whe n input bit y i is querie d, if the actual v alue is y i = 0 the n 1 is returned with probabilit y 0 , and if the actual v alue is y i = 1 then 0 is returned with probabilit y ν i ≤ 1 / 2 . W e inv en t the notation BPP † ( f ) for the minim um cost of an y randomized decision tree that computes f on 1 -sided noisy inputs, with 2 -sided error probabilit y 1 / 3 . W e hav e BPP ( f ) ≤ BPP † ( f ) ≤ BPP ∗ ( f ) . The connection BPP ( f ◦ g n ) ≤ BPP † ( f ) · RP ( g ) holds like in the 2 -sided noise setting. W e oﬃcially record these observ ations: Observ ation 1. F or al l f and g , BPP ( f ◦ g n ) ≤ BPP ∗ ( f ) · BPP ( g ) and BPP ( f ◦ g n ) ≤ BPP † ( f ) · RP ( g ) . The upshot is that noisy upp er b ounds imply compos ition upp er b ounds, and comp osition lo w er b ounds imply noisy lo w er b ounds. There are many proofs of the result BPP ∗ ( Or ) ≤ O ( n ) [ FRPU94 , KK94 , New09 , GS10 ]: Theorem 1 ( Or nev er necessitates ampliﬁcation). BP P ∗ ( Or ) ≤ O ( n ) and thus for every p artial function g , BPP ( Or ◦ g n ) ≤ O ( n · BPP ( g )) . Theorem 1 is not new, but in App endix A w e pro v ide a particularly clean and elemen tary pro of (rela ted to, but more streamlined than, the proof in [ KK94 ]). W e men tion that the pro of straigh tforward ly generalizes to some othe r functions f , suc h as “odd-max-bit”: Omb ( y ) = 1 iﬀ the highest index of an y 1 in y is o dd. W e turn our attenti on to lo w er b ounds. V arious sp ecial-purp ose tec hniques hav e b een dev eloped for pro ving query complexit y lo wer bounds in the noisy setting [ FRPU94 , EP98 , DR08 , G S10 ]. Ho w ever, a conceptual consequence of Observ ation 1 is that sp ecial-purpose tec hniques are not generally necessary: W e can just use tec hniques for lo w er b ounding plain (non-noisy) randomiz ed query complexit y , applied to composed functions. 1.3 Lo wer bound for parity [ FRPU94 ] pro v ed that BPP ∗ ( X or ) and BPP ∗ ( Maj ) are Ω( n log n ) . Although apparen tly not recorded in the literatur e, it is p ossible to generali ze this result to sho w BPP † ( X or ) and BPP † ( Maj ) are Ω( n log n ) . How ev er, w e prov e results even stronger than that, u sing the co mp osition paradigm. Our results in v olve query complexit y models that are more pow erful than BPP , and ev en more p o werful than the BPP mo del from [ BB19 ]. This follo ws a theme f rom a lot of prior w ork: Since BPP query complexit y is rather subtle, w e can make progress by st udying related mo dels that are somewhat more “w ell-beha v ed.” 4 r As observed in [ BB19 ], the BPP mo del is equiv alen t to one where the cost is the w orst-case (rather than expected) n um b er of queries, and a ran domized decision tree is allo w ed to ab ort (i.e., output a s pecial symbol ⊥ ) with at most a small constan t probabilit y , and the output should b e correct with high probabilit y conditioned on not ab orting. r If w e stre ngthen the ab ov e mo del by allo wing the non-ab ort probabilit y to be arb itrarily close to 0 (rather than close to 1 ), but require that the non-ab ort probabilities are appro ximately the same for all inputs (within some factor close to 1 ), the resulting mo del has been called 2W APP (“ 2 -sided w eak almost-wide PP ”) [ GLM + 16 , GJP W18 ]. The “ 1 -sided” ve rsion W APP , deﬁned later, will b e relev an t to us. r If we further strengthen the mo del b y allo wing the non-ab ort probabilities to b e completel y unrelated for diﬀeren t inputs (and still arbi trarily close to 0 ), the resulting mo del has b een called P ostBPP (“ BP P wi th p ost-selection”) [ GLM + 16 , Cad18 ]. W e ﬁrst consider th e las t of these models. P ost BPP ε ( f ) is the m inim um cost of an y ra ndomized decision tree s uc h that on every input x (for whic h f ( x ) is deﬁned), the prob abilit y of out putting ⊥ is < 1 , and the pro babilit y of outputting f ( x ) is ≥ 1 − ε conditioned on not ou tputting ⊥ . T rivially , P ostBPP ( f ) ≤ BPP ( f ) . In fact, the PostBPP model is m uc h more p o werful than plain randomized query complexit y; for example (noted in [ GLM + 16 ]) it can eﬃcien tly compute the aforemen tioned o dd-max-bit function: PostBPP ( Omb ) ≤ 1 . F or the noisy input setting, PostBPP ∗ and PostBPP † are deﬁned in the natural wa y , and P ostBPP ( f ◦ g n ) ≤ P ost BPP ∗ ( f ) · BPP ( g ) and P ost BPP ( f ◦ g n ) ≤ P ostBPP † ( f ) · RP ( g ) hold lik e in Observ ation 1 . In Section 2 w e pro v e something qualitativ ely m uc h stronger than BPP ∗ ( X or ) ≥ Ω ( n log n ) : Theorem 2 ( X or sometimes necessitates ampliﬁcation). F or some p artial function g , namely g = GapMaj with m ≥ log n , P ostBPP ( X or ◦ g n ) ≥ Ω( n · BP P 1 /n ( g )) ≥ Ω( n log n · BPP ( g )) . In p articular, PostBPP ∗ ( X or ) ≥ Ω ( n log n ) . Let us compare Theorem 2 to t w o previous results. r [ EP98 ] pro v ed that BP P ∗ ( X or ) ≥ Ω( n log n ) and that this lo w er b ound holds ev en in the a v erage-case setting (i.e., Ω( n log n ) queries are need ed in exp ectation to succeed with high probabilit y o ver a uniformly random input, random noise, and randomness of the decision tree). Our pro of of Theorem 2 is s impler than the pro of in [ EP98 ] (though b oth pro ofs ha v e a F ouri er ﬂa vor ), it also w orks in the a verage- case setting, and it yields a stronger result since the mo del is P ostBPP instead of just BPP (and the lo w er b ound ho lds for composition rather than just noisy inputs). [ DR08 ] presen ted a diﬀerent simpliﬁed pro of of the result from [ EP98 ], but that pro of do es not generalize to PostBPP ∗ . r Our pro of of Theorem 2 shows s omething analogous, but incomparable, to the strong direct sum f rom [ BB19 ]. As w e ex plain in Section 2 , our proof sho ws that PostBPP ( X or ◦ g n ) ≥ Ω( n · P ostBPP 1 /n ( g )) holds for al l g (thus addressing a v ersion of our Question A). Compa red to the [ BB19 ] result that BPP ( Id ◦ g n ) ≥ Ω( n · BPP 1 /n ( g )) for all g , our result has the adv an tages of w orking for f = Xor rather than f = Id and yielding a qualitativ ely s tronge r lo w er b ound ( Po stBPP rather than BP P on the left side), but the disadv an tage of also requiring the qualitati v ely stronger type of lo w er b ound on g . Our result sho ws that if amplifying g requires a log factor in a ve ry strong sense (ev en PostBPP -t yp e decision trees cannot a void the log factor), then that log factor will b e necessary when comp osing Xo r with g . 5 1.4 Lo wer bound for ma jority Our main result strengthens the b ound BPP ∗ ( Maj ) ≥ Ω( n log n ) from [ FRPU94 ], mainly b y holding for the stronger mo del W APP (rather than j ust BPP ), but also b y dire ctly handling 1 -sided noise and b y holding for comp osition rather than just noisy inputs. W APP ε ( f ) is the minim um cost of an y randomized decision tree such that for some t > 0 , on input x the probabilit y of outputting 1 is in the range [(1 − ε ) t, t ] if f ( x ) = 1 , and in the range [0 , εt ] if f ( x ) = 0 . The ε subscript should alw a y s b e sp eciﬁed, b ecause unlike BPP and P os tBPP , W APP is not amenable to eﬃcien t ampliﬁcation of the error parameter ε [ GLM + 16 ]. F or every constant 0 < ε < 1 / 2 , we ha ve P ostBPP ( f ) ≤ O ( W APP ε ( f )) ≤ O ( BPP ( f )) . W APP -type query complexit y has sev eral aliases, suc h as “appro ximate conical jun ta degr ee” and “appro x imate query complexit y in exp ectation,” and it has recen tly play ed a cen tral role in v ar- ious randomized query (and comm unication) complexit y low er b ounds [ KLdW15 , GLM + 16 , GJ16 , GJPW18 ]. One can think of W A PP as a nonnegativ e v ersion of approxima te p olynomial degree (whic h corresp onds to the class A WPP ); in other w ords, it is a classical analogue of the p olynomial method used to low er b ound quan tum algorithms. F or the n oisy inpu t setting, W APP ∗ and W APP † are deﬁn ed in the natural w a y , and W APP ε ( f ◦ g n ) ≤ W APP ∗ ε ( f ) · BPP ( g ) and W AP P ε ( f ◦ g n ) ≤ W APP † ε ( f ) · RP ( g ) hold like in Observ ation 1 . W e pro ve th e follo wing theorem, whic h sho ws that W APP some times requires ampliﬁcatio n, ev en in the one-sided noise setting. Theorem 3 ( Maj som etimes necessitates ampliﬁcation). F or some p artial f unction g , namely g = GapOr with m ≥ log n , an d some c onstant ε > 0 , W APP ε ( Maj ◦ g n ) ≥ Ω( n · BP P 1 /n ( g )) ≥ Ω( n log n · RP ( g )) . In p articular, W APP † ε ( Maj ) ≥ Ω( n log n ) . This theorem should b e con trasted with the w ork of Shersto v ab out making p olynomials robust to noise [ She13 ]. In that w ork, Sh ersto v sho w ed that ap pro x imate p olynomial de gree never requires a log factor in the noisy input setting, nor in comp osition. That is to sa y , he impro v ed the simple b ound A WP P ∗ ( f ) ≤ O ( A WPP ( f ) · log n ) to A WP P ∗ ( f ) ≤ O ( A WPP ( f )) for all Bo olean functions f , and show ed A WPP ( f ◦ g n ) ≤ O ( A WPP ( f ) · A WPP ( g )) . In con trast, for conical jun tas (nonnegativ e linear combin ations of conjunctions), Theorem 3 sho ws that in a s trong sense, the simple b ound W APP ∗ ǫ ( f ) ≤ O ( W APP δ ( f ) · log n ) (for all constan ts 0 < δ < ε < 1 / 2 and total Bo olean functions f ) cannot be impro ved: W APP † ε ( f ) ≥ Ω( W A PP 0 ( f ) · log n ) for some constan t ε and some total f , namely f = Maj . Thus unlik e p olynomials, conical jun tas cannot b e made robust to noise. Our proof of Theorem 3 (in Section 3 ) introduces some tec hnical ideas that may b e useful f or other randomized query complexit y lo wer b ounds. By a simple reduction, Theorem 3 for g = Gap Or implies the s ame for g = GapMaj (with BPP ( g ) = 1 instead of RP ( g ) = 1 at the end of the statemen t), but w e do not know of a simpler direct pro of for the latter result. Theorem 3 cannot b e s trength ened to hav e PostBPP in place of W APP , b ecause P ostBPP ( Maj ◦ GapMaj n ) ≤ O ( n ) . How ev er, Theorem 3 do es hold with X or in place of Maj , b y the s ame pro of. 2 Pro of of Theorem 2 : X or sometimes necessitates am pliﬁcation W e ﬁrst discuss a standard techn ique for pro ving randomized query complexit y low er b ounds, whic h will b e useful in the proof of Theorem 2 . F or an y conjunction C : { 0 , 1 } k → { 0 , 1 } and distribution 6 D o v er { 0 , 1 } k , w e write C ( D ) : = E x ∼D [ C ( x )] = P x ∼D [ C ( x ) = 1] . The num b er of literals in a conjunction is called its width. F act 1. L et h : { 0 , 1 } k → { 0 , 1 } b e a p artial func tion, and for e ach z ∈ { 0 , 1 } let D z b e a distribution over h − 1 ( z ) . T hen for every ε ther e exist a c onjunction C of width PostBPP ε ( h ) and a z ∈ { 0 , 1 } such that ε · C ( D z ) ≥ (1 − ε ) · C ( D 1 − z ) and C ( D z ) > 0 . Pr o of. Abbreviate P ostBPP ε ( h ) as r . Fix a randomized decision tree of cost r comp uting h with error ε conditioned on not ab orting, and assume w.l.o.g. that f or eac h outcome of the randomness, the corresponding deterministic tree is a p erfect tree with 2 r lea ves, all at depth r . Consider the probabilit y space where we sample input x from the mixture 1 2 D 0 + 1 2 D 1 , sample a deterministic decision tree T as an outcome of the randomized decision tree, and sample a un iformly random leaf ℓ of T . Let A b e the indicator random v ariable for th e ev ent that ℓ is the leaf reac hed by T ( x ) and its lab el is h ( x ) . Let B b e the indica tor random v ariable for the e v en t that ℓ is the leaf reac hed b y T ( x ) and its label is 1 − h ( x ) . Conditioned on an y particular x a nd T , the probabilit y that ℓ is th e leaf reac hed b y T ( x ) is 2 − r . Th us conditioned on an y particular x , if the non-ab ort probabilit y is t x > 0 then E [ A | x ] ≥ 2 − r t x (1 − ε ) an d E [ B | x ] ≤ 2 − r t x ε and th us ε · E [ A | x ] − (1 − ε ) · E [ B | x ] ≥ 0 . Ov er the whole probabilit y space, w e ha v e ε · E [ A ] − (1 − ε ) · E [ B ] ≥ 0 , so by linearit y the same m ust hold conditio ned on s ome particular T and ℓ with E [ A | T , ℓ ] > 0 . Let C b e the conjunction of width r suc h that C ( x ) = 1 iﬀ T ( x ) reac hes ℓ , and let z b e the lab el of ℓ . Then w e ha ve C ( D z ) = E [ A | T , ℓ and h ( x ) = z ] = 2 · E [ A | T , ℓ ] > 0 and similarly C ( D 1 − z ) = 2 · E [ B | T , ℓ ] . Th us ε · C ( D z ) − (1 − ε ) · C ( D 1 − z ) = 2 ·  ε · E [ A | T , ℓ ] + (1 − ε ) · E [ B | T , ℓ ]  ≥ 0 . No w w e wo rk tow ard pro ving Theorem 2 . T hroughout, n is the input length of Xo r , and m is the input length of GapMaj . W e ha v e BPP ( GapMaj ) ≤ 1 b y outputting the bit at a uniformly random p osition from the input. W e describ e one w ay of seeing that BPP 1 /n ( GapMaj ) ≥ P ostBPP 1 /n ( GapMaj ) ≥ Ω(log n ) pro vided m ≥ log n . F or z ∈ { 0 , 1 } , deﬁn e G z as the uniform distribution o v er GapMaj − 1 ( z ) . F act 2. F or every c onjunction C : { 0 , 1 } m → { 0 , 1 } of width w ≤ m/ 7 and for e ach z ∈ { 0 , 1 } , C ( G z ) ≤ 3 w · C ( G 1 − z ) . Pr o of. By symmetry we just consider z = 0 . Supp ose C has u p ositiv e literals and v negativ e literals ( u + v = w ). Then C ( G 0 ) =  m − w m/ 3 − u  /  m m/ 3  ≤  m − w m/ 3  /  m m/ 3  = (2 m/ 3) · (2 m/ 3 − 1) ··· (2 m/ 3 − w +1) m · ( m − 1) ··· ( m − w +1) ≤ (2 / 3) w , C ( G 1 ) =  m − w m/ 3 − v  /  m m/ 3  ≥  m − w m/ 3 − w  /  m m/ 3  = ( m/ 3) · ( m / 3 − 1) ··· ( m/ 3 − w +1) m · ( m − 1) ··· ( m − w +1) ≥  m/ 3 − w m − w  w ≥  m/ 3 − m/ 7 m − m/ 7  w = (2 / 9) w . Th us C ( G 0 ) /C ( G 1 ) ≤  2 / 3 2 / 9  w = 3 w . Com bining F act 1 and F act 2 (using h = GapMaj , k = m , D z = G z , ε = 1 /n , and w = P ostBPP ε ( h ) ) implies that (1 − ε ) /ε ≤ 3 w , in other w ords we hav e PostBPP 1 /n ( GapMaj ) ≥ log 3 ( n (1 − 1 /n )) ≥ Ω(log n ) , pro v ided w ≤ m/ 7 . If w > m/ 7 then PostBPP 1 /n ( GapMaj ) ≥ Ω (log n ) holds an yw ay pro vided m ≥ log n . Hence, our result can b e restated as follows. 7 Theorem 2 (Restat ed). P ostBPP ( X or ◦ GapMaj n ) ≥ Ω( n log n ) pr ovide d m ≥ log n . Pr o of. W e sho w P ostBPP ( X or ◦ GapMaj n ) > 1 14 n log n . By F act 1 (using h = X or ◦ GapMaj n , k = nm , and ε = 1 / 3 ) it suﬃces to exhibit for eac h z ∈ { 0 , 1 } a distribution D z o v er ( X or ◦ GapMaj n ) − 1 ( z ) , suc h that for ever y conjunction C of width ≤ 1 14 n log n and for eac h z ∈ { 0 , 1 } , either C ( D z ) < 2 C ( D 1 − z ) or C ( D z ) = 0 . Letting F z b e the uniform distribution o ver Xor − 1 ( z ) , deﬁne D z as the mixture o v er y ∼ F z of G y : = G y 1 × · · · × G y n (i.e., ( x 1 , . . . , x n ) ∼ G y is sampled b y independent ly sampling x i ∼ G y i for all i ). Put succinctly , D z : = E y ∼ F z [ G y ] . Letting G : = 1 2 G 0 + 1 2 G 1 and F : = 1 2 F 0 + 1 2 F 1 and D : = 1 2 D 0 + 1 2 D 1 , w e ha v e D = G n since F is uniform ov er { 0 , 1 } n . Since C ( D ) = 1 2 C ( D 0 ) + 1 2 C ( D 1 ) , our goa l of sho wing “ 1 2 C ( D 0 ) < C ( D 1 ) < 2 C ( D 0 ) or C ( D 0 ) = C ( D 1 ) = 0 ” is equiv alen t to show ing “ 2 3 C ( D ) < C ( D 1 ) < 4 3 C ( D ) or C ( D ) = 0 .” No w consider an y conjunction C of width w ≤ 1 14 n log n such that C ( D ) > 0 , and write C ( x 1 , . . . , x n ) = Q i C i ( x i ) where C i is a conjunction. Since C i ( G ) = 1 2 C i ( G 0 ) + 1 2 C i ( G 1 ) , for eac h y i ∈ { 0 , 1 } w e can write C i ( G y i ) = (1 + a i ( − 1) y i ) C i ( G ) for some n um b er a i with | a i | ≤ 1 (so a i ≥ 0 iﬀ C i ( G 0 ) ≥ C i ( G 1 ) ). Let w i b e the width of C i , so P i w i = w ≤ 1 14 n log n . Then w i ≤ 1 7 log n ≤ m/ 7 for at least n/ 2 man y v alues of i , and for such i note that b y F act 2 , C i ( G y i ) ≤ 3 (log n ) / 7 · C i ( G 1 − y i ) ≤ n 1 / 4 · C i ( G 1 − y i ) for eac h y i ∈ { 0 , 1 } . The latter implies that | a i | ≤ 1 − 2 / ( n 1 / 4 + 1) ≤ 1 − n − 1 / 4 . Th us   Q i a i   = Q i | a i | ≤ (1 − n − 1 / 4 ) n/ 2 ≤ e − n 3 / 4 / 2 ≤ 1 / 4 . F or S ⊆ [ n ] , let χ S : { 0 , 1 } n → { 1 , − 1 } b e the chara cter χ S ( y ) : = Q i ∈ S ( − 1) y i = ( − 1) P i ∈ S y i . Note that E y ∼ F 1 [ χ S ] is 1 if S = ∅ , is − 1 if S = [ n ] , and is 0 otherwise. Putt ing ev erything together, C ( D 1 ) = E y ∼ F 1 [ C ( G y )] = E y ∼ F 1  Q i C i ( G y i )  = E y ∼ F 1  Q i (1 + a i ( − 1) y i ) C i ( G )  =  Q i C i ( G )  · E y ∼ F 1  P S ⊆ [ n ] Q i ∈ S a i ( − 1) y i  = C ( D ) · P S ⊆ [ n ]  Q i ∈ S a i  · E y ∼ F 1 [ χ S ( y )] = C ( D ) ·  1 − Q i ∈ [ n ] a i  ∈ C ( D ) · (1 ± 1 / 4) whic h impl ies 2 3 C ( D ) < C ( D 1 ) < 4 3 C ( D ) since w e are assuming C ( D ) > 0 . This concludes the pro of of Theorem 2 . Using s trong LP dualit y (as in [ G L14 ]), it can b e seen that F act 1 is a tigh t lo w er b ound method up to constan t factors: PostBPP ε ( h ) ≥ Ω ( c ) iﬀ it is poss ible to pro v e this via F act 1 b y exhibiting “hard input distributions” D 0 and D 1 (as we did for GapMa j in F act 2 ). Since this w as the only propert y of g used in the proof of Theorem 2 , this imp lies that BPP ( X or ◦ g n ) ≥ P ostBPP ( X or ◦ g n ) ≥ Ω( n · PostBPP 1 /n ( g )) holds for all g , as w e men tioned in Section 1.3 . 3 Pro of of Theorem 3 : Maj so m etimes necessitates a mpliﬁcation W e ﬁrst discuss a standard techn ique for pro ving randomized query complexit y low er b ounds, whic h will b e useful in the proof of Theorem 3 . F or an y conjunction C : { 0 , 1 } k → { 0 , 1 } and distribution D o v er { 0 , 1 } k , w e write C ( D ) : = E x ∼D [ C ( x )] = P x ∼D [ C ( x ) = 1] . The num b er of literals in a conjunction is called its width. F act 3. L et h : { 0 , 1 } k → { 0 , 1 } b e a p artial function, and let D 0 , D 1 , D 2 b e thr e e distributions, over h − 1 (0) , h − 1 (1) , and h − 1 (0) ∪ h − 1 (1) r esp e ctivel y. Then for every 0 < ε ≤ 1 / 10 ther e exists a c onjunction C of width W AP P ε ( h ) such that C ( D 0 ) ≤ δ · C ( D 1 ) and C ( D 2 ) ≤ (1 + δ ) · C ( D 1 ) and C ( D 1 ) > 0 , wher e δ : = 2 √ ε . 8 The k ey calculation underlying the pro of of F act 3 is encapsulated in the follo wing: F act 4. L et P 0 , P 1 , P 2 b e thr e e jointly distribute d nonne gative r an dom variables with E [ P 1 ] > 0 . F or any 0 < ε ≤ 1 / 10 , if E [ P 0 ] ≤ ε and E [ P 1 ] ≥ 1 − ε and E [ P 2 ] ≤ 1 , then ther e exists an outc ome o such that P 0 ( o ) ≤ δ · P 1 ( o ) and P 2 ( o ) ≤ (1 + δ ) · P 1 ( o ) and P 1 ( o ) > 0 , wher e δ : = 2 √ ε . Pr o of of F act 4 . Let W : = { o : P 1 ( o ) > 0 } 6 = ∅ . Supp ose for con tradiction that f or every outcome o ∈ W , either P 0 ( o ) > δ · P 1 ( o ) or P 2 ( o ) > (1 + δ ) · P 1 ( o ) . Then W can be pa rtitioned in to ev en ts U and V suc h that P 0 ( o ) > δ · P 1 ( o ) for ev ery o ∈ U and P 2 ( o ) > (1 + δ ) · P 1 ( o ) for ev ery o ∈ V . Letting I U and I V b e the indicator random v ariables for these ev en ts, we ha ve E [ P 1 · I U ] + E [ P 1 · I V ] = E [ P 1 ] and th us either: r E [ P 1 · I U ] ≥ √ ε · E [ P 1 ] , in whic h case E [ P 0 ] ≥ E [ P 0 · I U ] > δ · E [ P 1 · I U ] ≥ δ · √ ε · (1 − ε ) = 2 ε (1 − ε ) > ε , or r E [ P 1 · I V ] ≥ (1 − √ ε ) · E [ P 1 ] , in whic h case E [ P 2 ] ≥ E [ P 2 · I V ] > (1 + δ ) · E [ P 1 · I V ] ≥ (1 + δ ) · (1 − √ ε ) · (1 − ε ) > 1 where the last inequalit y can b e v eriﬁed b y a little calculus for 0 < ε ≤ 1 / 10 . Both cases yield a contr adiction. Pr o of of F act 3 . Abbreviate W APP ε ( h ) as r . Fix a random ized decision tree of cost r computing h with error parameter ε and threshold t > 0 (from t he deﬁnition of W APP ), and a ssume w.l.o.g. th at for eac h outcome of the randomness, the corresp onding deterministic tree is a p erfect tree with 2 r lea ves, all at depth r . Consider the proba bilit y space where w e sam ple a deterministic decision tr ee T as an outcome of the randomized decision tree, and sample a unif ormly random leaf ℓ of T . F or an y outcome T , ℓ , let C T ,ℓ b e the conjunction of width r such that C T ,ℓ ( x ) = 1 iﬀ T ( x ) reac hes ℓ . Deﬁne three join t random v ariables P 0 , P 1 , P 2 as P j ( T , ℓ ) : = ( C T ,ℓ ( D j ) if the lab el of ℓ is 1 0 if the lab el of ℓ is 0 . Conditioned on any particular x and T , the probabilit y that ℓ is the leaf reac hed by T ( x ) is 2 − r . Th us E [ P j ] = P T ,ℓ, x ∼D j [ ℓ is the leaf reac hed by T ( x ) and its lab el is 1] = E x ∼D j  2 − r · P T [ T ( x ) outputs 1]  whic h implies E [ P 0 ] ≤ 2 − r tε and E [ P 1 ] ≥ 2 − r t (1 − ε ) and E [ P 2 ] ≤ 2 − r t . Applying F act 4 to the scaled random v ariables (2 r /t ) P 0 , (2 r /t ) P 1 , (2 r /t ) P 2 yields an outcome T , ℓ s uc h that P 0 ( T , ℓ ) ≤ δ · P 1 ( T , ℓ ) and P 2 ( T , ℓ ) ≤ (1 + δ ) · P 1 ( T , ℓ ) and P 1 ( T , ℓ ) > 0 . Since P 1 ( T , ℓ ) > 0 , the lab el of ℓ m ust b e 1 , so w e get C T ,ℓ ( D 0 ) ≤ δ · C T ,ℓ ( D 1 ) and C T ,ℓ ( D 2 ) ≤ (1 + δ ) · C T ,ℓ ( D 1 ) and C T ,ℓ ( D 1 ) > 0 . 9 No w w e w ork tow ard pro ving Theorem 3 . Throughout, n is the input length of Maj , and m is the input length of GapOr . W e ha v e RP ( GapOr ) ≤ 1 by outputting the bit at a uniformly random p o- sition from the input. W e describe one wa y of seeing that BPP 1 /n ( GapOr ) ≥ W APP 1 /n ( GapOr ) ≥ Ω(log n ) pro vided m ≥ log n (this cannot b e sho wn via F act 1 ). F or z ∈ { 0 , 1 } , deﬁne G z as the uniform distribution o v er GapOr − 1 ( z ) . F act 5. F or every c onjunction C : { 0 , 1 } m → { 0 , 1 } : (i) C ( G 0 ) ∈ { 0 , 1 } . (ii) If C ( G 0 ) = 1 and C has width w ≤ m/ 4 t hen C ( G 1 ) ≥ 3 − w . Pr o of. (i) : Note that G 0 is supp orted entirel y on the input 0 m . If C has a p ositiv e literal then C ( G 0 ) = 0 . If C has only negativ e literals then C ( G 0 ) = 1 . (ii) : Supp ose C has w negativ e literals and no p ositive lit erals. The n C ( G 1 ) =  m − w m/ 2  /  m m/ 2  = ( m/ 2) · ( m / 2 − 1) ··· ( m/ 2 − w +1) m · ( m − 1) ··· ( m − w +1) ≥  m/ 2 − w m − w  w ≥  m/ 2 − m/ 4 m − m/ 4  w = 3 − w . Com bining F act 3 and F act 5 (using h = GapOr , k = m , D 0 = G 1 , D 1 = G 0 , D 2 is not needed, ε = 1 / n , and w = W APP ε ( h ) ) implies that 3 − w ≤ δ , in other wor ds W A PP 1 /n ( GapOr ) ≥ log 3 (1 / (2 p 1 /n )) ≥ Ω (lo g n ) , pro vided w ≤ m/ 4 . If w > m/ 4 then W APP 1 /n ( GapOr ) ≥ Ω (log n ) holds an yw ay pro vided m ≥ log n . Hence, our result can b e restated as follows. 2 Theorem 3 (Restat ed). W APP ε ( Maj ◦ GapOr n ) ≥ Ω( n log n ) for some c onstan t ε > 0 pr ovide d m ≥ log n . W e sho w W APP 1 / 36 ( Maj ◦ Ga pOr n ) > 1 16 n log n . By F act 3 (using h = Maj ◦ GapOr n , k = n m , ε = 1 / 36 , and δ = 1 / 3 ) it suﬃces to exhibit distributions D 0 , D 1 , D 2 o v er h − 1 (0) , h − 1 (1) , and h − 1 (0) ∪ h − 1 (1) resp ective ly , suc h that for ev ery con junction C of width ≤ 1 16 n log n , either C ( D 0 ) > 1 3 C ( D 1 ) or C ( D 2 ) > 4 3 C ( D 1 ) or C ( D 1 ) = 0 . Assume n is even and for the tiebreak er, Maj ( y ) = 1 if | y | = n/ 2 . F or ζ ∈ { 0 , 1 , 2 } letting F ζ b e the uniform distribution o v er all y ∈ { 0 , 1 } n with | y | = n/ 2 − 1 + ζ (so F 0 , F 1 , F 2 are o v er Maj − 1 (0) , Maj − 1 (1) , Maj − 1 (1) resp ectiv ely), deﬁne D ζ as the mixture o ver y ∼ F ζ of G y : = G y 1 × · · · × G y n (i.e., ( x 1 , . . . , x n ) ∼ G y is sampled b y independent ly sampling x i ∼ G y i for all i ). Put succinctly , D ζ : = E y ∼ F ζ [ G y ] . No w consider an y conjunction C of width w ≤ 1 16 n log n , and write C ( x 1 , . . . , x n ) = Q i C i ( x i ) where C i is a conjun ction. By F act 5 . (i) , [ n ] can be p artitioned into A ∪ B s uc h that C i ( G 0 ) = 1 for all i ∈ A , and C i ( G 0 ) = 0 for all i ∈ B . Abbreviate C i ( G 1 ) as c i , and for S ⊆ [ n ] write c S : = Q i ∈ S c i . Iden tify y ∈ { 0 , 1 } n with Y : = { i : y i = 1 } , so | y | = | Y | . Let the uniform distributio n o v er all size- s subsets of S b e denoted by  S s  , so y ∼ F ζ corresp onds to Y ∼  [ n ] n/ 2 − 1+ ζ  . Let I Y ⊇ B : = Q i 6∈ Y C i ( G 0 ) b e the indicator random v ariable for the ev en t Y ⊇ B . Now for ζ ∈ { 0 , 1 , 2 } , C ( D ζ ) = E y ∼ F ζ [ C ( G y )] = E y ∼ F ζ  Q i C i ( G y i )  = E Y ∼ ( [ n ] n/ 2 − 1+ ζ )  c Y · I Y ⊇ B  = P Y ∼ ( [ n ] n/ 2 − 1+ ζ ) [ Y ⊇ B ] | {z } p ζ · c B · E S ∼ ( A n/ 2 − 1+ ζ − | B | ) [ c S ] | {z } q ζ . 2 Properties (i) and (ii) from F act 5 are somewhat stronger than n ecessary fo r th e p roof of Theorem 3 to go through. The proof works, with virtually n o modiﬁcation, for any g satisfying the follo wing for some distributions G z o ver g − 1 ( z ) ( z ∈ { 0 , 1 } ): F or every conjunction C : { 0 , 1 } m → { 0 , 1 } such that C ( G 0 ) > 0 , we have C ( G 1 ) ≤ C ( G 0 ) and if furthermore C has width w ≤ m / 4 then C ( G 1 ) ≥ 2 − O ( w ) · C ( G 0 ) . 10 If c B = 0 then C ( D 1 ) = 0 , so assume c B > 0 . F actoring out c B and deﬁning p ζ and q ζ as ab o ve (but q ζ is undeﬁned if p ζ = 0 ), our goal is to sho w that either p 0 q 0 > 1 3 p 1 q 1 or p 2 q 2 > 4 3 p 1 q 1 or p 1 q 1 = 0 . There are three cases dep ending on whether | B | is greater than, eq ual to, or less than n/ 2 . First w e collect some generally useful prop erties: Claim 1. (i) p 0 = n/ 2 −| B | n/ 2 · p 1 and p 1 = n/ 2+1 −| B | n/ 2+1 · p 2 . (ii) 0 < q 1 ≤ √ n · q 2 if q 1 is deﬁne d. Pr o of. (i) : W e just consider p 0 vs. p 1 since p 1 vs. p 2 is similar. Imagine sampling Y 1 ∼  [ n ] n/ 2  and then obtaining the set Y 0 b y remo ving a uniformly rand om i ∈ Y 1 . I f Y 1 ⊇ B , then Y 0 ⊇ B when i ∈ Y 1 r B , whic h happens with probabilit y n/ 2 −| B | n/ 2 (assuming | B | ≤ n/ 2 ; if | B | > n/ 2 then p 0 = p 1 = 0 ). Th us p 0 = P [ Y 0 ⊇ B ] = P [ Y 0 ⊇ B | Y 1 ⊇ B ] · P [ Y 1 ⊇ B ] = n/ 2 −| B | n/ 2 · p 1 . (ii) : Let w i b e the w idth of C i , so P i w i = w ≤ 1 16 n log n . Then w i ≤ 1 4 log n ≤ m/ 4 for at l east 3 n/ 4 man y v alues of i , and for suc h i note that b y F act 5 . (ii) , c i ≥ 3 − (log n ) / 4 ≥ n − 2 / 5 if i ∈ A . This implies that if we sample a uniformly random i from an y A ′ ⊆ A with | A ′ | = n/ 2 (note that | A | ≥ n/ 2 if q 1 is deﬁned) then E i ∈ A ′ [ c i ] ≥ 1 2 · n − 2 / 5 + 1 2 · 0 ≥ 1 / √ n . Now to relate q 2 and q 1 , q 2 = E S ∼ ( A n/ 2 −| B | )  c S · E i ∈ A r S [ c i ]  ≥ E S ∼ ( A n/ 2 −| B | )  c S / √ n  = q 1 / √ n where the inequalit y uses | A r S | = ( n − | B | ) − ( n/ 2 − | B | ) = n/ 2 . F urthermore, q 1 > 0 if q 1 is deﬁned, b ecause n/ 2 − | B | ≤ | A | − n/ 4 and th us there exists an S ⊆ A with | S | = n/ 2 − | B | and c i ≥ n − 2 / 5 > 0 for all i ∈ S , hence c S > 0 . (A similar argumen t sho ws 0 < q 0 ≤ √ n · q 1 if q 0 is deﬁned, but w e will not need that.) Case | B | > n/ 2 . In this case, p 1 = 0 so w e are done. Case | B | = n/ 2 . By Claim 1 , p 2 = p 1 · ( n/ 2 + 1) and q 2 ≥ q 1 / √ n > 0 and th us p 2 q 2 ≥ p 1 q 1 · ( n/ 2 + 1) / √ n > 4 3 p 1 q 1 . Case | B | < n/ 2 . W e will show that p 0 p 1 ≥ 1 2 · p 1 p 2 and q 2 q 1 ≥ 9 10 · q 1 q 0 , whic h yields the punc hline: If p 0 q 0 ≤ 1 3 p 1 q 1 then q 2 q 1 ≥ 9 10 · q 1 q 0 ≥ 9 10 · 3 · p 0 p 1 ≥ 9 10 · 3 · 1 2 · p 1 p 2 > 4 3 · p 1 p 2 and th us p 2 q 2 > 4 3 p 1 q 1 . First, p 0 p 1 ≥ 1 2 · p 1 p 2 follo ws from Claim 1 . (i) using | B | ≤ n/ 2 − 1 : p 0 p 1 = n/ 2+1 n/ 2 · n/ 2 −| B | n/ 2+1 −| B | · p 1 p 2 ≥ 1 · n/ 2 − ( n/ 2 − 1) n/ 2+1 − ( n/ 2 − 1) · p 1 p 2 = 1 2 · p 1 p 2 . It just rema ins to sho w q 2 q 1 ≥ 9 10 · q 1 q 0 . Henceforth let s : = n/ 2 − 1 − | B | ≥ 0 . The exp erimen t S ∼  A s +2  in the deﬁnition of q 2 can alternativ ely b e view ed as: r Sample S 0 ∼  A s  . r Sample i ∈ A r S 0 u.a.r. and let S 1 : = S 0 ∪ { i } . r Sample j ∈ A r S 1 u.a.r. and let S = S 2 : = S 1 ∪ { j } . That is, i and j are sampled without replacemen t. W e consider an “ideal” (easier to analyze) version of this exp erimen t that samples i and j with replacemen t, in other w ords, the third step b ecomes: 11 r Sample j ∈ A r S 0 u.a.r. and let S ∗ 2 : = S 1 ∪ { j } . No w S ∗ 2 is a multiset , whic h m a y ha v e tw o copies of i , in whic h case the pro duct c S ∗ 2 has tw o factors of c i . Just as q 2 : = E [ c S 2 ] , we let q ∗ 2 : = E [ c S ∗ 2 ] , and w e next sho w how to deriv e q ∗ 2 q 1 ≥ q 1 q 0 from the follo wing claim: Claim 2. F or al l nonne gative n umb ers α 1 , . . . , α N and β 1 , . . . , β N such that α k β k > 0 for some k , P k α k β 2 k P k α k β k ≥ P k α k β k P k α k . Pr o of. By clearing denominators, this inequalit y is equiv alen t to  P k α k  P k α k β 2 k  ≥  P k α k β k  2 whic h can b e rewritten as P k ,ℓ α k α ℓ β 2 ℓ ≥ P k ,ℓ α k β k α ℓ β ℓ . Subtracting P k α 2 k β 2 k from b oth sides, this is equiv alen t to P k <ℓ  α k α ℓ β 2 ℓ + α ℓ α k β 2 k  ≥ P k <ℓ 2 α k β k α ℓ β ℓ . W e show that this inequalit y holds for eac h summand separately . F actoring out α k α ℓ , this reduces to sho wing β 2 ℓ + β 2 k ≥ 2 β k β ℓ , whic h holds since β 2 ℓ + β 2 k − 2 β k β ℓ = ( β ℓ − β k ) 2 ≥ 0 . In the statemen t of Claim 2 , let the index k corresp ond to S 0 , let N : =  | A | s  , let α k : = c S 0 / N , and let β k : = E i ∈ A r S 0 [ c i ] . The n q 0 = P k α k and q 1 = P k α k β k and q ∗ 2 = P k α k β 2 k and q 0 ≥ q 1 > 0 by Claim 1 . (ii) (i.e., α k β k > 0 for some k ) so by Claim 2 w e indeed hav e q ∗ 2 q 1 ≥ q 1 q 0 . T o conclude that q 2 q 1 ≥ 9 10 · q 1 q 0 , w e just need to s how q 2 ≥ 9 10 q ∗ 2 . The third step of the S 2 exp erimen t is just the third step of the S ∗ 2 exp erimen t condition ed on j 6 = i , whic h happens with proba bilit y 1 − 1 | A |− s . With probabilit y 1 | A |− s , we get j = i in the S ∗ 2 exp erimen t. If we cond ition on the latter ev ent, it yields another ex p erimen t, whose result w e call S err 2 , whic h is a m ultiset deﬁnitely con taining tw o copies of i . Corr esp ondingly we deﬁne q err 2 : = E [ c S err 2 ] (with tw o factors of c i ). No w w e ha v e q ∗ 2 = P [ j 6 = i ] · E  c S ∗ 2   j 6 = i  + P [ j = i ] · E  c S ∗ 2   j = i  =  1 − 1 | A |− s  · q 2 + 1 | A |− s · q err 2 ≤ q 2 + 2 n · q err 2 since | A | − s = ( n − | B | ) − ( n/ 2 − 1 − | B | ) = n/ 2 + 1 ≥ n / 2 . The S err 2 exp erimen t can alternativ ely b e view ed as: r Sample S 1 ∼  A s +1  . r Sample i ∈ S 1 u.a.r. and let S err 2 : = S 1 ∪ { i } . This implies that q err 2 ≤ q 1 b ecause the extra factor of c i ≤ 1 cannot increase the exp ectation. By Claim 1 . (ii) we get q err 2 ≤ q 1 ≤ √ n · q 2 . Com bining, we ha ve q ∗ 2 ≤ q 2 + 2 n · √ n · q 2 =  1 + 2 √ n  q 2 ≤ 10 9 q 2 and th us q 2 ≥ 9 10 q ∗ 2 as desired. This concludes the pro of of Theorem 3 . 12 4 Op en questio ns Op en Question 1. Is ther e a total f unction g : { 0 , 1 } m → { 0 , 1 } such that BPP ( Xor ◦ g n ) ≥ Ω( n log n · BPP ( g )) or BPP ( Maj ◦ g n ) ≥ Ω( n log n · BPP ( g )) ? Since F act 5 captures the only properties of g = GapOr used in our pro of of Theorem 3 , this pro v ides a p ossible roadmap for conﬁ rming Op en Question 1 : Just ﬁnd a total function g satisfying prop erties similar to F act 5 , enabling our pro of of Theorem 3 to go throug h. Ho w ever, suc h a g w ould need to ha ve certiﬁc ate complexit y ω ( BPP ( g )) , and it remains a signiﬁcan t op en problem to ﬁnd any suc h total function g (the “p oin ter function” [ GPW18 , ABB + 17 ] and “c heat sheet” [ ABK16 ] methods do not seem to w ork). Another approac h f or conﬁrming Op en Question 1 w ould b e to generalize the strong direct sum theorem from [ BB19 ] to show that BPP ( X or ◦ g n ) ≥ Ω( n · BPP 1 /n ( g )) or BPP ( Maj ◦ g n ) ≥ Ω( n · BPP 1 /n ( g )) holds for all g . This w ould answ er Op en Question 1 in the aﬃrmativ e, since [ BB19 ] designed a total function g satisf y ing BPP 1 /n ( g ) ≥ Ω( RP ( g ) · log n ) using the “p oin ter functi on” method. Comp ared to our approac h from the previous paragraph, this approac h inv olv es less stringen t requiremen ts on g , whic h mak es it easier to design g but harder to pro ve th e comp os ition lo w er b ound. Op en Question 2. Is ther e a total function f : { 0 , 1 } n → { 0 , 1 } such that BPP ∗ ( f ) ≥ ω ( BPP † ( f )) (or similarly, BPP ( f ◦ GapMaj n ) ≥ ω ( BPP ( f ◦ GapOr n )) )? It is not diﬃcult to ﬁnd s uc h a p artial f unctio n f . Namely , take any function f ′ : { 0 , 1 } n → { 0 , 1 } suc h that BPP ∗ ( f ′ ) ≥ Ω( n log n ) , suc h as f ′ = X or or f ′ = Maj . Then tak e f = f ′ ◦ Which n , whic h has input length 2 n (recall from Section 1.1 that giv en y ∈ { 0 , 1 } 2 with the promise that y has Hamming w eigh t 1 , Which ( y ) indicates the location of the unique 1 in y ). A simple reduction sho ws BPP ∗ ( f ) ≥ BP P ∗ ( f ′ ) . Ho w ever, BPP † ( f ) ≤ O ( n ) : F or eac h block of 2 bits, w e c an rep eatedly query b oth un til one of them returns 1 (whic h tak es O (1) queries in exp ectation). After doing this for all n blo cks (whic h tak es O ( n ) queries in expectation), we kno w for sure what the ent ire actual input is. By Mark o v’s inequalit y , w e can abort the execution af ter O ( n ) queries while in troducing only a small constan t error probabilit y . (Intu itiv ely , comp osition with Which preserv es hardness for 2 -sided noise but conv erts 1 -sided noise to “ 0 -sided noise”, and no partial function needs ω ( n ) queries in the setting of 0 -sided noise.) In comm unication (rather than query) complexit y , somewhat analogous questions hav e b een studied in sp eciﬁc con texts [ MWY13 , BBG14 , Sag18 ]. The pro of of The orem 1 also w orks for com - m unicat ion complexit y . It w ould b e in teresting to devel op a nalog ues of Theorem 2 and Theorem 3 for comm unication complexit y . A Pro of o f Theorem 1 : Or nev er necessitates ampl iﬁcation F or completen ess, w e pro vide a self-cont ained proof that BPP ∗ ( Or ) ≤ O ( n ) , using the follo wing standard fact ab out random w alks (“the drunk ard at the cliﬀ ”). Lemma 1. Consider a r andom walk on the inte gers that b e gins at 0 and in e ach step moves right (+1) with pr ob ability p and moves left ( − 1) with pr ob ability 1 − p . (i) If p < 1 / 2 then the exp e cte d time at which the walk ﬁ rst visits − 1 is 1 / (1 − 2 p ) . (ii) If p > 1 / 2 then the pr ob ability that the walk ever visits − 1 i s (1 − p ) /p . 13 Pr o of of L emma 1 . (i) : I f ran dom v ariable X represen ts the time at which the walk ﬁrst v is its − 1 , then its exp ectation satisﬁes E [ X ] = 1 + p · 2 E [ X ] since after the ﬁrst step, it either is already at − 1 , or is at +1 in whic h case to reac h − 1 it m ust ﬁrst get back to 0 ( E [ X ] exp ected time) then from there get to − 1 (anoth er E [ X ] expected time). This equation has a unique solution E [ X ] = 1 / (1 − 2 p ) < ∞ . (ii) : If ev en t E represen ts the walk ev er visiting − 1 , then its probabilit y satisﬁes P [ E ] = (1 − p ) · 1 + p · P [ E ] 2 since af ter the ﬁrst step, it either is already at − 1 , or is at + 1 in whic h case to reac h − 1 it must ﬁrst get back to 0 (probabilit y P [ E ] ) then from there get to − 1 (again probabilit y P [ E ] ). This equation has t w o solutions P [ E ] ∈ { (1 − p ) /p, 1 } . T o rule out P [ E ] = 1 , we deﬁne q k as the probab ilit y that the walk visits − 1 within the ﬁrst k steps, and w e show b y induction on k that q k ≤ (1 − p ) /p . The base case is trivial since q 0 = 0 . A ssuming q k ≤ (1 − p ) /p we sho w q k +1 ≤ (1 − p ) /p . After the ﬁrst step, with probabilit y 1 − p it is already at − 1 , and with probabilit y p it is at + 1 . In the latter case, to get to − 1 within a total of k + 1 steps (including the ﬁrst step), it m ust get from + 1 to 0 and then from there it m ust ge t to − 1 , all within k more s teps; in particular, the w alk m ust get from +1 to 0 within k steps (probabilit y ≤ q k ) and then from 0 to − 1 within k steps (probabilit y ≤ q k ). Overall w e can b ound q k +1 ≤ (1 − p ) · 1+ p · q 2 k ≤ (1 − p ) + p · (1 − p ) 2 /p 2 = (1 − p ) /p . Pr o of of The or em 1 . W e ma y assume the noise probabilities are ≤ 1 / 4 (rather than just ≤ 1 / 3 ), b ecause whenever an input bit is queried, we can instead q uery it ﬁve times and pretend that the ma jorit y vote w as the result of the single query . This w ould only aﬀect the cost by a constan t factor. With this assumption, here is our decision tree, on input y ∈ { 0 , 1 } n : F or i = 1 , 2 , . . . , n : Rep eat: Query y i . If the queries to y i ha ve resulted in more 0 s than 1 s so f ar, then break out of the inner lo op. If a total of 6 n q uerie s hav e b een made (across all input bits), then halt and output 1 . Halt and output 0 . This decision tree’s cost is ≤ 6 n . T o see the correctness, consider an y input y ∈ { 0 , 1 } n and an y tuple of noise probabilities ( ν 1 , . . . , ν n ) where eac h ν i ≤ 1 / 4 . F or eac h i , the random v ariable “n umber of 1 s min us n um b er of 0 s, among the queries to y i so far” is a random w alk with mo v e-righ t probabi lit y p i = ν i ≤ 1 / 4 if y i = 0 and p i = 1 − ν i ≥ 3 / 4 if y i = 1 , and whic h stops when it visits − 1 . First assume Or ( y ) = 0 . Then f or eac h i , y i = 0 and so b y Lemma 1 . (i) , the exp ected n um b er of queries un til the inner lo op is broken is 1 / ( 1 − 2 p i ) ≤ 2 . By linearit y , the exp ected total n um b er of queries un til all n inner lo ops hav e b een broken is ≤ 2 n , so b y M arko v’s inequalit y this n um b er of querie s is < 6 n with probabilit y ≥ 2 / 3 . Th us the decision tree outputs 0 with probabilit y ≥ 2 / 3 . No w assume Or ( y ) = 1 . Then for some i , y i = 1 and so by Lemma 1 . (ii) , with probabilit y 1 − (1 − p i ) /p i = 2 − 1 /p i ≥ 2 / 3 there wo uld never b e more 0 s than 1 s f rom the queries to y i . In that case, the decision tree w ould never break out of the i th inner lo op, even if it w ere allo w ed to run forev er. Thus the decision tree outputs 1 with probabilit y ≥ 2 / 3 . 14 A c kno wledgmen ts W e thank Badih Ghazi for inte resting discussions ab out this w ork, and w e thank anon ymous re- view ers for their commen ts. T. W atson w as supp orted b y NSF grant CCF-165 7377. References [ABB + 17] Andris Am bainis, Kaspa rs Balo dis, Aleksandrs Belo vs, T roy Lee, Miklos San tha, and Juris Smotro vs. Sepa rations in query complexit y based on po in ter functio ns. Jou rnal of the ACM , 64(5):32 :1–32:24, 2017. doi:10.1 145/310 6234 . [p. 13 ] [ABK16] Scott Aaronson, Shalev Ben-Da vid, and Robin K othari. Separation s in q uery complexit y using che at sheets. In Pr o c e e dings of the 48th Symp osium on T he ory of Computing (STOC) , pages 863–876. AC M, 2016. doi:10.1 145/289 7518.2897644 . [p. 13 ] [A GJ + 17] An urag Anshu, Dmitry Gavin sky , Rah ul Jain, Srijita Kundu, T ro y Lee, Priy ank a Mukhopadh ya y , Miklos Santh a, and Sw agato Sany al. A composi tion theorem for ran- domized query comp lexit y . In Pr o c e e dings of the 37th Confer enc e on F oundations of Softwar e T e chnolo gy and The or etic al Computer Scienc e (FSTT CS) , pages 10:1–10:13. Sc hloss Dagstuhl, 2017. doi:10.4 230/LIPIcs.FSTTCS.2017.10 . [p. 2 ] [BB19] Eric Blais and Josh ua Bro dy . Optimal separation and strong direct sum for randomized query complexit y . In Pr o c e e dings of the 34th Computational Complexity Confer enc e (CCC) , pages 29:1–29:1 7. Schloss Dagstuhl, 2 019. doi:10.4 230/LIPIcs.CCC.201 9.29 . [pp. 3 , 4 , 5 , 13 ] [BB20] Shalev Ben-Da v id and Eric Blais. A tigh t comp osition theorem for the randomized query complexit y of partial functi ons. T ec hnical R eport 2002.10809, arXiv, 2020. URL: https://arxiv.o rg/abs/200 2.10809 . [p. 2 ] [BBG14] Eric Blais, Josh ua Bro dy , and Badih Ghazi. The information complexit y of hamming distance. In Pr o c e e dings of the 18th In ternational W orkshop on R an - domization and Computation (RANDOM) , pages 465–489. Schl oss Dagstuhl, 2014. doi:10.4 230/LIPIcs.APPROX- RANDOM.2014.465 . [p. 13 ] [BDG + 20] Andrew Bassilakis, Andrew Druck er, Mik a Göös, Lunjia Hu, W eiyun Ma, and Li-Y ang T an. The p o wer of man y samples in query complexit y . In Pr o c e e dings of the 47th International C ol lo quium on A utomata, L anguages, and Pr o gr ammin g (ICALP) . Schloss Dagstuhl, 2020. T o app ear. [p. 2 ] [BdW02] Harry Buhrman and Ronald de W olf. Complexit y measures and decision tree complexit y: A surve y . The or etic al C omputer Scienc e , 288(1):21–43, 2002. doi:10.1 016/S030 4- 3975(01)00144- X . [p. 2 ] [BK18] Shalev Ben-Da v id and Robin K othari. Randomized q uery complexit y of sab- otaged and comp osed functions. The ory of Computing , 14(1):1–27, 2018. doi:10.4 086/to c.2018.v014a 005 . [p. 2 ] [Cad18] Chris Cade. P ost-selected classical query complexit y . T ec hnical Rep ort 1804.100 10, arXiv, 2018. URL: http://arxiv.o rg/abs/180 4.10010 . [p. 5 ] 15 [DR08] Chin mo y Dutta and Jaikumar Radhakrishnan. Low er b ou nds for noisy wireless netw orks using sampling algorithm s. In Pr o c e e dings of the 49th Symp osium on F oundations of Computer Scienc e ( FOCS) , pages 394–402. IEEE, 2008. doi:1 0.1109/FOCS.2008.72 . [pp. 4 , 5 ] [EP98] William Ev ans and Nicholas Pippenger. A v erage-case lo w er b ounds for noisy Bo olean decision trees. SIAM , 28(2 ):433–446, 1998. doi:10.11 37/S0097 539796310102 . [pp. 4 , 5 ] [FRPU94] Uriel F eige, Prabhak ar Raghav an, Da vid Pele g, and Eli Upfal. Computing with noisy information. SIAM Journal on Computing , 23(5): 1001–1018, 1994. doi:10.1 137/S009 7539791195877 . [pp. 3 , 4 , 6 ] [GJ16] Mik a Göös and T. S. Ja yram. A comp osition theorem for conical jun tas. In Pr o c e e dings of the 31st C omputational Complexity Confer enc e ( C CC) , pages 5:1–5:16. Sc hloss Dagstuhl, 2016. doi:10.4 230/LIPIcs.CCC.201 6.5 . [pp. 2 , 6 ] [GJPW18] Mik a Göös, T. S. Ja yram, T oniann Pitassi, and Thomas W atson. Randomized comm uni- cation vs. partition num b er. ACM T r ansactions on Computation The ory , 10(1):4:1–4:20, 2018. doi:10.1 145/317 0711 . [pp. 2 , 5 , 6 ] [GL14] Dmitry Ga vinsk y and Shac har Lo vett . En route to t he log-rank conjecture: New reduc- tions and equiv alen t f orm ulations. In Pr o c e e dings of t he 41st Internation al C ol lo quium on A utomata, L anguage s, and Pr o gr amming (ICAL P) , pages 514–5 24. Springer, 2014. doi:10.1 007/978 - 3- 662- 43948- 7 _43 . [p. 8 ] [GLM + 16] Mik a Göös, Shac har Lo vett, R agh u Mek a, Thomas W atson, and Da vid Zuc k erman. Rectangles are nonnegativ e jun tas. SIAM Journal on Computing , 45(5):1835– 1869, 2016. doi:10.1 137/15M 103145X . [pp. 5 , 6 ] [GLSS19] Dmitry Ga vinsky , T ro y Lee, Miklos San tha, and Sw ag ato Sany al. A comp os ition theorem for randomi zed query complexit y v ia max-conﬂict complexit y . In Pr o c e e dings of the 46th International Col lo quium on A utomata, L anguages, and Pr o gr ammi ng (ICALP) , pages 64:1–64:13. Sc hloss Dagstuhl, 2019. doi:10.423 0/LIPIcs.ICALP .20 19.64 . [p. 2 ] [GPW18] Mik a Göös, T oniann Pitassi, and Thomas W atson. Deterministic comm unica- tion vs. partition num b er. SIAM Journal on C omputing , 47(6):2 435–2450, 2018. doi:10.1 137/16M 1059369 . [p. 13 ] [GS10] N avin Go y al and Mic hael Saks. Rounds vs . queries tradeoﬀ in noisy computation . The ory of Computing , 6(1):113–134, 2010. doi:10.40 86/to c.2010.v006a006 . [p. 4 ] [JKS10] R ahul Jain, Hartmu t Klauc k, and Miklos San tha. Optimal direct sum results for de- terministic and randomized decision tree com plexit y . Information Pr o c essing L etters , 110(20):893– 897, 2010. doi:10.1 016/j.ipl.201 0.07.020 . [p. 2 ] [KK94] Claire Ken yon and V alerie King . On Boolean decision trees with fault y no des. R andom Structur es and A lgorithms , 5(3):453–4 64, 1994. doi:10.10 02/rsa.3240 050306 . [p. 4 ] [KLdW15] Jedrzej Kaniewski, T ro y Lee, and Ronald de W olf. Query comple xit y in exp ectation. In Pr o c e e din gs of the 42nd International Col lo quium on A utomata, L anguag es, and Pr o- gr amming (ICALP) , pages 761–772. Spri nger, 2015. doi:10.1 007/978 - 3- 662- 47672- 7 \_62 . [p. 6 ] 16 [MWY13] Marco Molinaro, Da vid W o o druﬀ, and Grigory Y arosla vtsev. Beating the direct sum theorem in comm unication complexit y with implications for sk etc hing. In Pr o c e e dings of the 24th Symp osium on Discr ete A lgorithms , pages 1738–17 56. A CM-SIAM, 2013. doi:10.1 137/1.97 81611973105.125 . [p. 13 ] [New09] Ilan New man. Computing in fault toleran t broa dcast net w orks and noisy decision trees. R andom S truc tur es and Algorithms , 34(4):478–50 1, 2009. doi:10.1 002/rsa.202 40 . [p. 4 ] [Sag18] Mert Saglam. Near log-con vexit y of mea sured heat in (discrete ) ti me and consequences. In Pr o c e e dings of the 59th Symp osium on F oundations of Computer Scienc e (FO CS) , pages 967–978. IEEE, 2018. doi:10.1 109/FOCS.2018.00095 . [p. 13 ] [She13] Alexander She rsto v. Making p olynomials robust to noise. The ory of Computing , 9:593– 615, 2013. doi:10.4086 /toc.2 013.v009a018 . [p. 6 ] 17

When Is Amplification Necessary for Composition in Randomized Query Complexity?

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment