An optimal quantum algorithm to approximate the mean and its application for approximating the median of a set of points over an arbitrary distance

An optimal quantum algorith m to approx imate the mean and its applic ation for approximating the median of a set of points ov er an arbitrary distance Gilles Brassard ∗ , Fr ´ ed ´ eric Dupuis † , S ´ ebastien Gambs ‡ , and Alain T app ∗ ∗ D ´ eparte ment d’info rmatique et d e r ec her che op ´ era tionnelle , Univer sit ´ e de Montr ´ eal C.P . 6 128, Succu rsale Centr e-V ille, M ontr ´ eal (QC), H3C 3J7 Canada † Institu t f ¨ ur Theor etisc he Physik, ETH Z ¨ uric h W olfgan g-P auli-Stra ße 27, 8093 Z ¨ uric h, Switzerland ‡ IRISA, Campus de Beaulie u, Univer sit ´ e de Rennes 1 A venu e du G ´ en ´ era l Lecler c, 35042 Rennes Cede x, F ran ce 25 May 2011 Abstract W e describe two quan tum algor ithms to approx imate the mean value o f a black-bo x function. The ﬁrst algorithm is novel and asymptotically optimal while the seco nd is a variation on an earlier algorithm du e to Aharonov . Both algorithms have their o wn strengths and caveats and may b e relev an t in d ifferent c ontexts. W e th en pro pose a new algorithm fo r appr oximating the med ian of a set of po ints over an arb itrary distance function. Keywords: Quantum computin g, Mean, Med ian, Amplitude estimation. 1 Introd uction Let F : { 0 , . . . , N − 1 } → [0 , 1] be a functio n and m = 1 N P N i =1 F ( i ) be its mean. When F is giv en as a black box (i.e. an oracle), the complex ity of computing the mean can be measure d by counting the number of queries made to this black box. The ﬁrst quantu m algorithm to appro ximate the mean was giv en by Grover , whose outpu t of the estimate ˜ m was such that | m − ˜ m | 6 ε after O ( 1 ε log log( 1 ε )) queries to the black box [7]. Later , N ayak and W u [10] hav e prov en that to get such a precis ion, Ω(1 /ε ) calls to F are necessa ry , w hich still left a gap between the lower and upper bounds for 1 this problem. In this paper , we close this gap by presentin g an asymptotical ly optimal algori thm to a pproximate the mean. W e also describe a second algor ithm that is a v aria- tion of Aharono v’ s algor ithm [1], which may be more su itable than th e ﬁrst o ne in some conte xts. Afterwar ds, these two al gorithms to approx imate the mean are used in combinatio n with the quantum algorithm for ﬁnding the minimum of D ¨ urr and Høyer [6] to obtain a quantu m algori thm for approximatin g the median among a set of points with arbitrary black- box distan ce functio n between these points. The median , w hich is deﬁned as the point w ith minimum ave rage (or total) distance to the other points, can be thought of as the point that is the most represen tativ e of all the other point s. N ote that this is very dif ferent from the simpler problem of ﬁnding the median of a set of v alues, which has alread y been solv ed by Nayak and W u [10]. Our median-ﬁnding algorithm combines the amplitud e estimation techniq ue of Brassard, H øyer , Mosca and T app [3] with the minimum-ﬁndin g algorithm of D ¨ urr and Høyer [6]. The outlin e of the paper is as follo ws. In Section 2, we presen t all the tools that we need, includ ing Grov er’ s algorit hm, the quantum algorithm for computing the mini- mum of a function and the amplitud e estimation technique . In Section 3, we describ e two ef ﬁcient algorithms to approximate the mean value of a functi on, which w e use in Section 4 to dev elop our novel quantum algorithm for approxi mating the median of an ensemble of po ints for which dista nces between them are gi ven by a black box. Finally , we conclu de in Section 5 with open questions for future work. 2 Pr eliminaries In this section, we brieﬂy revie w the quantum informati on processin g notions that are rele va nt for unders tanding our algorith ms. A detailed account of the ﬁeld can be found in the book of Nielsen and Chuang [11]. As is often the case in the analysis of quantum algori thms, we shall assume that the input to the algorith ms is gi ven in the form of a black box (or “oracle”) that can be access ed in quantum superp osition. In practice, the quantum black box will be imple- mented as a quantum circuit that can hav e classical inputs and output s. W e shall count as our main resour ce the number of calls (also called “ev alua tions”) that are required to that black box. 2 Theor em 2.1 (Search [8, 2]) . T her e exis ts a quantu m alg orithm that tak es an arbi- tra ry func tion F : { 0 , . . . , N − 1 } → { 0 , 1 } as input and ﬁnds some x such that F ( x ) = 1 if one e xists or outpu ts “void” other wise. Any such x is called a “solution”. The algorithm r equir es O ( √ N ) evalua tions of F if ther e ar e no solutio ns. If ther e ar e s > 0 soluti ons, the algorith m ﬁnds one with pr obabi lity at least 2 / 3 after O ( p N/s ) e xpected evalu ations of F . This is true e ven if the value of s is not kno wn ahead o f time. Follo wing Grover ’ s seminal work, D ¨ urr and Høyer [6] ha ve proposed a quan tum algori thm that can ﬁnd the minimum of a funct ion with a quadratic speed-up compared to the best possibl e classical algorithm. Theor em 2.2 (Minimum Finding [6, 5]) . Ther e e xists a quantu m algorithm minimum that takes an arbitr ary funct ion F : { 0 , . . . , N − 1 } → Y as input (for an arbitr ary totally ord er ed rang e Y ) and ret urns a pair ( i, F ( i )) such that F ( i ) is the minimum value take n by F . The algorith m ﬁnd s a corr ect answer with pr obabi lity at least 3 / 4 after O ( √ N ) evaluatio ns of F . Another exten sion of Grove r’ s algorith m makes it possible to approximately count the number of solutions to a search problem [4]. It was subseque ntly formulated as follo ws. Theor em 2.3 (Counting [3]) . Ther e e xists a quantu m algorit hm count that takes an arbitr ary functi on F : { 0 , . . . , N − 1 } → { 0 , 1 } as input as well as some positive inte g er t . If ther e ar e s values of x such that F ( x ) = 1 , algori thm count ( F , t ) outputs an inte g er estimate ˜ s for s suc h that | s − ˜ s | < 2 π p s ( N − s ) t + π 2 N t 2 with pr obabi lity at least 8 /π 2 after e xactly t evaluatio ns of F . In specia l case s = 0 , count ( F , t ) always outputs perfect estimate ˜ s = 0 . The follo wing theorem on amplitude ampliﬁcation is also adapted from [3]. Its state- ment is rather more technica l than that of the pre vious theore ms. Theor em 2.4 (Amplitude estimatio n [3]) . Ther e exist s a quantum algori thm ampli- tude estimation that takes as inputs two unitar y tran sformations A and B as well as some positiv e inte g er t . If A | 0 i = α | ψ 0 i + β | ψ 1 i (wher e | ψ 0 i and | ψ 1 i ar e ortho gonal states and | 0 i is of arb itrary dimension) and B | ψ 0 i | 0 i = | ψ 0 i | 0 i and B | ψ 1 i | 0 i = | ψ 1 i | 1 i , 3 then amplitude e stimation ( A, B , t ) outputs ˜ a , an estimate of a = k β k 2 , such tha t | ˜ a − a | 6 2 π p a (1 − a ) t + π 2 t 2 with pr oba bility at least 8 /π 2 at a cos t of doing t e valuation s eac h of A , A − 1 and B . W e shall also need the follo wing technica l resul t, which we deri ve using standard Chernof f bound argumen ts. Theor em 2.5 (Majority) . L et B be a quantum blac k box that appr oximates some func- tion F : { 0 , . . . , N − 1 } → { 0 , . . . , M − 1 } suc h that its outp ut is within ∆ of the true valu e w ith pr oba bility at least 2 / 3 , i.e. B | i i | 0 i = X j α ij | i i | x ij i and X { j : | x ij − F ( i ) | 6 ∆ } | α ij | 2 > 2 / 3 for all i . Then, for all n ther e e xists a quantum black box B n that computes F with its outpu t within 2∆ of the true value with pr ob ability at least 1 − 1 /n , i.e. B n | i i | 0 i = X j β ij | i i | y ij i and X { j : | y ij − F ( i ) | 6 2∆ } | β ij | 2 > 1 − 1 /n for all i . Algorith m B n r equir es O (log n ) calls to B . Pr oof. Giv en an inpu t inde x i , B n calls k times black box B with input i , where k = ⌈ (lg n ) /D ( 3 5 k 2 3 ) ⌉ and D ( ·k· ) denotes the standar d Kullback- Leibler div er gence [9] (sometimes called the relati ve entrop y). If there exists an interv al of size 2∆ that con- tains at least 3 / 5 of the outputs , then B n outpu ts the midpoint of that interv al. If there is no such interv al (a very unlikely occurrenc e), then B n outpu ts 0 . If at least 3 / 5 of the output s are within ∆ of F ( i ) , then the outp ut of B n canno t be further than 2∆ fro m F ( i ) since the interv al selec ted by B n must contain at least one of those points. By the Chernof f bound, this hap pens w ith prob ability at least 1 − 2 − k D  3 5    2 3  > 1 − 1 /n . Hereinaft er , we shall denote by majority ( B , n ) the black box B n that results from using this algorithm on black box B with parameter n . N ote that D  3 5   2 3  > 1 / 10 0 , hence major ity ( B , n ) require s less than 100 lg n calls to B . Note also that the number of calls to B does not depe nd on ∆ . 4 3 T w o Efﬁcient Algorithms to A ppr oximate the Mean W e present two diff erent algorith ms to compute the mean v alue of a funct ion. In both algori thms, let F : { 0 , . . . , N − 1 } → [0 , 1] be a black-box function and let m = 1 N P x F ( x ) be the mean valu e of F , which w e seek to appro ximate. W ithout loss of generality , w e assume througho ut that N is a po wer of 2. The ﬁrst algor ithm assumes th at F ( x ) can be obtained with arbitrary pr ecision at unit cost while t he second algori thm considers that the output of function F is gi ven with ℓ bits of precisi on. Algorithm 1 mean1 ( F , N , t ) Let A ′ | x i | 0 i = | x i  p 1 − F ( x ) | 0 i + p F ( x ) | 1 i  and A = A ′ ( H ⊗ lg N ⊗ Id ) , where H is the W alsh–Hadamard transfo rm and Id de notes the ide ntity transfo rmation on one qubit Let B | x i | 0 i | 0 i = | x i | 0 i | 0 i and B | x i | 1 i | 0 i = | x i | 1 i | 1 i r eturn amplitude estimation ( A, B , t ) Note that in Algorithm 1 , it is easy to implement A ′ (and therefore A as well as A − 1 ) with only two e valu ations of F . F irst, F is computed in a ancillary register initialized to | 0 i , t hen the approp riate con trolled rotations are perfor med, and ﬁnally F is computed again to reset the ancillary register back to | 0 i . (In practice, this transformation will be approx imated to a prescribe d precisio n.) The follo wing theorem formalizes the result obtain ed by this algorithm. Theor em 3.1. Given a blac k-box function F : { 0 , . . . , N − 1 } → [0 , 1] and its mean value m = 1 N P x F ( x ) , algorithm mea n1 o utputs ˜ m suc h that | ˜ m − m | ∈ O  1 t  with pr oba bility at least 8 /π 2 . The algor ithm r equir es 4 t evaluat ions of F . Pr oof. Using the same deﬁnition as in Theorem 2.4, we hav e that | ψ 1 i = X x s F ( x ) P y F ( y ) | x i | 1 i and β = r P x F ( x ) N . The algori thm amplitude estimation ( A, B , t ) returns an estimate ˜ m = ˜ a of a = k β k 2 = 1 N P x F ( x ) = m and thus ˜ m is directl y an estimate of m . The error 5 | ˜ m − m | is at most 2 π p m (1 − m ) t + π 2 t 2 ∈ O  1 t  (1) with probabil ity at least 8 /π 2 . T his require s 4 t ev alua tions of F because each of the t calls on A and on A − 1 requir es 2 ev aluat ions of F . This theorem states that the error goes down asymptoti cally linearly with the num- ber of ev aluati on of F . This is optimal accordin g to N ayak and W u [10], w ho hav e pro ven that in the general case in which we ha ve no a priori kno wledge of the possi- ble distrib ution of outputs , an additi ve error of ε requires an amount of work propor - tional to 1 /ε in the wors t case when the function is giv en as a black box. Note that for t < (1 + √ 2) π / √ m , when our bou nd on the error exc eeds the targ eted mean (which is rather bad), the error goes do wn quadratical ly (w hich is good). W e no w pres ent a va riation on an a lgorithm of Aharo nov [1] an d analyse its char ac- teristi cs. This algorithm is also based on amplitude estimation, but it relies on the fac t that points in real interv al [0 , 1] can be represen ted in binary as ℓ -bit string s, where ℓ is the precisi on with which we w ish to consider the output of black-box funct ion F . The algo rithm estimates the number of 1 s in each binary posit ion. The dif ference between our algorit hm ( mean 2 ) and Aharono v’ s original algorithm is that we make sure that the estimate s of the counts in ev ery bit position are all simultaneo usly within the desired error bound. For each i between 1 and ℓ , let F i ( x ) represe nt the i th bit of the binary exp ansion of F ( x ) , so that F ( x ) = P i F i ( x )2 − i , with the obvi ous case F i ( x ) = 1 for all i when F ( x ) = 1 . Algorithm 2 mean2 ( F , N ) f or i = 1 to ℓ do ˜ m i = majority ( coun t ( F i ( x ) , 5 π √ N ) , n = ⌈ 3 2 ℓ ⌉ ) end f or r eturn ˜ m = 1 N P ℓ i =1 ˜ m i 2 − i Theor em 3.2. Given a blac k-box function F : { 0 , . . . , N − 1 } → [0 , 1] wher e the outpu t of F has ℓ bits of pre cision, algorith m mean2 outpu ts an estimate ˜ m such that | ˜ m − m | 6 1 N P i √ m i 2 − i , wher e m i = P x F i ( x ) , with pr oba bility at least 2 / 3 . The algor ithm r equir es O ( √ N ℓ log ℓ ) evalu ations of F . Pr oof. The proof is a straightforw ard corollary of Theorems 2.3 and 2.5. Using count on each column with s = 5 π √ N yield s an error of | m i − ˆ m i | 6 2 5 √ m i + 1 25 6 with pr obability at le ast 8 /π 2 , an d hence with probabil ity at le ast 2 / 3 , wher e ˆ m i denote s count ( F i ( x ) , 5 π √ N ) . U sing majority with n = ⌈ 3 2 ℓ ⌉ on this, we obtain | m i − ˜ m i | 6 4 5 √ m i + 2 25 with probability at least 1 − 2 3 ℓ . When m i > 1 , this is bound ed by √ m i . Furthermor e, count makes no error when m i = 0 . Hence, the error in each column is bounded by √ m i with probabil ity at least 1 − 2 3 ℓ . By the union bound, all of our estimates for the columns are simultaneous ly within the abov e error bound s with probability at least 2 / 3 , and the e rror bound on our ﬁnal estimate is | ˜ m − m | 6 1 N P √ m i 2 − i . It is st raightfor - ward to cou nt the number of ev alua tions of F from T heorems 2.3 and 2.5. The choice of w hich among algorithms mean1 or mean 2 is more appropriat e depen ds on the particula r characterist ics of the inpu t function . For ex ample, one can consid er the situatio n in which F ( x ) = 2 − N for all x , hence the mean m = 2 − N as well. In this case, if we choose t = cN 3 / 2 lg N in me an1 and ℓ = N in mean2 , where constant c is chosen so that both algorithms call funct ion F the same number of times, the ﬁrst algorith m is in the regime t ≪ (1 + √ 2) π / √ m where it performs badly because the error on the estimate d mean is expecte d to be much larger than the mean itself. On the other hand, the expecte d error produced by the secon d algor ithm is bounded by m/ √ N , which is much smaller than the targ eted mean. A t the other end of the spectru m, if F ( x ) = 1 / 2 for all x , hence the mean m = 1 / 2 as well, and if t ≫ 2 π √ N , then the error produced by mean 1 is much smaller than m/ √ N accor d- ing to Equation (1). W ith the same parameters, the error produced by m ean2 , which is again bounded by m/ √ N , is strictly unaf fected by the choice of ℓ , so that the second algori thm can work arbitrari ly harder than the ﬁrst, yet produce a less preci se estimate of the mean. 4 A ppr oximate Median Algorithm Let dist : { 0 , . . . , N − 1 } × { 0 , . . . , N − 1 } → [0 , 1] be an arbit rary black -box distan ce functi on. Deﬁnition 4.1 (Median) . The median is the point within an ensembl e of points whose aver age distance to the other poin ts is minimum. Formally , the median of a set of points Q = { 0 , . . . , N − 1 } is median ( Q ) = arg min z ∈ Q N − 1 X j =0 dist ( z , j ) . 7 The median can be found classica lly by going through each point z ∈ Q , computing the av erage distance from z to all the other points in Q , and then taking the minimum (ties are broke n arbitrarily). This proces s requires a ti me of O ( N 2 ) . In th e genera l case, in which there are no restric tions on the distance function used and no structur e among the ensemble of points that can be ex ploited, no techniqu e can be more efﬁcient than this na¨ ıve algorithm. Indeed, consi der the case in which all the points are at the same distan ce from each other , ex cept for two p oints that are cl oser than t he res t of t he points . These two points are the medians of this ensemble. In this case, classi cally we woul d need to query the oracle for the distance s between each and e very pair of point s before we can identify one of the two medians. (W e ex pect to disco ver this special pair after queryi ng about half the pairs on the a verage but we cannot kno w that there isn’ t some other ev en closer pair until all the pairs hav e been queried .) This resul ts in a lower bound of Ω( N 2 ) calls to the oracle. In Algorithm 3, mean stand s for either one of the two algorithms giv en in the pre- vious sectio n (in case mean 1 is used, parameter t must be added ) b ut it is repeated O (log N ) times in order to get all the means within the desir ed error bound with a con- stant prob ability via our majority algor ithm (T heorem 2.5). Here, d i = 1 N P j dist ( i, j ) and d min = d k for any k such that d k 6 d i for all i . Algorithm 3 median ( dist ) For eac h i , deﬁne functi on F i ( x ) = dist ( i, x ) For eac h i , deﬁne ˜ d i = majority ( mean ( F i , N ) , n = N 2 ) r eturn minimum ( ˜ d i ) Theor em 4.2. F or any bla ck-b ox distance function dist : { 0 , . . . , N − 1 } × { 0 , . . . , N − 1 } → [0 , 1] , when mean1 is used with parameter t , algorithm median output s an index j such that | d j − d min | ∈ O (1 / t ) with pr obabili ty at least 2 / 3 . The algorith m r equir es O ( t √ N log N ) ev aluations of dist . Pr oof. This result is obtain ed by a straigh tforward combination of Theorems 2.2, 2.5 and 3.1. The procedure m ajority is used with parameter n = N 2 to ensure that all the d i ’ s computed by the algorithm (in superpositi on) are simultane ously within the bound gi ven by Theorem 3.1, exce pt with proba bility o (1) . Note that with parameter n = N 2 , the number of repetition s is still in O (log N ) . The success probability of the algorithm follo ws from the f act that 3 4 (1 − o (1) ) > 2 / 3 . In this case , the e rror is in O (1 /t ) and the number of e valu ations of dist is in O ( t √ N log N ) . 8 By replacing m ean1 by mea n2 in the median algorith m we obtain the followin g theore m. Theor em 4.3. F or any bla ck-b ox distance function dist : { 0 , . . . , N − 1 } × { 0 , . . . , N − 1 } → [0 , 1] , when mean2 i s used, algorith m med ian out puts an inde x j such that | d j − d min | 6 1 N ℓ X i =1 √ m i 2 − i with pr obab ility at least 2 / 3 . (See algo rithm mean2 for a deﬁnit ion of m i and ℓ .) The algor ithm r equir es O ( N log N ) evalu ations of dist . 5 Conclusion W e hav e describe d two quantu m algorithms to approximate the mean and their appli- cation s to approximate the median of a set of points ov er an arbit rary distance functi on gi ven by a black box. W e lea ve open for futur e work an in-dept h study on how the dif ferent beha viour of the two algorith ms impact the qual ity of the m edian the y return. For instan ce, we kno w that the beha viour of both algorith ms for the mean depends on the distrib ution of data points and the distan ces between points, but w e still ha ve to in v estigate more precisely the ex act conte xt where it matters. Of course, unde rstanding the beh aviou r of the algorithms in diffe rent context s is important, b ut a mor e interest ing questi on is to tailor the algorithm to obtain better results on diffe rent data distrib utions of interest . Refer ences [1] D. Aharonov , “Quantum computation – A rev iew”, Annual Revie w of Computa- tional Physics , Dietrich Stauf fer (editor), W orld S cientiﬁc, V ol. 6, 199 8. [2] M. B oye r , G. B rassard , P . Høyer and A. T app, “Tig ht bound s on quantum search- ing”, F ortschr itte Der Physik , V ol. 46, nos. 4–5, pp. 493–50 5, 1998. [3] G. Brassard, P . Høyer , M. Mosca and A. T app, “Quantum amplitude ampliﬁcation and estimatio n”, Contempora ry Mathematics , V ol. 305, pp. 53–7 4, 2002. 9 [4] G. Brassard, P . Høyer and A. T a pp, “Quantum countin g”, In Pr oceedings of the Intern ational Confer ence on Automata, Lang uage s and Pr o gramming : IC A LP’98 , pp. 820–83 1, 1998. [5] C. D ¨ urr , M. Heiligman, P . Høyer and M. M halla, “Quantum query complexity of some grap h problems”, In Pr oc eedings of the Internat ional Confer enc e on Autom- ata, Langua ges and Pr ogra mming: ICALP’04 , pp. 481–49 3, 2004. [6] C. D ¨ urr and P . Høyer , “ A quantum algorith m for ﬁnding the minimum”, A v ailabl e at arxi v.org /abs/quant- ph/9607014 , 1996 . [7] L. K . Grov er , “ A framew ork for fast qu antum mech anical algor ithms”, In Pr oceed- ings of the 30th ACM Symposi um on Theory of C omputing : STOC’98 , pp. 53–62, 1998. [8] L. K . Gro ver , “Quantu m mechanic s helps in sear ching for a nee dle in a hays tack”, Physical Revie w Letters , V ol. 79, no. 2, pp . 325–3 28, 1997. [9] S. Kullback and R. A. Leibler , “On information and sufﬁcienc y”, Annals of Math- ematical Statis tics , V ol. 22, no. 1, pp. 79–86 , 1951. [10] A. Nayak and F . W u, “T he quan tum query comple xity of approximatin g the median and rela ted statistics”, In Pr oceedin gs of the 31s t ACM Symposium on Theory of Computing : STOC’99 , pp. 384 –393, 1999. [11] M. Nielsen and I. Chuang, Quantum Computation and Quantu m Informat ion , Cambridge Uni versity Press, 2000. 10

An optimal quantum algorithm to approximate the mean and its application for approximating the median of a set of points over an arbitrary distance

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment