Tight Cell-Probe Bounds for Online Integer Multiplication and Convolution

Tigh t Cell-Prob e Bounds for Online In teger Multi plication and Con v olution ∗ Rapha ¨ el Cliﬀord † Markus Jalsenius † Abstract W e show tight b ounds for b oth o nline integer mult iplication a nd conv olution in the cell-pro be mo del with w ord size w . F or the m ultiplication problem, o ne pair of digits, each from one of t w o n dig it n umbers that are to b e m ultiplied, is given as input at step i . The online algorithm outputs a single new digit from the pr o duct of the num b ers b efore s tep i + 1. W e give a Θ  δ w log n  bo und on av era ge p er output digit for this pro blem where 2 δ is the max im um v alue o f a dig it. In the conv olution problem, we are given a ﬁxed v ector V of le ng th n and w e co nsider a stream in which num b ers arr ive one at a time. W e output the inner pr o duct of V and the vector that consists of the la st n n um b ers of the strea m. W e show a Θ  δ w log n  bo und for the num ber of prob es req uired p er new num ber in the str e am. A ll the bo unds presented hold under r andomisation and amortisation. Multiplication and conv olution are cent ra l problems in the s tudy of algor ithms which also have the widest range o f practical applica tions. 1 In tro duction W e consider t wo related and f undamental problems: multiplying t w o int egers and com- puting the con v olution or cross-correlatio n of tw o vecto rs. W e study b oth these problems in an online or str eaming con text and p ro vide matc hing upp er and lo w er b oun ds in the cell-prob e mo del. The imp ortance of these problems is hard to o v erstate with b oth the in teger multiplica tion and conv olutio n problems p la ying a cen tral role in mo dern algorithms design and theory . F or notational brevit y , we write [ q ] to denote the set { 0 , . . . , q − 1 } , wh ere q is a p ositiv e in teger. Problem 1 (Online con v olution) . F or a ﬁxe d v e ctor V ∈ [ q ] n of length n , we c onsider a str e am in which numb ers fr om [ q ] arrive one at a time. F or e ach arriving numb er, b efor e the next numb er arrives, we output the inner pr o duct (mo dulo q ) of V and the ve ctor that c onsists of the last n numb ers of the str e am. ∗ A preliminary versi on of th is pap er app eared in I CALP ’11. † Universit y of Bristol, Department of Computer Science, Bristol, U.K. 1 W e sh ow that there are instances of this p roblem su ch that any algorithm solving it will requ ire Ω( δ w log n ) amortised time on a v erage p er ou tp ut, where δ = log 2 q and w is the num b er of b its p er cell in the cell-prob e mod el. Th e result is formally stated in Theorem 3. Problem 2 (Online multiplica tion) . Given two numb ers X , Y ∈ [ q n ] , wher e q is the b a se and n is the numb er of digits p er numb er, we want to output the n le ast signiﬁc ant digits of the pr o d uct of X and Y , in b a se q . We must do this under the c onstr a int that the i th digit of the pr o duct (starting fr om the lower-or der end) is outputte d b efor e the ( i + 1) th digit, and when the i th digit is outputte d, we only have ac c ess to the i le ast signi ﬁc ant digits of X and Y , r esp e ctively. We c an think of the digits of X and Y arriving online in p airs, one digit fr om e ach of X and Y . W e sh ow that there are instances of this p roblem su ch that any algorithm solving it tak es Ω( δ w log n ) time on a v erage p er input pair, where δ = log 2 q and w is the n umb er of bits p er cell in the cell-prob e mo del. The result is formally stated in Theorem 12 Our main tec hnical innov ation is to extend recen tly deve lop ed metho ds designed to giv e lo we r b oun d s on d ynamic data structur es to the seemingly distinct ﬁ eld of online algorithms. Where δ = w , for example, w e h av e Ω(log n ) lo w er b ounds for b oth online m ultiplication and con v olution, thereb y matc hing the currently b est kno wn oﬄine u pp er b ound s in the RAM mo del. As we d iscuss in the Section 1.1, this ma y b e the highest lo w er b ou n d that can b e form ally pr o v ed for th ese problems with ou t a further signiﬁcan t theoretical breakthrough. F or the conv olution problem, one consequence of our results is a new separation b et wee n the time complexit y of exact and inexact strin g matc hing in a stream. The con v olution has play ed a particularly imp ortan t role in the ﬁeld of com binatorial pat- tern matc hing w here many of the fastest algorithms rely cru cially for their sp eed on the use of fast F ourier transforms (FFTs) to p erform r ep eated conv olutions. These m eth- o ds h a v e also b een extended to allo w searc hing for patterns in rapidly pro cessed data streams [CE PP11, CS11]. The results we present here th erefore giv e the ﬁrst strict sep- aration b et wee n the constant time complexit y of online exact m atc hin g [Gal81] and an y con v olution based online p attern matc hing algorithm. Although we sho w only the existence of probabilit y distributions on the inputs for whic h we can pr o v e low er b ounds on the exp ected runn ing time of an y deterministic al- gorithm, b y Y ao’s minimax principle [Y ao77] this also imm ediately imp lies that for every (randomised) algorithm, there is a w orst-case inp u t suc h that the (exp ected) run ning time is equally high. Ther efore our lo w er b oun ds hold equally for randomised algorithms as for deterministic ones. The lo wer b ounds w e show for b oth online multiplicatio n and con v olution are also tigh t within the cell-prob e mo del. This can b e seen by application of r eductions d e- scrib ed in [FS73, CEPP11]. It was shown there th at any oﬄine algorithm for multipli- cation [FS73] or con v olution [CEPP11] can b e con v erted to an online one with at most an O (log n ) factor o v erhead. F or d etails of these reductions we r efer the reader to th e original pap ers. In our case, the s ame appr oac h also allo ws us to directly con v ert any 2 cell-prob e algorithm from an oﬄine to online setting. An oﬄine cell-prob e algorithm for either multiplica tion or con v olution could ﬁrst read the whole inpu t, then compute the answ ers and ﬁnally outp ut them. This tak es O ( δ w n ) cell pr ob es. W e can therefore deriv e online cell-prob e algorithms whic h tak e only O ( δ w log n ) pr ob es p er outpu t, h ence matc hing the n ew lo w er b ound we give . 1.1 Previous results and upp er b ounds in the RAM mo del The b est time complexity lo wer b ounds for online multiplica tion of t w o n -bit num b ers w ere given in th e 1974 by P aterson, Fisc her and Mey er. They presen ted an Ω(log n ) lo w er b ound for multita p e T uring mac hines [PFM74 ] and also ga v e an Ω(log n/ log log n ) lo w er b ound for the ‘b ounded activit y mac hine’ (BAM). Th e BAM, which is a strict generalisatio n of the T uring mac hine mo d el b ut wh ic h has nonetheless largely fallen out of fa v our, attempts to ca ptur e the id ea that futu re s tates can only dep end on a limited part of the current conﬁguration. T o the auth ors’ kno wledge, there has b een no progress on cell-prob e lo wer b ounds for online m ultiplication or con v olution previous to the wo rk w e presen t here. There ha v e h o w ev er b een attempts to p r o vide oﬄine lo w er b ounds for the related problem of computing the FFT. In [Mor73] Morgenstern ga v e an Ω( n log n ) lo w er b ound conditional on the assumption th at the u nderlying ﬁeld of the transf orm is the complex n umb ers and that the mo du lus of any complex n um b ers inv olv ed in the computation is at most 1. P apadimitriou ga v e the same Ω( n log n ) low er b ound for FFTs of length a p ow er of t w o, this time excludin g certain classes of algorithms including th ose that rely on linear m athematical relations among the ro ots of unity [P ap79]. Th is work had the adv an tag e of giving a cond itional lo w er b ound for FFTs ov er more general algebras th an w as pr eviously p ossible, including for example ﬁnite ﬁelds. In 19 86 Pan [Pa n86] sho w ed that another class of algorithms ha ving a so-called synchronous structure must require Ω( n log n ) time for th e compu tation of b oth the FFT and con v olution. The fastest kn o wn algorithms for b oth oﬄin e intege r m ultiplication and con v olution in the wo rd-RAM mo d el require O ( n log n ) time by a w ell kno wn application of a constan t n umb er of FFTs. As a consequence ou r online lo w er b oun ds matc h the b est kno wn time upp er b ounds for the oﬄine problem. As w e discussed ab o ve, our lo we r b ound s are also tigh t within the cell-prob e mo del for the online problems. The qu estion now naturally arises as to wh ether one can ﬁn d h igher lo w er b ounds in the RAM mo del. This app ears as an in teresting question as there remains a gap b etw een the b est kno wn time u pp er b ound s p ro vided by existing alg orithms and the lo w er b ounds that we giv e within the cell-prob e mo del. Ho w ev er, as we men tion ab o v e, any oﬄine algorithm for conv olution or multiplica tion can b e con v erted to an online one with at most an O (log n ) f actor o v erhead [FS73, CEPP11]. As a consequen ce, it is like ly to b e hard to pro v e a higher lo w er b oun d for the on lin e p roblem th an w e hav e give n, at least for the case wh er e δ /w ∈ Θ (1), as this would immediately imp ly a sup erlin ear lo w er b ound for oﬄine con v olution or multiplicat ion. Suc h sup erlinear lo w er b ounds are not ye t kno wn for an y problem in NP except in v ery restricted mo dels of computation, such as for example a sin gle tap e T u r ing Mac hine. Our only alternativ e route to ﬁn d tigh t time b ounds 3 w ould b e to ﬁn d b etter up p er b ounds f or the online p roblems. F or the case of on lin e m ultiplication at least , this h as b een an op en p roblem since at least 1973 and has so far resisted our b est attempts. 1.2 The cell-prob e mo del When stating lo we r b oun ds it is imp ortant to b e precise ab out the mo del in which the b ound s apply . Our b ounds in this pap er h old in p erhaps the strongest mo del of them all, the c el l-pr ob e mo del , in tro du ced originally b y Minsky and P ap er t [MP69, MP88] in a diﬀeren t con text and then subs equen tly b y F redman [F re78] and Y ao [Y ao81]. In this m o del, there is a separation b et w een the compu ting unit and the m emory , whic h is external and consists of a set of cells of w bits eac h. The computing unit cann ot remem b er an y information b et w een op erations. Computation is free and th e cost is measured only in the n umb er of cell reads or wr ites (cell-prob es). This general view mak es the mo del v ery strong, sub suming for instance the p opu lar w ord-RAM mo del. In the w ord-RAM mo del certain op erations on wo rds, suc h as addition, subtr action and p ossibly m ultiplication tak e constan t time (see for example [Hag98] for a d etailed in tro du ction). Here a wo rd corresp onds to a cell. Typica lly we think of the cell size w as b eing at least log 2 n bits, where n is the num b er of cells. This allo ws eac h cell to hold the address of an y location in memory . The generalit y of th e cell-prob e mo del mak es it particularly attractiv e for establishing lo w er b ounds for data structure problems and man y suc h resu lts ha ve b een give n in the past couple of decades. Th e approac hes tak en hav e until recen tly mainly b een based on comm unication complexit y arguments and the c hronogram tec hnique of F redman and Saks [FS89]. T here remains h o w ev er, a num b er of unsatisfying gaps b et w een the lo w er b ounds and kno wn u pp er b ounds. Only a few y ears ago, a breakthrough lead b y Demaine and Pˇ atra ¸ scu ga v e us the to ols to seal the gaps for several data stru cture problems [PD06]. The new tec hnique w as based on information theoretic arguments. Demaine and Pˇ atra¸ scu also presen ted ideas wh ic h allo w ed them to express more r eﬁned lo w er b ou n ds suc h as trade-oﬀs b et w een up d ates and queries of dyn amic data structur es. F or a list of data str ucture pr ob lems and their low er b oun d s using these and related tec hniques, see for example [Pˇ at0 8 ]. 1.3 Organisation W e p resen t the new cell-prob e low er b ound for online con v olution in Section 2 along with the main tec hniques th at w e will use throughout. In Section 3 we show h o w these can then b e app lied to the problem of online m ultiplicatio n. 2 Online con v olution F or a v ector V of length n and i ∈ [ n ], w e write V [ i ] to denote the elemen ts of V . F or p ositiv e intege rs n and q , the inner pr o d uct of tw o v ectors U, V ∈ [ q ] n , denoted h U, V i , 4 is deﬁned as h U, V i = X i ∈ [ n ] ( U [ i ] · V [ i ]) . P arameterised b y tw o p ositiv e intege rs n and q , and a ﬁ x ed v ector V ∈ [ q ] n , the online c onvo lution pr oblem asks to main tain a vec tor U ∈ [ q ] n sub ject to an op eration next(∆), whic h tak es a parameter ∆ ∈ [ q ], mo diﬁes U to b e the v ector ( U [1] , U [2] , . . . , U [ n − 1] , ∆) and then returns the inner p ro du ct h U, V i . In other w ords, next(∆) mo diﬁes U by shifting all elemen ts one step to the left, p ushing the leftmost element out, and setting the new righ tmost elemen t to ∆. W e consider the online con v olution problem ov er the ring Z /q Z , th at is intege r arithmetic mo dulo q . Let δ = log 2 q . Theorem 3. F or any p o sitive inte gers q and n , in the c el l pr ob e mo d el with w bits p er c el l ther e exist instanc es of the online c onvolution pr oblem such tha t the exp e cte d amortise d time p er next -op er ation is Ω  δ w log n  , wher e δ = log 2 q . In order to pro v e Theorem 3 w e will consider a random instance that is describ ed b y n n ext-op erations on the sequence ∆ = (∆ 0 , . . . , ∆ n − 1 ), where eac h ∆ i is c hosen indep en d en tly and uniform ly at r an d om from [ q ]. W e defer the c hoice of the ﬁxed ve ctor V un til later. F or t f rom 0 to n − 1, we use t to denote the time, and we say that the op eration next(∆ t ) occurs at time t . W e ma y assume that prior to the ﬁ rst up date, the vec tor U = { 0 } n , although an y v al ues are p ossible sin ce they do not inﬂuen ce the analysis. T o a v oid tec hnicalit ies w e will from n ow on assume that n is a p o w er of t w o. 2.1 Information transfer F ollo wing the ov erall app roac h of Demaine and Pˇ atra ¸ scu [PD04] we w ill consider adjacen t time in terv als and study the informat ion that is trans f erred from the op erations in one in terv al to the next in terv al. More precisely , let t 0 , t 1 , t 2 ∈ [ n ] s uc h that t 0 6 t 1 < t 2 and consider an y algorithm solving the online con v olution p roblem. W e wo uld like to kee p trac k of the memory cells that are written to during the time interv al [ t 0 , t 1 ] and then read du ring the succeeding inte rv al [ t 1 + 1 , t 2 ]. The information f rom the next-op erations taking place in the interv al [ t 0 , t 1 ] th at the algorithm passes on to the in terv al [ t 1 + 1 , t 2 ] m ust b e cont ained in these cells. Inf ormally one can sa y that there is no other wa y for the algorithm to determine what o ccurred durin g the in terv al [ t 0 , t 1 ] except through these cells. F ormally , th e information tr ansfer , d en oted I T ( t 0 , t 1 , t 2 ), is deﬁned to b e the set of m emory cells c such that c is written durin g [ t 0 , t 1 ], read at a time t r ∈ [ t 1 + 1 , t 2 ] an d not w ritten du r ing [ t 1 + 1 , t r ]. Hence a cell that is o v erwritten in [ t 1 + 1 , t 2 ] b efore b eing read is not in cluded in the inform ation transf er. Observ e that the information transfer dep end s on the algorithm, the ve ctor V and the sequ ence ∆ . T he ﬁ rst aim is to sho w that for an y choi ce of algorithm solving the online conv ol ution prob lem, th e n umb er of cells in the inf orm ation transfer is b oun ded from b elo w by a suﬃcien tly large n umber for some c hoice of the vec tor V . F or 0 6 t 0 6 t 1 < n , we write ∆ [ t 0 ,t 1 ] to d enote the subs equence (∆ t 0 , . . . , ∆ t 1 ) of ∆, and ∆ [ t 0 ,t 1 ] c to denote the sequence (∆ 0 , . . . , ∆ t 0 − 1 , ∆ t 1 +1 , . . . , ∆ n − 1 ) which con tains 5 all the elemen ts of ∆ except for those in ∆ [ t 0 ,t 1 ] . F or t ∈ [ n ], w e let P t ∈ [ q ] d en ote the inner pro duct returned by next(∆ t ) at time t (recall that we op erate mo d ulo q ). W e let P [ t 1 +1 ,t 2 ] = ( P t 1 +1 , . . . , P t 2 ). Since ∆ is a random v ariable, so is P [ t 1 +1 ,t 2 ] . In particular, if we condition on a ﬁ xed c hoice of ∆ [ t 0 ,t 1 ] c , call it ∆ ﬁx [ t 0 ,t 1 ] c , then P [ t 1 +1 ,t 2 ] is a r andom v ariable that dep ends on the random v alues in ∆ [ t 0 ,t 1 ] . The dep end ency on the next-op erations in the interv al [ t 0 , t 1 ] is captured by the inf orm ation transfer I T ( t 0 , t 1 , t 2 ), wh ic h m ust en co de all the relev ant inf orm ation in order for the alg orithm to corr ectly output th e inner pro ducts in [ t 1 + 1 , t 2 ]. In other words, an enco ding of the information su pplied by cells in the information transfer is an upp er b ound on the conditional e ntr opy of P [ t 1 +1 ,t 2 ] . Th is f act is stated in Lemma 4 and w as giv en in [Pˇ at08] with small notational diﬀerences. Lemma 4 (Lemma 3.2 of [Pˇ at08]) . The entr op y H ( P [ t 1 +1 ,t 2 ] | ∆ [ t 0 ,t 1 ] c = ∆ ﬁx [ t 0 ,t 1 ] c ) 6 w + 2 w · E h | I T ( t 0 , t 1 , t 2 ) | | ∆ [ t 0 ,t 1 ] c = ∆ ﬁx [ t 0 ,t 1 ] c i . Pr o of. The a v erage length of an y enco ding of P [ t 1 +1 ,t 2 ] (conditioned on ∆ ﬁx [ t 0 ,t 1 ] c ) is an upp er b ound on its entrop y . W e use the inform ation transfer as an enco ding in the follo wing wa y . F or eve ry cell c in the information transfer I T ( t 0 , t 1 , t 2 ), we store the address of c , whic h tak es at most w bits under the assum ption that the cell size can hold the address of ev ery cell, and we store the conte nts of c , which is a cell of w bits. In total this r equires 2 w · | I T ( t 0 , t 1 , t 2 ) | bits. In addition, w e store the size of the information transfer, | I T ( t 0 , t 1 , t 2 ) | , so that any algorithm deco din g the stored inform ation knows ho w man y cells are stored and h ence wh en to stop chec king for stored cells. Sto ring the size of the information transfer r equires w bits, th us the a v erage total lengt h of the enco ding is w + 2 w · E [ | I T ( t 0 , t 1 , t 2 ) | | ∆ [ t 0 ,t 1 ] c = ∆ ﬁx [ t 0 ,t 1 ] c ]. In order to pro v e that the describ ed encod in g is v alid, w e describ e ho w to decod e the stored in formation. W e d o th is b y sim ulating the algo rithm. First w e sim ulate the algorithm from time 0 to t 0 − 1. W e hav e no problem d oing so since all necessary information is a v ail able in ∆ ﬁx [ t 0 ,t 1 ] c , wh ic h w e kn o w. W e then skip from time t 0 to t 1 and resume simulat ing the algorithm from time t 1 + 1 to t 2 . In this inte rv al, the algorithm outputs the v alues in P [ t 1 +1 ,t 2 ] . In order to correctly do so, the algorithm migh t need information fr om the next-op erations in [ t 0 , t 1 ]. This information is only a v a ilable through the enco ding describ ed ab o ve . When simulati ng the algorithm, for eac h cell c we r ead, we chec k if the address of c is conta ined in the list of ad d resses that w as stored. If so, we obtain the conte nts of c b y reading its stored v alue. Eac h time we write to a cell whose address is in the list of stored addresses, w e remo v e it from the stored list, or blank it out. Note that ev ery cell we read whose address is not in the stored list con tains a v a lue that w as written last either b efore time t 0 or after time t 1 . Hence its v alue is kno wn to us. 6 2.2 Reco v ering information In the previous section, we pro vided an upp er b oun d f or the en trop y of th e outp uts from the next-op erations in [ t 1 + 1 , t 2 ]. Next we will explore h o w m uch information ne e ds to b e communicat ed from [ t 0 , t 1 ] to [ t 1 + 1 , t 2 ]. This will provide a lo w er b oun d on the en trop y . As w e will see, the lo w er b ound can b e expr essed as a function of the length of the in terv als and the vect or V . Supp ose that [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] b oth ha v e the same length ℓ . That is, t 1 − t 0 + 1 = t 2 − t 1 = ℓ . F or i ∈ [ ℓ ], the output at time t 1 + 1 + i can b e broke n into t w o sums S i and S ′ i , suc h that P t 1 +1+ i = S i + S ′ i , where S i = X j ∈ [ ℓ ] ( V [ n − 1 − ( ℓ + i ) + j ] · ∆ t 0 + j ) is the cont ribu tion from th e alignmen t of V with ∆ [ t 0 ,t 1 ] , an d S ′ i is the cont ribu tion from the alignmen ts th at do not include ∆ [ t 0 ,t 1 ] . W e deﬁne M V , ℓ to b e th e ℓ × ℓ matrix w ith entrie s M V , ℓ ( i, j ) = V [ n − 1 − ( ℓ + i ) + j ]. That is, M V , ℓ =        V [ n − ℓ − 1] V [ n − ℓ + 0] V [ n − ℓ + 1] · · · V [ n − 2] V [ n − ℓ − 2] V [ n − ℓ − 1] V [ n − ℓ + 0] · · · V [ n − 3] V [ n − ℓ − 3] V [ n − ℓ − 2] V [ n − ℓ − 1] · · · V [ n − 4] . . . . . . . . . . . . . . . V [ n − 2 ℓ ] V [ n − 2 ℓ + 1] V [ n − 2 ℓ + 2] · · · V [ n − ℓ − 1]        . W e observ e that M V , ℓ is a T o eplitz matrix (or “upside d own” Hankel matrix) sin ce it is constan t on eac h descending diagonal from left to right. This prop erty will b e imp ortan t later. F rom the d eﬁ nitions abov e it follo ws that M V , ℓ ×      ∆ t 0 ∆ t 0 +1 . . . ∆ t 1      =      S 0 S 1 . . . S ℓ − 1      . (1) W e deﬁne the r e c overy numb er R V , ℓ to b e the num b er of v a riables x ∈ { x 1 , . . . , x ℓ } suc h that x can b e determined un iquely by the system of linear equations M V , ℓ ×    x 1 . . . x ℓ    =    y 1 . . . y ℓ    , where w e op erate in Z /q Z . The reco ve ry n um b er may b e distinct from the rank of a matrix, ev en where w e oper ate o v er a ﬁ eld. As an example, consider the all ones matrix. The matrix will ha v e reco v ery num b er zero but rank one. The r eco v ery num b er is how ev er related to the conditional entrop y of P [ t 1 +1 ,t 2 ] as describ ed by the next lemma. 7 Lemma 5. If the intervals [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] b oth have the same length ℓ , then the entr opy H ( P [ t 1 +1 ,t 2 ] | ∆ [ t 0 ,t 1 ] c = ∆ ﬁx [ t 0 ,t 1 ] c ) > δ R V , ℓ . Pr o of. As describ ed ab o v e, for i ∈ [ ℓ ], P t 1 +1+ i = S i + S ′ i , where S ′ i is a constan t that only dep ends on V and ∆ ﬁx [ t 0 ,t 1 ] c . Hence w e can compute the v alues S 0 , . . . , S ℓ − 1 from P [ t 1 +1 ,t 2 ] . F rom Eq u ation (1) it follo ws that S 0 , . . . , S ℓ − 1 uniquely sp ecify R V , ℓ of the parameters in ∆ [ t 0 ,t 1 ] . That is, w e can reco v er R V , ℓ of the parameters from the in terv al [ t 0 , t 1 ]. Eac h of these parameters is a random v ariable that is u niformly distributed in [ q ], so it contributes δ bits of entrop y . W e now com bine Lemmas 4 and 5 in the follo wing coroll ary . Corollary 6. F or any ﬁxe d ve ctor V , two intervals [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] of the same length ℓ , and any algorithm solving the online c onvo lution pr oblem on ∆ chosen uni- formly at r andom fr om [ q ] n , E [ | I T ( t 0 , t 1 , t 2 ) | ] > δ R V , ℓ 2 w − 1 2 . Pr o of. F or ∆ [ t 0 ,t 1 ] c ﬁxed to ∆ ﬁx [ t 0 ,t 1 ] c , comparing Lemmas 4 and 5, w e see that E h | I T ( t 0 , t 1 , t 2 ) | | ∆ [ t 0 ,t 1 ] c = ∆ ﬁx [ t 0 ,t 1 ] c i > δ R V , ℓ 2 w − 1 2 . The result follo ws by taking exp ectation o v er ∆ [ t 0 ,t 1 ] c un der the random sequence ∆. 2.3 The lo w er b ound for online c on v olution W e no w sho w how a lo w er b ound on the total num b er of cell reads o v er n next-op erations can b e obtained b y s u mming the information transfer b et w een man y p airs of time in- terv a ls. W e again follo w the approac h of Demaine and Pˇ a tra¸ scu [PD04], wh ic h in v olv es conceptually constructing a balanced tree o ve r the time axis. This lower b ound tr e e , denoted T , is a balanced binary tree on n lea v es. Recall th at we ha v e assumed th at n is a p o w er of tw o. The lea v es, from left to righ t, repr esen t the time t from 0 to n − 1, resp ectiv ely . An internal no de v is asso ciated with the times t 0 , t 1 and t 2 suc h that the tw o in terv als [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] span the left subtr ee and the right subtree of v , resp ectiv ely . F or example, in Figure 1, the no de lab elled v is asso ciated with the in terv als [16 , 23] and [24 , 31]. F or an internal no de v of T , w e write I T ( v ) to denote I T ( t 0 , t 1 , t 2 ), where t 0 , t 1 , t 2 are asso ciated with v . W e wr ite L ( v ) to denote th e num b er of lea v es in th e left (same as the right) subtree of v . The key lemma, stated next, is a mo diﬁed v ersion of Theorem 3.6 in [Pˇ a t08]. The state ment of the lemma is adap ted to our online con v olution problem and the pro of relies on Corolla ry 6. 8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 v 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Figure 1: A lo w er b ound tree T ov er n = 32 op erations. Lemma 7. F or any ﬁxe d ve ctor V and any algor ithm solving the online c o nvolution pr oblem, the exp e cte d running time of the algorithm over a se quenc e ∆ that is chosen uniformly at r andom fr om [ q ] n is at le ast δ 2 w X v ∈T R V , L ( v ) − n − 1 2 , wher e the sum is over the internal no des of T . Pr o of. W e ﬁrst co nsid er a ﬁxed sequence ∆. W e argue that the num b er of r ead in- structions exec uted by the algorithm is at least P v ∈T | I T ( v ) | . T o see this, for any read instruction, let t r b e the time it is executed. Let t w 6 t r b e the time the cell wa s last written, ignoring t r = t w . Then this r ead instruction (the cell it acts up on), is conta ined in I T ( v ), wh ere v is the low est common ancestor of t w and t r . Thus, P v ∈T | I T ( v ) | nev er double-coun ts a read instru ction. F or a rand om ∆, an exp ected lo wer b ound on the num b er of read instructions is therefore E [ P v ∈T | I T ( v ) | ]. Usin g linearit y of exp ectation and Corollary 6, we obtain the lo w er b ound in the statemen t of the lemma. 2.3.1 Lo w er b ound w ith a random v ector V W e ha v e seen in Lemma 7 that a lo we r b ound is highly dep endent on the reco v ery n umb ers of the ve ctor V . In the next lemma, w e s ho w that a r andom v ector V h as reco v ery num b ers that are large. Lemma 8. Supp o se that q is a prime and the v e ctor V is chosen uniformly at r andom fr o m [ q ] n . Then E [ R V , ℓ ] > ℓ/ 2 for every length ℓ . Pr o of. Recall that M V , ℓ is an ℓ × ℓ T o eplitz matrix. It has b een sho wn in [KL 96 ] that for an y ℓ , out of all the ℓ × ℓ T o eplitz matrices o v er a ﬁnite ﬁeld of q elemen ts, a fraction of exactly (1 − 1 /q ) is non-singular. T h is fact w as actually already established in [Da y60] almost 40 y ears earlier but incidenta lly repro ve d in [KL96]. Since we ha v e assumed in the statemen t of the lemma that q is a pr ime, the ring Z /q Z we op erate in is indeed a ﬁn ite ﬁeld. The d iagonals of M V , ℓ are ind ep endent and uniformly distribu ted in [ q ], hence the p robabilit y that M V , ℓ is in ve rtible is (1 − 1 /q ) > 1 / 2. If M V , ℓ is in ve rtible then the reco v ery num b er R V , ℓ = ℓ ; there is a uniqu e solution to the system of linear equations in Equ ation (1). O n the other hand , if M V , ℓ is not in ve rtible then the r eco ve ry 9 n umb er w ill b e lo w er. Th us, w e can safely sa y that the exp ected reco v ery num b er R V , ℓ is at least ℓ/ 2, whic h pro v es the lemma. Before w e giv e a lo w er b ound f or a random c hoice of V in Th eorem 10 b elo w, we state the follo wing fact. F act 9. F or a b alanc e d binary tr e e with n le aves, the sum of the numb er of le aves in the subtr e e r o ote d at v , taken over al l internal no d es v , is n log 2 n . Theorem 10. Supp ose that q is a prime. In the c el l-pr ob e mo del with w bits p er c el l, any algorithm solving the online c on volution pr oblem on a ve ctor V and ∆ , b ot h c hosen uniformly at r and om fr om [ q ] n , wil l run in Ω  δ w n log n  time in exp e ctation, wher e δ = log 2 q . Pr o of. F or a rand om v ector V , a lo w er b ou n d is obtained by taking the exp ectation of the b ound in the statemen t of Lemma 7. Using linearit y of exp ectation and app lying Lemma 8 and F act 9 complete s the pro of. R e mark. Theorem 10 requir es that q is a p r ime but for an in teger δ > 1, q = 2 δ is not a prime. Ho w ev er, we know that there is alw a ys at least one p rime p suc h that 2 δ − 1 < p < 2 δ . Thus, Theorem 10 is applicable for any integ er δ , on ly with an adjustment b y at most one. 2.3.2 Lo w er b ound w ith a ﬁxed vector V W e demonstrate n ext that it is p ossible to d esign a ﬁxed vecto r V with guaran teed large reco v ery num b ers. W e will u se this vect or in the p ro of of Theorem 3. T h e id ea is to let V consist of stretc hes of 0s intersp ersed by 1s. Th e distance b et w een t w o succeeding 1s is an in creasing p o w er of t wo , ensur ing that for half of the alignmen ts in the in terv a l [ t 1 + 1 , t 2 ], all but exac tly one elemen t of ∆ [ t 0 ,t 1 ] are sim ultaneously aligned with a 0 in V . W e d eﬁne the binary v ector K n ∈ [2] n to b e K n = ( . . . 0000000000 000 1 00000 0000000000 1 0000000 1 000 1 0 11 0 ) , or f orm ally , K n [ i ] = ( 1 , if n − 1 − i is a p o w er of tw o; 0 , otherwise. (2) Lemma 11. Supp ose V = K n and ℓ > 1 is a p ower of two. The r e c overy numb er R V , ℓ > ℓ/ 2 . Pr o of. Recall that en try M V , ℓ ( i, j ) = V [ n − 1 − ( ℓ + i ) + j ]. Th us, M V , ℓ ( i, j ) = 1 if and only if n − 1 − ( n − 1 − ( ℓ + i ) + j ) = ℓ + i − j is a p o w er of t w o. It follo ws that for ro w i = ℓ/ 2 , . . . , ℓ − 1, M V , ℓ ( i, j ) = 1 for j = i and M V , ℓ ( i, j ) = 0 for j 6 = i . This implies that the reco v ery n umber R V , ℓ is at least ℓ/ 2. W e ﬁn ally giv e the pr o of of Theorem 3. 10 The or em 3. W e assume that n is a p o w er of t wo. Let V = K n . It follo ws f r om Lemma 11 and F act 9 that P v ∈T R V , L ( v ) > P v ∈T L ( v ) / 2 = Ω ( n log n ). Note that L ( v ) is a p o we r of t w o for ev ery no d e v in T . F or ∆ chosen uniformly at rand om from [ q ] n , apply Lemma 7 to obtain the exp ected r u nning time Ω  δ w n log n  o v er n next-op erations. 3 Online m ultiplication In th is section w e consider online m ultiplication of t w o n -digit num b ers in b ase q > 2. F or a non-negativ e in teger X , let X [ i ] denote the i th digit of X in base q , where the p ositions are num b er ed starting with 0 at the right (lo w er-order) end. W e thin k of X padded with zeros to mak e su re that X [ i ] is deﬁn ed for arb itrarily large i . F or j > i , w e write X [ i . . j ] to denote the inte ger that is w r itten X [ j ] · · · X [ i ] in base q . F or example, let X = 15949 (decimal repr esen tation) and q = 8 (octal): X = 37115 (base 8) X [0] = 5 X [1 . . 3] = 711 (base 8) = 457 (d ecimal) X [3] = 7 X [3 . . 10] = 37 (base 8) = 31 (decimal) X [15] = 0 The online multiplic ation pr oblem is deﬁned as follo ws. The inpu t is tw o n -digit n umb ers X , Y ∈ [ q n ] in b ase q (higher order digits ma y b e zero). Let Z = X × Y . W e w an t to output the n lo w er order digits of Z in base q (i.e. Z [0] , . . . , Z [ n − 1]) under the constrain t that Z [ i ] must b e outputted b efore Z [ i + 1] and when Z [ i ] is outputted, w e are not allo wed to use any kno wledge of the digits X [ i + 1] , . . . , X [ n − 1] and Y [ i + 1] , . . . , Y [ n − 1]. W e can thin k of the d igits of X and Y arriving one p air at a time, starting with the least signiﬁcant p air of digits, and w e output the corresp onding digit of the pro du ct of the t w o num b ers seen so far. W e also consider a v arian t of the online m ultiplicatio n problem when one of the tw o input n umbers, sa y Y , is kno wn in adv ance. That is, all its digits are a v ailable at ev ery stage of the algorithm and only the digits of X arrive in an online f ash ion. In particular w e will consider the case wh en Y = K q ,n is ﬁxed, wh ere we deﬁn e K q ,n to b e the largest n umb er in [ q n ] such that the i th bit in the binary expansion of K q ,n is 1 if and only if i is a p ow er of tw o (starting with i = 0 at the lo w er-order end). W e can see that th e binary expansion of K q ,n is the rev erse of K ( n l og 2 q ) in Equation (2). W e will pro ve the follo wing result. Theorem 12. F or any p o sitive inte gers δ and n in the c el l pr ob e mo del with w bits p er c el l, the e xp e cte d running time of any algorithm solving the online multiplic ation pr oblem on two n - digit r and om numb ers X, Y ∈ [ q n ] is Ω ( δ w n log n ) , wher e q = 2 δ is the b ase. The same b ound holds even under ful l ac c ess to e v ery digit of Y , and when Y = K q ,n is ﬁxe d . It suﬃces to pro v e the lo w er b ound for the case when we hav e fu ll access to ev ery digit of Y ; w e could alw a ys ignore digits. W e p r o v e Theorem 12 u sing the same approac h as for the online con vo lution problem. Here the next-op eration deliv ers a n ew digit of 11 ← − − − − − − − − − − − − − − − − − − n − − − − − − − − − − − − − − − − − − → ← − ℓ − →← − − − − − − − t 0 − − − − − − − → X ′ = X ← − − − − 2 ℓ − − − − → Y ′ = Y ← − ℓ − →← − − − − − − − − − t 0 + ℓ − − − − − − − − − → Z ′ = Z Figure 2: X , Y and Z = X × Y in base q . X , wh ic h is c hosen uniformly at random f rom [ q ], and outputs the corresp onding digit of the pro duct of X and Y . F or t 0 , t 1 , t 2 ∈ [ n ] suc h th at t 0 6 t 1 < t 2 , we write X [ t 0 , t 1 ] c to denote every digit of X (in base q ) except for those at p osition t 0 through t 1 . It is helpful to th in k of X [ t 0 , t 1 ] c as a vect or of d igits rather than a single num b er. W e write X ﬁx [ t 0 , t 1 ] c to den ote a ﬁxed choic e of X [ t 0 , t 1 ] c . During the in terv al [ t 1 + 1 , t 2 ], we output Z [( t 1 + 1) . . t 2 ]. The information transfer is d eﬁned as b efore, and Lemma 4 is replaced with the f ollo wing lemma. Lemma 13. The entr o py H ( Z [( t 1 + 1) . . t 2 ] | X [ t 0 , t 1 ] c = X ﬁx [ t 0 , t 1 ] c ) 6 w + 2 w · E h | I T ( t 0 , t 1 , t 2 ) | | X [ t 0 , t 1 ] c = X ﬁx [ t 0 , t 1 ] c i . 3.1 Retrorse n um b ers and the lo w er b ound In Figure 2, the three n umb ers X , Y and Z = X × Y are illustrated with some segmen ts of their digits lab elled X ′ , Y ′ and Z ′ . Inf ormally , we sa y that Y ′ is r etr orse if Z ′ dep end s hea vily on X ′ . W e h av e b orro wed the term f r om P aterson, Fisc her and Meye r [PFM7 4 ], ho w ev er, we giv e it a more precise meaning, formalised b elow. Supp ose [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] b oth hav e the same length ℓ . F or notational brevity , w e write X ′ to d enote X [ t 0 . . t 1 ], Y ′ to denote Y [0 . . (2 ℓ − 1)] and Z ′ to d enote Z [( t 1 + 1) . . t 2 ] (see Figure 2). W e sa y that Y ′ is r etr orse if for any ﬁxed v alues of t 0 , X [ t 0 , t 1 ] c (the digits of X outside [ t 0 , t 1 ]) and Y [2 ℓ . . ( n − 1)], eac h v alue of Z ′ can arise from at m ost four diﬀerent v alues of X ′ . That is to sa y there is at most a f our-to-one mapping fr om p ossible v alues of X ′ to p ossible v alues of Z ′ . W e deﬁne I Y , ℓ = ℓ if Y ′ is retrorse, otherwise I Y , ℓ = 0. Note that I Y , ℓ only dep ends on Y and ℓ . W e will u se I Y , ℓ similarly to the reco v ery num b er R V , ℓ from S ection 2.2 and r eplace Lemma 5 with Lemma 14 b elo w, whic h com bined with L emma 13 giv es us Corollary 15. 12 Lemma 14. If the intervals [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] b oth have the same length ℓ , then the entr opy H ( Z [( t 1 + 1) . . t 2 ] | X [ t 0 , t 1 ] c = X ﬁx [ t 0 , t 1 ] c ) > δ I Y , ℓ 4 − 1 2 . Pr o of. The lemma is tr ivially true when I Y , ℓ = 0, so sup p ose that I Y , ℓ = ℓ . Th en Y [0 . . (2 ℓ − 1)] is retrorse, which implies that at most four distinct v alues of X [ t 0 . . t 1 ] yield the same v alue of Z [( t 1 + 1) . . t 2 ]. There are q ℓ p ossible v alues of X [ t 0 . . t 1 ], eac h with the same probabilit y , hence, from the deﬁnition of entrop y , H ( Z [( t 1 + 1) . . t 2 ] | X [ t 0 , t 1 ] c = X ﬁx [ t 0 , t 1 ] c ) > q ℓ 4 · 1 q ℓ · log 2  1 4 /q ℓ  = δ ℓ 4 − 1 2 . Corollary 15. F or any ﬁxe d numb er Y , two intervals [ t 0 , t 1 ] and [ t 1 + 1 , t 2 ] of the same length ℓ , and any algorith m solving the online multiplic ation pr oblem on X c hosen uniformly at r andom fr om [ q n ] , E [ | I T ( t 0 , t 1 , t 2 ) | ] > δ I Y , ℓ 8 w − 1 . W e tak e the same appr oac h as in Section 2.3 and use a lo w er-b ound tree T with n lea v es to obtain the n ext lemma. Th e pro of is identi cal to the pro of of Lemma 7, only that w e u se Corolla ry 15 instead of Corollary 6. T o a v oid tec hnicalities we will assume that n and δ are b oth p o w ers of tw o and we let th e base q = 2 δ . Lemma 16. F or any ﬁxe d numb er Y and any algorithm solving the online multiplic ation pr oblem, the exp e cte d running time of the algorithm with the numb er X chosen uniformly at r and om fr om [ q n ] is at le ast δ 8 w X v ∈T I Y , L ( v ) − ( n − 1) . Before giving th e pro of of Theorem 12, w e b ound the v alue of I Y , ℓ for b oth a random n umb er Y and Y = K q ,n . In order to d o so, w e will use the f ollo wing tw o results b y P aterson, Fisc her and Meyer [PFM74 ] whic h app ly to binary num b ers. T he lemmas are state d in our notation, but the translation from th e original notatio n of [PFM74] is straigh tforw ard. Lemma 17 (Lemma 1 of [PFM74]) . F or the b ase q = 2 and ﬁxe d values of t 0 , ℓ , n and X [ t 0 , t 1 ] c (wher e t 1 = t 0 + ℓ − 1 ), such that ℓ is a p ower of two, e ach value of Z ′ c an arise fr om at most two values of X ′ when Y = K 2 ,n . Lemma 18 (Corollary of Lemma 5 in [PFM74]) . F or the b ase q = 2 and ﬁxe d v alues of t 0 , ℓ , n and X [ t 0 , t 1 ] c (wher e t 1 = t 0 + ℓ − 1 ), such that ℓ is a p ow er of two, at le ast half of al l p ossible values of Y ′ have the pr op erty that e ach value of Z ′ c an arise fr om at most four diﬀer e nt values of X ′ . 13 Lemma 19. If ℓ is a p ower of two, then f or a r ando m Y ∈ [ q n ] , E [ I Y , ℓ ] > ℓ/ 2 , and for Y = K q ,n , I Y , ℓ = ℓ . Pr o of. Sup p ose ﬁr st that Y = K q ,n . Let ℓ b e a p o w er of t w o and t 0 a non-negativ e in teger. W e deﬁ n e X ′ , Y ′ and Z ′ as b efore (see Figure 2). Instead of writing the n umbers in base q , we consider their binary expansions, in whic h eac h d igit is repr esen ted by δ = log 2 q bits. In binary , we can write X , Y and Z as in Figure 2 if w e replace n , t 0 and ℓ with δ n , δ t 0 and δℓ , r esp ectiv ely . Note that δ ℓ is a p o w er of t w o. Since K q ,n = K 2 ,δn , it follo ws immed iately from Lemma 17 that Y ′ is retrorse and hence I Y , ℓ = ℓ . Supp ose no w that Y is c hosen un iformly at r andom from [ q n ], hen ce Y ′ is a random n umb er in [ q 2 ℓ ]. F rom Lemma 18 it follo ws that Y ′ is retrorse with probabilit y at least a half. Thus, E [ I Y , ℓ ] > ℓ/ 2. Pr o of of The or em 12. W e assume that n is a p o we r of t w o. Let Y b e a r andom num b er in [ q n ], either under th e un iform distribu tion or the distribu tion in whic h K q ,n has probabilit y one and every other num b er has probability zero. A lo w er b ound on the runn in g time is obtained by taking the exp ectatio n of th e b ound in the statemen t of Lemma 16. Using lin earit y of exp ectation and applying Lemma 19 and F act 9 ﬁnish the pro of. Note from Lemma 19 that th e exp ected v alue E [ I Y , ℓ ] = ℓ when Y = K q ,n . 4 Ac kno wledgemen ts W e are grateful to Mihai Pˇ atra ¸ scu f or suggesting the connection b et w een online lo w er b ound s and the recen t cell-prob e results for dyn amic data str uctures and for very helpful discussions on the topic. W e w ould also lik e to thank Kasp er Green L arsen for the observ ati on that our lo we r b ounds are in fact tight within the cell-prob e mo del. MJ wa s supp orted by the EPSR C. References [CEPP11] R. Cliﬀord, K. E fremenko , B. Porat , and E. Porat . “A Blac k Bo x for Online Appro ximate Patte rn Matc hing”. In: Information and Computation 209 .4 (2011 ), p p. 731–736. [CS11] R. Cliﬀord and B. Sac h. “Pat tern Matc hing in Pseudo Real-Time”. In : Jour- nal of Di scr ete A lgorithms 9.1 (2011 ), p p. 67–81. [Da y60] D. E. Da ykin. “Distribution of b ordered p ersymmetric matrices in a ﬁn ite ﬁeld”. In: Journal f ¨ ur die r eine und angewandte Mathematik 203 (1960) , pp. 47–54. [F re78] M. F red m an. “Observ ations on the complexit y of generating Quasi-Gra y co des”. In: SIAM Journal on Computing 7.2 (197 8), pp . 134–146. [FS73] M. J. Fisc her and L. J. Sto c kmey er. “F ast On-Line Int eger Multiplica tion”. In: STOC ’79: Pr o c. 5 th Annual ACM Symp. The ory of Computing , p p . 67– 72. 14 [FS89] M. F redman and M. Saks. “The cell prob e complexit y of dynamic d ata struc- tures”. In: STOC ’89: Pr o c. 21 st Annual ACM Symp. The or y of Computing , pp. 345–354. [Gal81] Z. Galil. “String Matc hing in Real Time.” In: Journal of the ACM 28.1 (1981 ), p p. 134–149. [Hag98 ] T. Hagerup. “Sorting and searching on the wo rd R AM”. In: ST ACS ’98: Pr o c. 15 th Annual Symp. on The or etic al A sp e cts of Computer Scienc e , pp. 366– 398. [KL96] E. K altofen and A. L ob o. “On r ank prop erties of To eplitz matrices ov er ﬁnite ﬁelds”. In: ISSAC ’96: 1996 International Symp. on Symb olic and Algebr aic c omputat ion , p p. 241 –249. [Mor73] J. Morgenstern. “Note on a Lo w er Bound on the Linear Complexit y of the F ast F ourier T rans f orm”. I n: Journal of the A CM 20.2 (1973), pp. 305–306 . [MP69] M. Minsky and S. Pa p ert. Per c eptr ons: An Intr o duction to Computationa l Ge o metry . MIT Press, 1969. [MP88] M. Minsky and S. Pa p ert. Per c eptr ons: An Intr o duction to Computationa l Ge o metry . MIT Press, 1988. [P an86] V. Y. P an. “The trade-oﬀ b et we en the additiv e complexit y and the asyn- c hronicit y of linear and bilinear algorithms”. In: Information Pr o c essing L et- ters 22.1 (19 86), pp . 11 –1 4. [P ap79] C. H. P apadimitriou. “Optimalit y of the F ast F ourier transform”. In: Journal of the ACM 26 (1 19 79), pp . 95–102. [PD04] M. Pˇ atra¸ scu and E. D. Demaine. “Tight b ounds for the partial-sums prob- lem”. In: SODA ’04: Pr o c. 15 th ACM-SIAM Symp. on Discr ete Algo rithms , pp. 20–29. [PD06] M. P˘ atra ¸ scu an d E. D. Demaine. “Logarithmic Lo w er Bounds in th e Cell- Prob e Model”. In: SIAM Journal on Computing 35.4 (2006) , pp. 932–963. [PFM74] M. S. Pat erson, M. J. Fisc her, and A. R. Mey er. “An Impro v ed Overlap Argumen t for On-Line Mu ltiplication”. In: SIAM-A MS Pr o c e e dings . V ol. 7. Amer. Math. So c., pp. 97–1 11. [Pˇ at08 ] M. Pˇ atra ¸ scu. “Low er b ound tec hniques for data stru ctures”. Ph D thesis. MIT, 2008. [Y ao77] A. C.-C. Y ao. “Probabilistic computations: T o wa rd a uniﬁed measure of complexit y”. In: FOCS ’77: Pr o c. 18 th Annual Symp. F oundations of Com- puter Scienc e , p p . 22 2–227. [Y ao81] A. C.-C. Y ao. “Should T ables Be Sorted?” In: Journal of the ACM 28. 3 (1981 ), p p. 615–628. 15

Tight Cell-Probe Bounds for Online Integer Multiplication and Convolution

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment