Faster exponentials of power series

F ASTER EXPONENTIALS OF PO WER SERIES DA VID HAR VEY Abstract. W e desc rib e a new algorithm for computing exp f where f is a pow er series in C J x K . If M ( n ) denot es the cost of m ultiplying polynomials of degree n , the new algorithm costs (2 . 1666 . . . + o (1)) M ( n ) to compute exp f to o rder n . This impro v es o n t he previous b est result, namely (2 . 333 . . . + o (1)) M ( n ). The author recently ga ve new algo rithms for computing the square r oot and recipro cal of p ow er series in C J x K , achieving better running time constants than those previously known [Har09]. In this pape r we apply similar tech niques to the problem of computing exp f for a pow er ser ies f ∈ C J x K . Previously , the bes t known algorithm was that o f v an der Ho ev en [vdH06, p. 6], computing g = exp( f ) mo d x n in time (7 / 3 + o (1)) M ( n ), where M ( n ) denotes the cost o f mult iplying po lynomials of degree n . W e g iv e a new algo rithm that p erforms the same task in time (13 / 6 + o (1)) M ( n ). V an der Hoeven’s algorithm w or ks by decomposing f int o blocks, and solving g ′ = f ′ g by operating systematically with FFTs of blocks. Our starting point is the obser v a tion that his alg o rithm computes to o mu ch , in the sense that at the end of the a lgorithm, the FFT o f every blo ck of g is known. Our new algo rithm uses v an der Ho even’s a lgorithm to compute the ﬁrs t half of g , and then ex tends the appro ximation to the target precisio n using a Newton iteration due to Brent [Bre76] (see a lso [HZ04] or [Ber04] for other exponential algo rithms ba sed o n a similar iteration). At the end of the algor ithm, o nly the FFTs o f the blo c ks of the ﬁrst half of g a re known. In fa ct, the reduction in running time r e lativ e to v an der Ho ev en’s algorithm turns o ut to b e e qual to the cost of these ‘missing ’ FFTs. W e fr e ely use notation and complexity assumptions introduce d in [Har09]. Brieﬂy: ‘running time’ alwa ys means num b er of ring oper ations in C . The F our ier tra nsform of length n is denoted by F n ( g ), and its cost b y T ( n ). W e assume that T (2 n ) = (1 / 3 + o (1)) M ( n ) for a suﬃciently dense set of in tegers n . F or Prop osition 1 be low, we ﬁx a blo ck s ize m , and for any f ∈ C J x K w e write f = f [0] + f [1] X + f [2] X 2 + · · · where X = x m and deg f [ i ] < m . The key techn ical to ol is [Har09, Lemma 1], which a sserts that if f , g ∈ C J x K , k ≥ 0, and if F 2 m ( f [ i ] ) a nd F 2 m ( g [ i ] ) a r e known for 0 ≤ i ≤ k , then ( f g ) [ k ] may b e co mputed in time T (2 m ) + O ( m ( k + 1 )). W e deﬁne a diﬀerential op e rator δ by δ f = xf ′ ( x ), and w e set δ k f = X − k δ ( X k f ). In par ticular δ ( f [0] + f [1] X + · · · ) = ( δ 0 f [0] ) + ( δ 1 f [1] ) X + · · · . 1 2 DA VI D HAR VE Y Algorithm 1 : Exp onential Input : s ∈ Z , s ≥ 1 f ∈ C J x K , f = 0 mod x g [0] = exp( f [0] ) mo d X u = exp( − f [0] ) mo d X (= g − 1 [0] mo d X ) Output : g = g [0] + · · · + g [2 s − 1] X 2 s − 1 = exp( f ) mo d X 2 s Compute F 2 m ( g [0] ), F 2 m ( u ) 1 for 0 ≤ k < s do compute F 2 m (( δ f ) [ k ] ) 2 for 1 ≤ k < s do 3 ψ ← (( g [0] + · · · + g [ k − 1] X k − 1 )(( δ f ) [0] + · · · + ( δ f ) [ k ] X k )) [ k ] 4 Compute F 2 m ( ψ ) 5 φ ← uψ mo d X 6 Compute F 2 m ( δ − 1 k φ ) 7 g [ k ] ← g [0] ( δ − 1 k φ ) mo d X 8 Compute F 2 m ( g [ k ] ) 9 for 0 ≤ k < s do q [ k ] ← ( δ f ) [ k ] 10 for s ≤ k < 2 s do 11 ψ ← (( q [0] + · · · + q [ k − 1] X k − 1 )( g [0] + · · · + g [ s − 1] X s − 1 )) [ k ] 12 Compute F 2 m ( ψ ) 13 q [ k ] ← − uψ mo d X 14 Compute F 2 m ( q [ k ] ) 15 for 0 ≤ k < s do ε [ k ] ← δ − 1 k + s q [ k + s ] − f [ k + s ] 16 for 0 ≤ k < s do compute F 2 m ( ε [ k ] ) 17 for 0 ≤ k < s do g [ k + s ] ← − (( g [0] + · · · + g [ k − 1] X k − 1 )( ε [0] + · · · + ε [ k ] X k )) [ k ] 18 Prop osition 1. Algori thm 1 is c orr e ct, and ru ns in time (13 s − 4) T (2 m ) + O ( s 2 m ) . Pr o of. W e ﬁrst show that the lo op in lines 3 – 9 (essentially v an der Ho even’s exp o- nent ial algorithm) correc tly computes g 0 = g mod X s = g [0] + · · · + g [ s − 1] X s − 1 = exp( f ) mo d X s . By deﬁnition g [0] is correct. In the k th iteration, ass ume that g [0] , . . . , g [ k − 1] hav e bee n co mputed corr e ctly . Since δ g 0 = g 0 ( δ f ) mo d X s we hav e (( g [0] + · · · + g [ k ] X k )(( δ f ) [0] + · · · + ( δ f ) [ k ] X k )) [ k ] = ( δ g ) [ k ] , and by co nstruction (( g [0] + · · · + g [ k − 1] X k − 1 )(( δ f ) [0] + · · · + ( δ f ) [ k ] X k )) [ k ] = ψ . Subtracting yields ( δ g ) [ k ] − ψ = g [ k ] ( δ f ) [0] mo d X , and o n multiplying by u we o btain φ = uψ mod X = ( δ g ) [ k ] u − g [ k ] ( δ f ) [0] u mo d X = ( δ g ) [ k ] u + g [ k ] ( δ u ) mo d X = δ k ( g [ k ] u mo d X ) since δ u = − ( δ f ) u mo d X . Therefore g [ k ] is computed c o rrectly in line 8. F ASTER EXPONE NTIALS OF P O W ER SERIES 3 Next w e show that lines 10 – 15 correctly compute q = q [0] + · · · + q [2 s − 1] X 2 s − 1 = δ g 0 g 0 mo d X 2 s . Since δ g 0 /g 0 mo d X s = δ f mo d X s , line 10 corr ectly computes q [0] , . . . , q [ s − 1] . The lo op in lines 11 – 1 5 computes q [ s ] , . . . , q [2 s − 1] using a similar strategy to the division algorithm in [vdH06, p. 6 ]. Namely , in the k th iter ation, assume that q [0] , . . . , q [ k − 1] are co rrect. Then (( q [0] + · · · + q [ k − 1] X k − 1 )( g [0] + · · · + g [ s − 1] X s − 1 )) [ k ] = ψ and (( q [0] + · · · + q [ k ] X k )( g [0] + · · · + g [ s − 1] X s − 1 )) [ k ] = ( q g 0 ) [ k ] = ( δ g 0 ) [ k ] = 0 since deg ( δg 0 ) < sm and k ≥ s . Subtracting, w e obtain g [0] q [ k ] = − ψ mo d X , so q [ k ] is computed correctly in line 14. (Note that the transforms o f q [0] , . . . , q [ s − 1] used in line 1 2 ar e already known, since they w ere computed in line 2.) A t this stage we hav e δ g 0 g 0 mo d X 2 s = q = δ f + δ ( εX s ) for some ε = ε [0] + · · · + ε [ s − 1] X s − 1 . Line 16 c o mputes the blocks of ε . Then by logarithmic in tegration, w e hav e g 0 = exp( f ) exp( εX s ) mo d X 2 s , so exp( f ) mo d X 2 s = g 0 exp( − εX s ) mo d X 2 s = g 0 (1 − ε X s ) mo d X 2 s . Line 1 8 multiplies o ut the latter pro duct to co mpute the remaining blo c ks of g . W e now analy s e the complexity . Ea c h iteration of lines 4, 12 and 18 costs T (2 m )+ O ( m ( k + 1 )) a c cording to [Har 0 9 , Lemma 1]; their total con tribution is ther efore (3 s − 1) T (2 m ) + O ( s 2 m ). Lines 6 , 8 and 14 each req uire a single in verse trans form, contributing a total of (3 s − 2) T (2 m ). The explicitly stated forward transforms contribute (7 s − 1) T (2 m ). The v arious other op erations, including applications of δ a nd δ − 1 , contribute only O ( sm ). The tota l is (13 s − 4) T (2 m ) + O ( s 2 m ).  Theorem 2. L et f ∈ C J x K with f = 0 mo d x . Th en exp f may b e c ompute d to or der n in time (13 / 6 + o (1)) M ( n ) . Pr o of. Apply the pro of of [Har 09, Theore m 3] to Prop osition 1, with r = 2 s .  References [Ber04] Dan iel Bernstein, R emoving r e dundancy in high-pr e cision N e wton iter ation , unpublished, a v ailable at http://cr.yp .to/papers.html#fastnewton , 2004. [Bre76] Ric hard P . Brent, Multiple-pr ecision zer o-ﬁnding met ho ds and the c omplexity of e lemen- tary function evaluation , A nalytic computational complexity (Pro c. Symp os., Carnegie- Mellon Univ., Pittsburgh, Pa., 1975), Academic Press, New Y or k, 1976, pp. 151–176. [Har09] Da vid Harvey , F aster algorithms for t he squar e r o ot and r e c ipr o c al of p ower series , preprint av ailable at http://a rxiv.org/a bs/0910.1926 , 2009. [HZ04] Guillaume Hanrot and Paul Zi mmermann, Newton iter ation r ev isite d , unpublished, av ail - able at http://www.lori a.fr/~zimmerma/papers/fastnewton.ps.gz , 2004. [vdH06] Jori s v an der Ho ev en, Newton ’s metho d and FFT tr ading , preprin t a v ailable at http://w ww.texmacs. org/joris/fnewton/fnewton-abs.html , 2006.

Faster exponentials of power series

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment