A Linear-Time Approximation Algorithm for Rotation Distance

Rotation distance between rooted binary trees measures the number of simple operations it takes to transform one tree into another. There are no known polynomial-time algorithms for computing rotation distance. We give an efficient, linear-time appro…

Authors: Sean Cleary, Katherine St. John

A Linear-Time Approximation Algorithm for Rotation Distance
A Linear-Time Appro ximation Algorithm for Rotation Distance Sean Cleary ∗ Katherine St. John † Septem b er 2, 2021 Abstract Rotation distance b et w een ro oted binary trees measures the num- b er of simple op erations it takes to transform one tree in to another. There are no known polynomial-time algorithms for computing rota- tion distance. W e giv e an efficient, linear-time approximation algo- rithm, whic h estimates the rotation distance, within a prov able factor of 2, b et w een ordered ro oted binary trees. 1 In tro duction Binary searc h trees are a fundamental data structure for storing and re- trieving information [4]. Roughly , a binary searc h tree is a ro oted binary tree where the no des are ordered “left to right.” The potential efficiency of storing and retrieving information in binary searc h trees dep ends on their heigh t and balance. Rotations pro vide a simple mechanism for “balancing” binary search trees while preserving their underlying order (see Figure 1). There has been a great deal of w ork on estimating, b ounding and computing rotation distances. By rotating to right caterpillar trees, Culik and W o od [5] ga v e an immediate upp er bound of 2 n − 2 for the distance b et w een t wo trees with n interior no des. In elegan t work using metho ds of h yp erb olic v olume, Sleator, T arjan, and Thurston [12] sho wed not only that 2 n − 6 is an upper b ound for n ≥ 11, but furthermore that for all very large n , that ∗ Departmen t of Mathematics, City College of New Y ork, City Univ ersity of New Y ork, New Y ork, NY 10031, cleary@sci.ccny.cuny.edu . Partial funding provided by NSF #0811002. † Departmen t of Mathematics & Computer Science, Lehman College & the Graduate Cen ter, City Univ ersity of New Y ork, Bronx, NY 10468, stjohn@lehman.cuny.edu . Partial funding pro vided by NSF #0513660. 1 right left Figure 1: A (right) rotation at a no de consists of rotating the righ t child of the left c hild of the node to the right c hild of the node. A left rotation is defined similarly b y mo ving the left child of the righ t c hild of the no de to the left child of the no de. The circled no de in the middle tree has b een rotated righ t to yield the tree on the righ t, and similarly rotated left to yield the tree on the left. b ound is realized. In remark able recent w ork, Dehorno y [7] gav e concrete examples illustrating that the low er b ound is at least 2 n − O ( √ n ) for all n . There are no kno wn p olynomial-time algorithms for computing rotation dis- tance, though there are p olynomial-time estimation algorithms of P allo [10], P allo and Baril [1], and Rogers [11]. Baril and P allo [1] use computational exp erimen tal evidence to sho w that a large fraction of their estimates are within a factor of 2 of the rotation distance. The problem has b een recently sho wn to b e fixed-parameter tractable in the parameter, k , the distance [3]. Li and Zhang [9] giv e a polynomial time appro ximation algorithm for the equiv alent diagonal flip distance with approximation ratio of almost 1.97. 1 In this short note, w e giv e a linear time appro ximation algorithm with an appro ximation ratio of 2, impro ving the running time at the v ery modest ex- p ense of approximation ratio. This is accomplished b y showing the distance b et w een the trees is b ounded b elow by n − e − 1 and ab o v e by 2( n − e − 1) where n is the num b er of internal no des and e is the num ber of edges in common in the reduced trees. The num b er of common edges is equiv alen t to Robinson-F oulds distance, widely used in phylogenetic settings, which Da y [6] calculates in linear time. 1 The exact ratio is b ounded b y the maxim um n umber of diagonals, d , allow ed at an y v ertex, and is 2 − 2 4( d − 1)( d +6)+1 . 2 2 Bac kground W e consider ordered, ro oted binary trees with n interior no des and where eac h in terior no de has t w o children. Such trees are commonly called extende d binary tr e es [8]. In the following, tr e e refers to such a tree with an ordering on the lea ves, no de refers to an in terior node, and le af refers to a non-in terior no de. Our trees will hav e n + 1 lea ves n um b ered in left-to-right order from 1 to n + 1. The size of a tree will be the n umber of internal no des it contains. Eac h in ternal edge in a tree separates the lea ves into tw o connected sets up on remo v al, and a pair of edges e 1 in S and e 2 in T form a c ommon e dge p air if their remov al in their resp ectiv e trees giv es the same partitions on the leav es. In that case, w e say that S and T ha ve a c ommon edge. Righ t rotation at a node of a ro oted binary tree is defined as a simple c hange to T as in Figure 1, taking the middle tree to the righ t-hand one. Left rotation at a no de is the natural in verse op eration. The r otation distanc e d R ( S, T ) b et ween tw o ro oted binary trees S and T with the same n umber of leav es is the minim um n um b er of rotations needed to transform S to T . The sp ecific instance of the rotation distance problem we address is: R ot a tion Dist ance: Input: Tw o ro oted ordered trees, S and T on n in ternal no des, Question: Calculate the rotation distance b et ween them, d R ( S, T ). Finding a sequence of rotations which accomplish the transformation giv es only an upp er b ound. The general difficulty of computing rotation distance comes from the low er b ound. 3 Appro ximation Algorithm W e first sho w that the rotation distance is b ounded b y the num b er of edges that differ b et w een the trees. F rom this, the appro ximation result follows easily . Theorem 1 L et S and T b e two distinct or der e d r o ote d tr e es with the same numb er of le aves. L et n b e the numb er of internal no des and e the numb er of c ommon e dges for S and T . Then, n − e − 1 ≤ d R ( S, T ) ≤ 2( n − e − 1) 3 Pr o of: The lo wer bound follows from t wo simple observ ations. First, if w e use a single rotation to transform T 1 to T 2 , all but one of the in ternal edges in eac h tree is common with the other tree. Second, ev ery in ternal edge of S that is not common with an internal edge of T needs a rotation (p ossibly more than one) to transform it to an edge in common in T . The n umber of internal edges o ccurring only in S is n − e − 1 and th us, is also a simple low er b ound. F or the upp er bound, w e use t wo facts from past w ork on rotation dis- tance. W e first let ( S 1 , T 1 ), ( S 2 , T 2 ), . . . , ( S e +1 , T e +1 ) b e the resulting tree pairs from removing the e edges S and T ha ve in common, where w e insert placeholder leav es to preserv e the extended binary tree prop ert y . Let n i b e the size of tree S i for i = 1 , 2 , . . . , e + 1. The first is the observ ation of Sleator et al. [12] used b efore: the rotation distance of the original tree pair ( S, T ) with a common edge is the sum of the rotation distances of the tw o tree pairs “abov e” and “below” the common edge. Extending this to e edges in common betw een S and T , w e hav e d R ( S, T ) = e +1 X i =1 d ( S i , T i ) ≤ e +1 X i =1 2 n i − 2 = 2 n − 2( e + 1) = 2( n − e − 1) The inequality follows b y the initial bound of 2 n − 2 on rotation distance b et w een trees with n in ternal no des of Culik and W o o d [5]. Th us, n − e − 1 ≤ d R ( S, T ) ≤ 2( n − e − 1).  W e note that using the sharp er b ound of 2 n − 6 for n > 12 from Sleator, T arjan and Thurston [12] together with the table of distances for n ≤ 12 can improv e this sligh tly still further. These reduction rules and counting the num b er of common edges can b e carried out in linear-time [2, 6], yielding the corollary: Corollary 2 L et S and T b e or der e d r o ote d tr e es with n internal no des. A 2 -appr oximation of their r otation distanc e c an b e c alculate d in line ar time. Pr o of: Let S and T b e tw o distinct ordered rooted n -leaf trees. Let n b e the num b er of internal no des and e the n um b er of edges in common for S and T . Then, b y Theorem 1, n − e − 1 ≤ d R ( S, T ) ≤ 2( n − e − 1). Since this is within a linear factor 2 from b oth b ounds, w e hav e the desired appro ximation.  W e note that this algorithm not only appro ximates rotation distance, it gives a sequence of rotations which realize the upp er b ound of the ap- pro ximation, again in linear time. The appro ximation algorithm uses the Culik-W o o d b ound on p oten tially sev eral pieces. On each piece, the 2 n − 2 4 b ound comes from rotating eac h internal no de whic h is not on the righ t side of the tree to obtain a righ t caterpillar, and then rotating the caterpillar to obtain the desired tree. This can b e accomplish simply in linear time. References [1] Jean-Luc Baril and Jean-Marcel P allo. Efficient lo w er and upper b ounds of the diagonal-flip distance b et ween triangulations. Information Pr o c essing L etters , 100(4):131–136, 2006. [2] Maria Luisa Bonet, Katherine St. John, Ruc hi Mahindru, and Nina Amen ta. Appro ximating subtree distances betw een phylogenies. Journal of Computa- tional Biolo gy , 13(8):1419–1434 (electronic), 2006. [3] Sean Cleary and Katherine St. John. Rotation distance is fixed parameter tractable. 109:918–922, 2009. [4] T.H. Corman, C.E. Leiserson, and R.L. Rivest. Intr o duction to A lgorithms . McGra w-Hill, 1990. [5] Karel Culik I I and Derick W o od. A note on some tree similarity measures. Information Pr o c essing L etters , 15(1):39–42, 1982. [6] W. H. E. Day . Optimal algorithms for comparing trees with lab eled lea ves. Journal of Classific ation , 2:7–28, 1985. [7] P atrick Dehornoy . On the rotation distance betw een binary trees. Preprint, arXiv:math.CO/0901.2557. [8] Donald E. Kn uth. The Art of Computer Pr o gr amming. Volume 3 . Addison- W esley , Reading, Mass, 1973. Sorting and searching. [9] Ming Li and Louxin Zhang. Better approximation of diagonal-flip transfor- mation and rotation transformation. In COCOON ’98: Pr o c e e dings of the 4th A nnual International Confer enc e on Computing and Combinatorics , pages 85–94, London, UK, 1998. Springer-V erlag. [10] Jean P allo. An efficient upp er bound of the rotation distance of binary trees. Information Pr o c essing L etters , 73(3-4):87–92, 2000. [11] R. Rogers. On finding shortest paths in the rotation graph of binary trees. In Pr o c e e dings of the Southe astern International Confer enc e on Combinatorics, Gr aph The ory, and Computing , v olume 137, pages 77–95, 1999. [12] Daniel D. Sleator, Rob ert E. T arjan, and William P . Thurston. Rotation distance, triangulations, and hyperb olic geometry . Journal of the Americ an Mathematic al So ciety , 1(3):647–681, 1988. 5

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment