On the Complexity of Searching in Trees: Average-case Minimization

On the Complexit y of Searc hing in T rees: Av erage-case Minimization T obias Jacobs Alb ert-Ludwigs-Univ ersity F reiburg, Germany jacobs@informatik.uni-freiburg.de F erdinando Cicalese Univ ersity of Salerno, Italy cicalese@dia.unisa.it Eduardo Lab er PUC-Rio, Brazil laber@inf.puc-rio.br Marco Molinaro Carnegie Mellon, USA molinaro@cmu.edu Abstract W e study the following tr e e se ar ch pr oblem : in a giv en tree T = ( V , E ) a no de has b een marked and w e wan t to identify it. In order to locate the marked node, we can use edge queries. An edge query e asks in which of the tw o connected comp onents of T \ e the mark ed no de lies. The worst-case scenario where one is interested in minimizing the maximum num b er of queries is w ell understoo d, and linear time algorithms are known for ﬁnding an optimal search strategy [Onak et al. FOCS’06, Mozes et al. SOD A’08] . Here we study the more in volv ed av erage-case analysis: A function w : V → Z + is given whic h deﬁnes the likelihoo d for a no de to b e the one marked, and we w ant the strategy that minimizes the exp ected num b er of queries. Prior to this pap er, very little was kno wn ab out this natural question and the complexit y of the problem had remained so far an op en question. W e close this question and prov e that the ab ov e tr e e se ar ch pr oblem is N P -complete even for the class of trees with diameter at most 4. This results in a complete characterization of the complexity of the problem with resp ect to the diameter size. In fact, for diameter not larger than 3 the problem can b e shown to b e p olynomially solv able using a dynamic programming approac h. In addition we prov e that the problem is N P -complete even for the class of trees of maximum degree at most 16. T o the b est of our kno wledge, the only known result in this direction is that the tree searc h problem is solv able in O ( | V | log | V | ) time for trees with degree at most 2 (paths). W e match the ab ov e complexity results with a tight algorithmic analysis. W e ﬁrst show that a natural greedy algorithm attains a 2-approximation. F urthermore, for the b ounded degree instances, w e show that an y optimal strategy (i.e., one that minimizes the exp ected num b er of queries) p erforms at most O (∆( T )(log | V | + log w ( T ))) queries in the w orst case, where w ( T ) is the sum of the lik eliho ods of the no des of T and ∆( T ) is the maxim um degree of T . W e combine this result with a non-trivial exp onen tial time algorithm to provide an FPT AS for trees with b ounded degree. 1 In tro duction Searc hing is one of the fundamental problems in Computer Science and Discrete Mathematics. In his classical b o ok [20], D. Kn uth discusses many v arian ts of the searc hing problem, most of them dealing with totally ordered sets. There has b een some eﬀort to extend the a v ailable techniques for searching and for other fundamental problems (e.g. sorting and selection) to handle more complex structures such as partially ordered sets [26, 11, 29, 28, 8]. Here, we fo cus on searching in structures that lay b et ween totally ordered sets and the most general p osets. W e wish to eﬃcien tly lo cate a particular no de in a tree. More formally , as input we are giv en a tree T = ( V , E ) which has a ‘hidden’ marke d no de and a function w : V → Z + that giv es the likelihoo d of a no de b eing the one marked. In order to discov er whic h no de of T is marked, we can p erform e dge queries : after querying the edge e ∈ E w e receive an answ er stating in which of the tw o connected comp onen ts of T \ e the marked no de lies. T o simplify our notation let us assume that our input tree T is ro oted at a no de r so that we can sp ecify a query to an edge e = uv , with u b eing the paren t of v , by referring to v . A search strategy is a pro cedure that decides the next query to b e p osed based on the outcome of the previous queries. Every search strategy for a tree T = ( V , E ) (or for a forest) can b e represented by a binary search (decision) tree D such that a path from the ro ot of D to a leaf ` indicates which queries should b e made at eac h step to discov er that ` is the marked no de. More precisely , a search tree for T is a triple D = ( N , E 0 , A ), where N and E 0 are the no des and edges of a binary tree and the as signmen t A : N → V satisﬁes the following prop erties: (a) for every no de v of V there is exactly one leaf ` in D suc h that A ( ` ) = v ; (b)[search prop ert y] if v is in the right (left) subtree of u in D then A ( v ) is (not) in the subtree of T rooted at A ( u ). F or an example w e refer to Figure 1. Giv en a search tree D for T , let d ( u, v ) b e the length (in num b er of edges) of the path from u to v in D . Then the cost of D , or alternatively the exp ected num b er of queries of D is given by cost ( D ) = X v ∈ leav es ( D ) d ( r oot ( D ) , v ) w ( A ( v )) . Therefore, our problem can b e stated as follows: giv en a ro oted tree T = ( V , E ) with | V | = n and a function w : V → Z + , the goal is to compute a minimum cost search tree for T . This is a natural generalization of the problem of searching an elemen t in a sorted list with non-uniform access probabilities. The State of the Art. The v ariant of the problem in whic h the goal is to minimize the n umber of edge queries in the worst case, rather than minimizing the exp ected num b er of queries, has been studied in sev eral recen t pap ers [5, 29, 28]. It turns out that an optimal (worst-case) strategy can b e found in linear time [28]. This is in great con trast with the state of the art (prior to this pap er) ab out the a verage-case minimization we consider here. The known results amount to the O (log n )-approximation obtained b y Kosara ju et al. [21], and Adler and Heeringa [2] for the muc h more general binary identiﬁcation problem, and the constan t factor appro ximation algorithm that t wo of the authors gav e in [23]. Ho wev er, the complexit y of the av erage-case minimization of the tree search problem has so far remained unknown. Our Results. W e signiﬁcan tly narrow the gap of knowledge in the complexit y landscap e of the tree searc h problem under tw o diﬀerent p oints of view. W e prov e that this problem is N P -Complete even for the class of trees with diameter at most 4. This results in a complete c haracterization of the problem’s complexit y with resp ect to the parametrization in terms of the diameter. In fact, the problem can b e sho wn to b e polynomially solv able for the class of trees of diameter at most 3 . W e also sho w that the tree search problem under a verage minimization is N P -Complete for trees of degree at most 16 (note that in any inﬁnite class of trees either the diameter or the degree is non-constan t). This substantially impro ves up on the state of the art, the only kno wn result in this direction b eing an O ( n log n ) time solution [16, 14] for the class of trees with maxim um degree 2. The hardness results are obtained by fairly in volv ed reductions from the Exact 3-Set Cov er (X3C) with m ultiplicity 3 [13]. In addition to the complexity results, w e also signiﬁcan tly impro ve the previous kno wn results from the algorithmic persp ective. W e ﬁrst sho w that w e can attain 2-appro ximation b y a simple greedy approac h 1 that alw ays seeks to divide the remaining tree as evenly as p ossible. F or b ounded-degree trees, we match the new hardness results with an FPT AS. In order to obtain the FPT AS, w e ﬁrst devise a non-trivial Dynamic Programming based algorithm that, roughly sp eaking, computes the b est p ossible search tree, among the searc h trees with height at most H , in O ( n 2 2 H ) time. Then, w e show that ev ery tree T admits a minim um cost search tree whose height is O (∆ · (log n + log w ( T ))), where ∆ is the maximum degree of T and w ( T ) is the total weigh t of the no des in T . This b ound is of indep enden t in terest b ecause the height of any searc h tree for a complete tree of degree ∆ is Ω( ∆ log ∆ log n ) . F urthermore, it allo ws us to execute the DP algorithm with H = c · ∆ · (log n + log w ( T )), for a suitable constan t c , obtaining a pseudo-p olynomial time algorithm for trees with b ounded degree. By scaling the w eights w in a fairly standard w ay we obtain the FPT AS. The w orst-case scenario has also b een studied for the case where a question is p osed to some no de u and the answ er is either that u is the mark ed no de or in which connected comp onent of the forest T \ { u } the mark ed no de lies [30, 29]. W e remark that it is p ossible to adapt our tec hniques to prov e that for the a verage-case minimization, this “no de query”-v ariant of the tree search problem is also N P -Complete; furthermore, w e can provide for it a (degree indep endent) FPT AS . Due to the space constraints we hav e to defer these results to the full version of the pap er. Other Related W ork. Besides the ab o ve mentioned pap ers, the worst-case version of searching in trees had already b een studied and solv ed under a diﬀerent name, one decade ago, as p oin ted out by Derenio wski [10]. That is b ecause the problem of searching a no de in a tree is equiv alen t to the problem of ranking the edges of a tree [19, 9, 25]. The problem studied here can also b e seen as a particular case of the binary identiﬁcation problem (BIP) [12]. Supp ose we are giv en a set of elemen ts U = { u 1 , . . . , u n } , a set of tests { t 1 , . . . , t m } , with t i ⊆ U , a ‘hidden’ marked element and a likelihoo d function w : U 7→ R + . A test t allows to determine whether the marked elemen t is in the set t or in U \ t . The BIP consists of deﬁning a strategy (decision tree) that minimizes the (exp ected) num b er of tests to ﬁnd the marked elemen t. Both the av erage-case and the worst-case minimization are N P -Complete [17], and none of them admits an o (log n )-approximation unless P = N P [24, 7]. F or b oth v ersions, simple greedy algorithms attain O (log n )-approximation [21, 4, 2]. When w e imp ose some structure in the set of tests w e hav e in teresting particular cases. If the set of tests consists of all the subsets of U (i.e., 2 U ), then the strategy that minimizes the av erage cost is a Huﬀman tree. Let G b e a D AG with v ertex set U . If the set of tests is { t 1 , . . . , t n } , where t i = { u j | u i u j in G } , then we ha ve the problem of searching in a p oset [27, 21, 6]. When G is a directed path we hav e the alphab etic co ding problem [16]. The problem we study here corresp onds to the particular case where G is a directed tree. Applications. The problem of searching in p osets (and in particular in trees) has practical applications in ﬁle system sync hronization and softw are testing according to [5, 28]. Strategies for searching in trees hav e also p otential application to asymmetric comm unication proto- cols [1, 3, 15, 22, 31]. In this scenario, a client has to send a binary string x ∈ { 0 , 1 } t to the server. x is drawn from a probabilit y distribution D only av ailable to the server. The asymmetry comes from the clien t having m uch larger bandwidth for downloading than for uploading. In order to b eneﬁt from this discrepancy , b oth parties agree on a proto col to exchange bits until the server learns the string x , trying to minimize the n umber of bits sent b y the clien t (though other factors, e.g., the num b er of rounds should also b e tak en into account). In one of the ﬁrst proto cols [3, 22], at eac h round the server sends a binary string y and the client replies with a 0 or 1 dep ending on whether y is a preﬁx of x or not. Based on the clien t’s answer, the serv er up dates his kno wledge ab out x and sends another string if he has not learned x yet. This proto col corresp onds to a strategy for searching a marked leaf in a complete binary tree of heigh t t , where only the leav es hav e non-zero probabilit y . In fact, the binary strings in { 0 , 1 } t can b e represen ted b y a complete binary tree of heigh t t where every edge that connects a no de to its left (right) c hild is lab eled with 0 (1). This gives a 1-1 corresp ondence b etw een binary strings of length at most t and edges of the tree, and the message y sent by the serv er naturally corresp onds to an edge query . 2 2 Hardness In this section we shall prov e that the tree search problem deﬁned ab o ve is N P -Complete. W e shall use a reduction from the Exact 3-Set Cov er problem with multiplicit y b ounded by 3, i.e., eac h element of the ground set can app ear in at most 3 sets. An instance of the 3-b ounded Exact 3-Set Cov er problem (X3C) is deﬁned b y: (a) a set U = { u 1 , . . . , u n } , with n = 3 k for some k ≥ 1; (b) a family X = { X 1 , . . . , X m } of subsets of U, suc h that | X i | = 3 for eac h i = 1 , . . . m and for eac h j = 1 , . . . n, we ha ve that u j app ears in at most 3 sets of X . Giv en an instance I = ( U, X ) the X3C problem is to decide whether X con tains a partition of U, i.e., whether there exists a family C ⊆ X suc h that |C | = k and S X ∈C X = U. This problem is well known to b e N P -Complete [13]. F or our reduction it will b e crucial to deﬁne an order among the sets of the family X . Any total order < on U, sa y u 1 < u 2 < · · · < u n , can b e extended to a total order ≺ on X ∪ U b y stipulating that: (a) for an y X = { x 1 , x 2 , x 3 } , Y = { y 1 , y 2 , y 3 } ∈ X (with x 1 < x 2 < x 3 and y 1 < y 2 < y 3 , ) the relation X ≺ Y holds if and only if the sequence x 3 x 2 x 1 is lexicographically smaller than y 3 y 2 y 1 ; (b) for every j = 1 , . . . , n, the relation u j ≺ X holds if and only if the sequence u j u 1 u 1 is lexicographically smaller than x 3 x 2 x 1 . Assume an order < on U has b een ﬁxed and ≺ is its extension to U ∪ X , as deﬁned ab ov e. W e denote b y Π = ( π 1 , . . . , π n + m ) the sequence of elemen ts of U ∪ X sorted in increasing order according to ≺ . F rom no w on, w.l.o.g., w e assume that according to < and ≺ , it holds that u 1 < · · · < u n and X 1 ≺ · · · ≺ X m . F or each i = 1 , . . . , m, w e shall denote the elements of X i b y u i 1 , u i 2 , u i 3 so that u i 1 < u i 2 < u i 3 . Example 1. Let U = { a, b, c, d, e, f } , and X = {{ a, b, c } , { b, c, d } , { d, e, f } , { b, e, f }} . Then, ﬁxing the standard alphab etical order among the elemen ts of U, we ha ve that the sets of X are ordered as fol- lo ws: X 1 = { a, b, c } , X 2 = { b, c, d } , X 3 = { b, e, f } , X 4 = { d, e, f } . Then, we hav e Π = ( π 1 , . . . , π 10 ) = ( a, b, c, X 1 , d, X 2 , e, f , X 3 , X 4 ) . Because of the orders we ﬁxed and the fact that each element of U app ears in at most 3 sets of X , it follo ws that that we cannot hav e more than three sets of X app earing consecutively in Π. This will b e imp ortan t to prov e the hardness for b ounded degree instances. W e shall ﬁrst show a p olynomial time reduction that maps an y instance I = ( U, X ) of 3-b ounded X3C to an instance I 0 = ( T , w ) of the tree searc h problem, such that T has diameter 4 but un b ounded degree. W e will then mo dify suc h reduction and show hardness for the b ounded case to o. The structure of the tree T . The ro ot of T is denoted by r . F or each i = 1 , . . . , m the set X i ∈ X is mapp ed to a tree T i of heigh t 1, with ro ot r i and lea ves t i , s i 1 , s i 2 , s i 3 . In particular, for j = 1 , 2 , 3 , w e sa y that s i j is asso ciated with the element u i j . W e mak e each r i a child of r. F or i = 1 , . . . , m, w e also create four lea ves a i 1 , a i 2 , a i 3 , a i 4 and make them c hildren of the ro ot r. W e also deﬁne ˜ X i = { t i , s i 1 , s i 2 , s i 3 , a i 1 , . . . , a i 4 } to b e the set of leav es of T asso ciated with X i . F or the example giv en ab ov e, the corresp onding tree is given in Figure 2. The weigh ts of the no des of T . Only the lea ves of T will hav e non-zero weigh t, i.e., we set w ( r ) = w ( r 1 ) = · · · = w ( r m ) = 0 . While deﬁning the weigh t of the leav es of T it will b e useful to assign weigh t also to eac h u ∈ U. In particular, our weigh t assignmen t will b e suc h that each leaf in T which is asso ciated with an element u will b e assigned the same w eight we assign to u. Also, when we ﬁx the w eight of u w e shall understand that we are ﬁxing the weigh t of all leav es in T asso ciated with u. W e extend the function w () to sets, so the weigh t of a set is the total weigh t of its elements. Also w e deﬁne the weigh t of a tree as the total weigh t of its no des. The weigh ts will b e set in order to force any optimal search tree for ( T , w ) to hav e a well-deﬁned structure. The following notions of Conﬁguration and Realization will be useful to describ e suc h a structure of an optimal search tree. In describing the searc h tree we shall use q ν to denote the no de in the search tree under consideration that represen ts the question ab out the no de ν of the input tree T . Moreo ver, we shall in general only b e concerned with the part of the search tree meant to identify 3 the no des of T of non-zero w eight. It should b e clear that the searc h tree can b e easily completed b y app ending the remaining queries at the b ottom. Deﬁnition 1. Given le aves ` 1 , . . . , ` h of T , a sequential searc h tree for ` 1 , . . . , ` h is a se ar ch tr e e of height h whose left p ath is q ` 1 , . . . , q ` h . This is the str ate gy that asks ab out one le af after another until they have al l b e en c onsider e d. Se e Figur e 3 (a) for an example. Conﬁgurations, and Realizations of Π . F or eac h i = 1 , . . . , m, let D A i b e the search tree with ro ot q r i , with right subtree b eing the sequen tial search tree for t i , s i 3 , s i 2 , s i 1 , and left subtree b eing a sequential searc h tree for (some p ermutation of ) a i 1 , . . . a i 4 . W e also refer to D A i as the A -conﬁguration for ˜ X i . Moreo ver, let D B i b e the search tree with ro ot q t i and left subtree b eing a sequential search tree for (some p erm utation of ) a i 1 , . . . a i 4 . W e sa y that D B i is the B -conﬁguration for ˜ X i . See Figure 3 (b)-(c). Deﬁnition 2. Given two se ar ch tr e es T 1 , T 2 , the extension of T 1 with T 2 is the se ar ch tr e e obtaine d by app ending the r o ot of T 2 to the leftmost le af of T 1 . The extension of T 1 with T 2 is a new se ar ch tr e e that “acts” like T 1 and in c ase of al l NO answers c ontinues fol lowing the str ate gy r epr esente d by T 2 . Deﬁnition 3. A r e alization (of Π ) with r esp e ct to Y ⊆ X is a se ar ch tr e e for ( T , w ) deﬁne d r e cursively as fol lows: 1 F or e ach i = 1 , . . . , n + m, a r e alization of π i π i +1 . . . π n + m is an extension of the r e alization of π i +1 . . . π n + m with another tr e e T 0 chosen ac c or ding to the fol lowing two c ases: Case 1. If π i = u j , for some j = 1 , . . . , n, then T 0 is a (p ossibly empty) se quential se ar ch tr e e for the le aves of T that ar e asso ciate d with u j and ar e not querie d in the r e alization of π i +1 . . . , π n + m . Case 2. If π i = X j , for some j = 1 , . . . , m, then T 0 is either D B j or D A j ac c or ding as X j ∈ Y or not. W e denote by D A the realization of Π w.r.t. the empt y family , i.e., Y = ∅ . Figure 4 sho ws some of the realizations for the Example 1 ab ov e. W e are going to set the weigh ts in such a wa y that every optimal solution is a realization of Π w.r.t. some Y ⊆ X (our Lemma 1). Moreov er, such weigh ts will allow to discriminate b et ween the cost of solutions that are realizations w.r.t. to an exact cov er for the X3C instance and the cost of an y other realization of Π. Let D ∗ b e an optimal search tree and Y b e such that D ∗ is a realization of Π w.r.t. Y . 2 In addition, for each u ∈ U deﬁne W u = P ` : X ` ≺ u w ( ˜ X ` ). It is not hard to see that the diﬀerence b etw een the cost of D A and D ∗ can b e expressed as follows: cost ( D A ) − cost ( D ∗ ) = X X i ∈Y   w ( t i ) − ( W u i 1 + W u i 2 + W u i 3 ) − 3 X j =1 d A B ( q s i j ) w ( u i j )   , (1) where d A B ( q s i j ) is the diﬀerence b et ween the level of the no de q s i j in D ∗ and the level q s i j in a realization of Π w.r.t. Y \ { X i } . T o see this, imagine to turn D A in to D ∗ one step at a time. Each step b eing the c hanging of conﬁguration from A to B for a set of leav es ˜ X i suc h that X i ∈ Y . Such a step implies: (a) moving the question q s ij exactly d A B ( q s ij ) levels down, so increasing the cost by d A B ( q s i j ) w ( u i j ); (b) b ecause of (a) all the questions that were b elow the level where q s ij is mov ed, are also mov e d down one lev el. This additional increase in cost is accounted for b y the W u i j ’s; (c) moving one lev el up the question ab out t i , so gaining cost w ( t i ) . W e will deﬁne the w eight of t i in order to: comp ensate the increase in cost (a)-(b) due to the relocation of q s i j ; and to provide some additional gain only when Y is an exact co ver. In general, the v alue of d A B ( q s i j ) dep ends on the structure of the realization for Y \ { X i } ; in particular, on the length of the sequential searc h trees for the leav es asso ciated to u κ ’s, that app ear in Π b etw een X i and u i j . Ho wev er, when Y is an exact cov er, eac h such sequential search tree has length one. A moment’s reﬂection sho ws that in this case d A B ( q s i j ) = γ ( i, j ) , where, for eac h i = 1 , . . . , m and j = 1 , 2 , 3 , we deﬁne γ ( i, j ) = j − 5 + |{ u κ : u i j ≺ u κ ≺ X i }| + 5 · |{ X κ : u i j ≺ X κ  X i }| 1 F or sak e of deﬁniteness w e set π m + n +1 = ∅ and the realization of π n + m +1 w.r.t. Y to b e the empty tree. 2 The existence of such a Y will b e guaranteed by Lemma 1. 4 T o see this, assume that Y is an exact cov er. Let D 0 b e the realization for Y \ X i , and ` b e the lev el of the ro ot of the A -conﬁguration for ˜ X i in D 0 . The no de q s i j is at level ` + (5 − j ) in D 0 . In D ∗ , the ro ot of the B -conﬁguration for ˜ X i is also at level `. Also, in D ∗ , b et ween lev el ` and the level of q s i j , there are only no des asso ciated with elements of some π k s.t. u ij ≺ π κ  X j . Precisely , there is 1 level p er each u κ s.t. u ij ≺ u κ ≺ X i (corresp onding to the sequential searc h tree for the only leaf asso ciated with u κ ); and 5 levels p er eac h X κ s.t. u ij ≺ X κ  X i (corresp onding to the left path of the A or B -conﬁguration for ˜ X κ ). In total, the diﬀerence b et ween the levels of q s ij in D 0 and D ∗ is exactly γ ( i, j ) . Note that γ ( i, j ) is still well deﬁned even if there is not an exact cov er Y ⊆ X . This quantit y will b e used to deﬁne w ( t i ) . W e are now ready to provide the precise deﬁnition of the weigh t function w . W e start with w ( u 1 ) = 1 . Then, we ﬁx the remaining w eights inductiv ely , using the sequence Π in the following wa y: let i > 1 and assume that for each i 0 < i the w eights of all lea ves asso ciated with π i 0 ha ve b een ﬁxed 3 . W e now pro ceed according to the follo wing tw o cases: Case 1. π i = u j , for some j ∈ { 1 , . . . , n } . Then, we set w ( u j ) = 1 + 6 max {| T | 3 w ( u j − 1 ) , W u j } , where | T | denotes the n umber of no des of T . Case 2. π i = X j , for some j ∈ { 1 , . . . , m } . Note that in this case the w eights of the lea ves s j 1 , s j 2 , s j 3 ha ve already b een ﬁxed, resp ectively to w ( u j 1 ) , w ( u j 2 ) , and w ( u j 3 ) . This is b ecause w e ﬁx the weigh ts follo wing the sequence Π and we ha ve u j 1 ≺ u j 2 ≺ u j 3 ≺ X j . In order to deﬁne the weigh ts of the remaining elements in ˜ X j w e set w ( a j 1 ) = · · · = w ( a j 4 ) = W u j 1 + W u j 2 + W u j 3 + P 3 κ =1 γ ( j, κ ) w ( u j κ ). Finally , we set w ( t j ) = w ( a j 1 ) + w ( X j ) / 2 . Remark 1. F or e ach i = 1 , . . . , n + m, let w ( π i ) denote the total weight of the le aves asso ciate d with π i . It is not har d to se e that w ( π i ) = O ( | T | 3 i ) . Ther efor e we have that the maximum weight is not lar ger than w ( π m + n ) = O ( | T | 3( m + n ) ) . It fol lows that we c an enc o de al l the weights using O (3 | T | ( n + m ) log | T | ) bits, henc e the size of the instanc e ( T , w ) is p olynomial in the size of the X3C instanc e I = ( U, X ) . Since t m is the hea viest leaf, one can show that in an optimal search tree D ∗ the ro ot can only b e q t m or q r m . F or otherwise moving one of these questions closer to the ro ot of D ∗ results in a tree with smaller cost, violating the optimality of D ∗ . Moreov er, b y a similar “exchange” argumen t it follo ws that if q r m is the ro ot of D ∗ then the right subtree m ust coincide with a sequential search tree for t m , s m 1 , s m 2 , s m 3 and the left subtree of q r m m ust b e a sequen tial tree for a m 1 , . . . , a m 4 . Therefore the top levels of D ∗ coincide either with D A m or with D B m , or equiv alently they are a realization of π m + n . Rep eating the same argumen t on the remaining part of D ∗ w e hav e the following (the complete pro of is in app endix): Lemma 1. Any optimal se ar ch tr e e for the instanc e ( T , w ) is a r e alization of Π w.r.t. some Y ⊆ X . Recall now the deﬁnition of the search tree D A . Let D ∗ b e an optimal searc h tree for ( T , w ) . Let Y ⊆ X b e suc h that D ∗ is a realization of Π w.r.t. Y . Equation (1) and the deﬁnition of w ( t i ) yield cost ( D A ) − cost ( D ∗ ) = X X i ∈Y   w ( X i ) 2 + 3 X j =1  γ ( i, j ) − d A B ( q s i j )  w ( u i j )   = n X j =1 X X i ∈Y u j ∈ X i  w ( u j ) 2 + Γ( i, j ) w ( u j )  , (2) where Γ( i, j ) = γ ( i, κ ) − d A B ( q s i κ ) , and κ ∈ { 1 , 2 , 3 } is such that s i κ = u j . By deﬁnition, if for each j = 1 , . . . , n, there exists exactly one X i ∈ Y such that u j ∈ X i , then we ha ve Γ( i, j ) = 0 . Therefore, equation (2) ev aluates exactly to P n j =1 w ( u j ) 2 . Conv ersely , we can prov e that this never happ ens when for some 1 ≤ j ≤ n, u j app ears in none or in more than one of the sets in Y . F or this we use the exp onen tial (in | T | ) gro wth of the w eights w ( u j ) and the fact that in such case the 3 By the leav es asso ciated with π i 0 w e mean the leav es in ˜ X j , if π i = X j for some X j ∈ X , or the leav es asso ciated with u if π i 0 = u for some u ∈ U. 5 inner sum of the last expression in (2) is non-p ositiv e. In conclusion we hav e the follo wing result, whose complete pro of is in app endix. Lemma 2. L et D ∗ b e an optimal se ar ch tr e e for ( T , w ) . L et Y ⊆ X b e such that D ∗ is a r e alization of Π w.r.t. Y . We have that cost ( D ∗ ) ≤ cost ( D A ) − 1 2 P u ∈ U w ( u ) if and only if Y is a solution for the X3C instanc e I = ( U, X ) . The N P -Completeness of 3-b ounded X3C [13], Remark 1, and Lemma 2 imply the following. Theorem 1. The se ar ch tr e e pr oblem is N P -Complete in the class of tr e es of diameter at most 4 . Note that this result is tigh t. In fact, for trees of diameter at most 3 the problem is p olynomially solv able, e.g., via dynamic programming (see App endix). N P -Completeness for b ounded-degree instances. W e can adapt our pro of to show that the search tree problem is N P -Complete also for b ounded-degree trees. F or that, we mo dify the input tree as follo ws. W e partition the subsets of X so that sets that are adjacen t in Π are put together. F or the instance in the Example 1 the corresp onding partition would b e {{ X 1 } , { X 2 } , { X 3 , X 4 }} . Let Z = {Z 1 , . . . , Z p } b e the partition obtained from the input instance ( U, X ) . Recall the deﬁnitions of the subtrees T j and the leav es a j 1 , . . . , a j 4 ( j = 1 , . . . , m ) given for the construction of the tree T . W e no w create a new tree T b as follows. F or each i = 1 , . . . , p, in T b there is a subtree H i that corresp onds to the elemen t Z i ∈ Z . H i has ro ot h i . F or each j such that X j ∈ Z i w e mak e the ro ot of T j , i.e., r j , and the lea ves a j 1 , . . . , a j 4 c hildren of h i . Finally , we create no des z 1 , . . . , z p and mak e h 1 a c hild of z 1 and for i = 2 , . . . , p we make z i − 1 and h i c hildren of z i . See Fig. 5 for the tree T b corresp onding to the instance in Example 1. The fact that in Π there are no more than three elements of X whic h app ear consecutively , implies that an y Z i con tains at most three elements. This gives that the maximum degree in T b is at most 16. Regarding the weigh t function, w e extend to T b the weigh t function deﬁned for the tree T by setting w ( h i ) = w ( z i ) = 0 , for eac h i = 1 , . . . , p and lea ving the other weigh ts as b efore. It turns out that Lemma 1 still holds for the new instance ( T b , w ) . In fact, in eac h subtree H i the structure of the instance is exactly the same as in the tree T , so one can pro ve that an y optimal solution for such subinstance is a realization of the corresp onding subsequence of Π . Moreov er, b ecause of the wa y w e partitioned X , and the w eight function w , it follo ws that the smallest weigh t of an a j k in Z i is bigger than the total w eight of the leav es in Z 1 , . . . , Z i − 1 . This is enough to enforce the order of a realization of Π , i.e., that the lea ves t j , a j 1 , . . . , a j 4 are queried b efore the leav es in Z 1 , . . . , Z i − 1 . W e hav e prov ed the follo wing (a formal pro of is in the app endix). Lemma 3. Any optimal se ar ch tr e e for the instanc e ( T b , w ) is a r e alization of Π w.r.t. some Y ⊆ X . By using this lemma together with Lemma 2 we hav e that Theorem 1 holds also for b ounded-degree instances of the tree searc h problem. 3 Appro ximation Algorithms W e need to introduce some notation. F or an y forest F of ro oted trees and no de j ∈ F , we denote by F j the subtree of F composed by j and all of its descendan ts. W e denote the ro ot of a tree T by r ( T ), δ ( u ) denotes the n umber of children of u and c i ( u ) is used to denote the i th c hild of u according to some arbitrarily ﬁxed order. The following op eration will b e useful for mo difying search trees: Given a searc h tree D and a no de u ∈ D , a left deletion of u is the op eration that transforms D into a new search tree b y removing b oth u and its left subtree from D and, then, by connecting the right subtree of u to the paren t of u (if it exists). A right deletion is analogously deﬁned. Giv en a search tree D for T , w e use l u to denote the leaf of D assigned to no de u of T . 6 3.1 The natural greedy algorithm attains 2 -approximation Consider a search tree D for T . Notice that when we follow a path from the ro ot of D to one of its lea ves, we reduce the searc h space (eliminate part of T ) whenever w e visit a new no de. Therefore, we can asso ciate with each no de of D the subtree of T whic h may still con tain the no de w e searc h for. Notice that the tree T 0 asso ciated with no de v ∈ D is exactly the one induced b y the no des of T that corresp ond to the lea ves of D v , hence w ( T 0 ) = w ( D v ). E.g., in Fig. 1 the no de < f > in D is asso ciated with T d . W e can transform a search tree D for T into a search tree D 0 for an arbitrary subtree T 0 of T . This searc h tree D 0 is computed by taking each no de v ∈ D assigned to a no de A ( v ) in T − T 0 and applying a left deletion if A ( v ) is an ancestor of r ( T 0 ) or a right deletion otherwise. The imp ortan t prop erty of this construction is that the path r ( D 0 ) l x , for every x ∈ T 0 , is exactly the subpath obtained b y removing all queries to no des in T − T 0 from r ( D ) l x . The next lemma formalizes this discussion: Lemma 4. Consider a tr e e T and a se ar ch tr e e D for it. L et T 0 b e a subtr e e of T . Then ther e is a se ar ch tr e e D 0 for T 0 such that d ( r ( D 0 ) , l x ) = d ( r ( D ) , l x ) − n x , wher e n x is the numb er of no des in the p ath r ( D ) l x assigne d to no des in T − T 0 . W e show that the natural greedy algorithm guarantees an appro ximation factor of 2. The algorithm can b e formulated in tw o sen tences. (1) Let x b e a no de suc h that | w ( T x ) − w ( T \ T x ) | is minimized. Set A ( r ( D )) = x. (2) Construct the right and left subtree of D by recursiv ely applying the algorithm to T x and T \ T x , resp ectiv ely . In order to prov e that this algorithm results in a 2-approximation, w e sho w that any searc h tree D ∗ can b e turned in to the greedy searc h tree D while the cost increases by at most cost ( D ∗ ) . The pro of is by induction on the n umber of no des n of the input tree T . F or the basic case n = 1 there is nothing to show. Assume that the claim holds for any tree with at most n − 1 no des. In order to pro ve it true for T w e pro ceed in tw o steps. Let x b e the no de queried at the ro ot of D . Also let D ∗ 0 (resp. D 0 ) and D ∗ 1 (resp. D 1 ) b e the search tree for T x and T \ T x obtained from D ∗ (resp. D ) via Lemma 4. (a) Construct a search tree D 0 with A ( r ( D 0 )) = x and the left and righ t subtree b eing D ∗ 1 and D ∗ 0 resp ectiv ely . It is not hard to see that D 0 is a legal searc h tree. (b) Use the induction h yp othesis for turning D ∗ 0 and D ∗ 1 in to D 0 and D 1 resp ectiv ely . It is straigh tforward to see that the transformation results in the tree D . Lemma 5. We have cost ( D 0 ) ≤ cost ( D ∗ ) + w ( T ) / 2 . Pr o of sketch. Let x and x ∗ b e the no des queried at the ro ot of D 0 and D ∗ , respectively . W.l.o.g. we assume x 6 = x ∗ , as otherwise the lemma trivially holds. W e can also assume that x ∗ is a no de from T x , b ecause the opp osite case is analyzed analogously . W e shall ﬁrst analyze the case w ( T x ) ≤ w ( T − T x ) , i.e., w ( T x ) ≤ w ( T ) / 2. As an y path from r ( D ∗ ) to a leaf in D ∗ con tains r ( D ∗ ) and T − T x do es not contain x ∗ , Lemma 4 states that the depth of any leaf in D ∗ 1 is at least by one smaller than it is in D ∗ . The lemma also implies that the depth of any leaf in D ∗ 0 is not greater than it is in D ∗ . So w e hav e cost ( D 0 ) = w ( T ) + cost ( D ∗ 0 ) + cost ( D ∗ 1 ) ≤ w ( T ) + cost ( D ∗ ) − w ( T − T x ) ≤ cost ( D ∗ ) + w ( T ) / 2 . The case w ( T x ) > w ( T − T x ) requires a more inv olv ed analysis and we defer it to the app endix due to the space limitations. It follo ws that the cost of D can b e b ounded from ab o ve by cost ( D ) = w ( T ) + cost ( D 0 ) + cost ( D 1 ) ≤ w ( T ) + 2 cost ( D ∗ 0 ) + 2 cost ( D ∗ 1 ) = 2 cost ( D 0 ) − w ( T ) ≤ 2 cost ( D ∗ ) . The ﬁrst inequalit y follows from the induction hypothesis and the second one is due to Lemma 5. W e hav e prov en the follo wing result. Theorem 2. The gr e e dy str ate gy is a p olynomial 2 -appr oximation algorithm for the tr e e se ar ch pr oblem. 7 3.2 An FPT AS for Searching in Bounded-Degree T rees W e now present an FPT AS for searc hing in trees with b ounded degree. First, w e devise a dynamic programming algorithm whose running time is exp onen tial in the height of optimal search trees. Then w e essentially argue that the height of optimal search trees is O (∆( T ) · (log w ( T ) + log n )), th us the previous algorithm has a pseudo-p olynomial running time. Finally , w e employ a standard scaling tec hnique to obtain an FPT AS. W e often construct a search trees starting with its ‘left part’. In order to formally describ e such constructions, w e deﬁne a left p ath as an ordered path where every no de has only a left child. In addition, the left p ath of an ordered tree T is deﬁned as the ordered path we obtain when we trav erse T b y only going to the left child, until we reach a no de whic h do es not hav e a left child. A dynamic programmi ng algorithm. In order to ﬁnd an optimal searc h tree in an eﬃcien t w ay , w e need to deﬁne a family of auxiliary problems denoted by P B ( F , P ). In the following paragraphs w e describ e the essen tial structures needed in these subproblems and then we sho w how to use the subproblems to ﬁnd an optimal search tree. First we introduce the concept of an extende d se ar ch tr e e , which is basically a search tree with some extra no des that hav e not b een asso ciated with a query yet (unassigned no des) and some other no des that cannot b e associated with a query (blo ck ed no des). Deﬁnition 4. An extended search tree (EST) for a for est F = ( V , E ) is a triple D = ( N , E 0 , A ) , wher e N and E 0 ar e the no des and e dges of an or der e d binary tr e e and the assignment A : N → V ∪ { blo cke d , unassigne d } simultane ously satisfy the fol lowing pr op erties: (a) F or every no de v of F , D c ontains b oth a le af ` and an internal no de u such that A ( ` ) = A ( u ) = v ; (b) ∀ u, v ∈ D , with A ( u ) , A ( v ) ∈ F , the fol lowing holds: If v is in the right subtr e e of u then A ( v ) ∈ F A ( u ) . If v is in left subtr e e of u then A ( v ) / ∈ F A ( u ) ; (c) If u is a no de in D with A ( u ) ∈ { blo cke d , unassigne d } , then u do es not have a right child. If we drop (c) and also the requirement regarding internal no des in (a) w e ha ve the deﬁnition of a searc h tree for F . The cost of an EST D for F is analogous to the cost of a search tree and is given by cost ( D ) = P d ( r ( D ) , u ) w ( A ( u )), where the summation is tak en o ver all lea ves u ∈ D for whic h A ( u ) ∈ F . A t this p oint w e establish a corresp ondence b et ween optimal EST’s and optimal search trees. Giv en an EST D for a tree T , we can apply a left deletion to the in ternal no de of D assigned to r ( T ) and right deletions to all no des of D that are blo c ked or unassigned, getting a searc h tree D 0 of cost cost ( D 0 ) ≤ cost ( D ) − w ( r ( T )). Con versely , we can add a no de assigned to r ( T ) to a search tree D 0 and get an EST D such that cost ( D ) ≤ cost ( D 0 ) + w ( r ( T )). Employing these observ ations w e can prov e the following lemma: Lemma 6. Any optimal EST for a tr e e T c an b e c onverte d into an optimal se ar ch tr e e for T (in line ar time). In addition, the existenc e of an optimal se ar ch tr e e of height h implies the existenc e of an optimal EST of height h + 1 . So we can fo cus on obtaining optimal EST’s. First, we introduce concepts which serve as a building blo c ks for EST’s. A p artial left p ath (PLP) is a left path where every no de is assigned (via a function A ) to either bl ock ed or unassig ned . No w consider an EST D and let L = { l 1 , . . . , l | L | } b e its left path. W e say that D is c omp atible with a PLP P = { p 1 , . . . , p | P | } if | P | = | L | and A ( p i ) = blo cke d implies A ( l i ) = blo cke d . The tree in Figure 7.(c) is compatible with the path of Figure 7.(b). This deﬁnition of compatibility implies a natural one to one corresp ondence b etw een no des of L and P . Therefore, without ambiguit y , we can use p i when referring to no de l i and vice v ersa. No w w e can in tro duce our subproblem P B . First, ﬁx a tree T with n no des and a weigh t function w . Giv en a forest F = { T c 1 ( u ) , T c 2 ( u ) . . . , T c f ( u ) } , a PLP P and an integer B , the problem P B ( F , P ) consists 8 of ﬁnding an EST for F with minimum cost among those EST’s for F that are compatible with P and ha ve heigh t at most B . W e shall note that F is not a general subforest of T , but one consisting of subtrees ro oted at the ﬁrst f children of some no de u ∈ T , for some 1 ≤ f ≤ δ ( u ). Notice that if P is a PLP where all no des are unassigned and P and B are suﬃciently large, then P B ( T , P ) gives an optimal EST for T . Algorithm for P B ( F , P ) . W e hav e a base case and also tw o other cases dep ending on the structure of F . In all these cases, although not explicitly stated, if P does not con tain unassigned no des then the algorithm returns ‘not feasible’. If during its execution the algorithm encounters a ‘not feasible’ subproblem it ignores this c hoice in the enumeration. Base c ase: F has only one no de u . In this case, the optimal solution for P B ( F , P ) is obtained from P b y assigning its ﬁrst unassigned no de, say p i , to u and then adding a leaf assigned to u as a right child of p i . Its cost is i · w ( u ). Case 1: F is a for est { T c 1 ( u ) , . . . , T c f ( u ) } . The idea of the algorithm is to decomp ose the problem in to subproblems for the forests T c f ( u ) and F \ T c f ( u ) . F or that, it needs to select w hic h no des of P will b e assigned to eac h of these forests. The algorithm considers all possible bipartitions of the unassigned no des of P and for each bipartition U = ( U f , U o ) it computes an EST D U for F compatible with P . A t the end, the algorithm returns the tree D U with smallest cost. The EST D U is constructed as follo ws: 1. Let P f b e the PLP constructed by starting with P and then setting all the no des in U o as blo ck ed (Figure 6.b). Similarly , let P o b e the PLP constructed by starting with P and setting all no des in U f as blo ck ed. Let D f and D o b e optimal solutions for P B ( T c f ( u ) , P f ) and P B ( F \ T c f ( u ) , P o ), resp ectiv ely (Figure 6.c). 2. The EST D U is computed b y taking the ‘union’ of D f and D o (Figure 6.d). More formally , the ‘union’ op eration consists of starting with the path P and then replacing: (i) every no de in P ∩ U f b y the corresp onding no de in the left path of D f and its righ t subtree; (ii) every no de in P ∩ U o b y the corresp onding no de in the left path of D o and its righ t subtree. Notice that the height of ev ery EST D U is at most B ; this implies that the algorithm returns a feasible solution for P B ( F , P ). Also, the cost of D U is giv en by OP T ( P B ( T c f ( u ) , P f )) + O P T ( P B ( F \ T c f ( u ) , P o )). The optimalit y of the abov e pro cedure relies on the fact we can build an EST ¯ D f for T c f ( u ) b y starting from an optimal solution D ∗ for P B ( F , P ) and p erforming the following op eration at eac h no de v of its left path: (i) if v is unassigned we assign it as block ed; (ii) if v is assigned to a no de in F \ T c f ( u ) w e assign it as block ed and remo v e its right subtree. W e can construct an EST ¯ D o for F \ T c f ( u ) analogously . Notice that cost ( ¯ D f ) + cost ( ¯ D o ) = cost ( D ∗ ). The proof is then completed b y noticing that, for a particular c hoice of U , ¯ D f and ¯ D o are feasible for P B ( T c f ( u ) , P f ) and P B ( F \ T c f ( u ) , P o ), so the solution returned b y the ab ov e algorithm costs at most O P T ( P B ( T c f ( u ) , P f )) + O P T ( P B ( F \ T c f ( u ) , P o )) ≤ cost ( D ∗ ). Case 2: F is a tr e e T v . Let p i b e an unassigned no de of P and let t b e an in teger in the in terv al [ i + 1 , B ]. The algorithm considers all p ossibilities for p i and t and computes an EST D i,t for T v of smallest cost satisfying the following: (i) D i,t is compatible with P ; (ii) its heigh t is at most B ; (iii) the no de of the left path of D i,t corresp onding to p i is assigned to v ; (iv) the leaf of D i,t assigned to v is lo cated at level t . The algorithm then returns the tree D i,t with minim um cost. In order to compute D i,t the algorithm executes the follo wing steps: 1. Let P i b e the subpath of P that starts at the ﬁrst no de of P and ends at p i . Let P i,t b e a left path obtained b y app ending t − i unassigned no des to P i and assigning p i as blo ck ed (Figure 7.b). Compute an optimal solution D 0 for P B ( { T c 1 ( v ) , T c 2 ( v ) , . . . , T c δ ( v ) ( v ) } , P i,t ). 9 2. Let p 0 i b e the node of D 0 corresp onding to p i and let y 0 b e the last node of the left path of D 0 (Figure 7.c). The tree D i,t is constructed b y mo difying D 0 as follows (Figure 7.d): make the left subtree of p 0 i b ecomes its right subtree; assign p 0 i to v ; add a leaf assigned to v as the left c hild of y 0 ; ﬁnally , as a technical detail, add some blo ck ed no des to extend the left path of this structure until the left path has the same size of P . It follows from prop erties (i) and (ii) of the trees D i,t ’s that the ab ov e pro cedure returns a feasible solution for P B ( T v , P ). The pro of of the optimalit y of this solution uses the same type of arguments as in Case 1 and is deferred to the app endix. Computational complexity . Notice that it suﬃces to consider problems P B ( F , P )’s where | P | ≤ B , since all others are infeasible. W e claim that, b y employing a Dynamic Programming strategy , we can compute all these problems in O ( n 2 2 2 B ) time. First, there are O ( n 2 B ) such problems; this follo ws from the fact that for each no de u in T there are tw o p ossible forests F considered in subproblems ( F = T u or F = { T c 1 ( u 0 ) , T c 2 ( u 0 ) , . . . , T c f ( u 0 ) = T u } , where u is the f -th child of u 0 ) and the fact there are O (2 B ) PLP’s of size at most B . It is not diﬃcult to see that each of these problems can b e solv ed in O ( n + 2 B ) time, so the claim holds. An upp er b ound on the height of optimal search trees. W e now argue that there is an optimal searc h tree for ( T , w ) whose heigh t is O (∆( T ) · (log w ( T ) + log n )). The following lemma is the core of our ‘geometric decrease’ argumen t. It essentially states that we can cut a constant factor of the total w eight of an optimal searc h tree by going do wn a num b er of levels that only dep ends on the maximum degree of T . Lemma 7. Consider an instanc e ( T , w ) for our se ar ch pr oblem and let D ∗ b e an optimal se ar ch tr e e for it. Fix 0 ≤ α < 1 and an inte ger c > 3(∆( T ) + 1) /α . Then, for every no de v ∗ ∈ D ∗ with d ( r ( D ∗ ) , v ∗ ) ≥ c we have that w ( D ∗ v ∗ ) ≤ α · w ( D ∗ ) . Pr o of sketch. (The full pro of is deferred to the app endix.) By means of con tradiction assume the lemma do es not hold for some v ∗ satisfying its conditions. Let ˜ T b e the tree associated with v ∗ , ro oted at no de ˜ r . Since by hypothesis ˜ T con tains a large p ortion of the total w eight (greater than α · w ( D ∗ )), w e create the follo wing searc h tree D 0 whic h mak es sure parts of ˜ T are queried closer to r ( D 0 ): the root of D 0 is assigned to ˜ r ; the left tree of r ( D 0 ) is a searc h tree for T − T ˜ r obtained via Lemma 4; in the righ t tree of r ( D 0 ) we build a left path con taining no des corresp onding to queries for c 1 ( ˜ r ) , c 2 ( ˜ r ) , . . . , c δ ( ˜ r ) ( ˜ r ), each having as righ t subtree a search tree for the corresp onding T c i ( ˜ r ) obtained via Lemma 4. If ¯ s is the num b er of no des of T − T ˜ r queried in r ( D ∗ ) v ∗ , then Lemma 4 implies that D 0 sa ves at least ¯ s − (∆( T ) + 1) queries for eac h no de in ˜ T when compared to D ∗ ; this gives the expression cost ( D 0 ) ≤ cost ( D ∗ ) − ¯ s · w ( ˜ T ) + (∆( T ) + 1) w ( T ). Using the hypothesis on c and w ( ˜ T ), this is enough to reach the con tradiction cost ( D 0 ) < cost ( D ∗ ) when ¯ s ≥ c/ 3. The case when ¯ s < c/ 3 is a little more in volv ed but uses a similar construction, only now the role of ˜ r is taken by a no de inside T ˜ r in order to obtain a more ‘balanced’ search tree. Assume that the w eight function w is strictly positive (see App endix E.3 for the general case). Since w is in tegral, employing Lemma 7 rep eatedly sho ws that D ∗ has heigh t at most O (∆( T ) · (log w ( T ) + log n )). F rom the DP algorithm to an FPT AS. By Lemmas 6 and 7, we can obtain an optimal searc h tree for ( T , w ) b y ﬁnding an optimal EST of height B = O (∆( T ) · (log w ( T ) + log n )) (via P B ) and then con verting it into an optimal search tree. Since we can employ the algorithm presented in the previous section to achiev e this in O  ( n · w ( T )) O (∆( T ))  time, w e obtain a pseudo-p olynomial time algorithm for trees with b ounded degree. F urthermore, suc h an algorithm can b e transformed into an FPT AS b y scaling and rounding the weigh ts w , just as in the w ell-known FPT AS for the knapsack problem [18] (see the app endix for details): Theorem 3. Consider an instanc e ( T , w ) to our se ar ch pr oblem wher e ∆( T ) = O (1) . Then ther e is a poly ( n · w ( T )) -time algorithm for c omputing an optimal se ar ch tr e e for ( T , w ) . In addition, ther e is a poly ( n/ ) -time algorithm for c omputing an (1 +  ) -appr oximate se ar ch tr e e for ( T , w ) . 10 References [1] M. Adler, E. Demaine, N. Harvey , and M. Patrascu. Low er b ounds for asymmetric communication channels and distributed source coding. In SODA , pages 251–260, 2006. [2] M. Adler and B. Heeringa. Approximating optimal binary decision trees. In APPRO X-RANDOM , pages 1–9, 2008. [3] M. Adler and B. Maggs. Proto cols for asymmetric communication channels. Journal of Computer and System Scienc es , 63(4):573–596, 2001. [4] E. Arkin, H. Meijer, J. Mitchell, D. Rappap ort, and S. Skiena. Decision trees for geometric mo dels. Interna- tional Journal of Computational Ge ometry and Applic ations , 8(3):343–364, 1998. [5] Y. Ben-Asher, E. F archi, and I. Newman. Optimal search in trees. SIAM Journal on Computing , 28(6):2090– 2102, 1999. [6] R. Carmo, J. Donadelli, Y. Kohay ak aw a, and E. Lab er. Searching in random partially ordered sets. The or etic al Computer Scienc e , 321(1):41–57, 2004. [7] V. Chak arav arthy , V. Pandit, S. Roy , P . Awasthi, and M. Mohania. Decision trees for en tity identiﬁcation: Appro ximation algorithms and hardness results. In PODS , pages 53–62, 2007. [8] C. Dask alakis, R. Karp, E. Mossel, S. Riesenfeld, and E. V erbin. Sorting and selection in p osets. In SODA , pages 392–401, 2009. [9] P . de la T orre, R. Greenla w, and A. Sch¨ aﬀer. Optimal edge ranking of trees in p olynomial time. Algorithmic a , 13(6):592–618, 1995. [10] D. Dereniowski. Edge ranking and searc hing in partial orders. Discr ete Applie d Mathematics , 156(13):2493– 2500, 2008. [11] U. F aigle, L. Lo v´ asz, R. Sc hrader, and Gy. T ur´ an. Searching in trees, series-parallel and interv al orders. SICOMP: SIAM Journal on Computing , 15, 1986. [12] M. Garey . Optimal binary iden tiﬁcation procedures. SIAM Journal on Applie d Mathematics , 23(2):173–186, 1972. [13] M. Garey and D. Johnson. Computers and Intr actability: A Guide to the The ory of NP–Completeness . F reeman, New Y ork, NY, 1979. [14] A. Garsia and M. W achs. A new algorithm for minimum cost binary trees. SIAM Journal on Computing , 6(4):622–642, 1977. [15] S. Ghazizadeh, M. Gho dsi, and A. Sab eri. A new proto col for asymmetric communication channels: Reaching the lo wer b ounds. Scientia Ir anic a , 8(4), 2001. [16] T. Hu and A. T uck er. Optimal computer searc h trees and v ariable-length alphab etic co des. SIAM Journal on Applie d Mathematics , 21(4), 1971. [17] L. Hyaﬁl and R. Rivest. Constructing optimal binary decision trees is NP-complete. Information Pr o c essing L etters , 5(1):15–17, 1976. [18] O. Ibarra and C. Kim. F ast approximation algorithms for the knapsack and sum of subset problems. Journal of the ACM , 22(4):463–468, 1975. [19] A. Iyer, H. Ratliﬀ, and G. Vijay an. On an edge ranking problem of trees and graphs. Discr ete Applie d Mathematics , 30(1):43–52, 1991. [20] D. Kn uth. The Art of Computer Pr o gr amming, V ol. 3: Sorting and Se ar ching . Addison-W esley , Reading, Massac husetts, 1973. [21] R. Kosara ju, T. Przytyc k a, and R. Borgstrom. On an optimal split tree problem. In W ADS , pages 157–168, 1999. [22] E. Lab er and L. Holanda. Improv ed b ounds for asymmetric communication protocols. Information Pr o c essing L etters , 83(4):205–209, 2002. 11 [23] E. Lab er and M. Molinaro. An appro ximation algorithm for binary searching in trees. In ICALP , pages 459–471, 2008. [24] E. Laber and L. Nogueira. On the hardness of the minimum height decision tree problem. Discr ete Applie d Mathematics , 144(1-2):209–212, 2004. [25] T. Lam and F. Y ue. Optimal edge ranking of trees in linear time. In SODA , pages 436–445, 1998. [26] N. Linial and M. Saks. Searching ordered structures. Journal of Algorithms , 6, 1985. [27] M. Lipman and J. Abrahams. Minimum av erage cost testing for partially ordered comp onen ts. IEEE T r ans- actions on Information The ory , 41(1):287–291, 1995. [28] S. Mozes, K. Onak, and O. W eimann. Finding an optimal tree searching strategy in linear time. In SODA , pages 1096–1105, 2008. [29] K. Onak and P . Parys. Generalization of binary search: Searching in trees and forest-like partial orders. In F OCS , pages 379–388, 2006. [30] A. Sch¨ aﬀer. Optimal no de ranking of trees in linear time. Information Pr o c essing L etters , 33(2):91–96, 1989. [31] J. W atkinson, M. Adler, and F. Fic h. New proto cols for asymmetric comm unication channels. In SIROCCO , pages 337–350, 2001. 12 App endix A The pro of of lemma 1 W e need t wo inequalities regarding the weigh ts. F act 1 F or each 1 ≤ i 0 < i ≤ m it holds that w ( t i ) > w ( a i 1 ) > w ( t i 0 ) + w ( u i 0 1 ) + w ( u i 0 2 ) + w ( u i 0 3 ) (3) Pr o of of the fact. The ﬁrst inequality follows b y deﬁnition. In order to pro ve the second inequalit y let us consider the diﬀerence D if f = w ( a i 1 ) − ( w ( t i 0 ) + w ( u i 0 1 ) + w ( u i 0 2 ) + w ( u i 0 3 )) . By deﬁnition w e hav e D if f = 3 X j =1  W u ij + γ ( i, j ) w ( u ij )  − 3 X j =1  W u i 0 j + ( γ ( i 0 , j ) + 3 / 2) w ( u i 0 j )  . Case 1. u i 3 = u i 0 3 . Note that γ ( i, 3) ≥ 5 + γ ( i 0 , 3), Since W u ij , W u i 0 j ≥ 0 and 0 < γ ( i, j ) , γ ( i 0 , j ) ≤ | T | w e get that D if f ≥ 5 w ( u i 3 ) − 3 W u i 0 3 − (2 | T | + 3) w ( u i 0 2 ) . Let κ b e suc h that u i 3 = u κ . It follo ws from the deﬁnition of the function w () that w ( u i 3 ) = w ( u κ ) = 1 + 6 max { W u κ , | T | 3 w ( u κ − 1 ) } > 3 W u i 0 3 + (2 | T | + 3) w ( u i 0 2 ) . Th us, D if f > 0. Case 2. u i 0 3 ≺ u i 3 . Then, it m ust also hold that X i 0 ≺ u i 3 . Therefore w e hav e w ( a i ) ≥ W u i 3 ≥ w ( ˜ X i 0 ) > w ( t i 0 ) + w ( u i 0 1 ) + w ( u i 0 2 ) + w ( u i 0 3 ) . F act 2 F or each 1 ≤ i ≤ m and κ = 1 , . . . , 4 , it holds that w ( a iκ ) ≥ 3( w ( u i 3 ) + w ( u i 2 ) + w ( u i 1 )) + W u i 3 (4) It follo ws directly from the deﬁnition of w ( a iκ ) and the the fact that γ ( i, j ) ≥ 3 ( j = 1 , 2 , 3). Pro of of Lemma 1. Let D b e an optimal searc h tree for ( T , w ). Let ` b e the deep est no de in the left path of D such that D − D ` is the realization of π i +1 . . . π n + m for some i = 0 , . . . , n + m. In particular, we take i = n + m if ` is the ro ot of D , i.e., no upper part of D lo oks lik e a realization of suﬃx of Π . By con tradiction, assume that D is not a realization of Π , in particular i > 0 . W e shall pro ve that b y mo difying D ` in suc h a w ay that its top part b ecomes a realization of π i w e obtain a new search tree with cost smaller than the cost of D . The desired result will follo w by contr adiction. W e consider the follo wing cases: Case 1. π i = X j , for some j = 1 , 2 , . . . , m. First we argue that ` ∈ { q t j , q r j } . Let q ν (for some ν ∈ T ) b e the parent of q r j . If ν ∈ T j w e swap q t j with q ν otherwise we swap q r j with q ν . 4 Let D 0 b e the new tree so obtained. 4 When swapping we imply that the tw o no des are exc hanging position and they are carrying along also their right subtrees. This is possible because q r j is the left child of q ν . 13 If ν is a leaf in T , then w e hav e cost ( D 0 ) ≤ cost ( D ) − w ( t j ) + w ( ν ) < cost ( D ) since t j is the leaf of largest w eight in D ` . Otherwise, it m ust b e that ν = r j 0 for some j 0 < j. In this case, by (3), w e hav e cost ( D 0 ) ≤ cost ( D ) − w ( t j ) + w ( t j 0 ) + w ( u j 0 3 ) + w ( u j 0 2 ) + w ( u j 0 1 ) < cost ( D ) . In either case w e obtain a tree of a verage weigh t smaller than D , violating the optimality of D . Alternativ ely , if q t j is not the right child of q r j , then w e swap q t j with its paren t. Note that q t j m ust b e the left child of its parent. By pro ceeding as ab ov e, w e can prov e that the resulting tree has cost smaller than D , again a violation to the optimality of D . Therefore, it must b e ` ∈ { q t j , q r j } . W e no w split the analysis according to this tw o p ossible cases. Sub c ase 1.1. ` = q r j . Then, b ecause of the assumption on D − D ` and the search prop erty , it follows that the right subtree of q r j con tains the no des q t j , q s j 3 , q s j 2 , q s j 1 . Also, it is not hard to see that they must app ear in this order, for otherwise b y reordering them we w ould decrease the a verage cost of D , since w ( t j ) > w ( s j 3 ) > w ( s j 2 ) > w ( s j 1 ) . Therefore the right subtree of ` coincides with the righ t subtree of D A j . Supp ose now w.l.o.g. that for each κ = 2 , 3 , 4 , it holds that q a j κ − 1 is closer to the root of D than q a j κ F or the sake of contradiction, assume that q a j 1 is not a child of q r j . Let q ν b e the paren t of q a j 1 . Note that q a j 1 can only b e the left child of q ν . By sw apping q a j 1 with q ν the resulting tree has smaller exp ected cost than D , again in contradiction with the assumed optimality of D . In fact, if ν is a leaf in T then it follo ws from inequalit y (4) that w ( a j 1 ) > w ( u j 3 ) ≥ w ( ν ). Otherwise, if ν = r j 0 for some j 0 < j, and then, b y (3) we ha ve that w ( a j 1 ) is greater than the weigh t of the right subtree of q ν . The same arguments sho w that q a j κ is the left c hild of q a j κ − 1 , for eac h κ = 2 , 3 , 4 . W e can conclude that in the left path of D , the no des following ` are exactly q a j 1 , . . . , q a j 4 . Let ` 0 b e the left c hild of q a j 4 . W e ha ve show ed that in this sub case D ` − D ` 0 coincides with D A j . Sub c ase 1.2. ` = q t j . There is nothing to prov e ab out the righ t subtree of `. In order to prov e that in the left path of D , the no de ` is follow ed b y q a j 1 , . . . , q a j 4 5 w e pro ceed as b efore. Assume (b y contradiction) that q a j 1 is not a child of q r j . Let q ν b e the paren t of q a j 1 . Note that q a j 1 can only b e the left child of q ν . W e swap q a j 1 with q ν . Let D 0 b e the resulting search tree. If ν is r j or a leaf in T j \ { t j } , w e hav e that cost ( D 0 ) = cost ( D ) − w ( a j 1 ) + w ( X j ) < 0 , where w ( X j ) accounts for the weigh t of the righ t subtree of q ν and the last inequalit y follows by (4). On the other hand, if ν is either a leaf in T or is equal to r j 0 for some j 0 < j, then we can apply the same argument as in Sub case 1.1, to reac h the same conclusion, i.e., w e violate the optimality of D . Therefore, we conclude that q a j 1 is the left child of q ` . Rep eating the same argumen t we can also show that q a j κ is the left child of q a j κ − 1 , for each κ = 2 , 3 , 4 . Let ` 0 b e the left child of q a j 4 . W e hav e show ed that in this sub case, D ` − D ` 0 coincides with D B j . W e can conclude that in b oth sub cases of Case 1, the tree D − D ` 0 is realization of π i , . . . , π n + m against the assumption that ` is the deep est no de for which such a condition holds. Case 2. π i = u j , for some j = 1 , 2 , . . . , n. Let us consider the set of leav es L of T b whic h are asso ciated with u j and suc h that they are not queried in D − D ` . Since D − D ` is a realization of π i +1 . . . π n + m , the leav es of T b whic h are not in L and are queried in D ` are either in S X ≺ u j ˜ X or are asso ciated to u j 0 for some j 0 < j . F or the sak e of con tradiction w e assume that one of the ﬁrst | L | no des in the left path of D ` do es not corresp ond to a leaf in L . Let us construct a tree D 0 from D ` as follows: ﬁrst we construct an auxiliary tree by removing from D ` all the no des corresp onding to the lea ves in L . Then, we add a left path with these no des to the top of this auxiliary tree. Our assumption that one of the ﬁrst | L | no des in the left path of D ` do es not corresp ond to a leaf in L implies that 5 W e are again assuming, w.l.o.g., that for each κ = 2 , 3 , 4 , it holds that q a j κ − 1 is closer to the ro ot of D than q a j κ . 14 cost ( D 0 ) ≤ cost ( D ` ) − w ( u j ) + | L | X X ≺ u j w ( ˜ X ) + 3 · | L | · X u ≺ u j w ( u ) . The negative term in the equation ab o ve is b ecause the sum of the lev els of the no des asso ciated with u j in D ` is at least 7 while this sum is exactly 6 in D 0 . The other terms are due to the fact that the lev el of a no de can increase b y at most | L | units in our construction. The deﬁnitions of W u j and w ( u j ) imply that cost ( D 0 ) ≤ cost ( D ` ) − w ( u j ) + | L | W u j + | L | · | T | · w ( u j − 1 ) Since | L | ≤ 3 and w ( u j ) > 6 max { W u j , | T | 3 w ( u j − 1 ) } w e get that cost ( D 0 ) < cost ( D ` ). This implies, ho wev er, that D can b e improv ed, a con tradiction. Th us, the D ` ’s | L | top levels coincide with a sequential search tree for L. Let ` 0 the left most query of suc h sequential search. Therefore, D − D ` 0 is realization of π i . . . , π n +1 , whic h contradicts also in this Case 2 the h yp othesis that ` is the deep est no de for whic h such a condition holds. The pro of is complete. B The pro of of Lemma 2 Lemma 2. L et D ∗ b e an optimal binary se ar ch tr e e for ( T , w ) . L et Y ⊆ X b e such that D ∗ is a r e alization of Π w.r.t. Y . We have that cost ( D ∗ ) ≤ cost ( D A ) − 1 2 P u ∈ U w ( u ) if and only if Y is a solution for the X3C instanc e I = ( U, X ) . Pr o of. W e start proving the only if part. Assume that cost ( D A ) − cost ( D ∗ ) ≥ 1 2 P u ∈ U w ( u ) . W e shall use induction on j to pro ve that for each j = n, . . . , 1 there exists exactly one X ∈ Y , suc h that u j ∈ X . Fix j ∗ ≤ n and assume that for every j > j ∗ it holds that there exists exactly one X ∈ Y suc h that u j ∈ X . Supp ose that there is no i ∈ { 1 , . . . , m } such that u ∗ j ∈ X i ∈ Y . W e can rewrite (2) as follo ws: cost ( D A ) − cost ( D ∗ ) = n X j =1 X X i ∈Y u j ∈ X i  w ( u j ) 2 + Γ( i, j ) w ( u j )  , where Γ( i, j ) = γ ( i, κ ) − d A B ( q s i κ ) , and κ ∈ { 1 , 2 , 3 } such that s i κ = u j . No w, since w e are assuming that for all j > j ∗ there exists only one i suc h that X i ∈ Y and u j ∈ X i , b y the deﬁnition of d A B ( · ) and γ ( i, κ ) , we hav e Γ( i, j ) = 0 . So w e obtain cost ( D A ) − cost ( D ∗ ) = X j >j ∗ w ( u j ) 2 + X j j ∗ w ( u j ) 2 + 3( j ∗ − 1)( | T | + 1 / 2) w ( u j ∗ ) 6 | T | 3 < X j >j ∗ w ( u j ) 2 + w ( u j ∗ ) 2 ≤ 1 2 X u ∈ U w ( u ) . 15 Supp ose no w that there are κ > 1 subsets in Y that con tain u j ∗ . Rewriting (2) as b efore, we obtain: cost ( D A ) − cost ( D ∗ ) ≤ X j >j ∗ w ( j ) 2 + X X i ∈Y u j ∗ ∈ X i  w ( u j ∗ ) 2 + Γ( i, j ∗ ) w ( u j ∗ )  + X j j ∗ w ( j ) 2 − κ − 2 2 w ( u j ∗ ) + 3( j ∗ − 1)( | T | + 1 / 2) w ( u j ∗ ) 6 | T | 3 < X j ≥ j ∗ w ( u j ) 2 < 1 2 X u ∈ U w ( u ) . This concludes the inductiv e argument and the pro of of the only if part. In order to prov e the if part of the statement we notice that if Y is a solution for I then for each j = 1 , . . . , n there exists exactly one index i suc h that X i ∈ Y and u j ∈ X i . Then, the desired result follo ws directly by equation (2), and by the fact that in this case the deﬁnition of d A B ( · ) and γ ( · , · ) , yields Γ( i, j ) = 0 . C The pro of of Lemma 3 Pro of of Lemma 3. Let D b e an optimal searc h tree for ( T b , w ). Let ` b e the deep est no de in the left path of D such that D − D ` is the realization of π i +1 . . . π n + m for some i = 0 , . . . , n + m. In particular, we take i = n + m if ` is the ro ot of D , i.e., no upper part of D lo oks lik e a realization of some suﬃx of Π . By contradiction, assume that D is not a realization of Π, whence i > 0 . W e shall pro ve that by mo difying D ` in such a wa y that its top part b ecomes a realization of π i w e obtain a new search tree with cost smaller than the cost of D . The desired result will follows b y contradiction. W e consider the follo wing cases: Case 1. π i = X j , for some j = 1 , 2 , . . . , m. In this case, our assumption regarding ` implies that if a no de ν ∈ D ` is asso ciated with a leaf ` 0 in T b then ` 0 either corresp onds to an elemen t u ∈ U such that u ≺ X j or ` 0 ∈ ˜ X j 0 suc h that X j 0  X j . Let κ b e suc h that X j ∈ H κ . W e need to pro ve the following claim Claim 1. ` ∈ { q t j , q r j } . Pr o of. W e shall sho w it by contradiction. W e split the pro of into cases I and I I. Case I. Supp ose that the no de q t j is the right child of q r j . Let q ν (for some ν ∈ T ) b e the parent of q r j . W e hav e tw o cases according as q r j is a righ t or a left child of q ν . Sub c ase I.a q r j is a right child of q ν . Note that b ecause of the search tree prop ert y ν m ust b e an ancestor of h κ in T b . W e p erform a left rotation on q ν . Let D 0 b e the new tree obtained. W e ha ve that cost ( D 0 ) ≤ cost ( D ) − w ( t j ) + w ( α ), where α is the left subtree of q ν . W e observe if a no de in α corresp onds to a leaf ` 0 then ` 0 m ust b e in T b \ H κ . Th us, the no des of α can take care of: (a) lea ves that are asso ciated to some u ∈ U, such that u ≺ u j 3 . The sum of the weigh ts of these lea ves is at most | T | · w ( u j 3 ) / 6 | T | 3 < w ( u j 3 ) / 2; (b) at most tw o leav es asso ciated with u ∈ U such that u = u j 3 . The fact that ev ery u ∈ U app ears in at most three sets of X together with the fact that s j 3 ∈ H κ explain that we ha ve at most tw o lea ves; 16 (c) lea ves in ˜ X j 0 suc h that X j 0 ≺ u j 3 . The sum of the w eights of these leav es sum at most W u j 3 . Th us, we can conclude that w ( α ) ≤ 2 . 5 w ( u j 3 ) + W u j 3 . Since w ( t j ) > 2 . 5 w ( u j 3 ) + W u j 3 w e conclude that cost ( D 0 ) < cost ( D ) , con tradicting the optimality of D . Sub c ase I.b q r j is a left child of q ν . This implies that ν is not an ancestor of r j in T b . Let D 0 b e a tree obtained as follows: w e sw ap q r j with q ν if ν is not in T j ; otherwise, w e sw ap q t j with q ν . Let α b e the righ t subtree of q ν . Again, w e hav e cost ( D 0 ) ≤ cost ( D ) − w ( t j ) + w ( α ). If ν / ∈ H κ then the analysis is identical to the one employ ed in Sub case I.a b ecause α can take care of the same lea ves considered in that case. If ν is a leaf in H κ then w ( t j ) > w ( ν ) = w ( α ) b ecause t j is the heaviest leaf among the leav es in H κ that corresp onds to a no de in D ` . Finally , if ν is an internal no de in H κ \ { h κ } then ν = r j 0 for some j 0 < j and it follows from inequalit y (3) that w ( t j ) > w ( t 0 j ) + w ( u j 0 3 ) + w ( u j 0 2 ) + w ( u j 0 1 ) = w ( α ). In either Sub case w e obtain a tree of cost smaller than D violating the optimalit y of D . Case II. Alternatively , if q t j is not the righ t c hild of q r j , then we can pro ceed as b efore. W e consider the case where q t j is the right c hild of its parent and also the case where it is the left child. In the former case we apply a left rotation and in the latter a simple swap. Again w e can prov e that the resulting tree has cost smaller than D , a violation to the optimality of D . The pro of of the claim is complete. Therefore, it m ust b e ` ∈ { q t j , q r j } . W e no w split the analysis according to this tw o cases. Sub c ase 1.1. ` = q r j . Then, b ecause of the assumption on D − D ` and the search prop erty , it follows that the right subtree of q r j con tains the no des q t j , q s j 3 , q s j 2 , q s j 1 . Also, it is not hard to see that they must app ear in this order, for otherwise, by reordering them w e would decrease the av erage cost of D , since w ( t j ) > w ( s j 3 ) > w ( s j 2 ) > w ( s j 1 ) . Therefore the right subtree of ` coincides with the righ t subtree of D A j . Let us assume w.l.o.g that the level of q a j k is smaller than or equal to the level q a j k 0 in D , for k < k 0 . First, we argue that the left child of ` must b e q a j 1 . Assume that q a j 1 is not the left c hild of ` and let ν b e the paren t of q a j 1 . W e hav e tw o cases: A . q a j 1 is a righ t child of ν . W e p erform a left rotation on q ν . Let D 0 b e the new tree obtained. W e ha ve that cost ( D 0 ) ≤ cost ( D ) − w ( q a j 1 ) + w ( α ) where α is the left subtree of ν. Note that the searc h prop erty assures that ν is an ancestor of h κ . Th us, the analysis of Subcase I.a in the ab o ve Claim 1, shows that the the sum of the weigh ts of the lea ves that α can takes care is upp er b ounded by 2 . 5 w ( u j 3 ) + W u j 3 . Since w ( q a j 1 ) > 3 w ( u j 3 ) + W u j 3 w e conclude that cost ( D 0 ) < cost ( D ). B . q a j 1 is a left child of ν . In this case, we swap q a j 1 and ν . Let D 0 b e the new tree obtained. W e ha ve that cost ( D 0 ) ≤ cost ( D ) − w ( q a j 1 ) + w ( α ) where α is the right subtree of ν. Note that ν is not an ancestor of a j 1 in T b . If ν / ∈ H κ the argumen ts employ ed in sub case I.A shows that w ( α ) ≤ 2 . 5 w ( u j 3 ) + W u j 3 . Since w ( q a j 1 ) > 3 w ( u j 3 ) + W u j 3 w e conclude that cost ( D 0 ) < cost ( D ). If ν ∈ T j 0 , with T j 0 ∈ H κ and j 0 < j , it follo ws from inequality (3) that w ( a j 1 ) > w ( α ). If ν ∈ T j it follows from inequality (4) that w ( a j 1 ) > w ( α ). Finally , if ν = a j 0 k with j 0 < j we hav e that w ( a j 1 ) > w ( a j 0 k ) = w ( α ) W e can conclude that q a j 1 is the left child of ` . Since w ( a j 1 ) = w ( a j 2 ) = w ( a j 3 ) = w ( a j 4 ), the same argumen ts show that the no des following a j 1 in the left path are q a j 2 , q a j 3 and q a j 4 . Let ` 0 b e the left c hild of q a j 4 . W e ha ve show ed that in this sub case D ` − D ` 0 coincides with D A j . Sub c ase 1.2. ` = q t j . There is nothing to pro v e ab out the righ t subtree of `. On the other hand, in order to pro ve that the no des following ` in the left path of D are exactly q a j 1 , q a j 2 , q a j 3 and q a j 4 , we can pro ceed as in Sub case 1.1. The only additional case to b e taken care of, in the argument by con tradiction used 17 there, is when the parent of q a j 1 is q r j . Ho wev er, in this case w e can emplo y the same argumen t we used for the analogous situation in Sub case 1.2. of the pro of of Lemma 1. Let ` 0 b e the left child of q a j 4 . W e ha ve show ed that in this sub case, D ` − D ` 0 coincides with D B j . W e can conclude that in b oth Sub case 1.1 and 1.2, the tree D − D ` 0 is a realization of π i , . . . , π n + m against the assumption that ` is the deep est no de for which such a condition holds. Case 2. π i = u j , for some j = 1 , 2 , . . . , n. The pro of is iden tical to that emplo yed for Case 2 of Lemma 1 D The pro of of Lemma 5 Let x and x ∗ b e the no des queried at the ro ot of D 0 and D ∗ , respectively . W. l. o. g. w e assume x 6 = x ∗ , as otherwise the lemma trivially holds. W e can also assume that x ∗ is a no de from T x , b ecause the opp osite case is analyzed analogously . Case 1: w ( T x ) ≤ w ( T − T x ). In other words, w ( T x ) ≤ w ( T ) / 2. As any path from r ( D ∗ ) to a leaf in D ∗ con tains r ( D ∗ ) and T − T x do es not con tain x ∗ , Lemma 4 states that the depth of any leaf in D ∗ 1 is at least by one smaller than it is in D ∗ . The lemma also implies that the depth of an y leaf in D ∗ 0 is not greater than it is in D ∗ . So w e hav e cost ( D 0 ) = w ( T ) + cost ( D ∗ 0 ) + cost ( D ∗ 1 ) ≤ w ( T ) + X v ∈ T x w ( v ) d ( r ( D ∗ ) , l v ) + X v ∈ T − T x w ( v )  d ( r ( D ∗ ) , l v ) − 1  = w ( T ) + cost ( D ∗ ) − w ( T − T x ) ≤ cost ( D ∗ ) + w ( T ) / 2 . Case 2: w ( T x ) > w ( T − T x ). Let x 1 , . . . , x n b e the no des successively queried when the path r ( D ∗ ) r ( D 0 ) is trav ersed in D ∗ . In particular, x 1 = x ∗ and x n = x . Let k < n b e such that x i is a no de from T x − { x } for i = 1 , . . . , k and x k +1 / ∈ T x − { x } . In this extended abstract w e assume that w ( T x − T x i ) > 0 for i = 1 , . . . , k . The case of w ( T x − T x i ) = 0 can only o ccur when there is tie regarding the choice of no de x in step (1) of the algorithm, and then the ab o ve scenario can b e av oided b y employing a suitable tie breaking rule. In the full pap er we will sho w b y a more in tricate case analysis that the appro ximation factor holds regardless of the tie breaking rule. F or i = 1 , . . . , k w e know that w ( T x i ) < w ( T − T x i ), b ecause otherwise, using the assumption that w ( T x − T x i ) > 0 , w e would hav e w ( T x i ) − w ( T − T x i ) = w ( T x i ) − w ( T x − T x i ) − w ( T − T x ) = w ( T x ) − w ( T − T x ) − 2 w ( T x − T x i ) < w ( T x ) − w ( T − T x ) , and so x i w ould ha ve b een chosen instead of x in step (1) of the algorithm. F rom this fact, it follo ws that w ( T x i ) ≤ w ( T − T x ) for i = 1 , . . . , k . This is b ecause otherwise w ( T x ) − w ( T − T x ) = w ( T x i ) + w ( T x − T x i ) − w ( T − T x ) > w ( T − T x ) + w ( T x − T x i ) − w ( T x i ) = w ( T − T x i ) − w ( T x i ) ≥ 0, so x i w ould hav e b een chosen instead of x in step (1). Let T 0 := S k i =1 T x i and let T 00 := T x − T 0 . Note that T 0 is a forest in general and T 0 ∪ T 00 = T x . W e are going to reason about the search tree depths of the no des in T − T x , T 0 , and T 00 separately . D ∗ 0 queries all no des from T 0 , and Lemma 4 states that the depth of those no des is not greater in D ∗ 0 than it is in D ∗ . The no des from T 00 are as well all queried in D ∗ 0 . F or these no des we kno w that in D ∗ the no de x k +1 is queried b efore them. As x k +1 is not queried by D ∗ 0 , the depth of each no de from T 00 in D ∗ 0 is b y at least b y one smaller than it is in D ∗ . Finally , the leav es in D ∗ corresp onding to the no des from T − T x are descendants of the no des in D ∗ querying x 1 , . . . , x k . These k nodes are not contained in D ∗ 1 , so the depth of eac h leaf in D ∗ 1 is at least 18 b y k smaller than it is in D ∗ . Com bining the ﬁndings, we obtain cost ( D 0 ) = w ( T ) + X v ∈ T 0 w ( v ) d ( r ( D ∗ 0 ) , l v ) + X v ∈ T 00 w ( v ) d ( r ( D ∗ 0 ) , l v ) + X v ∈ T − T x w ( v ) d ( r ( D ∗ 1 ) , l v ) ≤ w ( T ) + X v ∈ T 0 w ( v ) d ( r ( D ∗ ) , l v ) + X v ∈ T 00 w ( v )  d ( r ( D ∗ ) , l v ) − 1  + X v ∈ T − T x w ( v )  d ( r ( D ∗ ) , l v ) − k  = w ( T ) + cost ( D ∗ ) − w ( T 00 ) − k w ( T − T x ) . As T 0 = T − (( T − T x ) ∪ T 00 ), w e hav e w ( T 0 ) = w ( T ) − w ( T − T x ) − w ( T 00 ), so cost ( D 0 ) ≤ cost ( D ∗ ) + w ( T 0 ) − ( k − 1) w ( T − T x ) . W e hav e argued ab o ve that w ( T x i ) ≤ w ( T − T x ) for i = 1 , . . . , k . Therefore, w ( T 0 ) = w ( S k i =1 T x i ) ≤ P k i =1 w ( T x i ) ≤ k w ( T − T x ), and cost ( D 0 ) ≤ cost ( D ∗ ) + k w ( T − T x ) − ( k − 1) w ( T − T x ) = cost ( D ∗ ) + w ( T − T x ) ≤ cost ( D ∗ ) + w ( T ) / 2 . E An FPT AS for Searching in Bounded-Degree T rees E.1 Algorithm for P B ( F , P ) In this section w e complete the correctness pro of of the prop osed algorithm for solving P B ( F , P ). It has already b een argued in Section 3.2 that the algorithm alw ays returns a feasible solution. In addition, in Case 1 of the algorithm, the returned solution is also optimal. Here we prov e the optimality for the second case: Case 2: F is a tr e e T v . Let D ∗ b e an optimal solution for P B ( T v , P ). Consider the internal no de of D ∗ assigned to v ; since D ∗ is compatible with P and since this no de b elongs to the left path of D ∗ , it corresp onds to a no de p i of P . Thus, w e denote this internal node of D ∗ assigned to v by ¯ p 0 i . Let ¯ z 0 b e the leaf of D ∗ assigned to v and notice that ¯ z 0 lies in the left path of the right subtree of ¯ p 0 i . W e construct ¯ D 0 from D ∗ b y essentially applying the in verse of Step 2 of the algorithm: remov e from D ∗ the righ t subtree of ¯ p 0 i ; this remo ved subtree b ecomes the subtree of ¯ p 0 i ; assign ¯ p 0 i as blo ck ed and remov e ¯ z 0 . (One can use Figures 7.d and 7.c to better visualize this construction.) The tree ¯ D 0 is actually an EST for the forest { T c 1 ( v ) , . . . , T c δ ( v ) ( v ) } and has height at most B . Now construct ¯ P 0 b y taking the left path of ¯ D 0 , setting all the non-blo ck ed nodes as unassigned and also setting every no de after ¯ p 0 i as unassigned. Clearly ¯ D 0 is compatible with ¯ P 0 and thus feasible for P B ( { T c 1 ( v ) , . . . , T c δ ( v ) ( v ) } , ¯ P 0 ). Notice, how ever, that ¯ P 0 starts with the preﬁx of P until p i (in terms of its assignmen t), then it has a blo ck ed no de corresp onding to p i and then some unassigned no des. Let ¯ t b e the num b er of no des in ¯ P 0 . Since the last no de of ¯ P 0 comes from the parent of ¯ z 0 in D ∗ and D ∗ has height at most B , we hav e that ¯ t ≤ B . Th us, the path ¯ P 0 coincides with the path P i,t constructed b y the algorithm when t = ¯ t . It is easy to see that the tree D i, ¯ t , as deﬁned in the algorithm, has cost O P T ( P B ( { T c 1 ( v ) , . . . , T c δ ( v ) ( v ) } , P i, ¯ t ) + ¯ t · w ( v ) = O P T ( P B ( { T c 1 ( v ) , . . . , T c δ ( v ) ( v ) } , ¯ P 0 ) + ¯ t · w ( v ) , whic h is at most cost ( ¯ D 0 ) + ¯ tw ( v ) due to the feasibility of ¯ D 0 . Finally , notice that this last quantit y is actually the cost of D ∗ , so cost ( D i, ¯ t ) ≤ cost ( D ∗ ). Since the pro cedure returns a solution which is at least as go o d as D i, ¯ t , its optimalit y follows. 19 E.2 Pro of of Lemma 7 By means of contradiction supp ose v ∗ ∈ D ∗ with d ( r ( D ∗ ) , v ∗ ) ≥ c but w ( D ∗ v ∗ ) > α · w ( D ∗ ). Let ˜ T b e the subtree of T associated with v ∗ and let x b e the ro ot of ˜ T . Let y b e a node in T x to b e sp eciﬁed later. Let T 0 = T − T y and T i = T c i ( y ) , for i = 1 , . . . , δ ( y ) . Moreo ver, let D i b e the searc h tree for T i obtained from D ∗ via Lemma 4. W e shall construct a new searc h tree D 0 for T as follo ws: the ro ot of D 0 is assigned to y ; the left tree of r ( D 0 ) is the searc h tree D 0 ; in the right tree of r ( D 0 ) we build a left path con taining no des corresp onding to queries for c 1 ( y ) , c 2 ( y ) , . . . , c δ ( y ) ( y ) and w e make D i b ecomes the righ t subtree of no de querying c i ( y ) . It is easy to see that the cost of D 0 is at most P δ ( y ) i =0 cost ( D i ) + (∆( T ) + 1) · w ( T ). W e claim that, for a suitable choice of y , D 0 impro ves ov er D ∗ . F or this, let S be the set of no des of T x whic h are queried in the path r ( D ∗ ) v ∗ . W e distinguish the following cases. Case 1: | S | ≥ 2 c 3 . Set y as a no de in T x suc h that | T y ∩ S | ≥ | S | 2 and | T c i ( y ) ∩ S | ≤ | S | 2 for every c hild c i ( y ) of y and construct D 0 as describ ed previously . T o ﬁnd such a no de y , trav erse T x starting at its ro ot and pro ceeding as follows: if u is the current no de then mov e to the child v of u with largest | T v ∩ S | ; the tra versal ends when | T u ∩ S | ≤ | S | 2 . The paren t of the no de where the trav ersal ends is the desired y . T o b ound the cost of D 0 w e ﬁrst consider the cost of a particular tree D i . F rom its construction w e ha ve that d ( r ( D i ) , l u ) ≤ d ( r ( D ∗ ) , l u ) for any no de u ∈ T i . Moreov er, for any no de u ∈ T i ∩ ˜ T the path r ( D ∗ ) l u con tains v ∗ and therefore it contains | S \ T i | queries to no des in T x \ T i . Since these no des w ere remov ed in the construction of D i , w e hav e that for every u ∈ T i ∩ ˜ T d ( r ( D i ) , l u ) ≤ d ( r ( D ∗ ) , l u ) − | S \ T i | ≤ d ( r ( D ∗ ) , l u ) − | S | 2 , where the last inequalit y follows from the deﬁnition of y . It follows that cost ( D i ) ≤ X u ∈ T i d ( r ( D ∗ ) , l u ) · w ( u ) − | S | · w ( T i ∩ ˜ T ) 2 . Com bining this b ound with our upp er b ound on the cost of D 0 w e get that cost ( D 0 ) ≤ cost ( D ∗ ) − d ( r ( D ∗ ) , l y ) · w ( y ) − | S | w ( ˜ T − y ) 2 + (∆( T ) + 1) · w ( T ) . W e claim that actually cost ( D 0 ) ≤ cost ( D ∗ ) − | S | w ( ˜ T ) 2 + (∆( T ) + 1) · w ( T ). T o see this, ﬁrst supp ose y ∈ ˜ T ; then d ( r ( D ∗ ) , l y ) · w ( y ) ≥ | S | · w ( y ) and the claim holds. In the other case where y / ∈ ˜ T , the claim follo ws from the fact w ( ˜ T − y ) = w ( ˜ T ). By making use of this claim, the hypothesis on | S | and the facts that w ( ˜ T ) = w ( D ∗ v ∗ ) > α · w ( D ∗ ) and c · α > 3(∆( T ) + 1), we conclude that D 0 impro ves ov er D ∗ , whic h is a contradiction. Case 2: | S | < 2 c 3 . W e set y = x and construct D 0 as describ ed at the b eginning of the pro of. Again, w e are trying to reac h the con tradiction cost ( D 0 ) < cost ( D ∗ ). Recall that cost ( D 0 ) ≤ P δ ( y ) i =0 cost ( D i ) + (∆( T ) + 1) · w ( T ), so we b ound the cost of the trees D i ’s. By construction we ha ve that cost ( D 0 ) ≤ P u ∈ T 0 d ( r ( D ∗ ) , l u ) w ( u ). Now consider some tree D i for i 6 = 0. F rom its construction we hav e that d ( r ( D i ) , l u ) ≤ d ( r ( D ∗ ) , l u ) for any no de u ∈ T i . Moreo ver, for an y no de u ∈ T i ∩ ˜ T the path r ( D ∗ ) l u con tains v ∗ and therefore it contains at least c − | S | queries to no des in T − T x = T 0 . Then Lemma 4 guarantees that for ev ery u ∈ T i ∩ ˜ T w e ha ve d ( r ( D i ) , l u ) ≤ d ( r ( D ∗ ) , l u ) − ( c − | S | ). 20 W eighting these b ounds o ver all no des in T we hav e: δ ( y ) X i =0 cost ( D i ) ≤ δ ( y ) X i =0 X u ∈ T i d ( r ( D ∗ ) , l u ) w ( u ) − δ ( y ) X i =1 X u ∈ T i ∩ ˜ T ( c − | S | ) · w ( u ) = cost ( D ∗ ) − d ( r ( D ∗ ) , l x ) w ( x ) − ( c − | S | ) · ( w ( ˜ T ) − w ( x )) ≤ cost ( D ∗ ) − ( c − | S | ) · w ( ˜ T ) , where the last inequalit y is v alid b ecause l x is a descendant of v ∗ in D ∗ so that d ( r ( D ∗ ) , l x ) ≥ c . Thus, by com bining the upp er b ound on cost ( D 0 ) with the previous equation in the displa y w e get that cost ( D 0 ) ≤ cost ( D ∗ ) − ( c − | S | ) · w ( ˜ T ) + (∆( T ) + 1) · w ( T ). By making use of the hypothesis | S | < 2 c 3 and the facts that w ( ˜ T ) = w ( D ∗ v ∗ ) > α · w ( D ∗ ) = α · w ( T ) and c · α > 3(∆( T ) + 1), we conclude that D 0 impro ves o ver D ∗ , whic h gives the desired contradiction. E.3 Pro of of Theorem 3 The following lemma sho ws that that the b ound on the height of the shortest optimal tree holds even when the w eight function is not strictly p ositiv e. Lemma 8. Ther e is an optimal se ar ch tr e e for ( T , w ) of height at most O (∆( T ) · (log w ( T ) + log n )) . Pr o of. Consider an optimal searc h tree D ∗ for ( T , w ). Notice that for any v ∈ D ∗ , D ∗ v ∗ is an optimal searc h tree for the subtree of T asso ciated with v . So we can employ the Lemma 7 rep eatedly and get that for ev ery no de v of D ∗ at a lev el l = O (∆( T ) · log w ( T )), w ( D ∗ v ) = 0. No w let L b e all the no des of D ∗ at lev el l . F or eac h v ∈ L let D v b e the shortest search tree for the subtree of T asso ciated with no de v . It w as prov ed in [5] that the height of D v can b e upp er b ounded by (∆( T ) + 1) · log n . Then we can construct the search tree D 0 for T as follo ws: start with D ∗ and for each v ∈ L replace D ∗ v b y D v . Clearly D 0 has height at most O (∆( T ) · (log w ( T ) + log n )). Moreo ver, since w ( D v ) = w ( D ∗ v ) = 0 for all v ∈ L , it follo ws that D 0 has the same cost as D ∗ and hence is optimal. Theorem 3. Consider an instanc e ( T , w ) to our se ar ch pr oblem wher e ∆( T ) = O (1) . Then ther e is an algorithm for c omputing an optimal se ar ch tr e e for ( T , w ) that runs in poly ( n · w ( T )) time. In addition, ther e is an algorithm for c omputing an (1 +  ) -appr oximate se ar ch tr e e for ( T , w ) that runs in pol y ( n/ ) time. Pr o of. The existence of an exact pseudo-p olynomial algorithm which runs in pol y ( n · w ( T )) time follo ws from the discussion presented in Section 3.2 (see F rom the DP algorithm to an FPT AS. ). Thus, w e only pro ve the second claim of the theorem, namely , that our search problem admits an FPT AS. W e claim that the follo wing pro cedure gives the desired FPT AS: 1. Let W b e the w eight of the heaviest no de of T , namely W = max u ∈ T { w ( u ) } . Deﬁne K =  · W n 2 and the w eight function w 0 suc h that w 0 ( u ) = d w ( u ) /K e for every no de u ∈ T . 2. Find an optimal search tree D for ( T , w 0 ) using the pseudo-p olynomial algorithm and return D . First w e analyze the running time this pro cedure. Clearly Step 1 tak es at most O ( n ) time. In order to analyze Step 2, let W 0 = max u ∈ T { w 0 ( u ) } and notice that W 0 = d W /K e ≤ ( n 2 ) / + 1. Thus, w 0 ( T ) ≤ nW 0 ≤ ( n 3 ) / + n . Then the pseudo-p olynomial algorithm employ ed in Step 2 runs in pol y ( n · w 0 ( T )) = poly ( n/ ). The running time of the whole pro cedure is then pol y ( n/ ), as desired. No w we argue that the solution D returned by the pro cedure is (1 +  )-approximate for the instance ( T , w ). Let us make the w eights explicit in the cost function, e.g. we denote b y cost ( D , w ) and cost ( D , w 0 ) the cost of D with resp ect to the weigh ts w and w 0 . Thus w e w ant to pro ve that cost ( D , w ) ≤ (1 +  ) cost ( D ∗ , w ), where D ∗ is an optimal searc h tree for ( T , w ). 21 Clearly for eac h no de u ∈ T w e hav e K · w 0 ( u ) ≤ w ( u ) + K and hence K · cost ( D ∗ , w 0 ) ≤ cost ( D ∗ , w ) + X u ∈ T d ( r ( D ∗ ) , l u ) · K ≤ cost ( D ∗ , w ) + n 2 · K = cost ( D ∗ , w ) +  · W , where the last inequality follows from the fact that the distances are trivially upp er b ounded b y n . Excluding the trivial case where T is empt y , notice that ev ery path in D ∗ from r ( D ∗ ) to a leaf has length at least one. Thus, cost ( D ∗ , w ) can b e low er b ounded by W , and the previous display ed inequality giv es K · cost ( D ∗ , w 0 ) ≤ (1 +  ) cost ( D ∗ , w ). But since w ( u ) ≤ K · w 0 ( u ) for all u , w e hav e that cost ( D , w ) ≤ K · cost ( D , w 0 ) ≤ K · cost ( D ∗ , w 0 ) ≤ (1 +  ) cost ( D ∗ , w ) , where the second inequality follows from the optimalit y of D . Therefore, D is a (1 +  )-approximate searc h tree for the instance ( T , w ), whic h concludes the pro of of the theorem. F P olynomialit y of the tree searc h problem for instances of diameter at most 3 First consider an instance ( T , w ) of our search problem where T has diameter tw o, i.e., it is a star. Let us ro ot the star in its center. Employing a simple exchange argument it is easy to show that the c hildren of r ( T ) m ust b e queried according to their weigh ts, in decreasing order. Thus, an optimal search tree for ( T , w ) can b e built based on any sorting algorithm in O ( n log n ) . No w assume T has diameter 3. Notice that the only p ossible structure for T is the follo wing: there are tw o no des r and r 0 joined by an edge and all other no des are either adjacen t to r or to r 0 . In order to deﬁne the questions, let us take r as the ro ot. Let l ( l 0 ) b e the heaviest leaf among the c hildren of r ( r 0 ). It should not b e diﬃcult to see that the ro ot of any optimal search tree must query one of the no des in the set { r 0 , l , l 0 } . This can b e pro ved using a simple exc hange argument. If r ( D ) is assigned to r 0 then its righ t subtree is an optimal search tree for T r 0 and its left subtree is an optimal search tree for T − T r 0 . If r ( D ) is assigned to l then its righ t subtree is a leaf assigned to l and its right subtree is an optimal searc h tree for T − l . Analogously , when r ( D ) is assigned to l 0 its right subtree is an optimal search tree for T − l 0 . Finally , notice that in the ﬁrst case, b oth T r 0 and T − T r 0 ha ve diameter at most 2. Consider the recursion tree of the ab o ve pro cedure; notice that ev ery subproblem ( T 0 , w ) has a speciﬁc structure: T 0 is the subtree of T induced by no des r , r 0 , the i th heaviest leaf-children of r and the j th hea viest children of r 0 (for some i, j ). Employing a Dynamic Programming strategy together with an O ( n log n ) prepro cessing for the tw o stars cen tered at r and r 0 , it is not diﬃcult to see that each of these O ( n 2 ) problems can b e solved in O (1) time. This gives an O ( n 2 ) algorithm for ﬁnding an optimal search tree for ( T , w ). 22 Figures a f b c D c b d f e < d > < b > < f > < c > < e > e d a yes yes yes yes yes no no no no no T Figure 1: (left) The input tree T; (righ t) a search tree D for T X 1 = {a,b,c} X 2 = {b,c,d} X 3 = {b,e,f} X 4 = {d,e,f} r 4 t 4 s 41 s 42 s 43 r 1 t 1 s 1 1 s 12 s 13 r 2 t 2 s 21 s 22 s 23 r 3 t 3 s 31 s 32 s 33 r a 1 1 a b c b c d b e f d e f a 12 a 13 a 14 a 21 a 22 a 23 a 24 a 31 a 32 a 33 a 34 a 41 a 42 a 43 a 44 Figure 2: The tree obtained from instance I = ( { a, b, c, d, e, d, f } , { X 1 , X 2 , X 3 , X 4 } ) of 3-b ounded X3C. r i t i a i1 s i3 t i s i2 s i3 s i1 s i2 s i1 r i a i1 (b) Conﬁguration D i A t i a i1 t i a i1 (c) Conﬁguration D i B t i s i3 t i s i3 s i2 s i2 s i1 s i1 (a) Sequential search tree for {t i , s i3 ,s i2 , s i 1 } a i2 a i2 a i2 a i2 a i3 a i3 a i4 a i4 a i3 a i3 a i4 a i4 Figure 3: The tw o p ossible conﬁgurations w e use for the part of the search tree that concerns the subtree T i and the leaf a i and a sequen tial search tree for T i . 23 r 4 s 41 t 4 s 42 s 43 a 44 a 41 a 43 a 42 t 4 a 44 a 41 a 43 a 42 r 3 s 31 t 3 s 32 s 33 a 34 a 31 a 33 a 32 r 3 s 31 t 3 s 32 s 33 a 34 a 31 a 33 a 32 r 2 s 21 t 2 s 22 s 23 a 24 a 21 a 23 a 22 r 2 s 21 t 2 s 22 s 23 a 24 a 21 a 23 a 22 r 1 s 1 1 t 1 s 12 s 13 a 14 a 1 1 a 13 a 12 a 1 1 s 13 a 12 a 14 a 13 s 43 s 42 s 41 t 1 s 12 s 1 1 Figure 4: Realization D A (left), and the (optimal) Realization w.r.t. the exact co v er { X 1 , X 4 } (right)—in b old are the questions inv olv ed in the conﬁguration c hanges. Only the leav es asso ciated to no des of T with non-zero w eights are shown here. 24 X 1 = {a,b,c} X 2 = {b,c,d} X 3 = {b,e,f} X 4 = {d,e,f} Π = < π 1 , ..., π m+n > = < a, b, c, X 1 , d, X 2 , e, f, X 3 , X 4 > t 4 s 41 s 42 s 43 t 3 s 31 s 32 s 33 a 42 a 32 b e f d e f h 3 z 3 z 2 t 2 s 21 s 22 s 23 b c d t 1 s 1 1 s 12 s 13 a 13 a b c h 2 h 1 z 1 Z 1 = {X 1 } , Z 2 = {X 2 } , Z 3 = {X 3 , X 4 } a 14 a 21 a 22 a 31 a 41 a 1 1 a 12 a 34 a 33 a 44 a 43 a 23 a 24 Figure 5: The tree T b obtained from the instance I = ( { a, b, c, d, e, d, f } , { X 1 , X 2 , X 3 , X 4 } ) of 3-b ounded X3C. (a) (b) (c) (d) P f P U f U o P o D f D o D U U f U o Figure 6: (a) PLP P with partition U = { U f , U o } indicated. The blank no des are unassigned and the blac k ones are blo c ked. (b) PLP’s P f and P o . (c) The optimal EST’s D f and D o and (d) the resulting EST D U constructed b y taking the ‘union’ of D f and D o . 25 p i p i p' i p' i v y' y' v (a) (b) (c) (d) D' D i,t P i,t P Figure 7: (a) PLP P (b) PLP P i,t . (c)-(d) Construction of D i,t —giv en in picture (d)—starting from an EST D 0 giv en in picture (c) —for P B ( { T c 1 ( v ) , . . . , T c δ ( v ) ( v ) } , P i,t ) . 26

On the Complexity of Searching in Trees: Average-case Minimization

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment