An Efficient Algorithm for Partial Order Production

We consider the problem of partial order production: arrange the elements of an unknown totally ordered set T into a target partially ordered set S, by comparing a minimum number of pairs in T. Special cases include sorting by comparisons, selection,…

Authors: Jean Cardinal, Samuel Fiorini, Gwena"el Joret

An Efficient Algorithm for Partial Order Production
An Efficien t Algorithm for P artial Order Pro duction ∗ Jean Cardinal, Sam uel Fiorini, Gwena ¨ el Joret † , Rapha ¨ el M. Jungers ‡ , J. Ian Munro § Abstract W e consider the problem of p artial or der pr o duction : arrange the elemen ts of an unknown totally ordered set T in to a target partially ordered set S , by comparing a minimu m num b er of pairs in T . Sp ecial cases include sorting b y comparisons, selection, multiple selection, and heap construction. W e giv e an algorithm p erforming I T LB + o ( I T LB ) + O ( n ) comparisons in the w orst case. Here, n denotes the size of the ground sets, and I T LB denotes a natural information-theoretic lo wer b ound on the n umber of comparisons needed to pro duce the target partial order. Our approac h is to replace the target partial order by a weak order (that is, a partial order with a lay ered structure) extending it, without increasing the information theoretic lo wer b ound to o muc h. W e then solv e the problem b y applying an efficien t m ultiple selection algorithm. The ov erall complexity of our algorithm is polynomial. This answers a question of Y ao (SIAM J. Comput. 18, 1989). W e base our analysis on the entrop y of the target partial order, a quan tit y that can be efficien tly computed and provides a go o d estimate of the information-theoretic low er b ound. Keyw ords: Partial order, graph entrop y 1 In tro duction W e consider the P ar tial Order Production problem: Given a set S = { s 1 , s 2 , . . . , s n } p artial ly or der e d by a known p artial or der 4 and a set T = { t 1 , t 2 , . . . , t n } total ly or der e d by an unknown line ar or der 6 , find a p ermutation π of { 1 , 2 , . . . , n } such that s i 4 s j ⇒ t π ( i ) 6 t π ( j ) , by asking questions of the form: “is t i 6 t j ?”. The P ar tial Order Production problem generalizes man y fundamental problems (see Figure 1), corresp onding to specific families of posets P := ( S, 4 ). It amoun ts to sorting by comparisons when P is a chain. The selection [17] and multiple selection [8] problems are sp ecial cases in which P is a weak order 1 , that is, has a la yered structure (with a 4 b iff a is on a lo w er ∗ This w ork was supported by the “Actions de Rec herche Concert ´ ees” (AR C) fund of the “Comm unaut´ e fran¸ caise de Belgique”, NSERC of Canada, and the Canada Research Chairs Programme. G.J. and R.J. are Postdoctoral Researc hers of the “F onds National de la Rec herche Scien tifique” (F.R.S.–FNRS). A preliminary v ersion of the w ork app eared in [5]. † Univ ersit ´ e Libre de Bruxelles (ULB), Brussels, Belgium. E-mail: { jcardin,sfiorini,gjoret } @ulb.ac.be ‡ Univ ersit ´ e catholique de Louv ain (UCL), Louv ain-la-Neuv e, Belgium. E-mail: raphael.jungers@uclouvain.be § Univ ersit y of W aterlo o, W aterlo o, On tario, Canada. E-mail: imunro@uwaterloo.ca 1 Most of the terms that are not defined in the introduction are defined in Sections 2 and 3 1 (a) Sorting ... ... (b) Selection ... ... ... (c) Multiple selection (d) Heap construction Figure 1: Sp ecial cases of the P ar tial Order Production problem. la yer than b ). When the Hasse diagram of P is a complete binary tree, the problem b oils down to heap construction [6]. W e assume that the target p oset P is part of the input and represen ted by its Hasse diagram. Hence the size of the input can b e Ω( n 2 ), whereas sorting the n elements of T tak es O ( n log n ) time. In other words, reading the input could take more time than necessary to solv e problem, pro vided a top ological sorting of P is kno wn. T o cop e with this parado xical situation, w e consider algorithms that pro ceed in tw o phases: a pr epr o c essing phase during which an ordering strategy is determined (for instance, in the form of a decision tree, or any more efficient description, if p ossible), on the basis of the structure of P , and an or dering phase during whic h all comparisons betw een elemen ts of T are performed. Accordingly , w e distinguish the pr epr o c essing c omplexity and the or dering c omplexity of the algorithm, the latter b eing essen tially prop ortional to the num b er of comparisons p erformed. As noted before, w e exp ect the o v erall complexit y of the algorithm to b e dominated b y its prepro cessing complexity . Thus it is desirable to p erform the prepro cessing phase only once, and then use the resulting ordering strategy on several data sets. Lo wer b ound on the num b er of comparisons W e denote by e ( P ) the n umber of linear extensions of the target p oset P . F easible p erm utations π are in one-to-one corresp ondence with the linear extensions of P , th us the num b er of feasible p ermutations is exactly e ( P ). On the other hand, the total num b er of p ermutations is n !. W e ha v e th us the following information-theoretic lo wer b ound (logarithms are base 2): Theorem 1 ([1, 28, 30]) . Any algorithm solving the P ar tial Order Production pr oblem for an n -element p oset P r e quir es I T LB := log n ! − log e ( P ) c omp arisons b etwe en elements of T in the worst c ase and on aver age. Note that we can assume without loss of generalit y that P is connected, hence we also hav e a lo wer b ound of n − 1. Problem history and con tribution The P ar tial Order Pr oduction problem was first prop osed in 1976 by Sch¨ onhage [28]. It was studied five years after by Aigner [1]. Another four y ears passed and the problem sim ultaneously app eared in t wo survey pap ers: one b y Saks [27] and the other b y Bollob´ as and Hell [2]. In his survey , Saks conjectured that the P ar tial Order Pr oduction problem can b e solved b y p erforming O ( I T LB ) + O ( n ) comparisons in the worst case. F our years later, in 1989, Y ao prov ed Saks’ conjecture [30]. He gav e an algorithm solving the P ar tial Order Pr oduction problem in at most c 1 I T LB + c 2 n comparisons, for some constan ts 2 c 1 and c 2 . Ho wev er, the prepro cessing phase of Y ao’s algorithm seems difficult to implement efficien tly . In fact, in the last section of his pap er [30], Y ao ask ed whether, assuming P is part of the input (as is the case here), there exists a p olynomial-time algorithm for the problem that p erforms O ( I T LB ) + O ( n ) comparisons. Our main con tribution is an algorithm that solves the P ar tial Order Production problem and p erforms at most I T LB + o ( I T LB ) + O ( n ) comparisons in the worst case. The prepro cessing complexit y of our algorithm is O ( n 3 ). Hence w e answ er affirmatively the question of Y ao [30] men tioned ab o v e. Moreo ver, w e also significan tly impro ve the ordering complexity , since Y ao’s constan ts c 1 and c 2 are quite large. F urther references, fo cussing mainly on low er b ounds for the problem and generalizations of it include Culb erson and Rawlins [12], Chen [9] and Carlsson and Chen [7]. Main ideas underlying our approac h W e reduce the P ar tial Order Production problem to the multiple selection problem. Instead of solving the problem for the giv en target p oset P we solv e it for a larger (more constrained) p oset that has a simpler structure, namely , a weak order W extending P (a w eak order is a set of antic hains with a total ordering b etw een these antic hains). This approac h w orks b ecause, as w e sho w b elow, it is p ossible to find such a weak order W whose corresp onding information-theoretic low er b ound I T LB is not to o large compared to that of P . Unfortunately , computing I T LB exactly is # P -hard, b ecause computing the num b er of linear extensions of a p oset is # P -complete, a result due to Brigh t well and Winkler [3]. The analysis is made p ossible b ecause there exists a quan tity , dep ending on the structure of the target p oset, that can b e computed in p olynomial time and pro vides a go o d estimate of I T LB . This quantit y is nH , where H denotes the entrop y of the considered target p oset. (The entrop y of a graph is defined in the next section, and the entrop y of a p oset is defined as the en tropy of its comparabilit y graph.) It was K¨ orner who introduced the notion of the en trop y of a graph, in the con text of source co ding [21]. The idea of estimating an information-theoretic lo wer b ound b y means of the en tropy of a p oset w as used b efore b y Kahn and Kim in their inspiring w ork on sorting with partial information [18], see b elow. Related problems In 1971 Chambers [8] proposed an algorithm for the P ar tial Sor ting problem, defined as follows: giv en a vector V of n num b ers and a set I ⊆ { 1 , 2 , . . . , n } of indices, rearrange the elemen ts of V so that for ev ery i ∈ I , all elements with indices j < i are smaller or equal to V i , and elements with indices j > i are bigger or equal to V i . F or the indices i ∈ I , the elemen ts V i in the rearranged vector hav e rank exactly i , hence this problem is also called multiple sele ction . The P ar tial Sor ting problem is a sp ecial case of P ar tial Order Production in whic h the partial order is also a weak order. The algorithm prop osed by Cham b ers is similar to Hoare’s “find” algorithm [17], or Quic kSelect. It has b een refined and analyzed b y Dobkin and Munro [13], Panholzer [25], Pro dinger [26], and Kaligosi, Mehlhorn, Munro, and Sanders [20]. F or our purp oses, the key result is that of Kaligosi et al. [20] in which it is sho wn that m ultiple selection can b e done within a low er order term of the information theoretic lo wer b ound, plus a linear term. Another generalization of the sorting problem, called Sor ting with P ar tial Informa tion , w as studied by Kahn and Kim [18]: Given an unknown line ar or der 6 on a set T = { t 1 , . . . , t n } , to gether with a subset 4 of the r elations t i 6 t j forming a p artial or der, determine the c omplete line ar or der 6 by asking questions of the form: “is t i 6 t j ?”. 3 This problem is equiv alen t to sorting by comparisons if 4 is empty . The information-theoretic lo wer bound for that problem is log e ( Q ), where Q := ( T , 4 ). The problem is complemen tary to the P ar tial Order Production problem in the sense that sorting b y comparisons can b e ac hieved b y first solving a P ar tial Order Production problem, then solving the Sor ting with P ar tial Informa tion problem on the output. A proof that there exists a decision tree ac hieving the lo w er b ound up to a constant factor has b een kno wn for some time (see in particular Kahn and Saks [19]). This is related to the 1 / 3–2 / 3 conjecture of F redman [14] and Linial [22]. Kahn and Kim [18] provided a p olynomial time algorithm that finds the actual comparisons. They sho w that choosing the comparison that causes the en tropy of Q to increase the most leads to a decision tree that is near-optimal in the ab o v e sense. Ov erview In Section 2, w e study the en trop y of perfect graphs. W e sho w that it is possible to approximate the entrop y of a p erfect graph G using a simple greedy coloring algorithm. More precisely , w e pro ve that an y suc h appro ximation is at most H ( G ) + log ( H ( G ) + 1) + O (1), where H ( G ) denotes the entrop y of graph G . Section 3 explains ho w to apply this result to solv e the P ar tial Order Production problem algorithmically . W e b egin the section by remarking that entrop y is b ound to play a cen tral role for the problem since nH ( P ) − n log e ≤ I T LB ≤ nH ( P ), where H ( P ) denotes the entrop y of p oset P . The prepro cessing phase of our algorithm starts by applying the greedy coloring algorithm studied in Section 2 to the comparabilit y graph of P . W e then mo dify this coloring (we “uncross” the colors) in order to obtain an extension of P which is an interv al order I . Another application of the greedy coloring algorithm, this time on the comparabilit y graph of I , yields a weak order W extending I . Using our result on p erfect graphs, we pro v e that the entrop y of W is not muc h larger than that of P , that is, H ( W ) ≤ H ( P ) + 2 log( H ( P ) + 1) + O (1). The ordering phase of the algorithm simply runs then a m ultiple selection algorithm based on the weak order W . W e use a multiple selection algorithm from Kaligosi et al. [20] that p erforms a n umber of comparisons close to the information-theoretic low er b ound. W e conclude the section b y proving that the prepro cessing complexit y of our algorithm is O ( n 3 ). Finally , in Section 4, we discuss the num b er of comparisons and study the existence of an algorithm solving the P ar tial Order Pr oduction problem in I T LB + O ( n ) comparisons. W e giv e an example showing that suc h an algorithm cannot alwa ys reduce the problem to the case where the target p oset is a w eak order. More specifically , w e exhibit a family of in terv al orders with en tropy at most 1 2 log n , an y weak order extension of which has entrop y at least 1 2 log n +Ω(log log n ). 2 En trop y of P erfect Graphs W e recall that a subset S of v ertices of a graph is a stable set (or indep endent set ) if the ver tices in S are pairwise nonadjacen t. Also, a graph G is p erfe ct if ω ( H ) = χ ( H ) holds for every in- duced subgraph H of G , where ω ( H ) and χ ( H ) denote the clique and chromatic num b ers of H , resp ectiv ely . Let us recall similarly that the stable set p olytop e of an arbitrary graph G with v ertex set V and order n is the n -dimensional p olytop e ST AB( G ) := con v { χ S ∈ R V : S stable set in G } , 4 where χ S is the characteristic v ector of the subset S , assigning the v alue 1 to every vertex in S , and 0 to the others. The entr opy of G is defined as (see [11, 21]) H ( G ) := min x ∈ ST AB( G ) − 1 n X v ∈ V log x v . (1) F or example, if G = ( V , E ) is the graph with V := { a, b, c } and E := { bc } , then H ( G ) = 2 / 3 and the minim um in (1) is attained for x = ( x a , x b , x c ) = (1 , 1 / 2 , 1 / 2). Note that graph entrop y w as originally defined with resp ect to a giv en probabilit y distribution on V . Ho wev er, for our purp oses we can take the uniform distribution, as in [18]. In this case we obtain Equation (1). An upp er b ound on H ( G ) can b e found as follows: First, use the greedy coloring algorithm that remov es iterativ ely a maximum stable set from G , giving a sequence S 1 , S 2 , . . . , S k of stable sets of G . If G is p erfect, this can b e done in p olynomial time (see, e.g., Gr¨ otsc hel, Lov´ asz and Sc hrijver [16]). Next, let ˜ x ∈ R V b e defined as ˜ x := k X i =1 | S i | n · χ S i . By definition, ˜ x ∈ ST AB( G ). W e call an y such p oint ˜ x a gr e e dy p oint . The v alue of the ob jectiv e function in the definition of H ( G ) for ˜ x is P k i =1 | S i | n log n | S i | . W e refer to the latter quantit y simply as the entrop y of ˜ x . It turns out that this gives a go o d approximation of H ( G ) when G is a p erfect graph. Theorem 2. L et G b e a p erfe ct gr aph on n vertic es and denote by ˜ g the entr opy of an arbitr ary gr e e dy p oint ˜ x ∈ ST AB( G ) . Then ˜ g ≤ 1 1 − δ  H ( G ) + log 1 δ  for al l δ > 0 , and in p articular ˜ g ≤ H ( G ) + log( H ( G ) + 1) + O (1) . A k ey to ol in our pro of of Theorem 2 is a min-max relation of Csisz´ ar, K¨ orner, Lov´ asz, Marton, and Simon yi [11] relating the en trop y of a p erfect graph G to the entrop y of its complement ¯ G : Theorem 3 ([11]) . If G is a p erfe ct gr aph on n vertic es, then H ( G ) + H ( ¯ G ) = log n . W e now turn to the pro of of Theorem 2. Pr o of of The or em 2. Let S 1 , S 2 , . . . , S k b e the sequence of stable sets of G selected by the greedy algorithm (in the order the algorithm remov es them). So S 1 is a maximum stable set in G , S 2 is a maxim um stable set in G − S 1 , and so on. The outline of the pro of is as follows: W e first use the sets S 1 , S 2 , . . . , S k to define a point z ∈ R V , where V is the v ertex set of G . W e then sho w that z b elongs to the stable set p olytop e of the c omplement ¯ G of G , that is, z ∈ ST AB( ¯ G ). Finally , we derive the desired inequalit y b y com bining the upp er bound on H ( ¯ G ) implied b y z with Theorem 3. 5 Fix δ > 0. F or each vertex v ∈ V w e let m = m ( v ) b e the unique index in { 1 , . . . , k } such that v ∈ S m . W e define z b y letting, for eac h v ertex v of G , z v := δ n  1 ˜ x v  1 − δ = δ n  n | S m ( v ) |  1 − δ = δ n δ  1 | S m ( v ) |  1 − δ . W e claim that for ev ery stable set S of G : X v ∈ S z v ≤ 1 . (2) W rite the stable set S as S = T 1 ∪ T 2 ∪ · · · ∪ T ` , where T i is the i th subset of S taken by the greedy algorithm during its execution. F or ev ery v ∈ T 1 , we ha ve S m ( v ) = S 1 , and | S m ( v ) | ≥ | S | , since the greedy algorithm could ha ve selected the set S when it to ok S m ( v ) . More generally , for ev ery i ∈ { 1 , 2 , . . . , ` } and v ∈ T i , w e ha ve | S m ( v ) | ≥ | S | − P i − 1 j =1 | T j | . It follows in particular that w e can enumerate the p oin ts of S as v 1 , v 2 , . . . , v s in suc h a w ay that | S m ( v i ) | ≥ | S | − i + 1 ∀ i ∈ { 1 , 2 , . . . , s } . W e thus hav e X v ∈ S z v ≤ δ n δ  1 | S |  1 − δ +  1 | S | − 1  1 − δ + . . . + 1 ! ≤ δ n δ Z | S | 0 1 x 1 − δ d x ! ≤ 1 . Equation (2) follo ws. Tw o classical results on p erfect graphs are that the stable set p olytop e is completely describ ed b y the non-negativity and clique inequalities, that is, ST AB( G ) = { x ∈ R V + : X v ∈ K x v ≤ 1 ∀ K clique in G } (see Chv´ atal [10]), and that the complemen t ¯ G of G is also a p erfect graph (Lov´ asz [23]). Com bining these t wo results with (2) shows that z ∈ ST AB( ¯ G ). Using Theorem 3, we then deduce H ( G ) = log n − H ( ¯ G ) ≥ log n + 1 n X v ∈ V log z v = log n + 1 n X v ∈ V log δ n  1 ˜ x v  1 − δ ! = − 1 − δ n X v ∈ V log ˜ x v − log 1 δ = (1 − δ ) ˜ g − log 1 δ . Hence, ˜ g ≤ 1 1 − δ  H ( G ) + log 1 δ  , for all δ > 0. By c ho osing δ = 1 / 2 if H ( G ) ≤ 1, and δ = 1 / ( H ( G ) + 1) otherwise, we obtain ˜ g ≤ H ( G ) + log ( H ( G ) + 1) + O (1). 6 3 An Algorithm for P artial Order Pro duction W e denote b y G ( P ) the comparability graph of a p oset P = ( V , 6 P ), and let H ( P ) := H ( G ( P )). Note that a stable set in G ( P ) is an antic hain in P , that is, a set of mutually incomparable elements. Note also that G ( P ) is p erfect, a basic result that is dual to Dilworth’s theorem, see, e.g., [15]. The relev ance of the notion of graph en trop y in the context of sorting was first observed b y Kahn and Kim [18]. Using the fact that the volume of ST AB( G ( P )) equals e ( P ) /n ! (see Stanley [29]), they pro ved the following result. Lemma 1 ([18]) . F or any p oset P of or der n , − nH ( P ) ≤ log e ( P ) − log n ! ≤ n log n − log n ! − nH ( P ) . When written as 2 − nH ( P ) ≤ e ( P ) n ! ≤ 2 − nH ( P ) · n n n ! , the ab ov e inequalities b ecome intuitiv ely clear, since 2 − nH ( P ) is the (maxim um) volume of a b ox con tained in ST AB( G ( P )), e ( P ) /n ! is the volume of ST AB( G ( P )), and 2 − nH ( P ) · n n /n ! is the v olume of a simplex containing ST AB( G ( P )). The lemma directly implies the following equality for ev ery p oset P : I T LB = log n ! − log e ( P ) = nH ( P ) + O ( n ) . (3) W e recall that a p oset is said to b e a we ak or der whenever its comparabilit y graph is a complete k -partite graph, for some k . Suc h a p oset W = ( V , 6 W ) can b e partitioned in to k maximal an tichains A 1 , . . . , A k , the layers of W , suc h that v < W w whenev er there exist in dices i and j suc h that v ∈ A i , w ∈ A j and i < j . When restricted to weak orders, the P ar tial Order Pr oduction problem resem bles the P ar tial Sor ting problem, with I = { P i j =1 | A j | : i = 1 , . . . , k − 1 } . Our k ey idea is to sho w that, using (twice) the greedy coloring algorithm presen ted in the previous section, w e can efficien tly extend 2 the given p oset P to a weak order W whose entrop y is close to that of P . The reason wh y we ha ve to use t wice the greedy algorithm is that the obtained coloration might not b e ”ordered” (migh t not represent the stable sets of a weak order). Ho wev er, we describ e b elo w ho w to uncross this coloring in order to extend P to an interv al order without increasing to o m uc h the en tropy . W e sho w that applying our greedy coloring to an in terv al order pro vides an ”ordered” coloring, whic h allows us to run a second time our greedy algorithm, pro viding an extension whic h is a w eak order. W e then simply run an efficient multiple selection pro cedure, with W as input. W e sho w that, b ecause replacing P b y W do es not increase the entrop y to o muc h, the resulting n umber of comparisons is close to I T LB . The prepro cessing phase is comp osed of three steps, each of which can b e p erformed in p oly- nomial time. In the first step, we apply the greedy coloring pro cedure to G ( P ), to obtain a greedy p oin t ˜ x . This step mak es use of an auxiliary netw ork defined from P . Then, in the second step, using again the auxiliary netw ork, w e extend P to an in terv al order I whose entrop y is not larger than that of ˜ x . This allo ws us to “uncross” the antic hains used in ˜ x . (An alternativ e wa y of obtaining the interv al order I is to apply Kahn and Kim’s [18] laminar decomp osition lemma to ˜ x .) Finally , in the third step, w e apply the greedy coloring pro cedure again, this time on G ( I ), to obtain the w eak order W . See Figure 2 for an illustration of steps 1 and 2. 2 A p oset Q extends a p oset P if they hav e the same ground set V and v 6 P w implies v 6 Q w , for all v , w ∈ V . 7 1/2 1/3 1/6 1/2 1/2 1/3 (a) P ossible greedy p oint ˜ x . 1/3 0 0 0 1 1/3 1/2 1/2 5/6 2/3 5/6 1/2 1/2 0 (b) Net work D and p oten tial ˜ y . 1/3 1/2 2/3 5/6 1 0 (c) Interv al representation of I . (d) Interv al order I . Figure 2: Obtaining an in terv al order extension of the p oset P . Auxiliary net w ork Let P = ( V , 6 P ) b e an y p oset. W e sa y that v is c over e d by w in P if v 6 P w , v 6 = w and v 6 P z 6 P w implies z = v or z = w . The Hasse diagr am of P is the netw ork with no de set V , and arc set { ( v , w ) : v is cov ered b y w in P } . An elemen t v of P is minimal (resp. maximal ) if z 6 P v (resp. v 6 P z ) implies z = v . W e construct a netw ork D = D ( P ) from the Hasse diagram of P b y first uncontracting each elemen t v ∈ V to an arc ( v − , v + ) and then adjoining a source no de s sending an arc to eac h minimal element, and a sink no de t receiving an arc from each maximal element. The resulting net work has no de set N ( D ) := { s, t } ∪ { v − : v ∈ V } ∪ { v + : v ∈ V } and arc set A ( D ) := { ( s, v − ) : v ∈ V , v minimal in P } ∪ { ( v − , v + ) : v ∈ V } ∪ { ( v + , w − ) : v is cov ered b y w in P } ∪ { ( v + , t ) : v ∈ V , v maximal in P } . This netw ork gives a useful characterization of p oints in the stable set p olytop e of the comparability graph of P , as is explained in the next lemma. Lemma 2. L et P b e a p oset with gr ound set V , let G := G ( P ) and D := D ( P ) . A ve ctor x ∈ R V b elongs to ST AB( G ) if and only if ther e exists a ve ctor y ∈ R N ( D ) (c al le d a p oten tial ) such that y s = 0 , y t = 1 , y is nonde cr e asing along ar cs of D , and y v + − y v − = x v for al l v ∈ V . Pr o of. Again, we use (see Chv´ atal [10]): ST AB( G ) = { x ∈ R V + : X v ∈ K x v ≤ 1 ∀ K clique in G } . W e first show sufficiency . Let x ∈ R V b e a vector that admits a p otential y ∈ R N ( D ) . Consider an y chain C = { v 1 , v 2 , . . . , v c } in P with v 1 6 P v 2 6 P · · · 6 P v c (cliques in G corresp ond to chains 8 in P ). Then X v ∈ C x v = ( y v + 1 − y v − 1 ) + · · · + ( y v + c − y v − c ) ≤ ( y v − 1 − y s ) + ( y v + 1 − y v − 1 ) + ( y v − 2 − y v + 1 ) + · · · + ( y v + c − y v − c ) + ( y t − y v + c ) = y t − y s = 1 . It follo ws that x ∈ ST AB( G ). F or necessity , consider x ∈ ST AB( G ). F or v ∈ V , we let y v + b e the maxim um total w eight of a c hain of P whose maxim um with resp ect to 6 P is v , when each vertex w is given the weigh t x w , and y v − := y v + − x v . Then we let y s := 0 and y t := 1. As is easily v erified, y is a p otential for x . It follows that H ( P ) is the optim um v alue of the follo wing conv ex minimization problem with a p olynomial num b er of v ariables and constraints: (H-p oten tial) min − 1 n X v ∈ V log x v s.t. x v = y v + − y v − ∀ v ∈ V y p 6 y q ∀ ( p, q ) ∈ A ( D ) y s = 0 y t = 1 . W e remark that this form ulation shows that H ( P ) can b e computed to within any fixed precision in strongly polynomial time, using interior p oint metho ds (see for instance [24]). Ho wev er, ap- pro ximating H ( P ) using a greedy p oint will b e enough for our purp oses, and will moreo v er give a b etter upp er b ound on the complexity of our algorithm. Greedy extensions Let ˜ x b e a greedy p oint in ST AB( G ), as defined in Section 2. Consider the p otential ˜ y ∈ R N ( D ) defined from ˜ x as in the pro of of Lemma 2: F or v ∈ V , w e let ˜ y v + b e the maximum (total) weigh t of a chain of P ending in v , where each vertex w has w eight ˜ x w , and ˜ y v − := ˜ y v + − ˜ x v . Let also ˜ y s := 0 and ˜ y t := 1. F rom this p otential ˜ y , we compute an interv al order I extending P whose entrop y is not larger than that of ˜ x . The ground set of I is V . W e let v 6 I w whenever ˜ y v + ≤ ˜ y w − . Thus the op en in terv als ( y v − , y v + ) (for v ∈ V ) pro vide an interv al representation of I . Because v 6 P w implies ˜ y v + ≤ ˜ y w − , which in turn implies v 6 I w , the in terv al order I extends P . The en tropy of I is not larger than that of ˜ x b ecause ( ˜ x, ˜ y ) remains feasible for the minimization problem (H-p oten tial) defined ab o v e, after P is replaced by I . Apply again the greedy coloring algorithm, but now on G ( I ). Let A 1 , . . . , A k denote the an tichains of I pro duced b y the greedy coloring algorithm. Because I is an in terv al order, we can find a p erm utation σ of { 1 , . . . , k } suc h that v < I w , v ∈ A σ ( i ) and w ∈ A σ ( j ) imply i < j . Th us, the w eak order W with ground set V obtained b y setting v < W w whenever v ∈ A σ ( i ) and w ∈ A σ ( j ) with i < j is an extension of I . Such a weak order W is said to b e a gr e e dy extension of the original p oset P . Lemma 3. L et P b e a p oset and W one of its gr e e dy extensions. Then H ( W ) ≤ 1 1 − δ  H ( P ) + 2 log 1 δ + 2  9 for al l δ > 0 , and in p articular H ( W ) ≤ H ( P ) + 2 log ( H ( P ) + 1) + O (1) . Pr o of. Let δ 0 := δ / 2. Let I denote the in termediate interv al order used to obtain W . Theorem 2 implies H ( P ) ≥ (1 − δ 0 ) H ( I ) − log (1 /δ 0 ) ≥ (1 − δ 0 )  (1 − δ 0 ) H ( W ) − log (1 /δ 0 )  − log (1 /δ 0 ) ≥ (1 − δ ) H ( W ) − 2 log(1 /δ ) − 2 . In addition to Theorem 2, for the first inequalit y w e used the fact that H ( I ) ≤ ˜ g , and for the second one, w e used the fact that the greedy coloring of I directly gives the unique decomp osition of W in maximal stable sets. This shows the first part of the claim. F or the second part, again tak e δ = 1 / 2 if H ( P ) ≤ 1, and δ = 1 / ( H ( P ) + 1) otherwise. Algorithm and complexity The ab ov e results directly suggest the follo wing algorithm: com- pute a greedy extension W of P , and run a m ultiple selection pro cedure on T with resp ect to W . In terms of the num b er of comparisons b etw een elements of T , w e only incur a controlled p enalty . Theorem 4. The P ar tial Order Production pr oblem c an b e solve d in p olynomial time using at most I T LB + o ( I T LB ) + O ( n ) (4) c omp arisons b etwe en elements of T in the worst c ase. Pr o of. The w eak order extension W can b e computed in p olynomial time. Let us denote by A 1 , . . . , A k its lay ers. W e run the multiple selection algorithm on the elements of T , with the ranks r i := P i j =1 | A j | (for i = 1 , . . . , k − 1). Kaligosi et al. [20] giv e a multiple selection algorithm that requires only B + o ( B ) + O ( n ) comparisons in the w orst case, where B := log n ! − log e ( W ) is the information-theoretic lo wer b ound for W . Th us B = nH ( W ) + O ( n ) (from Eqn. (3)) ≤ nH ( P ) + 2 n log ( H ( P ) + 1) + O ( n ) (from Lemma 3) = I T LB + 2 n log  I T LB n + 1  + O ( n ) (from Eqn. (3)) = I T LB + o ( I T LB ) + O ( n ) . Hence B + o ( B ) + O ( n ) = I T LB + o ( I T LB ) + O ( n ), and the theorem follows. W e conclude the section b y discussing the prepro cessing complexity of our algorithm. The first execution of the greedy coloring algorithm can b e done in time O ( mn ), where m is the num b er of arcs in the net w ork D := D ( P ) (notice m ≥ n and m = O ( n 2 )), as w e no w briefly explain. The algorithm finds maximal an tic hains in the graph b y decrementing a flo w on the auxiliary net work. This flow has to satisfy low er b ounds on the arcs. Let X := ∅ , i := 1, and put a low er b ound of ` a := 1 on each arc a of the form ( v − , v + ) with v ∈ V , of ` a := 0 on every other arc a of D . Start with an arbitrary integer s – t flo w φ of v alue n such that φ a ≥ ` a for every arc a ∈ A ( D ). Let Y be the set of no des of D that can b e reac hed from s follo wing a de cr ementing p ath , namely , a path v 0 v 1 . . . v k with v 0 := s such that, for ev ery 10 i ∈ { 1 , 2 , . . . , k } , either ( v i − 1 , v i ) ∈ A ( D ) and φ ( v i − 1 ,v i ) > ` ( v i − 1 ,v i ) , or ( v i , v i − 1 ) ∈ A ( D ). Now, there are tw o cases: (1) t ∈ Y . Th us there exists a decrementing s – t path. W e then decrement by 1 the flo w v alue of φ using the latter path. (2) t / ∈ Y . Observe that no arc of D en ters the set Y and that the arcs a going out of Y satisfy φ a = ` a . It follo ws that A i := { v ∈ V | ( v − , v + ) ∈ δ + ( Y ) , φ ( v − ,v + ) = 1 } is an an tic hain of P − X . (Here, δ + ( Y ) denotes the set of arcs of D going out of Y .) Moreo v er, since the flo w v alue of φ equals | A i | , the antic hain A i is maximum among the antic hains of P − X . This is b ecause, b y definition of our low er b ounds, the flow v alue is at least | A | , for ev ery antic hain A contained in P − X . W e then let ` ( v − ,v + ) := 0 for every v ∈ A i , set X := X ∪ A i , increment i b y 1, and rep eat the ab ov e steps, until X = V . Computing the set Y , decremen ting the flo w, and finding the an tichain A i are steps that can b e done in time O ( m ). Since we go through the main lo op at most 2 n times, this implementation of the greedy algorithm runs in time O ( nm ). The greedy p oint ˜ x can be computed in time O ( n ). The corresp onding p otential ˜ y can be found in O ( m ) using a simple dynamic program. The second execution of the greedy coloring algorithm can b e done in time O ( n 2 ), using the fact that the comparability graph of the interv al order I is a co-in terv al graph. Finally , a b ound on the complexit y of the multiple selection pro cedure is O ( n 2 ). So the whole algorithm runs in O ( nm ) = O ( n 3 ). 4 Tigh tness A natural question is whether there exists an algorithm for P ar tial Order Pr oduction that do es at most I T LB + O ( n ) comparisons b etw een elemen ts of T . W e sho w in this section that ev ery algorithm that first extends the target p oset to a weak order and then solves the problem on the w eak order can b e forced to mak e I T LB + Ω( n log log n ) comparisons, b oth in the worst case and the a v erage case. This is a consequence of the following theorem: Theorem 5. Ther e exists a c onstant c > 0 such that, for al l n ≥ 1 , ther e is a p oset P on n elements satisfying H ( W ) ≥ H ( P ) + c log log n for every we ak or der W extending P . In order to prov e Theorem 5, we define a family { G k } ( k ≥ 1) of in terv al graphs inductively as follo ws: • G 1 consists of a unique vertex, and • for k ≥ 2, the graph G k is obtained b y first taking the disjoin t union of K 2 k − 1 (the “central clique”) with tw o copies of G k − 1 , and then making half of the vertices of the central clique adjacen t to all v ertices in the first cop y , and the other half to all those in the second copy . It is easily seen that G k is indeed an interv al graph, as is suggested in Figure 3. The graph G k has k 2 k − 1 v ertices. The complemen t ¯ G k of G k is the comparability graph of the interv al order I k defined b y an in terv al representation of G k . Lemma 4. H ( I k ) ≤ ( k + 1) / 2 . Pr o of. By construction, the maximal stable sets of the graph ¯ G k all hav e 2 k − 1 elemen ts, and there are 2 k − 1 such maximal stable sets. W e define a p oint x ( k ) of the stable set p olytop e ST AB( ¯ G k ) as follo ws: x ( k ) := 2 k − 1 X i =1 1 2 k − 1 χ S i , 11 Figure 3: An in terv al represen tation of G 4 (colors highlight the different levels of the construction). where S 1 , S 2 , . . . , S 2 k − 1 are the maxim al stable sets of ¯ G k . Observe that, for ev ery ` ∈ { 0 , . . . , k − 1 } , there are 2 k − 1 v ertices in ¯ G k that b elong to exactly 2 ` differen t maximal stable sets (that is, there are 2 k − 1 in terv als of eac h different length in the interv al represen tation suggested in Figure 3). W e th us obtain the follo wing upp er b ound on the entrop y of I k : H ( I k ) ≤ − 1 k 2 k − 1 k − 1 X ` =0 2 k − 1 log 2 ` 2 k − 1 = log (2 k − 1) − k − 1 2 ≤ ( k + 1) / 2 . The lemma follo ws. W e pro ceed b y showing that ev ery weak order extension of I k has relativ ely large en tropy compared to I k . W e first int ro duce some definitions. Consider an arbitrary graph G and a coloring C 1 , . . . , C ` of its v ertices. Similarly as ho w greedy points are defined (see Section 2), one can asso ciate an entrop y to the latter coloring, namely , the en trop y of the probabilit y distribution {| C i | /n } i =1 ,...,` : − ` X i =1 | C i | n log | C i | n . The minimum entrop y of a coloring is known as the chr omatic entr opy of G , and is denoted by H χ ( G ). The c hromatic entrop y can b e though t of as a constrained v ersion of the graph entrop y , in which the stable sets inv olv ed in the definition of H ( G ) are required to form a partition of the v ertices of G . Lemma 5. L et G b e the c omp ar ability gr aph of a p oset P . Then any we ak or der extension W of P has entr opy H ( W ) ≥ H χ ( G ) . Pr o of. The maximal antic hains of W are pairwise disjoint, hence they corresp ond to a coloring of G . The entrop y of W is equal to the entrop y of the latter coloring, and th us is at least H χ ( G ). Lemma 5 suggests finding a (go o d) lo w er b ound on H χ ( ¯ G k ), the c hromatic entrop y of ¯ G k . T o ac hieve that, we make use of the following result of [4] (see Corollary 1 in that pap er). Theorem 6 ([4]) . L et G b e an arbitr ary gr aph. Then the entr opy of any c oloring of G pr o duc e d by the gr e e dy c oloring algorithm is at most H χ ( G ) + log e . W e can therefore restrict ourselves to analyzing the entrop y of greedy colorings of ¯ G k . Recall that all maximal stable sets in ¯ G k ha ve the same cardinality 2 k − 1 . Consider the greedy coloring of ¯ G k defined recursiv ely as follows: take first the stable set of cardinalit y 2 k − 1 that corresp onds to the cen tral clique in G k , and then, if k ≥ 2, recurse on the tw o copies of ¯ G k − 1 that are left. Let ˜ g k denote the en tropy of the resulting coloring of ¯ G k . Lemma 6. ˜ g k = ( k − 1) / 2 + log k . 12 Pr o of. The greedy coloring defined ab o ve consists of 2 i − 1 color classes of cardinality 2 k − i , for i = 1 , 2 , . . . , k . Hence, its entrop y is ˜ g k = − k X i =1 2 i − 1 · 2 k − i k 2 k − 1 log 2 k − i k 2 k − 1 = 1 k k X i =1 log k 2 k − 1 2 k − i = 1 k k X i =1 (log k + ( i − 1)) = log k + k − 1 2 , as claimed. W e may now turn to the pro of of Theorem 5. Pr o of of The or em 5. Let k ≥ 1 and consider the in terv al order I k defined ab o ve, of order n := k 2 k − 1 . Let also W be an arbitrary weak order extending I k . Com bining Lemmata 4, 5 and 6 with Theorem 6 giv es H ( W ) − H ( I k ) ≥ H χ ( ¯ G k ) − H ( I k ) ≥  k − 1 2 + log k − log e  − k + 1 2 = log k − log e − 1 = Ω(log log n ) , as claimed. Ac knowledgmen ts The authors wish to thank S ´ ebastien Collette, F ran¸ cois Glineur and Stefan Langerman for useful discussions, and the anonymous referees for their commen ts on an earlier v ersion of the pap er. References [1] M. Aigner. Pro ducing p osets. Discr ete Math. , 35:1–15, 1981. [2] B. Bollob´ as and P . Hell. Sorting and graphs. In Gr aphs and or der, Banff, A lta., 1984 , v olume 147 of NA TO A dv. Sci. Inst. Ser. C Math. Phys. Sci. , pages 169–184, Dordrech t, 1985. Reidel. [3] G. Bright w ell and P . Winkler. Coun ting linear extensions. Or der , 8(3):225–242, 1991. [4] J. Cardinal, S. Fiorini, and G. Joret. Tigh t results on minimum entrop y set co v er. Algorith- mic a , 51(1):49–60, 2008. [5] J. Cardinal, S. Fiorini, G. Joret, R. M. Jungers, and J. I. Munro. An Efficien t Algorithm for Partial Order Pro duction. T o app ear in Pr o c e e dings of STOC 09, Bethesda (Maryland), Unite d States , 2009. 13 [6] S. Carlsson and J. Chen. The complexit y of heaps. In Pr o c e e dings of the thir d annual A CM- SIAM symp osium on discr ete algorithms (SODA ’92), Orlando (Florida), Unite d States , pages 393–402, Philadelphia, P A, USA, 1992. So ciet y for Industrial and Applied Mathematics. [7] S. Carlsson and J. Chen. Some lo w er b ounds for comparison-based algorithms. In Pr o c. 2nd Eur op e an Symp ositum on A lgorithms (ESA ’94), Utr e cht, The Netherlands , v olume 855 of L e ctur e Notes in Computer Scienc e , pages 106–117. Springer-V erlag, 1994. [8] J. M. Chambers. Partial sorting (algorithm 410). Commun. ACM , 14(5):357–358, 1971. [9] J. Chen. Av erage cost to pro duce partial orders. In Pr o c. 5th International Symp osium on A lgorithms and Computation (ISAAC ’94), Beijing, P. R. China , v olume 834 of L e ctur e Notes in Computer Scienc e , pages 155–163. Springer-V erlag, 1994. [10] V . Chv´ atal. On certain p olytop es asso ciated with graphs. J. Combinatorial The ory Ser. B , 18:138–154, 1975. [11] I . Csisz´ ar, J. K¨ orner, L. Lov´ asz, K. Marton, and G. Simon yi. Entrop y splitting for antiblocking corners and p erfect graphs. Combinatoric a , 10(1):27–40, 1990. [12] J. C. Culberson and G. J. E. Ra wlins. On the comparison cost of partial orders. T echni- cal Rep ort TR88-01, Departmen t of Computing Science, Univ ersity of Alberta, Edmon ton, Alb erta, Canada T6G 2E8, 1988. [13] D. P . Dobkin and J. I. Munro. Optimal time minimal space selection algorithms. J. ACM , 28(3):454–461, 1981. [14] M . L. F redman. Ho w go o d is the information theory b ound in sorting? The or. Comput. Sci. , 1(4):355–361, 1976. [15] M . C. Golumbic. Algorithmic gr aph the ory and p erfe ct gr aphs. 2nd e d. , volume 57 of the A nnals of Discr ete Mathematics . Elsevier, Amsterdam, 2004. [16] M . Gr¨ otsc hel, L. Lov´ asz and A. Schrijv er. Ge ometric algorithms and c ombinatorial optimiza- tion. 2nd c orr. e d. , volume 2 of A lgorithms and Combinatorics . Springer-V erlag, Berlin, 1993. [17] C. A. R. Hoare. Find (algorithm 65). Commun. ACM , 4(7):321–322, 1961. [18] J. Kahn and J. H. Kim. Entrop y and sorting. J. Comput. Syst. Sci. , 51(3):390–399, 1995. [19] J. Kahn and M. E. Saks. Balancing p oset extensions. Or der , 1:113–126, 1984. [20] K . Kaligosi, K. Mehlhorn, J. I. Munro, and P . Sanders. T o wards optimal m ultiple selection. In Pr o c. International Confer enc e on Automata, L anguages, and Pr o gr amming (ICALP’05) , L e ctur e Notes in Computer Scienc e , pages 103–114. Springer-V erlag, 2005. [21] J. K¨ orner. Co ding of an information source having ambiguous alphab et and the entrop y of graphs. In T r ansactions of the 6th Pr ague Confer enc e on Information The ory , pages 411–425, 1973. [22] N . Linial. The information-theoretic b ound is go o d for merging. SIAM J. Comput. , 13(4):795– 801, 1984. 14 [23] L. Lov´ asz. Normal hypergraphs and the p erfect graph conjecture. Discr ete Math. , 2(3):253– 267, 1972. [24] Y . Nestero v and A. Nemiro vskii. Interior-p oint p olynomial algorithms in c onvex pr o gr amming , v olume 13 of SIAM Studies in Applie d Mathematics . So ciet y for Industrial and Applied Mathematics (SIAM), Philadelphia, P A, 1994. [25] A . Panholzer. Analysis of multiple quic kselect v arian ts. The or. Comput. Sci. , 302(1-3):45–91, 2003. [26] H . Pro dinger. Multiple quickselect – Hoare’s find algorithm for several elemen ts. Inf. Pr o c ess. L ett. , 56:123–129, 1995. [27] M . E. Saks. The information theoretic b ound for problems on ordered sets and graphs. In Gr aphs and or der, Banff, A lta ., 1984 , v olume 147 of NA TO A dv. Sci. Inst. Ser. C Math. Phys. Sci. , pages 137–168, Dordrech t, 1985. Reidel. [28] A . Sc h¨ onhage. The pro duction of partial orders. In Journ´ ees algorithmiques, ´ Ec ole Norm. Sup., Paris, 1975 , pages 229–246. Ast´ erisque, No. 38–39. So c. Math. F rance, Paris, 1976. [29] R. P . Stanley . Two p oset p olytop es. Discr ete Comput. Ge om. , 1:9–23, 1986. [30] A . C. Y ao. On the complexit y of partial order pro ductions. SIAM J. Comput. , 18(4):679–689, 1989. 15

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment