The update complexity of selection and related problems

The up date complexit y of selection and related problems Mano j Gupta Deptt. of Comp. Sc., I IT Delhi, New Delhi. Y ogish Sabharwal IBM Researc h - India, New Delhi. Sandeep Sen Deptt. of Comp. Sc., I IT Delhi, New Delhi. Septem ber 28, 20 18 Abstract W e present a framew ork for computing with input data sp eciﬁed by in terv als, representing uncertaint y in the v alues of the input parameters. T o compute a solution, the algorithm can q uery t he input parameters that yield more reﬁned estimates in form of sub- in terv als and th e o b jective is to min imize the num b er of queries. The previous app roa ches address th e scenario where ev ery query returns an exact v alue. Our framew ork is more general as it can deal with a wider v ariet y of input s and query resp onses and w e establish interes ting relationships b etw een them t h at hav e n ot b een inv estiga ted previously . Alth ough some of the approac hes of the p revious restricted mod els can b e adapted to the more general mo del, we require more sophisticated techniques for the analysis and we also obtain imp ro ved algorithms for the previous mo del. W e address selection p rob lems in th e generalize d mo del and sho w th at there exist 2-up date comp et- itive algorithms that do not d epend on the lengths or distribution of the sub-interv als and h old against the worst case adversary . W e also obtain similar b ounds on the comp etitiv e ratio for th e MST problem in graphs. 1 In tro duction A co mmon scenario in many computationa l problems is uncertaint y ab out the precise v alues of one or more par ameters. Many diﬀerent mo dels hav e b een co ns idered in the da tabase communit y for dealing with uncertain da ta. In one of the c o mmonly used mo dels, the uncerta in par ameters are represented by pr obabilit y distributions (for a comprehensive surv ey , see[A Y09]). In another model, the uncer tain para meters are represented by in terv a l range s , wherein the pa rameter may take on any v a lue within the sp eciﬁed in terv a l (see [KT01]). In this pa per, we fo cus o n the latter mo del. More formally , we cons ide r the model w he r ein we w ant to compute a function f ( x 1 , x 2 . . . x n ) where some (or all) x i ’s a re not fully known. The x i ’s a re t ypically known to lie in some range (interv al). An y assignment of x i = x ′ i consistent with the known range of x i is a fe asible r e alization . The algor ithm c a n make quer ies ab out x i . This problem has bee n studied befo re [KT01, HEK + 08]. A common ass umption made in the existing litera ture is that the exac t v alue of x i is returned by a single query . How ev er, in many a pplications, a query ab out x i may only yie ld a more reﬁned estimate of the x i . As a matter of fact, in many such applications, it is not even p ossible to obtain the exact v alue of the par ameter. As an example, cons ide r the case o f handling sa tellite data such as ma ps. Due to the large amount of data inv olved, the data is often s tored hiera rc hically at diﬀerent scales of res o lutions. Typically the data is pr esen ted at the highes t level of r esolution. Dep ending on the area of interest, da ta may b e r etriev ed for the next level of r esolution for a smaller area (zo om in) by p erforming a query . No w consider a query to ﬁnd the clos est hospital. Based on the highest s c ale of resolutio n, the distances to the hospitals can b e determined within a cer tain rang e of uncertaint y . If the closest hospital cannot b e resolved at this level, then further quer ie s ar e requir ed for certain ho spitals to deter mine which amongst them is the closest. These queries proc e ed down the hierar c hical scales of r esolution un til it is resolved which is the closest hospita l. 1 Let us illustrate this mo del using the problem of ﬁnding minimum when the exact v a lues ar e not known but e ac h elemen t is asso ciated with a re a l interv a l [ ℓ i , r i ]. Consider the thr ee elements x 1 = [3 , 17] , x 2 = [14 , 19] , x 3 = [15 , 20]. Clearly a ny of these ca n b e the minim um element as these are mutually ov erlapping int erv als. Supp ose a query returns the ex act v alue, then with three queries, we obtain the complete infor- mation and the problem is tr ivially solved. But the interesting questio n is - are three querie s nece s sary ? Suppo se our ﬁrst quer y yields that x 1 = 10, then clearly w e do no t need to make a n y further queries. On the other hand, the query may yield x 1 = 16 , so tha t we are forced to make further queries. In a mor e general situation, where a query may re tur n a sub-interv al, we may obtain x 1 = [8 , 16] that do esn’t yield a ny useful information abo ut the identit y o f the minimum element. On the o ther hand, if the q uery returns [8 , 10], then we can conclude x 1 to b e the minim um even though we do not know the exact v alue of x 1 . It is natura l to compare the num ber of quer ies made b y the alg orithm w.r.t. a h ypothetica l OP T which can b e thoug h t o f as a non-deterministic s trategy that makes the minim um quer ies for any feasible realization of the input. Mo reov er, the a lgorithm must contain a certiﬁcate of correctness of the ﬁnal a nsw er, viz., that no mo r e queries ar e nec e ssary regardless of the num ber o f unresolved parameters. This also brings up the related veriﬁcation problem, i.e ., given an inco mpletely s peciﬁed problem, do es it co n tain suﬃcient information for a solution to b e computed (without further queries). 1.1 Related Previous W ork Kahan [Kah91] describ ed a tec hnique for main taining data structures for online proble ms like ﬂight-path collisions using predic tive estimates to obtain higher eﬃciency . The estimates could be used to pr une ob jects that couldn’t prov ably aﬀect the so lution a nd only those critic al ob jects were upda ted that could a ﬀect the answer. Kahan’s work laid the foundations for later work on kinetic da ta structures but in his pap er, he fo c ussed on describing a framework for minimizing up dates of cr itical ob jects. Kahan compared the eﬃciency o f his da ta structures with resp ect to a non-determinis tic optimal algo rithm, or mor e spe ciﬁcally , the comp etitiv e ra tio in the online setting. If o ur algo rithm makes q S ( n ) queries for an input S of siz e n , then it ha s c o mpetitive ratio c 1 iﬀ for so me co nstan t α > 0, q S ( n ) ≤ c · OP T ( S ) + α where OP T ma y b e thought of a s a non-deterministic algorithm (coined a s lucky in [Kah91]) Note that OP T has an unfair a dv a n tage in b e ing able to guess the optimal sequence of queries a nd ensure that it can b e veriﬁed in collusion with an adversary co ntrolling the output of the queries. F or instance, if the given interv als are x 1 = [2 , 6] , x 2 = [2 , 6] , x 3 = [2 , 6], i.e., all of them a re identical, O P T may guess the answer to b e x 3 and if the query yields x 3 = 2, then it is veriﬁed. O n the other hand, an algorithm has no means o f distinguis hing b et w een the x i ’s. Even use of r andomization do es not a ppear to provide any signiﬁcant adv antage in this scenario . Ka han [Ka h91] tackled this issue (without ackno wledging as muc h) by changing the pro ble m deﬁnition to that of r ep orting all values that ar e e qual t o the minimum . Khanna and T an [KT0 1] also used the co mp etitive r atio as a measure o f eﬃciency of their algor ithms but their parametr ization didn’t yield O (1) b ounds. Their alg orithms for selectio n was related to the clique numb er (maxim um clique size) of the input. They compar e with Non-determinis tic optimal and show that, no on-line a lgorithm can achiev e a b etter comp etitiv e ratio than the clique num b er. A somewhat diﬀer e n t model was used by Erlebach et al.[HEK + 08], who show ed how to co mpute a n exact minim um spanning tree for gra ph with interv al data using minimal num b er of queries. The ﬁna l answer is a combinatorial description (in this case a spanning tree) and not necessarily the w eight of the spanning tree. Erle ba c h et al.[HEK + 08] prov ed that their algorithm has comp etitive ratio 2 when the edge weight s are initially sp eciﬁed as op en interv als. One limitatio n of their result is the cr itical use o f the pro perty of op en interv als which is used to weak en the a dv antage of O P T in guessing a nd verifying the answer. Their results on co nstan t comp etitiv e ratio do not hold fo r c lo sed or semi-clos ed interv als. A r ecen t motiv a tion for this line of work came from ca ching pr oblems in distributed databas es, (O lston and Widom [OW00]), where lo cal cac hed copies are used for faster query pro cessing where the cached v alues are interv als that are guaranteed to con tain the actual v alue calle d the master v alue. Their work showed 1 So strictly speaking, the algorithm could tak e exponen tial time but may hav e a bounded comp etitive ratio. 2 O C OC P OP CP OCP O Category -1 (Note α ) (Note α ) (Note α ) (Note α ) (Note α ) (Note α ) C (Note α ) Category-1 (Note α ) (Note α ) (Note α ) (Note α ) (Note α ) OC (Note α ) (Note α ) Category -1 (Note α ) (Note α ) (Note α ) (Note α ) P trivial - - - - - - OP Category -2 (Note α ) (Note α ) OP-P OP-OP (Note α ) (Note α ) CP (Note α ) Category-2 (Note α ) Ca tegory-3 (Note α ) Catego ry-3 (Note α ) OCP (Note α ) (Note α ) Category -2 Category -3 (Note α ) (Note α ) C ategory-3 Figure 1: Mo dels for studying uncertain data problems (see note for α b elow). The allow ed input types listed along the rows a nd the quer y return types listed along the columns. (The pure input p oint mo del is trivial as no queries a re required). trade-oﬀ betw een the num be r of queries and the precis ion ∆ of the actual answer. This mo del was further explored in the work o f [FMP + 03, FMO + 03] that tackled fundamental problems like median-ﬁnding and shortest-paths. They disting uished b e tween the oﬄine (o bliv ious) a nd o nline (ada ptive) quer ies including weigh ted versions where quer ies could hav e v ar y ing costs for diﬀerent interv als. Unlik e the previous work, they compar ed their eﬃcie ncy with res p ect to a worst cas e o ptimal rather than a no n-deterministic input- sp eciﬁc optimal. Therefore their results cannot be compared eﬀectiv ely with the previous w ork. O ther approaches like [AH04, KZ0 6 ] minimize the worst case deviation from actual v a lues o r minimizing quer ies to get improv ed estimates of the e xpected s olution when the dis tr ibution is known [GGM06, GM07]. 2 Our con tributions In this pap er, we ge ner alize the query mo del in several directio ns. W e cla ssify mo dels based on the types of the inputs allow ed and the return type of the queries. T he input may sp ecify a co m bination of p oints (P), op en interv als (I) and/or clo sed interv als (C). This leads to 7 v ar iations , namely , O, C, P , O C, OP , CP and O CP . Similarly queries on interv als (o pen/close d) may yield p oints (P), op en interv a ls (I) and/o r closed interv als (C) 2 . This als o leads to seven v a riations. These models ar e spe c iﬁe d in Figure 1. W e denote the mo dels by X - Y where X denotes the type o f the input allowed in the input instance and Y deno tes the query return types where X and Y can take v alues from O, C, P , OC, OP , CP and OCP (here the literals O, C a nd P corresp ond to op en interv a ls, c losed in terv a ls and po ints resp ectiv ely). Thus for instance OP-P denotes the mo del wherein the input can consist of open in terv a ls as well as p oin ts a nd the queries can only return p oin ts. (Note α ): Although there are 49 models p ossible, many of them are unnatur al as they can lead to a change of the input t yp e after some initial que r ies. The fra mew ork of such mo dels can be covered under the framework of a nother suitable mo del. F or instance, a problem under the O-P mo del w ould conv ert to O P -P mo del after a single quer y and is th us b etter studied under the OP-P mo del. Similarly , the O C-C model can be cov ered under the O C-OC mo del. W e categ orize the v a lid mo dels in to 5 diﬀerent categor ies (See Fig ur e 1). The competitive ratio s a re based on this categorization of the mo dels. Cate gory-1 corres ponds to the mo dels where the input and query return t ypes are only interv als (O-O, C-C, O C - OC mo dels). Cate gory-2 cor responds to the mo dels where the input may contain p oint s by the queries only return interv als (OP- O, CP-C , OCP-OC mo dels). Cate gory-3 corres p onds to the mo dels wher e the input may contain closed interv als and the quer y may return p oint s. The other t wo catego ries cor respo nd to the O P-P and OP-OP models themselves. Our main r esults ca n b e summar ized a s follows 1. W e ﬁrst generalize the mo dels to practica l scena rios wherein q ueries may return sub-int erv als as a nsw ers rather than exa ct v alues. The sub-interv als need not have any pr o perties with resp e ct to leng ths or distributions. In other words, with further queries, we obta in increasingly reﬁned estimates of the 2 W e can also handle semi- closed interv als but we ha ve a vo ided further classiﬁcation as they don’t lead to an y interesting results. 3 v alues until suﬃcient infor mation has b een obtained, i.e., the veriﬁc ation pro ble m can b e solved. W e show that the witness b ase d appr o ach used in the previous models can b e adapted to the models considered in this pap er. More sp eciﬁcally , we establish interesting r elationships b et w een the v arious mo dels (se e Figure 2). 2. W e study the selection problem of ﬁnding the k th smallest v alue and pres e nt up date c ompetitive algorithms with diﬀer en t guarantees for the diﬀerent mo dels for this pro blem. W e also study the up date complexity of minim um spanning tree pr oblem under the diﬀerent models that is clos e ly r elated to the extremal selection problem (ﬁnding the heaviest edg e in a cy cle – also called the Red rule). 3. W e also s ho w that by devia ting from the witness based approach studied in prio r literatur e , we can actually obtain improv ed bounds for the selection problem. These algorithms attain an additive ov er- head from optimal, that is similar to a comp etitiv e r atio of unity for so me cases and are interesting in their own r igh t. 4. Given that clos ed in terv a ls have no t b een succes s fully handled in prior liter ature[HEK + 08] leading to un b ounded co mpetitive r atios, is it p ossible to characterize the pro blem more precise ly ? F or instance, do we run into the same is sues if we a llo w quer ies to re tur n interv a ls? One appr o ac h for a ddressing issues with closed interv als is to output a ll the optimal solutions[K ah91]. It ca n b e quite exp e nsiv e to output all the solutions. Is there an alternate framework tha t a ddresses the issues with clo sed interv als without determining all the so lutio ns. W e show that this problem is a characteristic of mo dels that a llo w closed interv als in the input and po in ts to be returned in the queries. W e extend o ur mo dels to handle close d in terv als by using the notion of lexicogra phica lly smallest solution (in case m ultiple solutions exist). This is a natura l version in many problems wher e the initial ordering is important and w e will s how la ter that this has the desired eﬀect of limiting non-deter ministic guessing p o wers of OP T . Another in teresting v ariation could b e assig ning cost to a query dep ending on the the pre cision o f the answer given but we hav e not addre s sed this version in this pap er. There is a g ro wing b ody o f work that addresses the problem of computing exact a nsw er with minimal queries [BEE + 06, BHKR0 5] and c oping with more gener alized quer ies is an imp ortan t and fundamental directio n of algor ithmic resear c h. Problem Comp etitive Mo dels Comment Source ratio O P T + 1 OCP-P Report all solutions Kahan [Kah91] Extremal OP T + 1 OP-P V alue this paper selection 2 · O P T Category-1,2 & OP-OP this paper 2 · O P T Category-3 lex ﬁrst this paper O P T + 1 OCP-P Report all solutions Kahan [Kah91] t · O P T CP-P t = cli que no. Khanna-T an [KT01] K-selection O P T + k OP-P V alue, ≤ k · O P T this paper 2 · O P T Category-1 elemen t this paper 2 · ( O P T + k ) OP-OP this paper 2 · O P T Category-3 V alue, lex ﬁrst this paper 2 · O P T OP-P Erlebac h et al.[HEK + 08] MST O P T + C OP-P C ≤ OP T C = no. of red rule this paper 2 · O P T Category-1,2 & OP-OP this paper 2 · O P T Category-3 lex ﬁrst this paper Figure 2: Known res ults in prior literature and our new results 3 Problem Deﬁnition W e consider a pr oblem P where we ar e g iven an instance P = ( C, A ) that consis ts o f • an ordere d set of data C = { c 1 , c 2 , . . . , c n } called a c onﬁgura tion ; and 4 • an ordere d set of data A = { a 1 , a 2 , . . . a n } called ar e as of unc ertainty such that c i ∈ a i ∀ i . The conﬁguration C is not kno wn to us – only the areas of uncer ta in t y , A , ar e known . As an example consider the proble m, P , of ﬁnding the index of the minim um elemen t. An example ins ta nce is given by P ex = ( C , A ) wher e C is the ordered se t of po in ts C = { 3 , 7 , 10 } and A is the order ed set of int erv als (areas of uncerta in ties) A = { (2 , 6) , (5 , 8) , (9 , 11 ) } . W e fo cus o ur discussio n to problems wher e the input is Real da ta . Thu s, the co nﬁg uration consists of po in ts on the Real line ℜ , and the ar eas of uncertaint y may be interv als on the Real line. The concepts can be extended to higher-dimensiona l problems. V eriﬁer: W e ar e also given a veriﬁer V for the pr oblem P , that takes as input the a r eas of uncertaint y , A and returns whether a s olution o f the problem P can be deter mined fr om A or not. F or the example instance, P ex , descr ibed ab o ve, the veriﬁer would return false as it cannot determine a s olution from the given a reas of uncer tain t y . How ev er, if the interv als were A = { (2 , 5) , (6 , 8) , (9 , 11) } , then the veriﬁer would return true as clear ly the ﬁr st interv al ha s to contain the minimum. Order-In v ariance: An imp ortan t characteristic of the problems we study is that the result of the veriﬁer is o nly dep enden t on the or dering of the areas of uncertaint y . More for mally , consider tw o instances P = ( C, A ) and P ′ = ( C ′ , A ′ ) where A = { a 1 , a 2 , . . . , a n } and A ′ = { a ′ 1 , a ′ 2 , . . . , a ′ n } for the same pro blem P . W e say that P and P ′ are or der-e quival ent if for ev ery pair of indices i, j ∈ { 1 , 2 , . . . , n } , it can be determined that a i ≤ a j iﬀ it can b e determined that a ′ i ≤ a ′ j . W e s a y that a problem P is or der-invariant if the veriﬁer returns the same v alue for any t wo order-e q uiv a len t c onﬁguration instances. It is ea sy to verify that the pr oblems such as selection (ﬁnding minim um, ﬁnding k th -minimu m) and minimu m spanning tree are or der-in v ar ian t. Up date o perations : W e are a llowed to p erform up date op erations on the areas. Performing an up date op eration on ar e a a i results in knowledge of the area to a g r eater degr ee o f a c curacy . More precisely , perfor m- ing an up date op eration on a i in the ins tance P = ( C, A ), where A = { a 1 , a 2 , . . . , a i − 1 , a i , a i +1 , . . . , a n } results in another instance P ′ = ( C, A ′ ), where A ′ = { a 1 , a 2 , . . . , a i − 1 , a ′ i , a i +1 , . . . , a n } such that a ′ i is completely contained in a i . An imp ortant characteristic of the mo dels that we co ns ider is that the results of up dates on an area are indep enden t of up dates on any other ar ea. That is, given a m ulti-set S = { i 1 , i 2 , . . . , i k } of indices of the areas , applying up dates on the corresp onding ar eas results in the sa me instance, irr espective of the sequence in which these up dates are applied. W e refer to this a s the up date indep endenc e pr op erty . Solution: Our go al is to so lve the problem P b y perfo rming minimum n umber of up dates, i.e ., per form the minim um num ber of up dates that result in an instance for which the v eriﬁer returns true. F or a problem instance P = ( C, A ), a solution , S , is deﬁned to b e a multi-set of indices { i 1 , i 2 , . . . , i k } suc h that p erforming upda tes on the ar e as a i 1 , a i 2 , . . . , a i k results in a problem instance P ′ = ( C, A ′ ) for which V ( A ′ ) returns true, i.e., a solution of the problem can b e determined from A without per fo rming any more up dates. In this ca se, w e say that S solves the problem ins tance P . Let S ( P ) denote the set of all such solutions. An optimal solution is a solution, S ∈ S ( P ) such tha t any other solution in S ( P ) has at least a s many indices, i.e., | S | ≤ | S ′ | for a ll s olutions, S ′ ∈ S ( P ). Therefore, an o ptimal s olution corres p onds to a sma llest s et o f indices that need to be up da ted in orde r to solve the problem. As mentioned b efore, the OP -P and the CP-P mo dels hav e been s tudied befor e. W e sha ll s ho w now show that the alg o rithms for the OP-P mo del c an b e g e neralized for the many other mo dels for problems that a re order- in v ariant. These up date comp etitiv e algorithms ar e based o n the concept o f witness sets. W e discuss these concepts in Sectio n 4; these concepts a re b orrow ed from [BHKR05] and presented here with mo diﬁcations suitable to discuss a ll o ur mode ls . Then we discuss how to extend these alg orithms to other mo dels. 4 The Witness Set F ramew ork F or a problem ins ta nce P = ( C, A ), a set W is s aid to b e a witness set of P if for every solution S ∈ S ( P ), W ∩ S 6 = φ . Thus, no alg orithm can s olv e P without q uerying any a r ea fro m W . Suppo se that we hav e a n algo rithm, WALG , that g iv en any instance P = ( V , A ) o f the problem, ﬁnds a witness-set of size at most k . Then ther e exists a k -up date comp etitiv e algorithm for the problem. The 5 algorithm is pr esen ted in Figure 3. It simply keeps apply ing alg o rithm WALG to ﬁnd a witness set of size at most k and up dates all the areas in the witness set. It keeps doing this unt il the problem is solved. Algorithm SOLVE ( Problem Instance P , V eriﬁer V , Witness Algo rithm WALG ) Input : - pro blem instance P = ( C, A ) , - a v eriﬁer algorithm V for the given p roblem, - a witness algorithm WALG fo r the given problem. Output : k -update competitive solution to p roblem instance P Initialize solution S = {} ; If ( V ( A ) retu rns f alse ) /* p roblem instance is not y et solved */ W = WALG ( P ) ; Update the areas in W to reduce the problem instance P to P ′ ; S = S ∪ SOLVE ( P ′ , V , WALG ) ; Endif; Output S ; Figure 3: Algorithm to determine k -up date comp etitiv e solution given witness a lgorithm W e now formally show that the so lution returned by this algor ithm is k -update comp etitiv e. Note that this result is independent of the mo del under cons ideration. The witness algor ithm and veriﬁer how ev er are depe ndent on the underlying mo del. Theorem 4 .1. The solut io n r eturne d by t he algorithm in Figur e 3 is k -up date c omp etitive for the pr oblem instanc e P . Pr o of. Se e App endix. Witness Algorithms F or Diﬀe ren t Mo dels. Witness algorithms have b een pr opose d for s ev eral problems under the OP -P mo del. W e now show that the s a me witness algorithms can b e used fo r v ar ious other models as well. Theorem 4. 2. A witness algorithm for a pr oblem under the OP-P mo del is also a witness algorithm for the same pr oblem un der the c ate gory-1, c ate gory-2 and OP-OP mo dels (i.e., O- O, C-C, OC-OC, OP-O , CP-C, OCP-OC and OP-OP mo dels). Pr o of. Se e App endix. Corollary 4.3. Algorithm 3 is k -up date c omp etitive u nder t he c ate gory-1, c ate gory-2 and OP-OP mo dels with the same witness algorithms as that for t he OP-P mo del. Pr o of. Se e App endix. W e make an imp ortant observ a tion here. While the reduction might seem stra igh tforward, it is impor tan t to note many of thes e reductions are only one- way reduction. F o r insta nce, we can reuse the witness algor ithm for the OP - P mo del for the OP-O mo del but not vice-versa. W e demonstr ate this later for the k -min selection problem, where w e sho w that while it is pos sible to design a 2 -update comp etitiv e a lgorithm under the OP-P mo del, it is not po ssible to des ign an a lgorithm that is b etter tha n k -update comp etitive under the OP - O mo del using witness sets. Another imp ortant o bserv ation we make is that prior liter ature ha s shown tha t no algo rithm ca n give bo unded up date complexit y guarantees for the s election pro blem under the CP-P mo dels. Ho wev er, w e hav e derived constant fac tor up date-comp etitiv e a lgorithms for models in volving closed interv als (i.e., the CP-C, C- C, OC-OC and OCP-OC mo dels). This highlights the fact that the pr oblem is not in dealing with closed interv als but rather with the combination of allowing closed interv a ls in the input and simultaneously allowing quer ies to return p oints for such close d in terv a ls. 5 The selection problem In an instance P = ( C , A ) of the k -Min pr oblem, C = { p 1 , p 2 , · · · , p n } is an ordered set of p oin ts in ℜ , a nd A = { a 1 , a 2 , · · · , a n } is a n or dered set o f interv als o n ℜ . The nature o f the interv als is determined by the 6 mo del under considera tion. The goa l is to ﬁnd the index of the k th smallest element in C . W e deno te by l j and u j , the low er and upp er ends of the interv al a j resp ectiv ely . T o avoid ov erloading of notations, we will assume tha t l j and u j alwa ys refer to the latest known v a lues for the interv al ranges, considering all the up dates that have alr eady b een p erformed. 5.1 1 -Min In this se ction we lo ok a t the sp ecial ca se when k = 1, i.e., w e a re interested in ﬁnding the index of the smallest v alue interv al. Witness Algorithm And V eriﬁer. W e ﬁr st present the witness algorithm for the OP-P mo del. Cons ider an insta nce P = ( C , A ). The witness a lg orithm c ho oses the interv al with the “ smallest l -v alue” and the along with the interv al with the next “sma llest l -v alue” and returns them as the witness set. The veriﬁer simply determines if some interv a l can be determined to b e smaller than all the other interv als. Let S = { 1 ..n } denote the set of indices of the interv als. F or any subset S ′ ⊆ S , we deﬁne orde r l ( S ′ ) to b e a p erm utation of indices in S ′ in increasing order of the low er v alues of the corres ponding interv als, i.e., order l ( S ′ ) = < j 1 , j 2 , · · · , j m > , such that l j 1 ≤ l j 2 ≤ · · · ≤ l j m . Similarly deﬁne ord er u ( S ′ ) = < j 1 , j 2 , · · · , j m > , such that u j 1 ≤ u j 2 ≤ · · · ≤ u j m . The witness a lgorithm a nd the veriﬁer are formally pre s en ted in Figure 4. Witness Algorith m: 1. Let = order l ( S ) 2. Return a p 1 and a p 2 as the witness set Verifier : 1. Let = order l ( S ) 2. If x ≤ y for all x ∈ a p 1 and y ∈ a p j , j 6 = 1, return the i n terv al with index p 1 as the solution Else return false Figure 4: Witness Alg orithm a nd V eriﬁer for 1-Min under the OP -P mo del Note that an int erv al is declared to be the s ma llest interv al only when no other interv al can con tain a smaller v alue. Therefo re the algor ithm alwa ys outputs the correct interv al. Comp etitiveness. W e now show that the alg orithm is 2 -update comp e titiv e under the OP-P mo del. Lemma 5.1. The set W = { p 1 , p 2 } r eturne d by t he algorithm of Figur e 4 is a witness set for the 1 - Min pr oblem under the OP-P mo del. Pr o of. Se e App endix. It follows from Theore m 4.2 and Corolla ry 4 .3 tha t w e can derive 2-up date comp etitiv e a lgorithms for the categ o ry-1, categ ory-2 a nd OP -OP mo dels. Tigh t E xampl e. W e now show that the up date-compe titive bound of 2 is tight for a ll the mo dels that allow the quer ie s to return in terv als , i.e., for the category - 1, ca tegory-2 a nd OP-OP mo dels (but not the OP - P mo del). This is demonstrated by the follo wing example. W e a re g iv en in terv als A = { a 0 , a 1 , a 2 , . . . , a n } where a 0 = (1 , 5) and a j = (3 , 7) for all 1 ≤ j ≤ n . W e argue that any algo rithm can b e fo rced to p erform 2 n queries w hile the O PT can deter mine the interv al containing the minim um with o nly n queries. Let S represent the set of int erv als A \ { a 0 } , i.e., S = { a 1 , a 2 , . . . , a n } . Suppo se that the algo rithm has already p erformed 2 n − 1 queries. The adversary b ehav es as follows. F or the ﬁrst n − 1 quer ies on a 0 it returns the interv al (1 + iε , 5) in the i th query , where ε is a small v alue < 1 / (2 n ). F o r the ﬁr st n − 1 quer ie s on interv als from the set S it returns the interv al (6 , 7). The re ma ining actions of the adversary ar e based on whether the algorithm p erforms n queries on a 0 or whether it queries n interv als from S . Note tha t in pe r forming 2 n − 1 querie s, the algor ithm must encount er one of these cases. These are considered in the following 2 cases: • Case 1: The algorithm makes n querie s to a 0 . In this cas e the adversary contin ues to return the interv a l (1 + iε, 5 ) for the i th query on a 0 where i ≤ 2 n − 1 and it returns the interv a l (6 , 7 ) for ea ch subsequent interv al queried fr o m S . Note that in this ca s e, on 7 per forming 2 n − 1 quer ies, the alg orithm could no t have quer ied all the int erv als from S . Therefore at the end of 2 n − 1 q ueries, as there is overlap b et ween interv al a 0 and the unquerie d interv a ls from S , the algorithm is force d to make 2 n queries. The OPT o n the other ha nd can just query a ll the int erv als in S . The a dv ersary w ill r eturn the interv al (6 , 7) for OPT on the remaining int erv als. Thus, O PT is able to determine that a 0 contains the minimum element by just p erforming n q ue r ies. • Case 2: The algorithm makes n querie s to interv als in S . In this case, the adversary returns (3 , 4) for the las t ( n th ) int erv al queried in S . F or an y subsequent queries to a 0 , the adversary contin ues to return (1 + iε, 5) for the i th query . Note that in this case, the adversary p erforms less than n queries on a 0 . Therefore at the end of 2 n − 1 quer ies, a s there is ov erlap betw een interv al a 0 and the last queried interv a ls fro m S , the algorithm is for ced to make 2 n queries. The OPT on the other ha nd can just query a ll the interv a ls in a 0 . The adversary will return the v alue (2 , 3) for OPT o n its n th query to a 0 (recall that in this ca se the alg orithm did no t p erform n queries on a 0 ). Thus, OPT is able to determine that a 0 contains the minimum ele men t by just per forming n queries. It is surprising that though this tight example demons tr ates that we cannot o btain better than 2-up date comp etitiv e a lgorithms for these mo dels, it is po ssible to obtain a 1 - update comp etitive algo rithm for the OP-P mo del; how ever, this is obtained by an appro ac h diﬀer e nt from the Witness Set framework. This is discussed in more detail in Section 6. 5.2 K -Min W e now g e ne r alize the 1-min algorithm presented ab o ve to the k th -min pr oblem, but under the O -O mo del. W e later discuss is sues related to handling p oin ts under the OP-P mo del. Witness Algorithm And V eriﬁer. W e no w present a witness algorithm and veriﬁer for this problem under the O -O model. Witness Algorithm: 1. Let = order l ( S ) 2. Let S ′ = { p 1 , . ., p k − 1 } 3. If x ≤ y ∀ x ∈ a i , i ∈ S ′ and ∀ y ∈ S \ S ′ return the witness set of 1-Min algorithm 4. Else let < q 1 , q 2 , · · · , q | S ′ | > = order u ( S ′ ) return a p k and a q 1 as the witness set Verifier : 1. Let = order l ( S ) 2. Let S ′ = { p 1 , .., p k − 1 } 3. If ( x ≤ y ∀ x ∈ a i , i ∈ S ′ and ∀ y ∈ a p k ) and ( x ≥ y ∀ x ∈ a i , i ∈ S \ ( S ′ ∪ a p k ) and ∀ y ∈ a p k ) return a p k else return false Figure 5: Witness and V eriﬁer Algorithm for K -Min under the O -O mo del W e say interv als a i and a j are disjoint if ∀ x ∈ a i , y ∈ a j , x ≤ y o r v ice-v erse. T he witness algorithm chec ks if the ﬁrst k − 1 interv al a re disj oi nt with the la st n − k + 1 interv al. If that is the ca se, it returns the witness set of the 1 -Min algo rithm. Else it chooses a p k and an interv al from S ′ with largest u v alue( a q 1 ) as the witness set. The v er if ier takes the ﬁrs t k − 1 interv als( S ′ ) depe nding on their l v alues. The v erif ier checks if these k − 1 interv als are disj oint from the a p k . Then it ta kes the last n − k interv als ( S \ ( S ′ ∪ a p k )) and chec ks if all of them disjoint with a p k . If bo th the condition holds, it r eturns a p k else it r eturns false. Comp etitiveness. W e now show that the algo rithm is 2- update comp etitiv e for the O-O mo del. It follows using pro ofs similar to Theor em 4.2 and Co rollary 4.3 that we can der iv e 2-up date comp etitiv e algor ithms for the o ther category- 1 mo dels. Lemma 5. 2 . The witness set W r eturne d by t he algo rithm of Figur e 5 is a witness set for the k -Min pr oblem under the O-O mo del. Pr o of. Se e App endix. 8 Tigh t Exampl e . It is not diﬃcult to construct examples similar to that discussed for the 1-Min algorithm to show that the up date-comp etitiv e b ound of 2 is tight under the c ategory-1 mo de ls . It is interesting to note here that while a 2 -update co mpetitive algorithm can be designed for the k -min problem under the c a tegory-1 mode ls , no algo r ithm can b e b etter than k -up date comp e titiv e for this pr o blem under mo dels that allow p oin ts, i.e., the catego ry-2 a nd OP-P mo dels. This is illustrated by the following example 3 . Supp ose we have 2 k are a s o f whic h k a re ope n interv als of the form (0 , 5) and k ar e ﬁxed po ints of the v alue 3. F or the ﬁrst k − 1 interv als queried by any alg o rithm, the adversary re tur ns 1 a nd for the k th int erv al, the adversary retur ns 4 (or interv al (3.5,4.5) as the case may b e), thereby forcing k queries. How ev er, O PT only needs to up date the int erv al with v alue 4 and can therea fter return a ny of the k ﬁxed po in ts of v alue 3 as the k th smallest. How ev er, in the nex t section we show that it is pos sible to des ign algorithms for the k -Min problem under these mo dels that allow for po in ts, obtaining up date comp etitiv e b ounds with additive fa ctor k (i.e., the algo rithm p erforms k more up dates than O PT ). This how ev er is achiev ed by bypassing the Witness set framework. 6 Bypassing the Witness Set framew ork While the witness set framework, studied in prio r literature, provides a gener al method for solving problems with data uncer tain t y under the update complexity mo dels, it has its limitations. W e demonstrate this by presenting algorithms that require to pe r form only k more queries than OPT for the k th -Min selection problem. Note that, for the 1 -Min problem this implies a 1 -update comp etitive algorithm, as only o ne q ue r y more than OPT is r equired to b e pe rformed. 6.1 1 -Min Consider the following alg orithm. W e note here that the set of int erv als r eturned by the “witness” algorithm ‘‘Witnes s’’ Algorit hm: 1. Let = order l ( S ) 2. Let A = { a p 1 } and B = { p 2 , · · · , p | S | } 3. Ret urn inte rv al in A . Verifier : 1. Let = order l ( S ) 2. If x ≤ y for all x ∈ a p 1 and y ∈ a p j , j 6 = 1, return the i n terv al with index p 1 as the solution Else return false Figure 6: “Witness” Algor ithm and V eriﬁer for 1-Min under the OP- P mo del is not a true witness set. Howev er, we s tick to the terminolo gy for the sake of consis tency . The a lgorithm remains the same, it up dates the interv als returned b y the “ witness” alg orithm un til we o btain a solution. Lemma 6.1. Le t c OP T b e the total numb er of queries made by O PT to ﬁnd 1-Min, then total numb er of queries made by algorithm in Figur e 6 is at most c OP T + 1 in the O P-P mo del. Pr o of. Se e App endix. Note that this simple algor ithm fo r 1-Min in OP-P model fails for the O P-O mo del. Consider the following example. Let there b e tw o interv als I 1 = (2,20) and I 2 = (19,21) Supp ose at the i th query o f I 1 , we get a new interv al ( d i , 20 ), where d i < 19 , so I 1 and I 2 will alwa ys intersect if we just que r y I 1 . The algo rithm in Figure 6 always queries I 1 , so it takes huge num ber of queries to ﬁnd 1-Min. B ut if we just query I 2 , it returns a subin terv a l (20 .5,21). This is what OP T do es and uses just o ne query to ﬁnd the answer. 6.2 k -Min Consider the algorithm in Figure 7 for k-s election in the OP-P mo del which generalizes the r esult of the algorithm in Fig ure 6. 3 This w as pointed out by an anon ymous reviewe r of a previous version 9 ‘‘Witnes s’’ Algorithm: 1. Let = order l ( S ) 2. Let S ′ = { p 1 , .., p k } 3. let < q 1 , q 2 , · · · , q k > = order u ( S ′ ) Let S ′ max = a q k . Q uery S ′ max . 4. If x ≤ y ∀ x ∈ a i , i ∈ S ′ and ∀ y ∈ S \ S ′ return the “witness set” of the 1-Max algorithm of S ′ (of Fi gure 4). Verifier : 1. Let = order l ( S ) 2. Let S ′ = { p 1 , .., p k − 1 } 3. If ( x ≤ y ∀ x ∈ a i , i ∈ S ′ and ∀ y ∈ a p k ) and ( x ≥ y ∀ x ∈ a i , i ∈ S \ ( S ′ ∪ a p k ) and ∀ y ∈ a p k ) return a p k else return f alse Figure 7: Witness and V eriﬁer Algorithm for K -Min under the O P-P mo del Lemma 6. 2. The algorithm of Figur e 7 uses atmost c OP T + min { k , n − k } queries wher e c OP T is the minimum num b er of queries r e quir e d by the OPT. Pr o of. Se e App endix. Now let us consider the OP-OP model. Note that since w e have 2 · O P T algorithms for the O P-O mo del and an O P T + k alg orithm for the OP-P mo del, we can derive a 2 · ( OP T + k ) alg o rithm for the OP-OP mo del by combining these 2 algo rithms. This is done by alterna ting the witness algor ithms of the tw o mode ls . This ensures that we only need to p e rform at most twice the num ber o f que r ies p erformed by the algo rithms of either of the tw o mo de ls . 7 Closed in terv als with p oin t r eturn ing queries As discuss e d ab o v e, the comp etitiv e ra tio is unbounded for the special cases wher e the input allows fo r closed int erv als a nd queries may return p oin ts (i.e., the ca tegory-3 mo dels). F or instance cons ider the problem of ﬁnding the index of the minim um elemen t. F urther , consider the problem instance P = ( C, A ) where a i = [1 , 3] for all 1 ≤ i ≤ n . The adversary in this case acts as follows; for each of our queries except the last, it re tur ns 2 . Finally , fo r our las t query , say on interv al a k , it returns 1. On the o ther hand, OPT directly queries interv al a k and obta ins the optimal solution. This res ults in an unbounded c o mpetitive ratio. The primar y reason for this anomaly is the po ssibilit y of existence o f m ultiple optimal s o lutions. In such cases, the adversary is able to g e t aw a y with few quer ie s by just querying the necessa ry in terv a ls that reveal one o f the o ptimal solutions. F or an y algo r ithm o n the o ther hand, it is not able to distinguish fr o m the areas of uncerta in t y (as shown ab ov e) whic h are the necessar y interv als to q uery to reveal the optimal s o lution. One of the wa ys that has b een s uggested in prior literature to deal with this sp ecial case is to requir e all the optimal solutions to be output. Ho w ever, it ca n b e quite exp ensiv e to output all these solutions. This raises the question of whether other reaso na ble c o nditions can be laid on the structure of the requir ed output that a re not so exp ensive but reasonable. W e now consider such a condition, whic h w e call the lexic o gr aphi c c ondition , for which we show that this spe c ial can b e handled. Recall that the sets C a nd A that deﬁne a pro blem instance a re or dered se ts . Thus, the set of indices that de ﬁne a solution ca n b e consider ed as a string (called solut io n string ) deﬁned as follows: the length of the string is n a nd the i th element of the string is s et to 1 if it deﬁnes the so lutio n and 0 otherwise . In the lexicog raphic setting, amongst all the optimal solutions, we are interested in ﬁnding the solution for which the solution string ha s the smalle s t lexicogra phic ordering. Now consider again the example ab ov e. Note that, ev en though OPT queries a k and determines a so lution with optimal solution v alue, it ca nnot terminate without making further queries as it cannot decide whether or not there exists a nother solution with the same v alue but a smaller lexicogr aphic or dering. W e note that ne w witness a lgorithms ma y require to b e developed for the lexicogr aphic v ariants of the problems. How ever, we show by case o f exa mples that these ar e no t very diﬀerent fro m the cor responding witness algo rithms for the or iginal pro blems. It can b e shown that once a witness algorithm is develop ed for a lexicogr aphic v aria n t o f the problem under the CP-P mo del, the same witness alg orithm can b e extended to other mo dels along the same lines as discussed in Section 4. 10 Now let us consider the lexicogr aphic v ar ian t of the 1-Min pr oblem. In order to obtain the witness a lg o- rithm for the lexicogr aphic v ariant for the categ ory-3 mo dels, the notio n of or de r ing of interv als, or der l ( . ), needs to b e ex tended to inco rpor ate lexico graphic ordering and clos ed interv a ls. As b efore, for a n y subset S ′ ⊆ S , w e deﬁne order l ( S ′ ) to b e a per mutation o f indices in S ′ in incre a sing order o f the low er v a lue s of the corres ponding interv als, i.e., order l ( S ′ ) = < j 1 , j 2 , · · · , j m > , such that l j 1 ≤ l j 2 ≤ · · · ≤ l j m . When comparing tw o interv als with the same l -v alues, say l j and l j ′ , ties ar e res olv ed as follows: If a j contains a po in t x such that x < y for all y ∈ a j ′ , then j precedes j ′ in the o rdering; similarly if a j ′ contains such a po in t, then j ′ precedes j ; and if neither can b e established, then the lex ic ographically smaller index precedes the larger one in the ordering. Thus, if one of the interv als, say a j , is open fro m the left and another interv al, say a j ′ , is either closed from the left o r a p oint, then j ′ precedes j in the ordering; in all other cas e s, the lexicogra phic smaller of j and j ′ precedes the other in the ordering. The witness a lgorithm and veriﬁer ar e formally presented in Figure 8. Note that the veriﬁer is also mo diﬁed so that it ca n chec k that the minim um in terv al can b e determined or not based on the lexicogr aphic ordering. Witness Algorith m: 1. Let = order l ( S ) 2. Return a p 1 and a p 2 as the witness set Verifier : 1. Let = order l ( S ) 2. If ( x ≤ y ∀ x ∈ a p 1 and y ∈ a p j , p j > p 1 ) and ( x < y ∀ x ∈ a p 1 and y ∈ a p j , p j < p 1 ), return the i n terv al with index p 1 as the solution Else return false Figure 8: Witness Algor ithm for 1 -Min under the CP -P mo del The pro of of up date comp etitiv eness is similar to the case for the or iginal pr oblem. Lemma 7.1. The set W = { p 1 , p 2 } r eturne d by the algo rithm of Figur e 8 is a witness set for the lexic o gr aphi c 1 -Min pr oblem under the CP-P mo del. Pr o of. Se e App endix. The fact that no algorithm can be better than 2-up date comp etitiv e for the 1-Min proble m under the CP-P mo del follows from the sa me r easoning as for the OP-P mo del. W e can extend this 2-upda te c o mpetitive algor ithm for the other ca tegory-3 mo de ls using techniques similar to tha t in Sectio n 4. Finally , we can design 2-up date comp etitiv e algorithms for the k -min version as well under these mo dels by using similar techniques. 8 Minim u m S p anning T ree In the Lexico graphic MST pr oblem, w e are given a gra ph G = ( V , E ). The edg e lengths are spe c iﬁed with uncertaint y . Le t E = { e 1 , e 2 , . . . , e n } be the or de r ed set of edges. Then the or dered set C = { v 1 , v 2 , · · · , v n } denotes the v alues of the edg e lengths and the or dered set A = { a 1 , a 2 , · · · , a n } denotes the interv als within which the edg e le ng ths ar e known to lie. The g oal is to ﬁnd the lexicogr aphically smallest MST under the category -3 mo dels. A 2-up date comp etitiv e algorithm for the MST problem was given b y [HEK + 08] under the OP -P model. By applying Theorems 4.2 and Corollar y 4.3, w e conclude that it is 2-upda te comp etitiv e for the Category- 1,2 and O P-OP mo dels as well. The Lexicogr aphic MST problem can b e solved under the Categor y-3 mo dels with few changes to the algorithm describ ed in [HEK + 08] (these c hanges ar e outlined in App endix A). This gives us the following r esult. Theorem 8.1. Ther e ex ists a 2-u p date c omp etitive algorithm for the L ex ic o gr aphic MST pr oblem under the Cate gory-3 mo dels. 11 R emark: It may be noted that the algor ithm describ ed in [HEK + 08] in c onjunction with Lemma 6.1 ca n b e used to derive an O P T + C upda te comp etitiv e a lgorithm for the MST problem under the OP-O P mo del where C is the num b er of r ed-rules applied by the optimal alg orithm. Note that C can b e muc h less tha n OP T . 9 Conclusion W e extended the one-s hot q uery mo del to the more general situation wher e a que r y can return arbitr ary s ub- int erv als a s answers and established strong relationships betw een these mo dels. Man y of the previous results in the r e s tricted mo del can be gener a lized ba s ed on this rela tionship that simpliﬁes the task of designing algorithms for the more general mo del. This is far fro m ob vious as the sub-interv al query model presents some ob vious challenges b ecause the uncerta in t y (in the v alues of a n y para meter ) c an take an arbitrary nu mber of steps to be resolved and ca n b e controlled by a n adversary . One drawbac k of this approach is that the a c tual algor ithmic complexit y is ov erlo oked and w e o nly fo cus on the competitive ra tio which is justiﬁed on the basis of very high cost o f a query . F or future work, the alg orithmic complexity needs to b e incorp orated in a mea ningful w ay . References [AH04] Ionut D. Aron and Pascal V an Hentenryck. On the complexity of the robu st spann ing tree problem with interv al data. Op er. R es. L ett. , 32(1):36–40, 2004. [A Y09] Charu C. Aggarw al and Philip S . Y u. A survey of u ncertain data algorithms and app lica tions. IEEE T r ans. Know l. Data Eng. , 21(5):609–623, 2009. [BEE + 06] Zu zana Beerlio v a, F elix Eb erhard, Thomas Erlebach, Alexander Hall, Mic hael Hoﬀmann 0002, Mat´ us Mihal´ ak, and L. Shankar Ram. Netw ork disco very and veriﬁcation. IEEE Journal on Sele cte d Ar e as in Communic ations , 24(12):2168–21 81, 2006. [BHKR05] Richard Bruce, Michael Hoﬀmann, Danny Krizanc, and Ra jeev R aman. Eﬃcient up date strategies for geometric compu ting with uncertaint y . The ory Comput. Syst. , 38(4):411– 423, 2005. [FMO + 03] T om´ as F eder, Ra jeev Mot w ani, Liadan O’Callaghan, Chris Olston, and Rina P anigrah y . Computing shortest path s with u ncertain t y . I n ST ACS , p ages 367–378 , 2003. [FMP + 03] T om´ as F eder, Ra jeev Motw ani, Rina Panigra hy , Chris Olston, and Jennifer Widom. Computing the median with uncertaint y . SI AM J. Comput. , 32(2):538–547, 2003. [GGM06] Ashish Goel, Su d ipto Gu ha, and Kamesh Munagala. Ask ing the righ t questions: mo d el-driv en optimiza- tion u sing prob es. In PODS , pages 203–212, 2006. [GM07] Sudipto Guha and Kamesh Munagala. Mo del-driv en optimization using adaptive probes. In SODA , pages 308–317 , 2007. [HEK + 08] Michael Hoﬀmann, Thomas Erlebac h, D ann y Krizanc, Mat ´ us Mihal´ ak, and R a jeev Raman. Computing minim um spann ing trees with uncertaint y . I n ST ACS , pages 277–288, 2008. [Kah91] Simon Kahan. A mo del for data in motion. In STOC , pages 267–2 77, 1991. [KT01] Sanjeev Kh anna and W ang Chiew T an. On compu ting fun ctions with uncertaint y . In PODS , pages 171–182 , 2001. [KZ06] A. Kasperski and P . Zielenski. An appro ximation algorithm for in terv al data minmax regret combinatorial optimization problem. Information Pr o c essing L etters , 97(5):177–180, 2006. [OW0 0] Chris Olston and Jennifer Widom. Oﬀering a precision-p erformance tradeoﬀ for aggreg ation q ueries ov er replicated data. In VLDB , pages 144–155, 2000. App endix A. S ke tc h of c hanges for Lexicographic MST The following changes are req uired to the algorithm o f [HEK + 08]. Her e we use the notation U x for u x and L x for l x to re main co nsisten t with [HEK + 08]. 12 1. The main change inv olv es mo difying the compariso n oper ator. W e modify the co mparison op erator deﬁned on the interv a ls as follows: L et x b e the l -v alue or u -v alue o f some interv al, i.e., x = l e or x = u e for some interv al e . Similar ly , let y be the l -v alue or u -v alue of so me interv al, i.e., y = l f or y = u f for some interv al f 6 = e . W e say that x ≺ y if x < y or x = y and e is lexicog raphically smaller than f . 2. An edge e of a cycle C is said to b e a lw a ys maximal if U c ≺ L e for all c ∈ C − { e } . Note tha t the o nly change introduced in this deﬁnition is in replacing the compariso n op erator. 3. W e similar ly mo dify the notion of comparing t wo edges e and f based on the compar is on op erator as follows. W e say that e ≺ f if L e ≺ L f . While indexing the edges in the algor ithm, the edg es are considered in the order deﬁned b y ≺ ab o ve. 4. The witness set is determined as follows. Once a cycle C is detected, if it contains a n alwa ys max imal edge, that edge is deleted. Otherwise let f ∈ C such that U f = max { U c | c ∈ C } where ma x is based on the new ≺ ope r ator. F urther let g ∈ C − { f } suc h that L f ≺ U g . Then f and g form the witness set. Theorem 8.1 can b e prov ed with these changes along the same lines as presented in [HEK + 08]. App endix B. Pro ofs for the Witness Set F ramew ork 9.1 Pro of of T heorem 4.1 W e ﬁrst prov e a claim tha t will b e required in the pro of of the ab o ve result. Claim 9. 1. Su pp ose that we ar e given a pr oblem instanc e P = ( C, A ) . F urther, supp ose that we know that an optima l solution, S o for P c ontains an index i , i.e., S o queries the ar e a a i . L et P ′ = ( C , A ′ ) b e the pr oblem ins t anc e r e duc e d fr om P on querying ar e a a i . Then S ′ o = S o \ { i } is an optimal solution for P ′ . Her e the op er ation \ on the multiset S o r emoves only one instanc e of i fr om it in c ase ther e ar e multiple instanc es. Pr o of. Se e App endix. Pr o of. Recall that the update indep endence proper t y implies that irr espective of the or der in which the upda tes are applied, applying all the up dates in S o solves the problem P . Therefore, clear ly S ′ o solves the problem instance P ′ . In o rder to arg ue that this is an optimal solution, all we need to show is that there do es not exist a s olution of smaller size. Supp ose other wis e. Then there exists a solution S of size smaller than S ′ o that solves P ′ . But then, S ′ = S ∪ { i } solves P which contradicts the fact tha t S o is an optimal solution of P . W e now present the pro of of Theor em 4 .1. Pr o of. The proo f is by induction on the s ize of an optimal solution on instance P . F or the base case, consider a problem instance P for which any o ptimal s olution has size 1. Let W b e a witness s e t returned b y algorithm WALG . Cle arly , W is k -update comp etitiv e b y deﬁnition. Now supp ose that the c la im holds for any problem instance P ha ving optimum solutio n of size i o r less. Consider a problem instance P for which any o ptim um solution has size i + 1. Let W be a witness set of size ≤ k returned by WALG . Let the instance P b e reduced to instance P ′ on applying upda tes on the areas in W . By Cla im 9.1, any optimal s olution on P ′ has s ize ≤ i . By induction, the algo rithm deter mines a k -upda te comp etitiv e so lution S ′ for P ′ . Hence | S ′ ∪ W | ≤ k ( i + 1), and th us S ′ ∪ W is a k -update comp etitiv e solution for P . 13 9.2 Pro of of T heorem 4.2 Pr o of. W e forma lly prov e this for the CP-C mo del. The pro ofs fo r the other mo dels follow simila rly; we po in t out the changes req uired. Let P b e any instance of the giv en problem under the CP-C mo del. L e t P ′ be obtained from P by mo difying the conﬁgura tion and areas of uncertaint y as follows; (i) All the clos ed in terv a ls are replaced with op en int erv als; a nd (ii) The co nﬁguration is suitably mo diﬁed in order to ensure that the conﬁguratio n p oin ts are alwa ys contained in the cor respo nding a reas of uncertaint y – this is expla ined in more detail later. Let W b e any witness set fo r P ′ under the OP -P mo del. W e need to show that W is a lso a witness set fo r P under the CP- C mo del. Suppo se this is not so , i.e., W is not a witness set for P under the CP -C mo del. W e will then ar gue that there exists a set of queries excluding W that when a pplied to P ′ under the OP -P mo del can res ult in an instance for which the veriﬁer retur ns true; this implies that W is not a witness se t for P ′ under the OP -P model leading to a contradiction. Hence our suppo sition is incorr ect a nd W m ust be a witness set for P as well. It r emains to ﬁnd a po ssible set of queries excluding W and query outcomes that when applied to P ′ under the OP-P mo del r e sults in a n instance for whic h the veriﬁer r eturns true under the a ssumption that W is no t a witness set for P under the CP-C mo del. Consider ing this assumption, there exists a solution S = { i 1 , i 2 , . . . , i k } for P under the CP -C mo del that do es not con tain the index for an y area in W . L et P 1 , P 2 , . . . , P k be the sequence of instances obtained on applying the up dates in S wher e P t is obtained from P t − 1 on applying the up date on a i t for 1 ≤ t ≤ k . F or any interv al (not a p oint) a j in P k (the ﬁnal conﬁguration in the sequence ab ov e) let l j , u j denote the int erv al end points. Let ε = min { u j − l j | a j a j is an in terv a l in P k } , i.e., ε is the minim um length of a ny interv al in P k . As mentioned ear lier, the conﬁg ur ation p oin ts in P ′ are also suitably mo diﬁed in order to ensur e that they are alwa ys contained in the corr esponding a reas of uncertaint y . This is done by setting a co nﬁguration po in t c j to c j + ε / 10 if c j = l j in P k , and setting it to c j − ε / 10 if c j = u j in P k . This ensures that no conﬁguratio n p oin t co incides with the in terv a l end-p oints; this will allow us to replace clos ed interv als with op en int erv als. Moreover, the mo diﬁed conﬁg uration p oints ar e co nsisten t with all the query outputs. Now, consider the cas e where the s ame sequence of up dates in S is applied to P ′ under the OP-P mo del. A possible sequence of outcomes is P ′ 1 , P ′ 2 , . . . , P ′ k wherein P ′ t is the same as P t with all closed interv als replaced by op en interv als ( P ′ t is obtained fro m P ′ t − 1 on applying the up date on a ′ i t for 1 ≤ t < k ). Now note that since S solves P , the veriﬁer retur ns true for P k . How e v er, P k and P ′ k are o rder-equiv ale n t a nd since the problem is order-inv ariant, the v eriﬁer must return true for P ′ k under the OP-P model as well. This implies that W is not a witness set for P ′ under the O P-P mo del lea ding to the requir ed co n tradiction. W e can similarly sho w that the witness set for the O P -P model can b e r eused for a v ariety of other mo dels, thereby resulting in co mparable up date-competitive algor ithms. The pr oofs are simila r to that of Theorem 4.2 ab o ve; the only diﬀerence is in the wa y the the instances P ′ 1 , . . . , P ′ k for the OP-P mo del ar e constructed from the instances P 1 , . . . , P k for the new mo del. F or the C- C mo del, OC-O C model and the OCP-O C mo del, the insta nce P ′ t is obtained from instance P t by replacing the closed interv als in P t with co rrespo nding open in terv a ls and mo difying the conﬁguratio n po in ts as describ ed in the pro of ab o ve. F or the O-O mo del, a nd the O P -O mo del, the insta nce P ′ t is obtained fro m instance P t by repla cing the int erv als in P t corres p onding to the areas of uncertaint y ha ving indices in the set { i 1 , i 2 , . . . , i t } with the corres p onding conﬁgur a tion p oint s c i 1 , c i 2 , . . . , c i t . Note that in this case, it do es not make sense to q uery the interv al on the same index more than once, therefor e the num b er of queries can b e reduced. This completes the pro of of the Theorem. Pro of of C or ollary 4.3 Pr o of. Consider the CP-C mo del. By Theor em 4.2, we know that the witness a lgorithm for the OP-P mo del is als o a witness algo rithm for the CP-C mo de l. Mo reo ver, the veriﬁer for the OP-P mo del is also a veriﬁer for the CP-C mo del a s the problem considered is order-inv aria n t. The pro of for the other mo dels follows similarly . 14 App endix C. Pro ofs for the Selection Problem Pro of of Lemma 5.1 Pr o of. The pro of is by co n tradiction. Supp ose O PT up dates neither a p 1 nor a p 2 . Let the index of the interv al returned by OPT as the answer be a q . W e consider the 2 c a ses: • a q = a p 1 : As the witness algorithm is inv ok ed o nly when the veriﬁer returns false, by exa mining the condition in Step 2 of the veriﬁer (which must have fa ile d for the current instance), we conclude that ∃ x ∈ a p 1 and y ∈ a p j , j 6 = 1 such that y < x . T hus O PT ha s not fully demonstrated that a p 1 contains the po in t which is minimum as a p 2 could b e made to contain the minim um p oin t. • a q 6 = a p 1 : By the deﬁnition of orde r l ( . ) applied in Step 1 of the witness algor ithm a nd by examining the condition in Step 2 o f the veriﬁer, w e conclude that l p 1 ≤ l p q . Thus, OPT has not demonstr a ted tha t a p q contains the p oint whic h is minim um as a p 1 could b e made to contain the minim um p oin t. Pro of of Lemma 5.2 Pr o of. The pro of is ag ain by contradiction. There are tw o cases: • The witn ess set re turne d is the witness set of the 1 -Min Algori thm, W = { a p k , a p k +1 } : Since the ﬁrst k − 1 interv als are disj oint with the rest o f the interv als, the problem of ﬁnding the k th minim um in terv al becomes the problem of ﬁnding 1-Min in S \ S ′ . Using Lemma 6.1, W is a v alid witness set. • W = { a p k , a q 1 } : Suppo se OPT updates neither a p k nor a q 1 . Let the index retur ned by OPT b e a j . So a j has to b e disj oi n t with all the other in terv a ls . Since the witness alg orithm was c alled only beca use the v eriﬁer returned false, s o by exa mining the condition of step 3 of the veriﬁer, we infer that ∃ a p i with 1 ≤ i ≤ k − 1 such that ∃ x ∈ a p i , y ∈ a p k for which x > y . So a p i and a p k are not disj oint . If such a p i exists, then by the deﬁnition o f a q 1 , we see that a q 1 and a p k are a lso not disj oint . So the so lution returned by O PT cannot be a p k and a q 1 as they b oth are not disj oint . As a j m ust b e disjoint, we co nsider following cases : – u j ≤ l p k and u j ≤ l q 1 : Initially there w ere less than k − 2 interv a ls with l v alues ≤ l q 1 . Since a q 1 is not up dated, any upda te of other in terv a ls ca nnot increase the num b er of interv als with l v a lues ≤ l q 1 . Since u j ≤ l p k , the num ber of interv a ls with l v a lues ≤ l j is less than k − 3. So a j cannot b e the k th minim um interv al. – l j ≥ u p k and l j ≥ u q 1 : Initially there are k − 2 interv als with u v alues ≤ u q 1 . Since a q 1 is not upda ted, any up date of o ther interv als is not going to decrease the num b e r of such interv als. These int erv als together with q 1 and p k hav e u v alues ≤ l j . So there are k interv a ls with u v a lue s ≤ l j . So a j cannot b e the k th minim um in terv a l. App endix D. Pro ofs for Bypassing the Witness set F ramew ork Pro of of Lemma 6.1 Pr o of. Assume for contradiction that we have quer ied c OP T + j interv als where j ≥ 2. Let a 1 and a 2 be any t wo int erv als that algor ithm in Figure 6 has queried but OP T has no t queried such that l 1 ≤ l 2 . Since OPT did not query a 1 , we conclude that a 1 is the in terv a l whic h con tains the minim um. Also since the algorithm in Figure 6 queried a 2 , ∃ x ∈ a 1 and y ∈ a 2 such that y < x . But we hav e assumed that O PT do es not query a 2 , so O P T cannot demonstrate that a 1 contains the po in t w hich is minimum. So we get a contradiction. 15 Pro of of Lemma 6.2 Pr o of. Let 1 < k ≤ n − k - the other case can b e arg ued similar ly and k = 1 is address e d by the algor ithm in Figure 6. If S ′ max is not queried b y OPT then S ′ max has rank ≤ k . S ′ max cannot hav e ra nk > k by deﬁnition of S ′ . Indeed, if S ′ max has rank > k , then there must be at least k points to the left of S ′ max that vio lates the deﬁnition of S ′ . If S ′ max has rank ≤ k then atmost k − 1 such interv als can remain unqueried, o therwise the rank of the element returned ca nnot prov ably b e k . (If S ′ max has ra nk = k , then the O PT m ust query all except o ne, which is ≤ k − 1 for k > 1 ). F or the s e cond phase, to ﬁnd out the ma xim um amo ng S ′ , the algorithm of Figure 6 needs at most c max OP T + 1 queries. So, ov erall, our alg orithm makes a t mo s t k − 1 + 1 = k queries mor e than the OPT. App endix E. Pro ofs for Closed in terv als with p oin t returning queries Pro of of Lemma 7.1 Pr o of. The pr oof is again b y contradiction. Suppos e OPT up dates neither a p 1 nor a p 2 . Let the index of the int erv al returned by OPT as the answer b e a q . W e consider the 2 c a ses: • a q = a p 1 : As the witness algorithm is inv ok ed o nly when the veriﬁer returns false, by exa mining the condition in Step 2 of the veriﬁer (which must hav e failed for the current instance), we conclude that either (i) ∃ x ∈ a p 1 and y ∈ a p j , j 6 = 1 s uch that y < x ; o r (ii) ∃ x ∈ a p 1 and y ∈ a p j such that y = x and p 2 < p 1 . In either case, we obser v e tha t OPT has not fully demonstra ted that a p 1 contains the po int which is minim um as a p 2 could b e made to contain the minim um point. • a q 6 = a p 1 : By the deﬁnition o f ord er l ( . ) applied in Step 1 of the witness algorithm and e xamining the condition in Step 2 o f the veriﬁer, w e conclude that l p 1 ≤ l p q . Thus, OPT has not demonstr a ted tha t a p q contains the p oint whic h is minim um as a p 1 could b e made to contain the minim um p oin t. 16

The update complexity of selection and related problems

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment