Space Efficient Multi-Dimensional Range Reporting

Space Eﬃcien t Multi-Dimensional Range Rep orting Marek Karpinski ∗ Y ak ov Nekric h † Dept. of Computer Scienc e Univ ersit y of Bonn. Abstract W e present a data structure that supp orts thr ee-dimensional rang e repor ting queries in O (log log U + (log log n ) 3 + k ) time and uses O ( n log 1+ ε n ) space, where U is the size of the universe, k is the num b er of points in the answer, and ε is an ar bitrary constant. This r e- sult improves ov er the data structure of Alstrup, Br oda l, and Rauhe (FOCS 2000 ) that uses O ( n log 1+ ε n ) s pa ce and supp orts q ueries in O (lo g n + k ) time, the data structure of Nekrich (SoCG’07) that uses O ( n log 3 n ) s pa ce and supp orts queries in O (lo g log U + (log log n ) 2 + k ) time, and the da ta structure of Afshani (ESA’08) that uses O ( n log 3 n ) spa c e and also s upports queries in O (log log U + (log log n ) 2 + k ) time but relies on ra ndomization during the prepro cess- ing stage. Our result allows us to sig niﬁcan tly reduce the space usage of th e f astest previously known static and incr emen ta l d -dimensiona l data structures, d ≥ 3, at a cost of increa sing the query time b y a negligible O (log log n ) factor. 1 In tro du c tion The r ange rep orting problem is to store a set of d -dimens ional p oin ts P in a data structure, so that for a query r ecta ngle Q all p oints in Q ∩ P can b e r ep orted. I n this pap er we signiﬁcantl y impr o v e the sp ace usage and pre-pro cessing time of th e fastest p r eviously known s tati c and semi-dyn amic data stru ctures for orthogonal range rep orting with only a n eglig ible increase in the query time. The r ange rep orting is extensive ly studied at least since 1970s; the history of this pr oblem is ric h with diﬀeren t trade-oﬀs b et w een query time and space u sage. Static r an ge rep orting queries can b e answ ered in O (log d n + k ) time and O ( n log d − 1 n ) space using r an ge trees [4 ] known sin ce 1980; here and fur th er n denotes the num b er of p oint s in P and k d enotes the num b er of p oin ts from P in th e query rectangle. The query time can b e reduced to O (log d − 1 n + k ) time b y applying the fractional cascading tec h nique of Ch azelle and Guibas [8] d esigned in 1985. The space usage was further imp ro ve d by Ch azell e [6]. In 90s, Sub ramanian and Ramasw am y [13] and Bozanis, K itsios, Makris, and Tsak alidis [5] s ho we d that d -dimensional q u eries can b e answ ered in e O (log d − 2 n + k ) time 1 at a cost of higher space usage: their d ata structur es use O ( n log d − 1 n ) and O ( n log d n ) sp ace resp ectiv ely . Alstrup, Bro dal, and Rau h e [2] designed a data str ucture th at answers qu eries in e O (log d − 2 n + k ) time and uses O ( n log d − 2+ ε n ) space for an arbitrary constant ε > 0. Nekrich [12] reduced the query time by e O (log n ) factor and p resen ted a data structur e that an s w ers qu eries ∗ Email marek@cs.uni-bo nn.de . † Email yasha@cs.uni-bo nn.de . 1 W e deﬁne e O ( f ( n )) = O ( f ( n ) log c ( f ( n ))) for a constant c . 1 Source Query Time Space [4] O (log d n + k ) O ( n log d − 1 n ) [8] O (log d − 1 n + k ) O ( n log d − 1 n ) [6] O (log d − 1 n + k ) O ( n log d − 2+ ε n ) [13] O (log d − 2 n log ∗∗ n + k ) O ( n log d − 1 n ) [5] O (log d − 2 n + k ) O ( n log d n ) [2] O (log d − 2 n/ (log log n ) d − 3 + k ) O ( n log d − 2+ ε n ) [12] O (log d − 3 n/ (log log n ) d − 5 + k ) O ( n log d +1+ ε n ) [1] † O (log d − 3 n/ (log log n ) d − 5 + k ) O ( n log d + ε n ) This p ap er O (log d − 3 n/ (log log n ) d − 6 + k ) O ( n log d − 2+ ε n ) T able 1: Data stru ctures in d > 3 dimensions; † in dicates that a d ata structur e is rand omized. W e deﬁne log ∗ ( n ) = min { t | log ( t ) n ≤ 1 } and log ∗∗ n = m in { t | log ∗ ( t ) n ≤ 1 } where log ∗ ( t ) n denotes computing log ∗ t times. in O (log d − 3 n/ (log log n ) d − 5 + k ) time for d > 3. Unfortunately , the data structur e of [12] us es O ( n log d +1+ ε n ) sp ace. Recently , Afshani [1] redu ced the space usage to O ( n log d + ε n ); ho wev er h is data structure u ses r andomizatio n (du ring the prepro cessing stage). In th is p ap er we p resen t a data stru ctur e that m atc hes the space eﬃciency of [2] at a cost of increasing the qu ery time by a negligible O (log log n ) factor: ou r data structur e supp orts queries in O (log d − 3 n/ (log log n ) d − 6 + k ) time and uses O ( n log d − 2+ ε n ) sp ace for d > 3. See T able 1 for a more precise comparison of diﬀeren t resu lts. Our result for d -dim en sional range rep orting is obtained as a corollary of a three-dimensional data structure that su pp orts queries in O (log log U + (log log n ) 3 + k ) time and uses O ( n log 1+ ε n ) space, where U is the s ize of the unive rse, i.e. all p oint co ordinates are p ositiv e inte gers b ounded b y U . Our three-dimensional d ata stru ctur e is to b e compared with the d ata stru cture of [2] that also uses O ( n log 1+ ε n ) space b u t answ ers queries in O (log n + k ) time and the data stru cture of [12] that ans w ers qu eries in O (log log U + (log log n ) 3 + k ) time but needs O ( n log 4 n ) sp ace. See T able 2 for a more extensiv e comparison w ith previous results. A corollary of our result is an eﬃcient semi- dynamic d ata structure th at su pp orts th ree-dimensional queries in e O (log n + k ) time and insertions in O (log 5 n ) time. T h us w e im p ro ve the space usage an d th e u p date time of fastest previously kno w n semi-dynamic data structure [12] that supp orts inser tions in O (log 8 n ) time. If w e are ready to p a y p enalties f or eac h p oin t in the answe r, the space usage can b e further reduced: we describ e a data stru cture that uses O ( n log d − 2 n (log log n ) 3 ) space and ans wers queries in O (log d − 3 n (log log n ) 3 + k log log n ) time. W e can also use this data structure to answer emptiness queries (to determine whether query rectangle Q conta ins p oin ts f rom P ) and one-rep orting queries (i.e. to rep ort an arbitr ary p oin t f r om P ∩ Q if P ∩ Q 6 = ∅ ). This is an e O (log n ) factor imp ro ve men t in query time o v er th e d ata structure of Alstrup et. al. [2]. O th er similar d ata structures are either slo wer or require higher p enalties for eac h p oint in the answe r. Throughout this pap er, ε d enotes an arb itrarily small constan t, and U denotes the size of the unive rse. If eac h p oin t in the answer can b e output in constan t time, we will sometimes sa y th at the query time is O ( f ( n )) (ins tea d of O ( f ( n ) + k )). W e let [ a, b ] d enote the s et of intege rs { i | a ≤ i ≤ b } ; The int erv als [ a, b ) and ( a, b ] denote th e same set as [ a, b ] but without a (resp. without b ). W e denote b y [ b ] the set [1 , b ]. In section 3 we describ e a sp ace eﬃcient d ata stru cture for three-dimensional r ange rep orting 2 Source Query Time Space [6] O (log 2 n + k ) O ( n log 1+ ε n ) [13] O (log n log ∗∗ n + k ) O ( n log 2 n ) [5] O (log n + k ) O ( n log 3 n ) [2] O (log n + k ) O ( n log 1+ ε n ) [12] O (log log U + (log log n ) 2 + k ) O ( n log 4+ ε n ) [1] † O (log log U + (log log n ) 2 + k ) O ( n log 3 n ) This pap er O (log log U + (log log n ) 3 + k ) O ( n log 1+ ε n ) T able 2: Three-dimensional data stru ctures; † in dicates that a data structure is rand omized. on a [ U ] × [ U ] × [ U ] grid, i.e. in the case wh en all p oint co ordinates b elong to [ U ]. In section 4 w e describ e a v ariant of our data str ucture that uses less space but needs O (log log n ) time to outp ut eac h p oin t in the answer. All resu lts of this pap er are v alid in the w ord RAM computation m od el. 2 Preliminaries W e use the s ame n ota tion as in [14] to denote the s p ecial cases of th r ee-dimensional range rep orting queries: a p ro duct of th r ee half-op en interv als will b e called a (1,1,1)-sided query; a pro du ct of a closed interv al and t wo half-op en inte rv als w ill b e called a (2,1,1)-sided query; a p ro duct of tw o closed inte rv als and one half-op en inte rv al (resp. three closed in terv als) will b e called a (2,2,1) -sided (r esp. (2,2,2)-sided) query . Clearly (1,1,1)-sided qu eries are equiv alen t to dominance rep orting queries, and (2,2,2)-sided query is the general three-dimensional qu ery . Th e follo wing transformation is describ ed in e.g. [14] and [13]. Lemma 1 L et 1 ≤ a i ≤ b i ≤ 2 for i = 1 , 2 , 3 . A data structur e that answers ( a 1 , a 2 , a 3 ) queries in O ( q ( n )) time, uses O ( s ( n )) sp ac e , and c an b e c onstructe d in O ( c ( n )) time c an b e tr ansforme d into a data structur e that answers ( b 1 , b 2 , b 3 ) queries in O ( q ( n )) time, uses O ( s ( n ) log t n ) sp ac e and c an b e c onstructe d in O ( c ( n ) log t n ) time for t = ( b 1 − a 1 ) + ( b 2 − a 2 ) + ( b 3 − a 3 ) . W e say that a s et P is on a grid of s ize n if all co ordinates of all p oints in P belong to an in terv al [ n ]. W e w ill need the follo w in g folklore r esult: Lemma 2 Ther e exists a O ( n 1+ ε ) sp ac e data structur e that supp orts r ange r ep orting queries on a d -dimensional grid of size n for any c onstant d in O ( k ) time. Pr o of : One d imensional range rep orting queries on the [ n ] × [ n ] × [ n ] grid can b e answered in O ( k ) time usin g a trie with n od e d egree n ε . Using range tr ees [4] with no de degree ρ w e can transform a d -dimensional O ( s ( n )) s p ace data structure into a ( d + 1 )-dimensional d ata stru cture that uses O ( s ( n ) h ( n ) · ρ ) space and ans w ers r ange rep orting queries in O ( q ( n ) h ( n )) time, where h ( n ) = log n/ log ρ is the heigh t of the r ange tree. Since ρ = n ε , h ( n ) = O (1). Hence, the query time d oes not d ep end on d imension and the space u sage increases by a factor O ( n ε ) with eac h dimension.  W e us e Lemm a 2 to obtain a data structur e th at sup p orts queries that are a p rod uct of a ( d − 1)- dimensional query on a universe of size n 1 − ε and a half-op en interv al. W e will show in the next Lemma that suc h queries can b e answered in O ( n ) space and O (1) time. 3 Lemma 3 Ther e exists a O ( n ) sp ac e data structur e that supp orts r ange r ep orting queries of the form Q ′ × [ −∞ , x ) in O ( k ) time, wher e Q ′ is a ( d − 1) -dimensional query on [ U 1 ] × [ U 2 ] × . . . × [ U d − 1 ] and U 1 · U 2 · . . . · U d − 1 = O ( n 1 − ε ) . Pr o of : There are O ( n 1 − ε ) p ossible pro jections of p oin ts onto the ﬁr st d − 1 co ordinates. Let min( p 1 , . . . , p d − 1 ) denote t he p oin t with minimal d -th coordinate among al l p oints wh ose ﬁr st d − 1 co ordinates equal to p 1 , p 2 , . . . , p d − 1 . W e store p oin ts min( p 1 , . . . , p d − 1 ) for all p 1 ∈ [ U 1 ], p 2 ∈ [ U 2 ], . . . , p d − 1 ∈ [ U d − 1 ] in a data s tructure M . Sin ce M conta ins O ( n 1 − ε ) p oint s, w e can use Lemma 2 and implemen t M in O ( n ) sp ace. F or all p ossible p 1 ∈ [ U 1 ], p 2 ∈ [ U 2 ], . . . , p d − 1 ∈ [ U d − 1 ] we also store a list L ( p 1 , . . . , p d − 1 ) of p oin ts whose ﬁrst d − 1 co ordinates are p 1 , . . . , p d − 1 ; p oin ts in L ( p 1 , . . . , p d − 1 ) are sorted by th eir d -th co ordinates. Giv en a q u ery Q = Q ′ × [ −∞ , x ), we ﬁrst an s w er Q using the data stru cture M . Since M con tains O ( n 1 − ε ) p oint s, we can ﬁnd all p oints in M ∩ Q in O ( | M ∩ Q | ) time. Then, f or ev er y p oin t p = ( p 1 , . . . , p d − 1 , p d ) foun d with h elp of M , w e tra verse the corresp ond ing list L ( p 1 , . . . , p d − 1 ) and rep ort all p oints in this list whose last co ordinate d oes not exceed x .  In sev eral p lace s of our pro ofs w e will use the r e duction to r ank sp ac e tec h nique [10, 6]. This t ec hniqu e allo ws us to rep lace co ordinates of a p oin t b y its rank. Let P x , P y , and P z b e the sets of x , y -, a nd z -co ordinates of points from P . F or a p oin t p = ( p x , p y , p z ), let p ′ = (rank( p x , P x ) , rank( p y , P y ) , rank( p z , P z )), where r an k( e, S ) is deﬁned as the n um b er of ele- men ts in S that are smaller th an or equal to e . A p oin t p b elongs to an interv al [ a, b ] × [ c, d ] × [ e, f ] if and only if a p oint p ′ b elongs to an interv al [ a ′ , b ′ ] × [ c ′ , d ′ ] × [ e ′ , f ′ ] where a ′ = su cc( a, P x ), b ′ = pred( b, P x ), c ′ = succ( c, P y ), d ′ = pred( d, P y ), e ′ = succ( e, P z ), f ′ = pred( f , P z ), and succ( e, S ) (pred( e, S )) denotes the smallest (largest) elemen t in S that is greater (smaller) than or equ al to e . Reduction to rank space can b e used to redu ce range rep orting qu eries to ran ge r ep ortin g on the [ n ] × [ n ] × [ n ] grid: Supp ose w e can ﬁnd pred( e, s ) and succ( e, S ) f or any e , wher e S is P x , P y , or P z , in time f ( n ). Supp ose that range rep orting queries on [ n ] × [ n ] × [ n ] grid can b e an s w ered in time O ( g ( n ) + k ). Then we can answe r r ange r ep ortin g q u eries in O ( f ( n ) + g ( n ) + k ) time. F ollo wing [2], we can also us e the redu ctio n to rank sp ace tec hniqu e to reduce th e sp ace usage: if a data stru cture con tains m elements, r eduction to r ank space allo ws us to store eac h element in O (log m ) bits. 3 Space Eﬃcien t Three-Dimensional Data Struc tu r e In this section w e describ e a data structure that supp orts three-dimensional range rep orting queries in O ((log log n ) 3 + log log U + k ) time where U is the unive rse size and u s es O ( n log 1+ ε n ) sp ace. Our data str u cture com bines the recursive d ivid e-and-conquer approac h in tro duced in [2], the result of Lemma 3, and the transformation of ( a 1 , a 2 , a 3 )-queries in to ( b 1 , b 2 , b 3 )-queries describ ed in Lemma 1. W e start w ith a description of a space eﬃcien t mo diﬁcation of the data str u cture for (1,1,1) -sided queries on the [ n ] × [ n ] × [ n ] grid. Then, we obtain data structures for (2 , 1 , 1) -sided and (2 , 2 , 1)-sided qu er ies on the [ n ] × [ n ] × [ n ] grid using the r ecursiv e divide-and-conqu er and Lemma 3. Finally , we obtain the data structure that sup p orts arbitrary orthogonal qu eries on the [ n ] × [ n ] × [ n ] grid u s ing Lemma 1. Reduction to rank sp ace tec hnique d escrib ed in section 2 allo w s us to transform a data structure on the [ n ] × [ n ] × [ n ] grid in to a data s tr ucture on the [ U ] × [ U ] × [ U ] grid , so that the qu ery time increases by an ad d itiv e term O (log log U ) and the space usage is n ot increased. 4 Lemma 4 [ 12] Given a set of thr e e-dimensional p oints P and a p ar ameter t , we c an c onstruct in O ( n log 3 n ) time a O ( n ) sp ac e data structur e T that supp orts the fol lowing qu eries on a grid of size n : (i) for a given query p oint q , T determ ines in O ((log log n ) 2 ) time whether q is dominate d by at most t p oints of P (ii) if q is dominate d by at most t p oints fr om P , T outputs in O ( t + (l og log n ) 2 ) time a list L of O ( t ) p oints such that L c ontains al l p oints of P that dominate q . As describ ed in [12], Lemma 4 allo ws us to answ er (1,1,1)-sided q u eries in O ((log log n ) 2 ) time and O ( n log n ) space. W e can redu ce the sp ace usage to O ( n log log n ) using an idea that is also used in [1]. Lemma 5 Ther e exists a data structur e that answers (1,1,1)-side d queries on [ n ] × [ n ] × [ n ] grid in O ((log log n ) 2 + k ) time, uses O ( n log log n ) sp ac e, and c an b e c onstructe d i n O ( n log 3 n log log n ) time. Pr o of : F or eac h p aramete r t = 2 2 i , i = i min , i min +1 , . . . , log log n/ 2, i min = 2 log log log n , we construct a data stru cture T i of Lemma 4. Given a query p oin t q , w e examine data structur es T i , i = i min , i min +1 , . . . , log log n/ 2 until q is dominated by at most 2 2 i p oin ts of P o r the last data structure T i is examined. Thus we identify the ind ex l , such that q is domin ated by more than 2 2 l and less than 2 2 l +2 p oin ts or determine th at q is dominated by at least log n p oints. If l = i min , then q is d ominated by O ((log log n ) 2 ) p oints. W e can generate in O ((log log n ) 2 ) time a list L of O ((log log n ) 2 ) p oint s that con tains all p oint s dominating q . Then, w e examine all p oints in L and output all p oints that domin ate q in O ((log log n ) 2 ) time. If log log n/ 2 > l > i min , we can examine data stru ctures T i min , T i min +1 , . . . , T l in O (( l − i min )(log log n ) 2 ) time. Then , we generate the list L that conta ins all p oin ts that dominate q in O (2 2 l ) time. W e can pro cess L and output all k p oin ts that dominate q in O (2 2 l ) time. Since k > 2 2 l − 2 , k = Ω(2 2 l ) and k = Ω(( l − i min ) · (log log n ) 2 ). Hence, the query is answ ered in O ( k ) time. If l = log log n/ 2, then q is d ominated b y Ω(log n ) p oin ts. in th is case we can use a linear space d ata structure w ith O (log n ) query time, e.g. the data structure of Chazelle and Edelsbr unner [7], to answe r the query in O (log n + k ) = O ( k ) time. Since eac h data s tructure T i uses linear space, the s pace us age of the describ ed d ata structur e is O ( n log log n ).  Lemma 6 Ther e exists a data structur e that answers (2,1,1)-side d queries on [ n ] × [ n ] × [ n ] grid in O ((log log n ) 3 + k ) time, uses O ( n log ε n ) sp ac e, and c an b e c onstructe d in O ( n log 3 n log log n ) time. Pr o of : W e divide the grid in to x -slices X i = [ x i − 1 , x i ] × [ n ] × [ n ] and y -slices Y j = [ n ] × [ y j − 1 , y j ] × [ n ], so that eac h x -slice conta ins n 1 / 2+ γ p oin ts and eac h y -slice cont ains n 1 / 2+ γ p oin ts; the v alue of a constan t γ will b e sp eciﬁed b elo w. T he cell C ij is the in tersection of the i -th x -slice and the j -th y - slice, C ij = X i ∩ Y j . The data structure D t con tains a p oin t ( i, j, z ) for eac h p oint ( x, y , z ) ∈ P ∩ C ij . Since the ﬁ rst t w o co ordinates of p oin ts in D t are b ound ed b y n 1 / 2 − γ , D t uses O ( n ) space and supp orts (2,1,1)-sided queries in constan t time by Lemma 3. F or eac h x -slice X i there are t wo d ata structures that sup p ort tw o t y p es of (1,1,1)-sided queries, op en in + x and in − x directions. F or eac h y -slice Y j , th ere is a data structure that su pp orts (1 , 1 , 1) -sided queries op en in + y direction. F or eac h y -slice Y j and for eac h x -slice X i there are recur siv ely deﬁned d ata structures. Recursiv e 5 1 0 2 3 4 5 6 7 2 3 4 5 y y y y y a b c x x x 1 x x x x x Figure 1: Example of a (2 , 1 , 1)-sided q u ery p ro jected onto the xy -plane. i 1 = 1, i 2 = 5, j 1 = 4 and a 0 = x 1 , b 0 = x 4 , c 0 = y 3 . sub division s tops when th e n um b er of elemen ts in a d ata structure is smaller than a p redeﬁned constan t. Hence, the num b er of r ecursion lev els is v log log n for v = log 2 1+2 γ 2. Essen tially we apply the id ea of [2] to three-dimensional (2 , 1 , 1)-sided q u eries. If a qu ery sp ans more than one x -slab and more than one y -slab, th en it can b e answered b y answering t wo (1 , 1 , 1) - sided queries, one sp ecial (2 , 1 , 1)-sided query that can b e pro cessed using the tec hnique of Lemma 3, and one (2 , 1 , 1)-sided query to a data structure with n 1 / 2+ γ p oin ts. If a query is conta ined in a slab, then it can b e answ ered by a data stru cture that contai ns n 1 / 2+ γ p oin ts. W e will sho w b elo w that the query time is O ((log log n ) 3 ). Each p oint is stored in O (2 i ) data stru ctures on recursion lev el i , bu t sp ace usage can b e r educed b ecause the num b er of p oin ts in data str u ctures quic kly decreases with the r ecursion level. W e will show b elo w that ev ery p oin t in a data stru cture on recursion lev el i can b e stored with approximate ly (log n/ 2 i ) log ε ′ n bits f or an arbitrarily small ε ′ . Query Time. Given a query Q = [ a, b ] × ( −∞ , c ] × ( −∞ , d ] we ident ify the indices i 1 , i 2 , and j 1 suc h that pro jections of all cells C ij , i 1 < i < i 2 , j < j 1 , are en tirely con tained in [ a, b ] × ( −∞ , c ]. Let a 0 = x i 1 , b 0 = x i 2 − 1 , and c 0 = y j 1 − 1 . The query Q can b e r epresen ted as Q = Q 1 ∪ Q 2 ∪ Q 3 ∪ Q 4 , where Q 1 = [ a 0 , b 0 ] × ( −∞ , c 0 ] × ( −∞ , d ], Q 2 = [ a, a 0 ) × ( −∞ , c ] × ( −∞ , d ], Q 3 = ( b 0 , b ] × ( −∞ , c ] × ( −∞ , d ], and Q 4 = [ a 0 , b 0 ] × ( c 0 , c ] × ( −∞ , d ]. See Fig. 1 for an example. Qu ery Q 1 can b e answe red using D t . Q ueries Q 2 and Q 3 can b e represented as Q 2 = ([ −∞ , a 0 ) × ( −∞ , c ] × ( −∞ , d ]) ∩ X i 1 and Q 3 = (( −∞ , b ] × ( −∞ , c ] × ( −∞ , d ]) ∩ X i 2 ; h ence, Q 2 and Q 3 are equiv alent to (1 , 1 , 1)-sided queries on x -slices X i 1 and X i 2 . The query Q 4 can b e answ ered by a recursively deﬁned data structure for the y -slice Y j 1 b ecause Q 4 = ([ a 0 , b 0 ] × ( −∞ , c ] × ( −∞ , d ]) ∩ Y j 1 . If i 1 = i 2 and the query Q is cont ained in one x -slice, then Q is pro cessed by a recursive ly deﬁn ed data structur e for the corresp ond ing x -slice. Th us a qu ery is red u ced to one sp ecial case that can b e pro cessed in constan t time, tw o (1 , 1 , 1)- sided qu eries, and one (2,1,1)-sided query ans wered by a data stru cture that conta ins n 1 / 2+ γ elemen ts. Queries Q 2 and Q 3 can b e answ ered in O ((log log n ) 2 ) time, the query Q 1 can b e answ ered in constan t time. The query Q 4 is answered by a recursively d eﬁ ned data str ucture th at con tains O ( n 1 / 2+ γ ) elemen ts. If i 1 = i 2 or j 1 = 1, i.e. if Q is en tirely conta ined in one x -slice or one y -slice, then the query is ans wered by a d ata structure for the corresp onding slice that con tains O ( n 1 / 2+ γ ) 6 elemen ts. Hence, the q u ery time q ( n ) = O ((log log n ) 2 ) + q ( n 1 / 2+ γ ) and q ( n ) = O ((log log n ) 3 ). Space Usage. Th e data stru cture consists of O (log log n ) r ecursion lev els. The total num b er of p oin ts in all data s tructures on the i -th recursion leve l is 2 i n . Hence all data structur es on the i -th recursion lev el requ ir e O (2 i n log n ) b its of sp ace. The space u sage can b e redu ced b y ap p lying the reduction to rank sp ace technique [10 , 6]. As explained in s ection 2, reduction to rank space allo ws us to replace p oin t co ordinates b y their ran k s . Hence, if w e use this tec hn ique with a data structure that con tains m elements, eac h p oin t can b e sp eciﬁed with O (log m ) bits. Th us, w e can reduce the space usage b y replacing p oin t co ordinates by their ranks on certain recursion lev els. W e apply reduction to rank space on ev ery δ log log n -th recursion lev el for δ = ε/ 3. Let V b e an arbitrary data stru ctur e on recurs ion lev el r = sδ log log n − 1 for 1 ≤ s ≤ (1 /δ ) log 2 1+2 γ 2. Let W b e th e set of p oint s that b elong to an x -slice or a y -slice of V . W e store a dictionary th at enables us to ﬁnd for eac h p oin t p = ( p x , p y , p z ) f r om W a p oint p ′ = ( p ′ x , p ′ y , p ′ z ) where p ′ x = rank( p x , W x ), p ′ y = rank( p y , W y ), p ′ z = rank( p z , W z ), and W x , W y , and W z are th e sets of x -, y -, and z -co ordinates of all p oints in W . Let W ′ b e the set of all p oints p ′ . Conv ersely there is also a dictionary that enables us to ﬁnd for a p oint p ′ ∈ W ′ the corresp onding p ∈ W . The data stru cture that answe rs queries on W stores p oint s in the rank sp ace of W . In general, all data stru ctures on recurs ion lev els r , r + 1 , . . . , r + δ l og log n − 1 ob tained b y sub division of W store p oin ts in r ank space of W . That is, p oint co ordinates in all those data structures are in tegers b ounded by | W | . If su c h a data structur e R is used to an s w er a qu ery Q , then f or eac h p oin t p R ∈ R ∩ Q , we must ﬁ nd the corresp onding p oint p ∈ P . S ince r ange reduction was app lied O (1) time, we can ﬁ nd for any p R ∈ R the corresp ond ing p ∈ P in O (1) time. Eac h data stru cture on leve l r = sδ log log n f or 0 ≤ s ≤ (1 /δ ) v and v = 1 log(2 / (1+2 γ )) con- tains O ( n l ) element s for l = (1 / 2 + γ ) r . Hence an arbitrary element of a data structure on lev el r can b e sp eciﬁed with l · log n b its. The total num b er of elemen ts in all data structures on the r -th lev el is n 2 r . Hence all elemen ts in all data structures on the r -th recursion leve l need O ( n 2 r (( 1+2 γ 2 ) r ) log n log log n ) bits. W e c ho ose γ so th at (1 + 2 γ ) ≤ 2 δ/ 2 . Then v = 1 1 − log 2 (1+2 γ ) ≥ 1 1 − δ/ 2 and (1 + 2 γ ) ≤ 2 δ/ 2 ≤ 2 δ − δ 2 / 2 ≤ 2 δ/v = 2 ε/ 3 v . Since r ≤ v lo g log n , (1 + 2 γ ) r ≤ 2 ( ε/ 3) log log n ≤ log ε/ 3 n . Therefore all data stru ctures on lev el r use log ε/ 3 n · O ( n log n log log n ) = O ( n log 1+2 ε/ 3 n ) bits of space or O ( n log 2 ε/ 3 n ) w ords of log n bits. T he num b er of elemen ts in all d ata stru ctures on lev els r + 1 , r + 2 , . . . increases by a factor tw o in eac h lev el. Hence, the total space (measur ed in words) needed for all d ata structures on all lev els q , r ≤ q < r + δ lo g log n , is ( P δ log log n − 1 f =1 2 f ) O ( n log 2 ε/ 3 n ) = O ( n 2 δ log log n n log 2 ε/ 3 n ) = O ( n log ε n ) b ecause δ ≤ ε/ 3 and 2 δ log log n ≤ log ε/ 3 n . Th us all data structures in a group of δ lo g log n consecutiv e r ecursion lev els u se O ( n log ε n ) words of space. Since there are (1 /δ ) v = O (1) su c h groups of lev els, the total sp ace usage is O ( n log ε n ). Construction Time. Th e data structure on lev el 0 (the topmost recursion lev el) can b e constructed in O ( n log 3 n log log n ) time. Th e total num b er of elemen ts in all data stru ctures on lev el s is 2 s n log log n . But eac h d ata stru cture on the r -th recursion level con tains at m ost n r = n l elemen ts and can b e constructed in O ( l 3 · n r log 3 n log log n ) time w here l = (1 + 2 γ ) r / 2 r . Hence, all data structure on the r -th r ecursion lev el can b e constructed in O ((2 r l 3 ) n log 3 n log log n ) = O (((1 + 2 γ ) 3 r / 2 2 r ) n log 3 n log log n ) time. W e can assume that ε < 1. Sin ce w e chose γ so th at (1 + 2 γ ) ≤ 2 ε/ 6 , (1 + 2 γ ) 3 < 2; hence, (1 + 2 γ ) 3 r / 2 2 r ≤ 1 / 2 r . T hen, all d ata str u cture on the r -th recursion lev el can b e constructed in O ((1 / 2 r ) n log 3 n log log n ) time. S u mming up b y all r , we see that all recursive data stru ctures can b e constructed in O ( n log 3 n log log n ) time.  7 0 2 3 4 5 6 7 2 3 4 5 y y y y y a b c d 1 x x 1 x x x x x x Figure 2: Example of a (2,2,1)-sided qu ery p ro jected onto the xy -plane. i 1 = 2, i 2 = 7, j 1 = 2, and j 2 = 5. Lemma 7 Ther e exists a data structur e that answers (2,2,1)-side d queries on [ n ] × [ n ] × [ n ] grid in O ((log log n ) 3 + k ) time, uses O ( n log ε n ) sp ac e, and c an b e c onstructe d in O ( n log 3 n log log n ) time. Pr o of : The p ro of tec hn ique is the same as in Lemma 6. The grid is divided int o x -slices X i = [ x i − 1 , x i ] × n × n and y -slices Y j = n × [ y j − 1 , y j ] × n in the s ame w ay as in the p roof of Lemma 6. Eac h x -slice X i supp orts (2 , 1 , 1)-sided qu eries op en in + x and − x dir ection; eac h y -slice Y j supp orts (2 , 1 , 1) -sided queries op en in + y and − y direction. All p oin ts are also stored in a d ata structure D t that contai ns a p oint ( i, j, z ) for eac h p oint ( x, y , z ) ∈ P ∩ C ij . F or eve ry x -slice and y -slice there is a recursively deﬁned data structure. The r eduction to rank space tec hnique is app lied on ev ery δ lo g log n -th lev el in the same w a y as in th e L emm a 6. Giv en a query Q = [ a, b ] × [ c, d ] × ( −∞ , e ] we iden tify indices i 1 , i 2 , j 1 , j 2 suc h that all cells C ij , i 1 < i < i 2 and j 1 < j < j 2 are entirely conta ined in Q . Th en Q can b e represented as a u nion of a query Q 1 = [ a 0 , b 0 ] × [ c 0 , d 0 ] × ( −∞ , e ] and fou r (2 , 1 , 1)-sided queries Q 2 = [ a, a 0 ) × [ c, d ] × ( −∞ , e ], Q 3 = ( b 0 , b ] × [ c, d ] × ( −∞ , e ], Q 4 = [ a 0 , b 0 ] × [ c, c 0 ) × ( −∞ , e ], and Q 5 = [ a 0 , b 0 ] × ( d 0 , d ] × ( −∞ , e ], where a 0 = x i 1 , b 0 = x i 2 − 1 , c 0 = y j 1 , an d d 0 = y j 2 − 1 . See Fig. 2 for an example. The query Q 1 can b e answ ered in constant time, and queries Q i , 1 < i ≤ 5, can b e answe red using the corresp onding x - and y -slices. Since queries Q i , 1 < i ≤ 5, are equiv alen t to (2,1,1)-sided queries eac h of th ose queries can b e answ ered in O ((log log n ) 3 + k ) time. If the query Q is ent irely conta ined in one x -slice or one y -slice, then Q is pro cessed by a data structure for the corresp ond ing x -slice resp. y -slice. Since the data s tructure consists of at most v log log n recursion lev els, th e query can b e transferred to a d ata structure for an x - or y -slice at most v log log n times for v = 1 log(2 / (1+2 γ )) . Hence, the total qu ery time is O (log log n + (log log n ) 3 + k ) = O ((log log n ) 3 + k ). The sp ace usage and constru ction time are estimated in th e same wa y as in Lemma 6.  Theorem 1 Ther e exists a data structur e that answers thr e e-dimensional ortho gonal r ange r ep ort- ing queries on the [ U ] × [ U ] × [ U ] grid in O (log log U + (log log n ) 3 + k ) time, u se s O ( n log 1+ ε n ) sp ac e, and c an b e c onstructe d i n O ( n log 4 n log log n ) time. 8 Pr o of : The result for th e [ n ] × [ n ] × [ n ] grid d irectly follo ws from Lemma 7 and L emma 1. W e can obtain the result for the [ U ] × [ U ] × [ U ] grid by applying the redu ction to rank space tec hniqu e [10, 6]: W e can use the v an Emd e Boas data str u cture [9] to ﬁnd pred( e, S ) and succ( e, S ) for any e ∈ [ U ] in O (log log U ) time, where S ⊂ [ U ] is P x , P y , or P z . Hence, the query time is incr eased by an additiv e term O (log log U ) and the sp ace usage r emains unc hanged.  F urthermore, we also obtain the resu lt for d -dimensional r ange rep orting, d ≥ 3. Corollary 1 Ther e exists a data structur e that answers d -dimensional ortho gonal r ange r ep orting queries in O (log d − 3 n/ (log log n ) d − 6 + k ) time, uses O ( n log d − 2+ ε n ) sp ac e, and c an b e c onstructe d in O ( n log d +1+ ε n ) time. Pr o of : W e can obtain a d -dimensional data structure fr om a ( d − 1)-dimensional data structur e using r ange trees w ith no de degree log ε n . See e.g. [2], [12] for d etail s.  Using Theorem 1 w e can r educe the space usage and up d ate time of the semi-dy n amic data structure f or three-dimensional range r ep orting qu eries. Corollary 2 Ther e exists a data structur e that u ses O ( n log 1+ ε n ) sp ac e, a nd supp orts thr e e- dimensional ortho gonal r ange r ep orting queries in O (log n (log log n ) 2 + k ) time and insertions in O (log 5+ ε n ) time. Pr o of : W e can obtain the semi-dyn amic data stru cture fr om the static data structure using a v ariant of the logarithmic metho d [3]. A detailed d escription can b e found in [12]. The space usage remains the same, th e qu ery time increases by a O (log n/ log log n ) factor, and the amortized insertion time is O ( c ( n ) n log 1+ ε n ), where c ( n ) is the constru ction time of the static data stru cture.  The r esu lt of Corollary 2 can b e also extended to d > 3 dimensions using range trees. 4 Three-Dimensional Emptiness Qu eries W e can fur ther reduce the space usage of the three-dimensional data structure if we allo w O (log log n ) p enalties for eac h p oin t in the answer. S uc h a data structure can also b e used to answ er emp tin ess and one-rep orting queries. As in the previous section, we design space eﬃcien t data structures for (2 , 1 , 1)-sided and (2 , 2 , 1)-sided queries. The pro of is quite similar to th e d ata structure of section 3 but some parameters m ust b e c hosen in a sligh tly d iﬀerent wa y . Theorem 2 Ther e e xi sts a data structur e that answers thr e e-dimensional ortho g onal r ange r e- p orting q u eries on the [ U ] × [ U ] × [ U ] grid in O (log log U + (log log n ) 3 + k log log n ) time, u se s O ( n log n (log log n ) 3 ) sp ac e, and c an b e c onstructe d in time O ( n log 4 n log log n ) . F or completeness, we p ro vide the pro of of Theorem 2 in the App endix. Using the standard range trees and r eduction to rank space tec hniqu es we can obtain a d -dimensional data structure f or d > 3 Corollary 3 Ther e exists a data structur e that answers d -dimensional ortho gonal r ange r ep orting queries for d > 3 in O (log d − 3 n (log log n ) 3 + k log log n ) time, uses O ( n log d − 2 n (log log n ) 3 ) sp ac e, and c an b e c onstructe d in O ( n log d +1 n log log n ) time. 9 References [1] P . Afsh ani On Dominanc e R ep orting in 3D , Pro c. ESA 2008, 41-51. [2] S . Alstrup , G. S. Bro dal, T. Rauhe, New Data Structur es for Ortho gonal R ange Se ar ching , Pro c. F O CS 2000, 198-207. [3] J. L. Ben tley , De c omp osa ble Se ar ching Pr oblems , Information Pro cessing Letters 8(5), 244-251 (1979 ). [4] J. L. Ben tley , Multidimensional Divide-and-Conquer , Comm un. ACM 23, 214-229, 1980. [5] P . Bozanis, N. Kitsios, C. Makris and A. T sak alidis, New R esults on Interse ction Qu e ry P r ob- lems , Th e Computer Journ al 40(1), 22-29, 1997. [6] B. Chazelle, A F unctional Appr o ach to D ata Structur es and its Use in M ultidimensiona l Se ar ching , S IAM J. on Compu ting 17, 427-462 (1988). See also F OC S ’85. [7] B. Chazelle, H. Edelsbru nner, Line ar Sp ac e Data Structur es for Two T yp e s of R ange Se ar ch , Discrete & Compu tati onal Geometry 2, 113-126 , 1987. [8] B. Ch azel le, L.J . Guibas F r actional Casc ading: I. A Data Structuring T e chnique , Algorithmica 1 , 133-16 2 (1986) . S ee also ICALP’85. [9] P . v an Emd e Boas, Pr eserving Or der in a F or est in L ess Than L o garithmic Ti me and Line ar Sp ac e , In f. Pr ocess. Lett. 6(3), 80-82 (1977). [10] H. Gab o w, J. L. Ben tley , R. E. T arjan, Sc aling and R elate d T e chniques for Ge ometry Pr oblems Pro c. STOC 1984, 135-143. [11] Y. Nekric h, Sp ac e Eﬃcient Dynamic Ortho gonal R ange R ep orting , Algorithmica 49(2), 94-108 (2007 ) [12] Y. Nekric h, A Data Structur e for Mu lti-Dimensional R ange R ep orting , Pro c. SoCG 2007, 344- 353. [13] S. S ubramanian, S. Ramasw amy , The P- r ange T r e e: A New Data Structur e for R ange Se ar ch- ing in Se c ondary Memory , Pro c. SODA 1995, 378-387 . [14] D. E. V en groﬀ, J. S . Vitter, Eﬃci ent 3- D R ange Se ar ching in External M emory , Pro c. STOC 1996, 192-201. App endix. Pro of of Theorem 2 Lemma 8 Ther e e xists a data structur e that answers (2,1,1)-side d querie s on the [ n ] × [ n ] × [ n ] grid in O ((log log n ) 3 + k log log n ) time, uses O ( n (log log n ) 2 ) sp ac e, and c an b e c onstructe d in O ( n log 3 n log log n ) time. 10 Pr o of : The data structur e consists of the same comp onen ts as the data structure of Lemma 6. But the size of x -slices and y -slices is reduced, so that eac h x -slice and eac h y -slice con tains n 1 / 2 log p n p oin ts for a constan t p ≥ 2 . The data structure D t con tains a p oin t ( i, j, z min ) for eac h cell C ij = X i ∩ Y j , C ij ∩ P 6 = ∅ , su c h that z min is the min imal z -co ordinate of a p oin t in C ij ∩ P . The data structure D t can con tain up to n / log 2 p n elemen ts. Com binin g the results of Lemma 1 and Lemm a 5, w e can implement D t in O (( n/ log 2 p n ) log n log log n ) = O ( n ) sp ace, so that queries are su pp orted in O ((log log n ) 2 + k ) time. A list L ij con tains all p oin ts in C ij sorted b y their z -co ordinates. F or eac h x -slice X i , there are tw o data str uctures that sup p ort (1 , 1 , 1)-sided queries op en in + x and − x direction. F or eac h y -slice Y j there is a data structure f or (1 , 1 , 1) -sided qu eries op en in + y direction. F or eac h x -slice and y -slice, th ere is a recur siv ely deﬁn ed d ata structure. As sh o wn in Prop osition 1 of [11], the total num b er of elemen ts in a data stru cture on the r -th recursion lev el can b e estimated as s r ( n ) = O ( n 1 / 2 r log p n √ log log n ). The recursive sub-division stops when a data structure con tains no more than log n elemen ts. In this case, the data structur e is imp lemented us in g e.g. the data structure of [2], so that queries are answ ered in O (log log n ) time and O (log n (log log n ) 1+ ε ) space. In the same wa y as in Lemma 6, the query Q can b e represented as a un ion of (at most) one (2,1,1) -sided qu ery on D t , t wo (1,1,1)-sided qu eries on x -slices, and one (2,1,1)-sided query on a recursiv ely d eﬁned data stru ctur e for a y -slice. Hence, the quer y time is O ((log log n ) 3 ) if we ignore the time we n eed to output p oin ts in the ans wer. Unlik e the data structure of Lemma 6, w e app ly range reduction on eve ry recursion level . Since the num b er of elements in a data structur e on level r is s r ( n ) = O ( n 1 / 2 r log p n √ log log n ), every elemen t in a data structure on lev el r can b e represent ed with log ( s r ( n )) = O ((1 / 2 r ) log n + log log n ) bits. Each d ata stru cture on level r uses O ( s r ( n ) log( s r ( n )) log log ( s r ( n ))) = O ( s r ( n ) log( s r ( n )) log log n ) bits. Th e total num b er of elemen ts in all data s tructures on level r is O ( n 2 r ). Hence, all lev el r d ata structures need O ( n log n log log n + n 2 r (log log n ) 2 ) bits. Su mming up by all recursion lev els, the total space us age is O ( n log n (log log n ) 2 ) + P r max − 1 r =1 n 2 r (log log n ) 2 bits. The maxim um recursion lev el r max = log log n + c r for a constan t c r . Hence, the sec- ond term can b e estimated as P r max r =1 n 2 r (log log n ) 2 = O ( n log n (log log n ) 2 ). If a data struc- ture on the recur sion leve l r max con tains m elemen ts, then it uses O ( m (log log n ) 1+ ε ) wo rds of space b ecause m ≤ log n . All data stru ctures on lev el r max con tain O ( n log n ) elemen ts and use O ( n log n (log log n ) 1+ ε ) bits of space. Thus the data stru cture us es O ( n (log log n ) 2 ) words of log n bits in total. The d ra wbac k of app lying redu ctio n to r an k space on eac h recursion lev el is that we must pa y a (higher than a constan t) p enalt y for eac h p oin t in the answer. Consider a data structure D r on the r -th level of r ecursion, and let P r b e the set of p oin ts s tored in D r . Co ordinates of an y p oin t stored in D r b elong to the rank sp ace of P r . T o obtain the p oint p ∈ P th at corresp onds to a p oint p r ∈ P r w e need O ( r ) = O (log log n ) time. Hence, our d ata stru cture ans wers queries in O ((log log n ) 3 + k log log n ) time. The constru ctio n time can b e estimated with the formula c ( n ) = O ( n log 3 n log log n ) + 2( n 1 / 2 / log p n ) c ( n 1 / 2 log p n ) Therefore, c ( n ) = O ( n log 3 n log log n ).  Lemma 9 Ther e e xists a data structur e that answers (2,2,1)-side d querie s on the [ n ] × [ n ] × [ n ] grid in O ((log log n ) 3 + k log log n ) time, uses O ( n (log log n ) 3 ) sp ac e, and c an b e c onstructe d in O ( n log 3 n log log n ) time. 11 Pr o of : The data stru cture is the same as in Lemma 8 but in eac h x -slice there are t wo data structures for (2 , 1 , 1)-sided quer ies op en in + x and − x directions, and in eac h y -slice there are tw o data stru ctures for (2 , 1 , 1)-sided qu eries op en in + y and − y direction. The qu ery is pr ocessed in the same wa y as in Lemma 7. The space usage can b e analyzed in the same w a y as in L emma 8. Construction time can b e estimated with the formula c ( n ) = O ( n log 3 n log log n ) + 2( n 1 / 2 / log p n ) c ( n 1 / 2 log p n ) and c ( n ) = O ( n log 3 n log log n ).  Finally , we can apply Lemma 1 and r eduction to rank sp ace and obtain the data stru ctur e for three-dimensional orthogonal r ange rep orting qu er ies on the [ U ] × [ U ] × [ U ] grid . 12

Space Efficient Multi-Dimensional Range Reporting

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment