A General Framework for Graph Sparsification
Given a weighted graph $G$ and an error parameter $\epsilon > 0$, the {\em graph sparsification} problem requires sampling edges in $G$ and giving the sampled edges appropriate weights to obtain a sparse graph $G_{\epsilon}$ (containing O(n\log n) ed…
Authors: Ramesh Hariharan, Debmalya Panigrahi
A General Frame work for Graph Sparsifica tion Ramesh Hariharan Strand Life Sciences Debmalya Panigrahi CSAIL, MIT Abstract Giv en a weigh ted g raph G an d an erro r p arameter ε > 0, th e graph sparsification prob lem requ ires sampling edges in G and gi ving the s ampled edges approp riate weights to obtain a sparse graph G ε with the following pr operty: the weig ht of ev ery cut in G ε is within a factor o f ( 1 ± ε ) of the weig ht of the correspo nding cut in G . Bencz ´ ur and Karger [2] sh owed how to ob tain G ε with O ( n log n / ε 2 ) edges in time O ( m lo g 3 n ) for weighted graph s an d O ( m lo g 2 n ) for unweigh ted g raphs using a comb inatorial approa ch based on strong connectivity . Spielm an et al [22] sho wed how to obtain G ε with O ( n log n / ε 2 ) edges in time O ( m log c n ) for so me (large) con stant c u sing an alg ebraic appr oach b ased on e ffecti ve resistances. Our con tributions are as below (all for weighted gr aphs G with n vertices an d m ed ges having polynomial-sized weights, unless otherwise stated): • Bencz ´ ur and Kar ger [2 ] conjectured that usin g standard connectivity instead of stron g conn ectivity for sampling would simplify the result substantially , and p osed this as an op en q uestion. In this correspo ndence, we reso lve this question by showing that sampling using stand ard connectivities also preserves cut weights and yields a G ε with O ( n log 2 n / ε 2 ) edg es. • W e pr ovide a very simple strictly linear time algorith m (i.e. O ( m ) time) fo r gr aph sparsification that yields a G ε with O ( n log 2 n / ε 2 ) edg es. • W e provide another algorithm fo r gr aph sparsification that yields a G ε with O ( n log n / ε 2 ) ed ges in O ( m log 2 n ) time (f or unweighted graphs, this reduces to O ( m log n ) time). • Combining the ab ove two results, we obtain the fastest k nown algor ithm for obtaining a G ε with O ( n lo g n / ε 2 ) edge s; th is a lgorithm run s in time O ( m + n log 4 n / ε 2 ) where as the pre viou s best bound is O ( m lo g 3 n ) . • If G has arbitrar y edge weights, we give an O ( m log 2 n ) -time algorithm that yields a G ε containing O ( n lo g 2 n / ε 2 ) edg es. The previous best bound is O ( m log 3 n ) time fo r a G ε with O ( n log n / ε 2 ) edges. • Most importan tly , we provide a gen eric framework that sets out sufficient cond itions for any partic- ular samplin g scheme to result in good sparsifiers; all th e above results can be obtain ed by simple instantiations of this framew ork, as can k nown results on samp ling by stron g co nnectivity and sampling by effecti ve resistance s 1 . Our algorithms are Monte -Carlo, i.e. work with high probab ility , as are all efficient algorithms for graph sparsification. A k ey ing redient o f o ur p roofs is a generalization of boun ds on the numb er of small cuts in an undirected graph due to Karger [8]; this generalization might be of independent interest. 1 with a G ε that is slightly denser than the best-kno wn result for the ef fectiv e resistance case. 1 1 Introd uction A cut o f an undirec ted graph i s a partit ion of it s v ertices i nto tw o disjoint sets . The weight of a c ut is the sum of w eights of the edges crossing the cut, i.e. edge s havin g one endpoi nt each in the two verte x subsets of the part ition. For un weighted graphs, each ed ge is as sumed to ha ve unit weig ht. Cuts play an i mportant role in many problems in graphs: e.g., the maximum flow between a pair of vert ices is equal to the minimum weight cut separa ting them. A skelet on G ′ of an u ndirecte d graph G is a s ubgraph o f G on t he s ame se t of vertice s where each edge in G ′ can ha ve an arbitrary weight. In a se ries of results , K ar ger [9, 10] sho w ed that an app ropriate ly weighted sparse skeleton generated by r andom sampling of edges approximatel y preserv es th e weight o f every cut in an undirected graph. This ser ies of resul ts culminated in a semina l work by Benc z ´ ur and Kar ger [2] that sho wed the follo wing theorem. Throug hout this paper , for any undirecte d graph G and any ε ∈ ( 0 , 1 ] , ( 1 ± ε ) G is the set of all appropriat ely weighted subgrap hs o f G where the weight of ev ery cut in the subgra ph is within a facto r of ( 1 ± ε ) of th e weight of the correspondi ng cut in G . Theor em 1 (Bencz ´ ur -Karge r [2]) . F or any undir ected gr aph G w ith m edges and n vertices , and for any err or paramet er ε ∈ ( 0 , 1 ] , the r e e xists a skeleto n G ε contai ning O ( n log n ε 2 ) edges such that G ε ∈ ( 1 ± ε ) G with high pr obability . 2 Further , su ch a skeleton can be found in O ( m log 2 n ) time if G is unweighted and O ( m log 3 n ) time otherwis e. Besides its combinatorial ramification s, the importance of this result stems from its use as a pre-pr ocessing step in sev eral graph algorithms, e.g. to obtain an ˜ O ( n 3 / 2 + m ) -time algorithm for approximate maximum flow using the ˜ O ( m √ m ) -time al gorithm for exac t maxflow due to Goldber g and R ao [6]; and more recen tly , ˜ O ( n 3 / 2 + m ) -ti me algorithms for approxi mate sparsest cut [12, 20]. Subsequ ent to Bencz ´ ur and Karg er’ s work, Spiel man and T eng [23, 24] extende d their results to pre- servin g all quadratic forms, of which cuts are a special case; howe ver , the size of the skelet on constr ucted was O ( n log c n ) for some lar ge co nstant c . Spielman and Sriv asta va [22] improv ed this result by co nstructi ng ske letons of size O ( n log n ε 2 ) in O ( m log O ( 1 ) n ) time, whi le co ntinuin g to preserve all quadra tic forms . Recently , this result was further impro ve d by Batson et al [1] who gav e a determini stic algorith m for co nstructin g ske letons of size O ( n ε 2 ) . While their result is optimal in terms of the size of the skel eton constru cted, the time comple xity of their algorithm is O ( mn 3 ε 2 ) , renderi ng it some what useless in terms of applicat ions. Bencz ´ ur and K ar ger [2], and Spielman et al [23, 24, 22, 1] use contras ting techn iques to obtain their respec tiv e results; the former use combinat orial graph technique s while the latter use algebraic graph tech- niques . In each case, the goal is to obtain a proba bility valu e p e for each edge e so that sampling each edge e indep endently w ith probability p e and gi ving each sampled edge e a weight 1 / p e yields G ε ∈ ( 1 ± ε ) G . Bencz ´ ur and Karg er [2] choose p e in versel y proport ional to the str ong connectivi ty of e while Spielman et al [23, 24, 22, 1] choos e p e propo rtional to the eff ective res istance of e (both concepts are defined belo w). Definition 1. The stron g conn ecti vity of an edge ( u , v ) in an undir ected graph G is the maximum value of k suc h that ther e is an induced subgr aph G ′ of G contain ing both u and v, and every cut in G ′ has weight at least k. Definition 2. The effe cti ve resista nce of an edge ( u , v ) in an undir ected gra ph G is the eff ective electrica l r esist ance between u and v if each edg e in G is r eplaced by an electr ical res istor betwee n its end points whose electri cal r esist ance is equa l to the weight of the edg e. 2 W e say that a property holds wit h high pro bability (or whp) for a graph on n v ertices if its failure probability can be boun ded by the in verse of a fi xed po lynomial in n . 2 1.1 Our Results W e ob tain the follo wing results. The Generic Framework. W e provi de a general pro of framewo rk as follo ws. For any giv en sampling scheme (i.e., assi gnment to the p e ’ s), we sho w that if th is assi gnment satisfies two suffici ent con ditions, then the s ampling s cheme re sults in good sp arsifiers. All of the results stated b elo w are then si mple inst antiation s of the above framew ork, i.e. we show that the sufficie nt condition s hold. The resulting algorithms are also much simpler than those in [2] or in [22, 1]. Fas ter Algorithms. Our first result is an ef ficient algorith m for construc ting a sparse skele ton. Theor em 2. Suppose G is an undir ected graph with n vertices and m edg es. Then, for any fixed ε ∈ ( 0 , 1 ] , ther e is an efficie nt algorith m for finding a ske leton G ε of G having O ( n log n ε 2 ) edg es in expe ctation suc h that G ε ∈ ( 1 ± ε ) G whp. The time comple xity of the algorithm is O ( m + n log 4 n / ε 2 ) if the weights of all edges ar e bound ed by a fixed polynomial in n (including all unweighted graphs) . This is the first sampling algorith m that runs in time strictly linear in m ; all pre vious algori thms had a time bound of at least O ( m log 2 n ) for unweight ed graphs, and O ( m log 3 n ) for weighted graphs. This algori thm impro ves the time comple xity of sev eral problems, where creating a graph sparsifier in the first step. W e mention some of these appli cations. • This yields an O ( m ) + ˜ O ( n 3 / 2 / ε 3 ) -time algorithm for finding the ε -appro ximate maximum flow be- tween two vertice s of an undirected graph using the exact maxflow algorithm in [6]. The previo us best algori thm had a running time of O ( m log 3 n ) + ˜ O ( n 3 / 2 / ε 3 ) . • This yields an O ( m ) + ˜ O ( n 3 / 2 ) -time algorithm for finding an O ( log n ) -app roximate sparsest cut [12, 20], and an O ( m ) + ˜ O ( n 3 / 2 + δ ) -time algorithm for finding an O ( √ log n ) -ap proximate sparsest cut for any constant δ [20]. The pr ev ious best algorithms had running time of O ( m log 3 n ) + ˜ O ( n 3 / 2 ) and O ( m log 3 n ) + ˜ O ( n 3 / 2 + δ ) respe cti vely . The sampling algorithm in Theorem 2 is obtained by composing two differe nt algorithms descri bed below . The first algori thm is fast b ut generates a slightly denser skeleton. The second (slo wer) algorithm then operat es on this sk eleton to obtai n a smaller sk eleton. Theor em 3. Suppose G is an undir ected graph with n vertices and m edg es. Then, for any fixed ε ∈ ( 0 , 1 ] , ther e is an ef ficient algor ithm for findi ng a sk eleton G ε of G having O ( n log 2 n ε 2 ) edges in ex pectatio n such that G ε ∈ ( 1 ± ε ) G whp. The time comple xity of the algorit hm is O ( m ) if the weights of all edges ar e bounded by a fixed polynomial in n (including all unweight ed graph s), an d O ( m log 2 n ) if the edge s have arbitra ry weights. Theor em 4. Suppose G is an undir ected graph with n vertices and m edg es. Then, for any fixed ε ∈ ( 0 , 1 ] , ther e is an algorith m fo r finding a ske leton G ε of G having O ( n log n ε 2 ) edg es in expec tation such that G ε ∈ ( 1 ± ε ) G whp. The time complexi ty of the algor ithm is O ( m log n ) for unweighted graphs and O ( m log 2 n ) if the weights of all edge s ar e bounded by a fixed polynomial in n (including all unweighted graphs ). 3 Sampling by S tandard C onnectiv ity , Effec tiv e Resista nces a nd Stro ng Connectivity . In proving Theo- rem 1, t he author s had to use stron g connecti vity because the more natural notion of standar d connectiviti es seemed to pose complic ations. Definition 3. The standard conne cti vity , or simply conn ecti vity , of an edg e ( u , v ) in an undir ected grap h G is the maximum flow between u and v in G. The authors conjecture d that using standard connecti vity instead of strong connec tiv ity for sampling would simplify the resu lt subs tantially , and posed this as their main open question. In this correspon dence, we resolv e this question by sho wing that sampling using standard connecti vities also preserve s cut weights. Theor em 5. S uppose G is an undir ected grap h on n vertices. F or any fixed ε ∈ ( 0 , 1 ] , let G ε be a skeleton of G formed by sampling edge e in G w ith pr obability 3 p e = min ( 96 ( 3 + lg n ) ln n 0 . 38 k e ε 2 , 1 ) , wher e k e is the standa r d conne ctivity of edge e in G. If selected in the sample , edge e is giv en a weight of 1 / p e in th e skeleton . Then, G ε has O ( n log 2 n ε 2 ) edg es in expe ctation and G ε ∈ ( 1 ± ε ) G whp. Observ e that the size of the skele ton constructed using stand ard connecti vity has an extra log n fac tor com- pared to that con structed using st rong connec tiv ity . W e c onjectur e that this fact or can inde ed be re move d by more careful analysi s. W e sho w that exact ly the same proof as above holds if we replace standard connec tiv ity with ef fective r esist ance of an edge. Thus, we sh ow th at sampling edge s using eff ecti ve resista nces also produces a sparse ske leton that approx imately preserv es all cut weight s, a result independen tly obtain ed by S pielman and Sri vas tav a recent ly for the l arg er class of all q uadratic forms (cuts are a specia l type of q uadratic f orms) wit h a tighte r bound on the size of the ske leton [22]. O ur result , though weaker , has a much simpler proof. W e als o show tha t the resu lts obtain ed in [2] using strong connect ivi ty can be obtained as a simple instan tiation of our general sampling frame work. Generaliza tions of Cut C ounting. The edge connectivi ty of an undirected graph is the minimum weight of a cut in the graph. A key ingre dient in the proof of Theorem 1 is a cele brated theore m due to Kar ger [8]) that giv es tight bounds on the number of distinct cuts of a fi xed w eight in an undirect ed graph in terms of the ratio of the weight of the cuts to the edge connec tiv ity of the graph. Theor em 6 (Karg er [8]) . F or an undir ected graph w ith edge connectivity c and for any α ≥ 1 , the number of cuts of weight at most α c is at most O ( n 2 α ) . While this theore m is extremel y useful in boundin g the number of small cuts in an undirected graph (e.g. in samplin g [9, 10, 2], network reliabi lity [11], etc.), it does not shed any light on the distrib ution of edges accord ing to their connecti vities in cuts. W e gener alize the abo ve theorem and sho w that though there may be man y distinct cuts of a fixed l arg e w eight in a g raph, there are a small number of distin ct sets of edges in these cuts if we restrict our attention to only edges with large (standard) connect iv ity . T o state our theor em precis ely , w e nee d to introd uce the notion of k-heavy and k-light edg es, and that of the k-pr ojecti on of a cut. Definition 4. An edg e is said to be k -heavy if it has connec tivity at least k, and k-light other wise. The k-pr ojection of a cut is the set of k-heavy edges in the cut. Since ev ery edge has connec tiv ity at least c , T heorem 6 can be interpreted as bounding the number of distin ct k -projection s of cuts of size α k by O ( n 2 α ) for k = c . W e ge neralize this result to arbitrary valu es of k . 3 ln n = log e n ; lg n = log 2 n . 4 Theor em 7. F or any u ndir ected gr aph with edg e co nnectivi ty c and for any k ≥ c a nd any α ≥ 1 , the number of disti nct k-pr oject ions of cuts of weight at most α k is at most n 2 α . W e be lie ve this theore m will be of indepen dent interest . Roadmap. In sectio n 2, we describe our generic sampl ing fra mewo rk, and p rovid e one example of inst an- tiating this framewo rk that prov es Theorem 3 for the unweighte d case. In section 3, we prov e Theorem 7 and use it to pro ve Theorem 8, the main frame work theorem stated in section 2. In section 4, we giv e two sampling algori thms for graphs with polynomial edge weights : the fi rst algorithm construc ts skeleto ns contai ning O ( n log 2 n ε 2 ) edges in expect ation and has time comple xity O ( m ) , thus proving Theorem 3 for the polyn omial weights case; the second algorithm constructs skele tons contain ing O ( n log n ε 2 ) edges in expecta- tion and h as t ime co mplexit y O ( m log n ) for unweigh ted graphs, and O ( m log 2 n ) for g raphs with po lynomial edge weights, th us provi ng Theore m 4 . Combining these two theorems p rov es Theorem 2. In section 5, we p rov e Theorem 5 and sho w that res ults on sampl ing by ef fecti ve resista nces and sampling by strong conne cti vities can also be deriv ed from our frame work. Final ly , in section 6, we gi ve a samplin g algorithm for graphs with arbitra ry edge weights that constructs skeleto ns containing O ( n log 2 n ε 2 ) edges in expect ation and has time comple xity O ( m log 2 n ) , thus pro ving T heorem 3 for the arbi trary w eights case. 2 The Generic Framework W e de scribe a generic sampling framewo rk—each of our indi vidual samplin g schemes is obtained by a particu lar setting of parameter s of this generic frame work. Suppose G = ( V , E ) is an und irected graph where e dge e ∈ E has weight w e . W e will assume throu ghout that w e is a positi ve inte ger . Let G M = ( V , E M ) denot e the multi-gr aph constr ucted by replacin g each edge e by w e unweigh ted paralle l edges e 1 , e 2 , . . . , e w e . Consider any ε ∈ ( 0 , 1 ] . W e constru ct a skel eton G ε where each edge e ℓ ∈ E M is present in graph G ε indepe ndently with probability p e , an d if present, it is g iv en a weight of 1 / p e . (For algorith mic efficienc y , observ e that an identical ske leton can be created by assigni ng to edge e a weig ht of R e / p e where R e is generated fro m the binomial dis trib ution B ( w e , p e ) ; thi s can be d one in time O ( w e p e ) rather than time O ( w e ) (see e.g. [7])). What value s of p e result in a sparse G ε that satisfies G ε ∈ ( 1 ± ε ) G ? Let p e = min ( 96 α ln n 0 . 38 λ e ε 2 , 1 ) , where α is indep endent of e and λ e is some parameter of e satisfyin g λ e ≤ 2 n − 1. The exa ct choice of v alues for α and the λ e ’ s will var y from application to appli cation. H o wev er , we describe below a suf ficient condition that charac terizes a good choice of α and λ e ’ s. T o describe this suf ficient con dition, partition the edges in G M accord ing to the v alue of λ e into sets F 0 , F 1 , . . . , F k where k = ⌊ lg max e ∈ E { λ e }⌋ ≤ n − 1 and e i ∈ F j if f 2 j ≤ λ e ≤ 2 j + 1 − 1. No w , let G = G 0 , G 1 , G 2 , . . . , G i = ( V , E i ) , . . . , G k be a set of subgraphs of G M (we allow edge s of G M to be replicated multiple times in the G i s) such that F i ⊆ E i for ev ery i . G is said to be a ( π , α ) -certi ficate correspon ding to the abo ve choice of α and λ e ’ s if the follo w ing properties are satisfied: π -connecti vity For i ≥ 0, any edge e ℓ ∈ F i is π -hea vy in G i . α -ov erlap For any cut C contain ing c edges in G M , le t e ( C ) i be the number of edges that cross C in G i . T hen, for all cuts C , ∑ k i = 0 e ( C ) i 2 i − 1 π ≤ α c . Theorem 8 des cribes the suf fi cient cond ition; its proof app ears later in section 3. T he intui tion for this proof is as follo ws. Consider all cuts C in G M ; restric t each cut to just the edges in F i (we do this because 5 edges in F i ha ve roughly the same sampling probab ilities, which enable s an easy applicat ion of Chernof f bound s). How many such distinct F i -restri cted cuts are the re? Or ganize all cuts C in G M into dou bling cate gories, each comprising cuts with roughly equal val ues of e ( C ) i ; now using Theorem 7 as applied to G i and the π -conne cti vity property abov e, we can conclu de that this count is n O ( e ( C ) i / π ) per category . N ext , for a particular cut C and its F i -restri ction, we need to apply an appropriat e Chernof f bound with a carefully chosen de viation -from-e xpectation parameter so that this de viation has probab ility at most n − Ω ( e ( C ) i / π ) ; this probab ility of fsets the abov e count, thereby allowing us to claim that this dev iation holds for all cuts in one doubl ing category (and the numbe r of catego ries is not too many , so the same fa ct extends across cate gorie s as w ell). The actual va lue of this dev iation comes out to be O ( ε ) · e ( C ) i π · 2 i − 1 α . T he α -ov erlap propert y now allo ws us to bound the sum of this de viation ove r all i , 0 ≤ i ≤ k , by ε c , as requi red. Theor em 8. If th er e exi sts a ( π , α ) -certi ficate for a p articula r choic e of α and λ e ’ s , then the sk eleton G ε ∈ ( 1 ± ε ) G with pr obab ility at least 1 − 4 / n. F urther G ε has O ( α log n ε 2 ∑ e ∈ E w e λ e ) edg es in e xpecta tion. 2.1 A Simple Algorithm for Un weighted Graphs W e sho w ho w we can i nstantia te the abo ve frame work with specific val ues of α , λ e ’ s to obtain a very si mple sampling algori thm that runs in O ( m ) time and ob tains a skelet on of size O ( n log 2 n ε 2 ) . This pro ves Theorem 3 for the unweighte d case. In order to prese nt our sampling algorithm, w e need to define the notio n of spanning for ests . As earlier , G denot es a graph with integ er edge weights w e for edge e and G M is the unweight ed multi-gra ph where e is repla ced with w e paralle l unweighted edges. Definition 5. A spanning forest T of G M (or equiva lently of G) is an (unweighted) acyclic subgra ph of G satisfy ing the pr operty that any two vertices are c onnecte d in T if and on ly if the y ar e connected in G. W e partition the set of edges in G M into a set of forests T 1 , T 2 , . . . using the follo wing rule: T i is a spanning for est of the gr aph formed by remo ving all edg es in T 1 , T 2 , . . . , T i − 1 fr om G M suc h that for any edg e e ∈ G, a ll its co pies in G M appea r in a set of cont iguous for ests T i e , T i e + 1 , . . . , T i e + w e − 1 . T his p artitioni ng techniqu e was introd uced by Nagamoch i and Iba raki in [19], and these forests are kno wn as Nagamoc hi-Ibar aki fore sts (or NI forests) . The follo wing is a basic property of NI forests. Lemma 1 (Nagamochi- Ibaraki [19, 18]) . F or any pair of vertices u , v, the y ar e conne cted in NI for ests T 1 , T 2 , . . . , T k ( u , v ) for some k ( u , v ) and not connec ted in any for est T j , for j > k ( u , v ) . Nagamochi and Ibaraki also gav e an algorith m for constructin g NI forests that runs in O ( m + n ) time if G M is a simple graph (i.e. G is unweighted) and O ( m + n log n ) time othe rwise [19 , 18]. Note that our sampling schemes are relev ant only when m > n log n ; therefore, the N I forests can be constructed in O ( m ) time for all rele v ant input graphs. W e set λ e to the index of the NI forest that e appears in, and set α = 2 and π = 2 i − 1 . For any i > 0, let G i contai n all edges in NI forests T 2 i − 1 , T 2 i − 1 + 1 , . . . , T 2 i + 1 − 1 ; let G 0 = F 0 = T 1 . Each edge in F i appear s exa ctly once in G i , once in G i + 1 , and does not a ppear at all in any of the other G j ’ s, j 6 = i , i + 1. This prov es α -ov erlap. Further , fo r any edge e ∈ F i , i > 0, L emma 1 ensures that the endpoin ts of e are connected in each of T 2 i − 1 , T 2 i − 1 + 1 , . . . , T 2 i − 1 . It follo ws that e i s 2 i − 1 -hea vy in G i , th ereby pro ving π -conne cti vity . W e ca n no w in vok e T heorem 8 an d conclude that this sampling sche me results in G ε ∈ ( 1 ± ε ) G with pro bability at least 1 − 4 / n . It remains to bound the number of edges in G ε , as follo w s. 6 Since w e = 1 for each edge e and the total number of NI forests K is at most n 2 , we ha ve ∑ e ∈ E w e λ e = ∑ e ∈ E 1 λ e = K ∑ j = 1 ∑ e ∈ T j 1 λ e = K ∑ j = 1 ∑ e ∈ T j 1 j ≤ ( n − 1 ) K ∑ j = 1 1 j = O ( n log K ) = O ( n log n ) . It follo ws from Theorem 8 that G ε has O ( n log 2 n ε 2 ) edges . The time comple xity for con structin g the N I forests is O ( m ) and tha t for sampli ng is O ( 1 ) per ed ge gi ving another O ( m ) ; so ove rall, the algorithm takes O ( m ) time. 3 Pr oofs of Main Theorems In this sectio n, we will first prov e Theorem 7, and then use it to prov e Theorem 8. Let us start by defining k -heav y and k -light vertices . Definition 6. A verte x in an undir ected gr aph is said t o be k -heavy if at least o ne e dge inc ident on the verte x is k-heavy; otherwise, the verte x is said to be k -light. W e ne ed the follo wing property of k -hea vy vertices. Lemma 2. T he sum of weights of edg es incident on a k -heavy verte x is at leas t k . Pr oof. For any k -hea vy ve rtex v , there exist s some other ver tex u such that the m axflo w between u and v is at least k . Thus, any cut separat ing u and v must hav e weight at least k ; in particular , this holds for the cut contai ning only v on one side. Suppose G is an any weighted undirecte d graph. W e scale up the weights of all edges in G uniformly until the weight o f e very edge is an ev en inte ger; call this graph G s . W e replace each edge e = ( u , v ) of weight w e in G S with w e paralle l unweighted edges between u and v to form an unweighted multi-graph G M . Clearly , any cut in G M has an ev en number of edges. Theorem 7 holds for any va lue of k in G if and only if it holds for a ny ev en inte ger k in G M . T herefo re, it su ffice s to prov e Theo rem 7 for a ll e ven in teger s k on u nweighted multigra phs where t he wei ght of e very cut is e ven. W e also as sume tha t G M is connecte d; if not, t he t heorem holds for the entire graph since it holds for each conne cted componen t. W e introdu ce two operations on undirected mult igraphs: spitting-o ff an d edge contra ction . The splitting- of f operati on was introd uced by Lov ´ asz in [13, 14] (ex. 6.53): Definition 7. A pair of edges ( s , u ) and ( u , t ) ar e said to be split-of f in an undir ected m ultigr aph if the y are r eplac ed by a sing le edg e ( s , t ) . V ariou s properties of the splitting-o ff operation hav e been explore d [15, 16, 5, 25]. W e need the follo wing proper ty . Definition 8. F or any k > 0 , a splitting-o ff oper ation is sa id to be k -preserv ing if all edges in the graph (e xcept those bein g split-of f) that wer e k-heavy befor e the splitti ng-of f continue to be k -heavy afte r the splitti ng-of f. The follo wing lemma is a corolla ry of a deep resul t of Mader [15] for splitting-of f edges while m aintain ing the maxflo ws of pairs of verti ces; ho we ver , we giv e a much simpler direct proof of this lemma here. 7 Lemma 3 . Suppose G M is an undir ected multigr aph wher e every cut con tains an e ven number of edge s. Let k > 0 be any even inte ger . Then, for any k -light non-is olated verte x u in G M , ther e exis ts a pair of edg es ( s , u ) and ( u , t ) such that splitt ing-of f this pair is k -pr eserving. Pr oof. W e will pro ve that for ev ery edge ( s , u ) , there exi sts an edge ( u , t ) such that splitting-of f this pair of edges retains the follo w ing property : any pair of vertices x , y that wer e k -conne cted (i.e. had a maxflow of at least k ) befor e the splitt ing-of f continue to be so after the splittin g-of f . W e define a k -separ ator to be any cut t hat separa tes at least one p air of k -connec ted ver tices, and ca ll a k -separa tor w ith e xactly k edg es a tig ht cut. Since all cuts hav e ev en number of edges and the weight of a cut can decreas e by at most 2 due to a splitti ng-of f operation , we only need to ensure that we do not decrease the number of edges in any tight cut when we split- of f a pair of edges. Suppose t here e xists no edge ( u , t ) such th at spli tting-of f ( s , u ) and ( u , t ) retains th e k -hea vy prope rty for all k -hea vy edges. Then, for ev ery neighbor t (other than s ) of u , there ex ists at least one tig ht cut ha ving s , t on one si de and u on th e other . Consider a minimum-siz ed collection of tight cuts X 1 , X 2 , . . . , X ℓ , wher e X i is the subset of ve rtices on the side of the cut not containing u . If ℓ = 1, moving u to the side of X 1 produ ces a k -separator containing less than k edges, which is a contra diction. Thus ℓ ≥ 2. Now , let A = X 1 ∩ X 2 ; B = X 1 \ X 2 ; C = X 2 \ X 1 ; D = V \ ( X 1 ∪ X 2 ) . Then, s ∈ A and u ∈ D . Since X 1 and X 2 are k -separ ators, either (1) A and D are k -separator s, or (2) B and C are k separators. In eit her case, this pair of k -separ ators must be tight cut s since they con tain at least k edges each be ing k -separator s and at most k edge s ea ch b ecause their to tal number of edges is at most that of X 1 and X 2 . If A and D are tight cuts, we can replace cuts X 1 and X 2 by D in the collection of tight cuts, contradict ing minimality of this collectio n. O n the other hand, if B and C are tight cuts, the countin g ar gument also shows th at there is no edge between A and D , contr adicting the exis tence of edge ( s , u ) . Let us no w exte nd the notion of split ting-of f to ver tices. Definition 9. A verte x with even de gr ee in an un dir ected gra ph is said to be split -of f if a pair of edges incide nt on it is re peatedl y split-of f until the verte x becomes isolated. Splittin g-of f of a vertex is said to be k -preserving if each constit uent edge spli tting-of f is k -pr eserv ing. Note that the n umber of edges in a cut either stays uncha nged o r decrea ses by 2 after a splitting -of f oper ation. Thus, if ev ery cut in the graph had an e ven number of edges to start w ith, then each cut continu es to hav e an ev en number of edges after a sequence of splittin g-of f operation s. Theref ore, the fo llo wing lemma is obtain ed by repea tedly applyin g Lemma 3 to a k -light vertex . Lemma 4. Suppo se G M is an undir ected multigr aph wher e the number of edges in every cut is even. Let k be an eve n inte ger . Then, th er e exi sts a k -pr eservin g splitting-o ff of any non-i solated k-light verte x u in G M . Our secon d opera tion is edge con tracti on . Definition 10. Contraction of edge e = ( u , v ) in an undir ected m ultigr aph G is define d as mer ging u and v into a singl e verte x (i.e. all edges inciden t on either u or v ar e now incident on the new ve rte x instea d). Any self-lo ops pr oduced by edges between u and v ar e discar ded. W e will no w prove Theor em 7. Pr oof of Theor em 7. W e run the follo wing rando mized algorit hm on multigrap h G M : 8 1. Split-o ff all k -light vertices ensurin g the k -preser ving property (L emma 4). 2. Contra ct an edge chose n unifo rmly at random in the resultin g graph. 3. If the cont raction produ ces a k -light verte x, split it off. 4 4. If ≤ 2 α ver tices are left , output a random cut; otherwise, go to step 2. Consider a cut C that h as at most α k edg es; let its k -project ion be S . In a ny of the spli tting-of f opera tions, no edge in S can be split- off since these edges continue to be k -hea vy t hrougho ut the ex ecutio n of the alg orithm. So, if no edge crossing cut C (either an edge in G M or one produce d by the splitting-o ff oper ations) is contra cted du ring the ex ecution of the algorithm, th en all edges in S survi ve till the end. T o estimate the probab ility that no edge crossin g cut C is contracte d, let h j be the number of vertice s left at the begin ning of th e j th iteration . Thus, h 1 is the numb er of k -heav y vertices in G M (note th at all k -light vertic es are split-o ff initi ally), and h j + 1 is eit her h j − 1 or h j − 2 dep ending on whet her a verte x was s plit-of f in step 3 of iterati on j . Observe that the number of edges crossing C cannot increase due to the splitting-of f operation s. Further , Lemma 2 ass erts that at the beginnin g of iterati on j , there are at least h j k / 2 edges in the graph. Thus, the probab ility that no edge in C is selected for random contracti on in step 2 of iteration j is at least 1 − α k h j k / 2 = 1 − 2 α h j . Then, the prob ability that no edge crossing C is contra cted in the en tire execu tion of the algori thm is at least ∏ j 1 − 2 α h j ≥ 2 α + 1 ∏ i = n 1 − 2 α i = n 2 α − 1 . Since there are 2 2 α − 1 cuts in a graph with 2 α ver tices, the probabili ty that the random cut output by the algori thm contains only edge s crossing cut C (and th erefore S is e xactly the set of k -hea vy edg es in G M outpu t by the algori thm) is at least n 2 α − 1 2 1 − 2 α ≥ n − 2 α . This is true for eve ry distinc t k -projec tion of cuts ha ving at most α k edges; hence, the total number of such k -projection s is at most n 2 α . In addition to the abov e theore m, we need the followin g non-unifor m versio n of Chernof f bound s (for Chernof f bounds, see e.g. [17]) to pro ve Theorem 8. (A proof of this theorem is giv en in the append ix.) Theor em 9. Consider any subset C of unweighted edges, wher e each edg e e ∈ C is sampled indepen dently with pr obability p e for some p e ∈ [ 0 , 1 ] and given weight 1 / p e if selected in the sample. Let the rando m variab le X e denote the w eight of edge e in the sample; if e is not selected in the sample , then X e = 0 . Then, for any p suc h that p ≤ p e for all edg es e, any ε ∈ ( 0 , 1 ] , and any N ≥ | C | , the following bound holds: 5 P " | ∑ i X e − | C || > ε N # < 2 e − 0 . 38 ε 2 pN . W e will no w use Theorem 7 to pro ve Theorem 8. (W e re-use the notation defined in section 2.) For an y cut C in G M , le t F ( C ) i = F i ∩ C and E ( C ) i = E i ∩ C for 0 ≤ i ≤ k ; 6 let f ( C ) i = | F ( C ) i | and e ( C ) i = | E ( C ) i | . Also, let d f ( C ) i be the ex pected w eight of all edge s in F ( C ) i in the skele ton graph G ε . W e first prov e a ke y lemma. 4 If an edge between u and v is contracted in step 2, all ed ges that were pre viously k -heavy continue to be so after the contraction , excep t the edges between u and v . So, at most one vertex (the ne w vertex) becomes k -light as a result of this contraction. 5 For any e vent E , P [ E ] represents the probability of ev ent E . 6 For any c ut C and an y set of edges Z , Z ∩ C denotes the set of edges in Z t hat cross cut C . 9 Lemma 5. F or any fixed i, w ith pr obabil ity at least 1 − 4 n 2 , | f ( C ) i − d f ( C ) i | ≤ ε 2 max e ( C ) i 2 i − 1 π α , f ( C ) i ! for all cuts C in G M . Pr oof. By the π -conne cti vity property , any edge e ∈ F i is π -hea vy in G i for any i ≥ 0. Therefore, e ( C ) i ≥ π . Let C i j be the set of all cuts C such that π 2 j ≤ e ( C ) i ≤ π 2 j + 1 − 1, j ≥ 0. W e will prov e that with probab ility at least 1 − 2 n − 2 j + 1 , all cuts in C i j satisfy the property of the lemma. Then, the lemma follo ws by using the union bound ov er j (keeping i fixed) since 2 n − 2 + 2 n − 4 + . . . + 2 n − 2 j + . . . ≤ 4 n − 2 . W e now prove the abov e claim fo r cu ts C ∈ C i j . L et X ( C ) i denote the set of edges in F ( C ) i that are sampled with probability strictl y less than 1; correspond ingly , let x ( C ) i = | X ( C ) i | and let d x ( C ) i be th e tot al weight of edges in X ( C ) i in the skeleton graph G ε . Since edges in F ( C ) i \ X ( C ) i ha ve a weight of exact ly 1 in G ε , it is suf ficient to sho w that with probabi lity at least 1 − 2 n − 2 j + 1 , | x ( C ) i − d x ( C ) i | ≤ ε 2 max e ( C ) i 2 i − 1 π α , x ( C ) i for all cuts C ∈ C i j . Since each edge e ∈ X ( C ) i has λ e < 2 i + 1 , we can use Theore m 9 w ith th e lo wer bo und on probab ilities p = 96 α ln n 0 . 38 · 2 i + 1 ε 2 . T here are two cases. In the first cas e, suppose x ( C ) i ≤ e ( C ) i 2 i − 1 π α . T hen, for any X ( C ) i where C ∈ C i j , by Theorem 9, we ha ve P " x ( C ) i − d x ( C ) i > ε 2 e ( C ) i 2 i − 1 π α # < 2 e − 0 . 38 ε 2 4 96 α ln n 0 . 38 · 2 i + 1 ε 2 e ( C ) i 2 i − 1 π α ≤ 2 e − 6 e ( C ) i ln n π ≤ 2 e − 6 · 2 j ln n , since e ( C ) i ≥ π 2 j for any C ∈ C i j . In the se cond ca se, suppose x ( C ) i > e ( C ) i 2 i − 1 π α . Then, for any X ( C ) i where C ∈ C i j , by Theorem 9, we ha ve P x ( C ) i − d x ( C ) i > ε 2 x ( C ) i < 2 e − 0 . 38 ε 2 4 96 α ln n 0 . 38 · 2 i + 1 ε 2 x ( C ) i < 2 e − 6 e ( C ) i ln n π ≤ 2 e − 6 · 2 j ln n , since x ( C ) i > e ( C ) i 2 i − 1 π α ≥ 2 i + j − 1 α for any C ∈ C i j . Thus, we ha ve pro ved that P " x ( C ) i − d x ( C ) i > ε 2 max e ( C ) i 2 i − 1 π α , x ( C ) i !# < 2 e − 6 · 2 j ln n = 2 n − 6 · 2 j for any cut C ∈ C i j . No w , by the π -conne cti vity property , we know that edges in F ( C ) i , and theref ore those in X ( C ) i , are π -hea vy in G i . Therefor e, by Theorem 7 , the number of distinct X ( C ) i sets for cuts C ∈ C i j is at most n 2 π 2 j + 1 π = n 4 · 2 j . Using the union bound over these distinct X ( C ) i edge sets, we conclud e that with probab ility at least 1 − 2 n − 2 j + 1 , all cuts in C i j satisfy the proper ty of the lemma. W e no w use the abov e lemma to prove Theor em 8. Pr oof of Theor em 8. For any cut C in G M , let c be the number of edges in C ; correspo ndingly , let ˆ c be the total weight of the edges crossing cut C in the sk eleton gra ph G ε . S ince k ≤ n − 1, we app ly the union bou nd 10 to the property from L emma 5 over the dif ferent v alues of i to conc lude that with probability at least 1 − 4 n , we hav e ∑ k i = 0 | d f ( C ) i − f ( C ) i | ≤ ∑ k i = 0 ε 2 max e ( C ) i 2 i − 1 π α , f ( C ) i for all cuts C in G M . Then, with probabili ty a t least 1 − 4 n , | ˆ c − c | = | k ∑ i = 0 d f ( C ) i − k ∑ i = 0 f ( C ) i | ≤ k ∑ i = 0 | d f ( C ) i − f ( C ) i | ≤ ε 2 k ∑ i = 0 max e ( C ) i 2 i − 1 π α , f ( C ) i ! ≤ ε 2 k ∑ i = 0 e ( C ) i 2 i − 1 π α + k ∑ i = 0 f ( C ) i ! ≤ ε c , since ∑ k i = 0 e ( C ) i 2 i − 1 π α ≤ c by t he α -ov erlap p roperty and ∑ k i = 0 f ( C ) i ≤ c sin ce F ( C ) i ’ s form a partition of th e edges in C . W e no w prove the size bou nd on G ε . The expe cted number of distinct edges in G ε is ∑ e ∈ E 1 − ( 1 − p e ) w e ≤ ∑ e w e p e . The bound follo ws by substitu ting the va lue of p e . 4 Sampling in Graphs with Polynomial Edge W eights In this section, w e w ill giv e an algorithm for sampling in undire cted weighted graphs, where the weight of e very edge i s an integer bounded by n d for a fix ed constant d > 0. T he algorith m constructs a skel eton grap h contai ning O ( n log n ε 2 ) edges in expec tation and has time complexity O ( m + n log 4 n ε 2 ) . Our strat egy , as outlin ed in the introductio n, has two steps: first w e run an algor ithm that constructs a skeleton graph w ith O ( n log 2 n ε 2 ) edges in expectat ion and has time complexit y O ( m ) ; then, we run a differe nt algorit hm that construc ts a sparse r skelet on containin g O ( n log n ε 2 ) edges in exp ectation on the skeleton grap h construc ted in the first step. The second algorithm takes time O ( m log 2 n ) on a graph w ith m edges and therefo re O ( n log 4 n ε 2 ) time on the skeleton graph produced in the first step. T o ensure that the final skeleton graph is in ( 1 ± ε ) G , w e choos e ε / 3 as the error parameter for each algorit hm. As an additio nal observ ation, we sho w that the time comple xity of the second algorithm improv es to O ( m log n ) if its input graph is unweighte d. W e will describ e bo th these algorith ms for an input graph G , where the weight w e of ev ery edge e is an integer bounded by n d for a fixed constant d > 0. Note that the input graph to the second algorithm in the abov e two-st ep sampling scheme may ha ve fractional weights. Howe ver , we can scale up all weights unifor mly until they are inte gral, and the scaled weights contin ue to be bounded by some fixed polynomial in n . Once the ske leton graph is obtained , we scale all weights down uniformly to obtain the final skele - ton graph. The unweighted m ultigra ph construc ted by replacing each edge e with w e paralle l unweighted edges e i , e 2 , . . . , e w e between u and v is deno ted by G M . Also, T 1 , T 2 , . . . denotes a set of N I fores ts of G M ; edge e j appear s in forest T i e + j − 1 , where 1 ≤ j ≤ w e . T hus, the copies of edge e appear in NI forests T i e , T i e + 1 , . . . , T i e + w e − 1 . For both al gorithms, we will use the g eneric samplin g scheme d escribed in section 2. Algorithm for Step 1. For any edge e = ( u , v ) , w e choose λ e = i e + w e − 1, i.e. the inde x of the last NI forest where a cop y of e appears; also set α = 2 and π = 2 i − 1 . For any i ≥ 1, define G i to be the graph contai ning al l edges in NI forests T 2 i − 1 , T 2 i − 1 + 1 , . . . , T 2 i − 1 (call this set of edges Y i ) and all edges in F i , i.e. all edges e w ith 2 i ≤ λ e ≤ 2 i + 1 − 1. Let G 0 only contain edges in F 0 . For any i 6 = j , F i ∩ F j = Y i ∩ Y j = / 0; thus, each edge appears in G i for at most two dif ferent v alues of i , provin g α -ov erlap. Further , for any ed ge 11 e ∈ F i , Lemm a 1 ensures that the endpoint s of e are connect ed in each of T 2 i − 1 , T 2 i − 1 + 1 , . . . , T 2 i − 1 . It follows that e is 2 i − 1 -hea vy in G i , thereby prov ing π -conne cti vity . W e no w prove the size bou nd. For any edge e ′ ∈ E M , let t ( e ′ ) be the inde x of the N I forest it appear s in. Then, ∑ e ∈ E w e λ e = ∑ e ∈ E w e ∑ j = 1 1 i e + w e − 1 ≤ ∑ e ∈ E w e ∑ j = 1 1 i e + j − 1 = ∑ e ′ ∈ E M 1 t ( e ′ ) = K ∑ ℓ = 1 ∑ e ′ ∈ T ℓ 1 ℓ ≤ ( n − 1 ) K ∑ ℓ = 1 1 ℓ = O ( n log K ) = O ( n log n ) , where the last step follo ws from the observ ation that the total number of NI forests K is at most n d + 2 , where d is a constan t. Using Theorem 8, w e conclude that the skeleton graph G ε constr ucted by the above algori thm has O ( n log 2 n ε 2 ) edges in ex pectatio n and is in ( 1 ± ε ) G whp . Time Complexity . T he time complexi ty for constr ucting the N I forests, and therefor e figuring out p e v alues is O ( m + n log n ) . W e sample each edge e by setting its weight in the skeleton G ε to r e / p e , where r e is dra wn randomly from the B inomial distri bu tion with parameter s w e and p e . This is clearly equi v alent to the sampli ng scheme described abov e, and can be done i n w e p e exp ected time for each edge e (s ee e.g. [7]), and therefore O ( n log 2 n ε 2 ) time o vera ll. Since m > n log 2 n ε 2 for this algorithm to be in v oke d, the o veral l time comple xity of the algorithm is O ( m ) . Algorithm for Step 2. Before describi ng our second sampling algori thm, w e define the following opera- tion on graphs . (Recall the definition of edge contracti on giv en in section 3.) Definition 11. Let G = ( V , E ) be an undir ected graph, and let V 1 , V 2 , . . . , V k be a partiti on of the vertices in G suc h that for each V i , the induced graph of G on V i is connected. T hen, shrinki ng G with resp ect to V 1 , V 2 , . . . , V k pr oduces the graph formed by contr acting all edges between vertices in the same V i for all i. Our sampling algorithm uses our generi c sampling scheme where λ e is determined using the following al- gorith m. Here H c = ( V c , E c ) is a graph varia ble represent ing a weighted graph. The algorithm is described recurs iv ely; w e call SetLambd a ( G , 0) to exe cute it. SetLambd a ( H , i ) 1. Set H c = H 2. If total weight of edges in E c is at most | V c | · 2 i + 1 , then (a) Set λ e = 2 i for all edges e ∈ E c (b) Remov e all edges in E c from H ; suppose H splits into connecte d componen ts H 1 , H 2 , . . . , H k (c) For ea ch H j contai ning at leas t 2 ve rtices, call Set Lambda ( H j , i + 1) Else, (a) Constru ct 2 i + 1 NI forests T 1 , T 2 , . . . , T 2 i + 1 for H c (b) Shrink H c wrt the connec ted compone nts in T 2 i + 1 ; update V c and E c accord ingly (c) Go to step 2 12 Also, set α = 4 an d π = 2 k where k = ⌊ lg max e ∈ E { λ e }⌋ . For any r , recall th at F r contai ns all w e unweigh ted copies of edge e from G M , where e satisfies 2 r ≤ λ e ≤ 2 r + 1 − 1. For any i ≥ 1, let G i contai n all edges in F r for all r ≥ i − 1, where each edge in F r is replicate d 2 k − r + 1 times in G i ; let G 0 contai n edges of F 0 where each edge is replica ted 2 k times. W e need the foll owin g lemma to pro ve that π -conne cti vity is satisfied. Lemma 6. F or any j ≥ 1 , consider any edge e ∈ F j , i.e. an edge e for whic h the above a lgorithm sets λ e = 2 j . Then, e is 2 j − 1 -heavy in the graph ∪ r ≥ j − 1 F r . Pr oof. For any edge e in F j , let G e = ( V e , E e ) be the component of G containing e such that SetLa mbda ( G e , j − 1) was exec uted. W e will sho w that e is 2 j − 1 -hea vy in G e ; since G e is a subgr aph of G , the lemma follo ws. In the e xecu tion of SetLambd a ( G e , j − 1), there are mult iple shrinkin g operatio ns, each of them compris ing the c ontractin g of a set of edge s. W e cl aim that an y su ch contra cted edge is 2 j − 1 -hea vy in G e ; it follo ws that any two vertice s u and v that got shrun k into a single ver tex are 2 j − 1 -conne cted in G e . Let G e ha ve k shrinkin g phases; let the graph produced after shrink ing phase r be G e , r . W e no w pro ve that all edges cont racted in phase r must be 2 j − 1 -conne cted in G e by induction on r . For r = 1, since e appear s in the ( 2 j − 1 + 1 ) st N I forest of phase 1, e is 2 j − 1 -conne cted in G e . For the inducti ve step, assume that the property holds for phas es 1 , 2 , . . . , r . Any edge that is cont racted in phase r + 1 appears in the ( 2 j − 1 + 1 ) st N I forest of phase r + 1; therefore, e is 2 j − 1 -conne cted in G e , r . By the induc tiv e hypothesis , all edges of G e contra cted in p re vious phases are 2 j − 1 -hea vy in G e ; t herefore , an edge that is 2 j − 1 -hea vy in G e , r must ha ve been 2 j − 1 -hea vy in G e . Consider any cut C in G containin g an edge e ∈ F i for any i ≥ 0. L et the corresp onding cut (i.e. with the same bipartit ion of vertice s) in G i be C i . W e need to sho w that the number of edges in C i is at least 2 k to pro ve π -conne cti vity . If i = 0, e is replicated 2 k times in G 0 thereb y provi ng the property . For i ≥ 1, let the maximum λ a of an edge a in C be k C , where 2 j ≤ k C ≤ 2 j + 1 − 1 for some j ≥ i . By the abov e lemma, C i contai ns at least 2 j − 1 distin ct edges of G , each of w hich is rep licated at least 2 k − j + 1 times. T hus, C i contai ns at least 2 k edges. W e now prov e α -ov erlap. For any cut C , reca ll that f ( C ) i and e ( C ) i respec tiv ely den ote the number of edges in F i ∩ C and in C i (where C i is as defined in the pre vious paragra ph) respec tiv ely . Then, k ∑ i = 0 e ( C ) i 2 i − 1 π = e ( C ) 0 2 π + k ∑ i = 1 e ( C ) i 2 i − 1 π = f ( C ) 0 2 k 2 k + 1 + k ∑ i = 1 f ( C ) i 2 k − r + 1 2 i − 1 2 k = f ( C ) 0 2 + k ∑ i = 1 k ∑ r = i − 1 f ( C ) r 2 r − i ≤ f ( C ) 0 + k ∑ r = 0 r + 1 ∑ i = 1 f ( C ) r 2 r − i ≤ 3 f ( C ) 0 + k ∑ r = 1 f ( C ) r r + 1 ∑ i = 1 1 2 r − i ≤ 4 f ( C ) 0 + 4 k ∑ r = 1 f ( C ) r ≤ 4 c . Define D i to be the set of connected compone nts in the graph G \ ( F 0 ∪ F 1 ∪ . . . ∪ F i − 1 ) for any i ≥ 1; let D 0 be th e singl e connected componen t in G . For any i ≥ 0, if an y co nnected compo nent in D i remains in tact in D i + 1 , then th ere is n o edge fr om that co nnected componen t in F i . O n the other ha nd, if a c omponent in D i splits into η compone nts in D i + 1 , then the algorithm expli citly ensures that the number of edges in F i from that connecte d componen t is at most η 2 i + 1 . Since each such edge h as λ e = 1 2 i , th e contrib ution of these edges to the sum ∑ e ∈ E w e λ e is at most 2 η ≤ 4 ( η − 1 ) (sinc e η ≥ 2). But, η − 1 is the increa se in the number of compon ents arising from this single compone nt. Therefore, if d i = | D i | , then ∑ e w e λ e ≤ k ∑ i = 0 4 ( d i + 1 − d i ) ≤ 4 n since ultimately we ha ve n singleton component s. Using Theorem 8, we conclude that the skeleto n graph G ε constr ucted by the abo ve algorithm has O ( n log n ε 2 ) edges in ex pectatio n and is in ( 1 ± ε ) G whp. 13 Time Complexity . W e sho w belo w that the algorith m to find valu es of λ e can be implemented in O ( m log n ) time for unweighted graphs , and O ( m log 2 n ) time for graphs with polynomia l edge weights . Once we hav e obtain ed the sampling probabi lities, we use the same trick as in the pre vious algorit hm, i.e. sample from a Binomial distrib ution , to produce the skele ton in O ( n log n ε 2 ) additiona l time. S ince the algorith m is in vok ed only if m > n log n ε 2 , the total running time is O ( m log n ) if G is unweigh ted and O ( m log 2 n ) othe rwise. W e no w determine the time comple xity for fi nding the v alues of λ e . Consider one call to SetLambda (H,i) which begins with H = ( V , E ) and let H c = ( V c , E c ) denote the graph H as it e vo lves over the vari ous iter- ations in this procedur e. Each iterat ion of steps (a) and (b) in the else block tak es O ( | V c | log n + | E c | ) time. W e show tha t the number of vert ices halves in each iter ation (sa ve the last) and there fore the to- tal time ove r all itera tions is O ( | V | log n + | E | log n ) . Since we are deali ng w ith the case of poly nomial edge weights, the depth of recursio n is O ( log n ) . Therefore, ov er all recursi ve calls, the time comes to O ( n log 2 n + m log 2 n ) = O ( m log 2 n ) . T o see that the n umber of verti ces halve s from one iteratio n to the n ext, cons ider an iteration that be gins with E c ha ving weight at least | V c | · 2 i + 1 . E c for the nex t iteration (denote d by E ′ c ) comprises only edges in the first 2 i NI forests construc ted in th e cu rrent itera tion. S o the total wei ght of ed ges in E ′ c is at most | V c | · 2 i . If this is not the last iterat ion, then this weight ex ceeds | V ′ c | · 2 i + 1 . It follo w s that | V ′ c | ≤ | V c | / 2, as requir ed. From the above descripti on, note that for the unweighted case, | E ′ c | ≤ | E c | / 2, and therefor e th e time taken ov er all iterati ons in one recursi ve call is O ( | V | + | E | ) . O ver a ll recursi ve calls this comes to O ( m log n ) . 5 Sampling Schemes using various Connec tivity Parameters In this se ction, we present se veral sampli ng schemes using va rious measures of conn ecti vity . Some of these results were previ ously kno wn; howe ve r , we will sho w that these results follo w as simple corollar ies of our generi c sampling scheme whereas the original proofs were specific to each scheme and substantia lly m ore complica ted. The algorithms for implementi ng these schemes are less efficient than the algor ithms that we ha ve pre viously pres ented; therefore we restr ict ourselv es to stru ctural result s in th is sect ion. A s earlier , G is the weighted input graph (with arbitrary integer weights); G M is the corres ponding unweighted multigraph; T 1 , T 2 , . . . , T K is a set of NI forests of G M . 5.1 Sampling using Standard Connectivities For any edge e = ( u , v ) , set λ e to th e stand ard conne cti vity of the e dge; also set α = 3 + lg n a nd π = 2 i − 1 . F i is defined as th e set of all edges e with 2 i ≤ λ e ≤ 2 i + 1 − 1 for any i ≥ 0. For any i ≥ 1 + lg n , let G i contai n all edges in N I forests T 2 i − 1 − lg n , T 2 i − 1 − log n + 1 , . . . , T 2 i + 1 − 1 and all edges in F i . For i ≤ lg n , G i contai ns all edges in T 1 , T 2 , . . . , T i and a ll edge s in F i . For an y i ≥ 0, let Y i denote the set of edge s in G i b ut not in F i . For an y i 6 = j , F i ∩ F j = / 0 and each edge appears in Y i for at m ost 2 + log n diff erent va lues of i ; this prove s α -ov erlap. T o pro ve π -conne cti vity , w e note that Lemma 1 ensures that for any pair of vertices u , v with maximum flow f ( u , v ) and for any k ≥ 1, u , v are at least min ( f ( u , v ) , k ) -connect ed in the union of the first k NI forests, i.e. in T 1 ∪ T 2 ∪ . . . T k . Thus, any edge e ∈ F i is at least 2 i -hea vy in the union of the N I forests T 1 , T 2 , . . . , T 2 i + 1 − 1 . Since there are at m ost 2 i − 1 edges over all in T 1 , T 2 , . . . , T 2 i − 1 − lg n − 1 , any edge e ∈ F i is 2 i − 1 -hea vy in G i . This pro ves π -conne cti vity . W e now prov e the size b ound. The nex t lemma is similar t o its cor respond ing lemma for stro ng connec- ti vity in [2]. Lemma 7. Suppose G is an undir ected graph wher e edg e e has w eight w e and standar d conn ectivity k e . Then, ∑ e w e k e ≤ n − 1 . 14 Pr oof. W e use inductio n on the number of vert ices in the graph. For a graph with a single verte x and no edge, the lemma holds v acuousl y . No w , suppose the lemma holds for all graphs with at most n − 1 vertices. Let C be a minimum cut in G , an d let λ be its weight. For any edge e ∈ C , k e = λ . Thus, ∑ e ∈ C w e k e = 1. W e remove all edges in C from G ; this splits G into two connect ed compone nts G 1 and G 2 with n 1 and n 2 ver tices respecti ve ly , where n 1 , n 2 ≤ n − 1. Furthe r , the standard connec tiv ity of each edge in G 1 , G 2 is at most that in G . Using the inducti ve hypot hesis, we conclude that ∑ e ∈ G 1 w e k e ≤ n 1 − 1 and ∑ e ∈ G 2 w e k e ≤ n 2 − 1. W e co nclude that ∑ e w e k e ≤ n 1 − 1 + n 2 − 1 + 1 = n − 1 . Using Theore m 8, we conclud e that the expe cted number of edges in the ske leton graph G ε is O ( n log 2 n ε 2 ) an d G ε ∈ ( 1 ± ε ) G whp. 5.2 Sampling using Effecti ve Resistances For any edge e = ( u , v ) , set λ e to th e eff ecti ve con ductanc e of the edge, i.e. λ e = 1 R e where R e is th e ef fecti ve resista nce of edge e . The nex t two lemmas imply that the skeleto n G ε ∈ ( 1 ± ε ) G whp. Lemma 8. Suppose that a sampling sche me (that uses the gener ic sampling sche me) has λ e ≤ k e for each edg e e in gr aph G , wher e k e is the sta ndar d connectiv ity of e in G. T hen, the skeleton constructed is in ( 1 ± ε ) G whp. Pr oof. W e use the same definition of α , π and G i s as in the samplin g scheme w ith standard conne cti vities, and ver ify that π -conne cti vity and α -ov erlap continu e to be satisfied. Lemma 9. Suppose edg e e in an undir ected graph G has standar d connec tivity k e and ef fective re sistance R e . Then, 1 R e ≤ k e . Pr oof. Consider a cut C of w eight k e separa ting the terminals of edge e . W e contract each side of this cut into a sin gle v erte x. In other words, we reduce the resistan ce on eac h edge , other tha n those in C , to 0. By Rayleigh’ s monotonici ty prin ciple (e.g. [4 ]), the effecti ve resistanc e of e does not increase due to this transfo rmation. S ince the ef fecti ve resistan ce of e after the transformatio n is 1 / k e , R e ≥ 1 / k e in the original graph. The size bound follo ws from the follo wing well-kno wn fact (see e.g. [22]). 7 Fac t 1. If R e is the eff ective r esista nce of edg e e with weight w e in an undir ected graph , then ∑ e w e R e ≤ n − 1 . It follo ws from Theorem 8 that the expe cted number of edges in skeleto n G ε is O ( n log 2 n ε 2 ) . 5.3 Sampling using Str ong Connectivities For any edge e , set λ e to the strong connecti vity o f the edge; set α = 1 and π = 2 k , where k = ⌊ lg max e ∈ E { λ e }⌋ . Let G i contai n all edges in F r for all r ≥ i , where each edge in F r is replicate d 2 k − r times. W e use the fol- lo wing property of strong connect ivi ties that also appears in [2]. 7 There are many proofs of this fact, e.g. use l inearity of expec tation coupled wi th the fact that ef fectiv e resistance of an edge is the probability that the edge is in a random spanning tree of the graph [3]. 15 Lemma 10. In any undir ected grap h G, if an edge e has str ong con nectivit y k , then e co ntinues to have str ong connectivi ty k e ven after all edges with str ong connecti vity strict ly less than k ha ve been re moved fr om G. Consider any cut C with an edge e ∈ F i . L et the correspond ing cut ( i.e. with the same bi-partiti on of ve rtices) in G i be C i . W e need to sho w that the number of edges in C i is at least 2 k to prov e π -conne cti vity . Let the maximum st rong conn ecti vity of an e dge in C be k C , wher e 2 j ≤ k C ≤ 2 j + 1 − 1 for s ome j ≥ i . By the ab ov e lemma, C i contai ns at least 2 j distin ct edges of G , each of which is replicat ed at least 2 k − j times. Thus, C i contai ns at least 2 k edges. W e now prov e α -ov erlap. For any cut C , reca ll that f ( C ) i and e ( C ) i respec tiv ely den ote the number of edges in F i ∩ C and in C i (where C i is as defined in the pre vious paragra ph) respec tiv ely . Then, k ∑ i = 0 e ( C ) i 2 i − 1 π = k ∑ i = 0 k ∑ r = i f ( C ) r 2 k − r 2 i − 1 2 k = k ∑ i = 0 k ∑ r = i f ( C ) r 2 r − i + 1 = k ∑ r = 0 r ∑ i = 0 f ( C ) r 2 r − i + 1 = k ∑ r = 0 f ( C ) r r ∑ i = 0 1 2 r − i + 1 < k ∑ r = 0 f ( C ) r = c . The size bound follo ws from the follo wing lemma due to Bencz ´ ur and Karge r . Lemma 11 (Bencz ´ ur -Karg er [2]) . If k e is the str ong connectivit y of edge e with weight w e in an undir ected gra ph, then ∑ e w e k e ≤ n − 1 . It follo ws from Theorem 8 that the expec ted number of edges in the skelet on graph G ε is O ( n log n ε 2 ) and that G ε ∈ ( 1 ± ε ) G whp. 6 Sampling in Graphs with Arbitrary Edge W eights Unfortun ately , the algorithms presente d earlier for sampling in a graph with polyn omial edge weights fail i f the edge w eights are arbitrary . In partic ular , we can no longer guarantee that the expected number of edges in a ske leton graph constructed by these algorith ms is ˜ O ( n / ε 2 ) , ev en though it contin ues to approxi mately preser ve the weig ht of all cuts whp. Therefore, we need to modify o ur tech niques to rest ore the s ize bound s, as descr ibed belo w . W e sort the edges in decreasing order of their weight, breaki ng ties arbitrarily . W e add edges to the NI forests in this sor ted ord er , i.e. when ed ge e is bein g adde d, the NI forests con tain al l ed ges of weight greater than e . T o insert e = ( u , v ) , we find the NI forest w ith the minimum index where u and v are not connec ted; call t his inde x i e . T hen, e is inserted in NI fore sts T i e , T i e + 1 , . . . , T i e + w e − 1 . N ote t hat this does not produce an y cyc le in the NI forests since Lemma 1 e nsures that if u , v are disco nnected in T i e , th en the y are not connecte d in T k for an y k ≥ i e . For any edge e = ( u , v ) , set λ e to the index of the first NI forest where edge e is inserte d, i.e. λ e = i e ; also set α = 2 an d π = 2 i − 1 . For an y i ≥ 1, let G i contai n all edg es in NI fore sts T 2 i − 1 , T 2 i − 1 + 1 , . . . , T 2 i − 1 (call this set of edges Y i ) and all edge s in F i , i.e. all edges e with 2 i ≤ λ e ≤ 2 i + 1 − 1. L et G 0 = F 0 . For any i 6 = j , F i ∩ F j = Y i ∩ Y j = / 0; thus, each edge appears in G i for at most two dif ferent va lues of i , prov ing α -ov erlap. On the other hand, for any edge e ∈ F i , L emma 1 ensure s that the endpoin ts of e are connected in each of T 2 i − 1 , T 2 i − 1 + 1 , . . . , T 2 i − 1 . It follo w s that e is 2 i − 1 -hea vy in G i , there by proving π -conne cti vity . W e no w prov e the size bound on the skel eton. Partition edges into subsets S 0 , S 1 , . . . where S j contai ns all edges e with j < i e w e ≤ j + 1. The follo wing lemma states that none of these subsets is larg e. Lemma 12. F or any j, | S j | ≤ n − 1 . 16 Pr oof. W e prov e th at the edges in an y su bset S j form an a cyclic g raph. Suppose no t; let C be a cyc le fo rmed by the edge in S j , and e = ( u , v ) be the edge that was inserted last in the NI forests among the edges in C . Let e ′ be any othe r edge in C . Then, w e ′ ≥ w e , and henc e i e ′ + w e ′ − 1 > w e ′ ( j + 1 ) − 1 ≥ w e ( j + 1 ) − 1 ≥ i e − 1 . Since both the first and last terms are intege rs, i e ′ + w e ′ − 1 ≥ i e . Therefore, u ′ and v ′ were connected in T i e for each e ′ = ( u ′ , v ′ ) in C . So, u and v w ere connected in T i e since C is a cycle , before e was added to T i e . But, then e would no t hav e been added to T i e , a cont radiction . Thus, ∑ e w e i e ≤ ∑ j : S j 6 = / 0 | S j | j ≤ ( n − 1 ) ∑ j : S j 6 = / 0 1 j = O ( n log n ) since at m ost m < n 2 of the S j ’ s are non-empt y . Using Theorem 8, we conclud e th at the skele ton G ε has O ( n log 2 n ε 2 ) edges in ex pectatio n and that G ε ∈ ( 1 ± ε ) G whp. Finally , we need to sho w that the constructio n of NI forest s where edges are added in decreasing order of weight can be done in O ( m log 2 n ) time. W e use a data struct ure (call it a partitio n tr ee ) P to succinctly encod e the NI forests. The leaf nodes in P exactly correspond to the vertices in graph G , i.e. there is a one-on e mapping between these two sets. On the other hand, each non-le af node v of the partition tree has a number n ( v ) assoc iated with it that satisfies the followin g property : for any two vertice s x , y in the grap h, if z be the least common ancestor 8 of their corr espond ing leaf nodes in P, then x and y ar e connected in e xactly the first n ( z ) N I for ests . T hen, n ( z ) + 1 is the index of the first NI forest w here edge ( x , y ) is to be inserte d. Initially , all the n leaf nodes in P rep resentin g the graph vertices are childre n of the root node r , and n ( r ) = 0. As edges are inserted in the NI forests, the partition tree ev olves, but w e make sure that the abo ve property hold s througho ut the constructi on. Additionally , we also maintain the in var iant that if x is a child of y in P , then n ( x ) > n ( y ) . W e need to show that we can maintain the above proper ties of the partition tree as it ev olves, and also retrie ve the lca of any pair of vertices efficie ntly for this ev olving partition tree. Let ( x , y ) be the edge being inserte d, let z = l ca ( x , y ) in the partition tree, and let u and v be the children of z that are ancest ors of x and y respec tiv ely . Observe that adding an edge ( x , y ) to trees with indices from n s + 1 to n s + ℓ increas es the conne cti vity of a pair of v ertices w 1 , w 2 if f the y were pre viously connec ted in n s + i tree s for some 0 ≤ i < ℓ , w 1 , x w ere connecte d in n s + j trees for some j ≥ i and w 2 , y w ere conne cted in n s + k trees for some k ≥ i (or vice-ve rsa). In this case, w 1 , w 2 are now connecte d in n s + min ( j , k , ℓ ) trees after adding the edge ( x , y ) . Further , if n ( u ) − n ( z ) < w ( x , y ) , then an edge of w eight less than w ( x , y ) must hav e been added to the trees accord ing to th e se cond in v ariant, which v iolates th e f act th at edg es are added i n de creasing ord er of weight. Thus, n ( u ) − n ( z ) ≥ w ( x , y ) ; similarl y n ( v ) − n ( z ) ≥ w ( x , y ) . There are three cases: 1. n ( u ) − n ( z ) = n ( v ) − n ( z ) = w ( x , y ) . W e merg e u and v into a single node s that remains a child of z and n ( s ) = n ( u ) . The first in varian t is clearly main tained. For the second in varian t, observ e th at the only pairs of vertices w 1 , w 2 whose connecti vity change d w ere those with l ca ( w 1 , w 2 ) = z , where w 1 , w 2 are descendan ts of u , v respec tiv ely . Their connect iv ity increas es to n ( u ) , which is reflected in the partitio n tree. 8 The least common ancestor or lca of two nodes x , y in a tree is the deepest node that is an ancestor of both x and y . 17 2. n ( u ) − n ( z ) = w ( x , y ) and n ( v ) − n ( z ) > w ( x , y ) (symmetrically for n ( u ) − n ( z ) > w ( x , y ) and n ( v ) − n ( z ) = w ( x , y ) ). W e make v a child of u (from being a child of z ), and n ( u ) = n ( z ) + w ( x , y ) . For notati onal con venie nce in the proofs later , w e replace u and v by a pair of ne w nodes s and t where n ( s ) and n ( t ) are re specti vely equal to the update d v alues of n ( u ) and n ( v ) . The first in v ariant is clearly maintain ed. For the second in v arian t, observe that the only pairs of vertic es w 1 , w 2 whose connec tiv ity chang ed were those with l ca ( w 1 , w 2 ) = z , where w 1 , w 2 are descen dants of u , v re specti ve ly . Their conne cti vity increases to n ( z ) + w ( x , y ) , which is reflected in the partition tree. 3. n ( u ) − n ( z ) > w ( x , y ) and n ( v ) − n ( z ) > w ( x , y ) . W e introduce a new node r as a child of z and p arent of u and v , a nd n ( r ) = n ( z ) + w ( x , y ) . For notational co n venience i n the proofs later , we replace u and v by a pair of ne w nodes s and t where n ( s ) = n ( u ) and n ( t ) = n ( v ) . The first in v ariant is clearly maintained. For the second in v ariant, obse rve that the only pairs of vertice s w 1 , w 2 whose connecti vity chan ged were th ose with l ca ( w 1 , w 2 ) = z , where w 1 , w 2 are desc endants of u , v respecti vely . Their connecti vity increa ses to n ( z ) + w ( x , y ) , which is reflected in the partitio n tree. W e use the dynamic tr ee data structure [21] for updatin g the partition tree. This data structur e can be used to maintain a dynamica lly changing forest of n nodes, while supportin g the follo wing operations 9 in O ( log n ) time per opera tion: Cut( v ) Cut the subtree under node v from the tree contain ing it, and make it a sep arate tree w ith root v . Link( v , w ) ( w need s to be the root node of a tree not containing v .) Join the tree rooted at w and that contai ning v by making w a child of v . LCA( v , w ) ( v and w need to be in the same tree.) D efined pre viously . W e maintain a dynamic t ree d ata structure for the p artition tree. Recall that the partitio n tree can be modified in three diff erent ways. The last two modificatio ns require O ( 1 ) cut and link operati ons each. Therefore, the over all time complex ity of these modifications is O ( m log n ) . On the other hand, the first modification requir es O ( d ) cut and link operations , where d is the lesser number of children among u and v . W e will pro ve the follo wing lemma boundin g the total number of operatio ns due to the first type of modification. Lemma 13. The tota l number of cut and link o pera tions due to mo dificatio ns of the first ty pe in the partitio n tr ee is O ( m log n ) . Theorem 10 follo ws immediately . Theor em 10. T he time comple xity of constructin g NI for ests wher e edges are inserted in decr easing or der of weight is O ( m log 2 n ) for grap hs with arbitrar y edge weight s. W e no w prove Lemma 13. Pr oof of Lemma 13. W e set up a chargi ng ar gument for the cut an d link operatio ns due to the first type of modi fication. Define a function f on the no des of th e partit ion tree where each node v has f ( v ) = 1 initial ly . In the first type of modification, w e assign f ( s ) = f ( u ) + f ( v ) ; in the secon d type of modification, f ( s ) = f ( u ) + f ( v ) and f ( t ) = 1; in the third type of modifica tion, f ( r ) = f ( u ) + f ( v ) and f ( s ) = f ( t ) = 1. Observ e that the sum of f ( · ) ov er all nodes in the partition tree increases by at most 2 for any of the above modificatio ns. 9 The dynamic tree data structure supports other operations as well; we only define the operations that we require. 18 Let C u be the set of childr en of node u ; then, let F C ( u ) = ∑ v ∈ C u f ( v ) . W e char ge the cut and link operat ions for the first type of modification to the children of u (resp., v ) if F C ( u ) ≥ F C ( v ) (resp., F C ( v ) > F C ( u ) ); each ch ild of u (res p., v ) is char ged O ( 1 ) ope rations. N o w , let S u be the se t of sibling s of any nod e u in the par tition tree ; correspo ndingly , let F S ( u ) = ∑ v ∈ S u f ( v ) . O bserv e that w hene ver a node u is char ged due the first type of modification, F S ( u ) at least doubles. F urther , F S ( u ) nev er decreases for any node u due to any of the three ty pes of modification s. Since t he sum of f ( . ) over all nodes in the partitio n tree in creases by at most 2 for any of the modificatio ns, and there are m modificatio ns ove rall, each node is char ged at most O ( log m ) = O ( log n ) times. Further , each modification introdu ces O ( 1 ) new nodes; so the total number of operat ions due to modification s of the first type is O ( m log n ) . Refer ences [1] Joshua D. B atson, Daniel A. Spielman, and Nikhil Sri vasta va. T wice-Ramanuja n sparsifiers . In ST OC , pages 255–26 2, 2009 . [2] Andr ´ as A. B encz ´ ur and Davi d R. Karge r . Approximating s-t minimum cuts in ˜ O ( n 2 ) time. In STOC , pages 47–55, 1996 . [3] Bela Bollobas . Modern Graph Theory . Springer , 1998. [4] Peter G. Doyle and Laurie J. Snell. Random W alks and Electric Networks . Carus Mathemat ical Monogra phs, 1984. [5] Andr ´ as F rank. On a theo rem of Mader. Discr ete Math. , 101(1- 3):49–5 7, 1992. [6] Andre w V . Goldber g and Satis h Rao. Beyond t he flo w decomposit ion barrier . J. ACM , 45(5) :783–79 7, 1998. [7] V oratas Kachitvic hyanuk ul and B ruce W . Schmeiser . B inomial random v ariate generat ion. Commun. A CM , 31(2):2 16–222, 1988. [8] Davi d R. Kar ger . Global min-cuts in RN C, and other ramification s of a simple m in-cut algorithm. In SOD A , pag es 21–3 0, 1993. [9] Davi d R. Kar ger . Ran dom samplin g in cut, flow , and netw ork design probl ems. In ST OC , pages 648–6 57, 1994. [10] Dav id R . Karger . Using randomized sp arsification to ap proximate m inimum cuts. In SODA , pages 424–4 32, 1994. [11] Dav id R. Kar ger . A randomized fully polyn omial time approxi mation scheme for the all-ter minal netwo rk reliability problem. SIAM J. Comput . , 29(2):492– 514, 1999. [12] Rohit Kha ndekar , Satish Rao, and Umesh V . V azirani. G raph partitioning using sing le commodity flows. J. A CM , 56 (4), 2009. [13] L ´ aszl ´ o Lov ´ asz. Lecture. In Confer ence of Graph Theory , 1974. [14] L ´ aszl ´ o Lov ´ asz. Combinatoria l Pr oblems and Exer cises, 2nd ed. North Holland, 1993. 19 [15] W olf gang Mader . A reduct ion m ethod for edge-conne cti vity in graphs. Ann. Discr ete Math. , 3:145– 164, 1978. [16] W olf gang Mader . K onstru ktion aller n-fach kantenzusammen hangenden di-graphen . Eur opean J . Combin. , 3:63– 67, 1982 . [17] R. Motwani and P . Ragha va n. R andomiz ed A lgorith ms . Cambridg e Univ ersity Press, 1997. [18] Hiroshi Nagamochi and T oshihi de Ibaraki. Computing edge-co nnecti vity in m ultigra phs and capaci- tated graph s. SIAM J. Discr ete M ath. , 5(1):5 4–66, 1992. [19] Hiroshi N agamoch i and T oshihide Ibaraki. A linear -time algorithm for finding a sparse k-conne cted spann ing subgra ph of a k-con nected graph . Algorit hmica , 7(5&6):583–5 96, 1992. [20] Jonah Sherman. Breakin g the m ulticommod ity flow barrie r for O ( √ log n ) -appr oximations to sparsest cut. In FOCS , page s 363–372, 2009. [21] Daniel Dominic Sle ator and Robert Endr e T arjan . A dat a structure for dyn amic trees. J. Comput. Sys t. Sci. , 26(3) :362–39 1, 1983. [22] Daniel A. S pielman and Nikhil S ri va sta va . Graph sparsificatio n by effect iv e res istances. In STOC , pages 563–56 8, 2008 . [23] Daniel A. Spielman and Shang-Hua T eng. Nea rly-linea r time algorithms for graph partitionin g, graph sparsi fication, and solvin g linear syste ms. In ST OC , pages 81–90, 2004. [24] Daniel A . Spielman and S hang-Hua T eng. Nearly-line ar time algorith ms for preco nditioni ng and solvin g symmetric , diagona lly dominan t linear systems. CoRR , abs/ cs/0607 105, 2006. [25] Zolt ´ an Szigeti. Edge- splitting s preserving local edge- connecti vity of graph s. Discr ete Applied Math e- matics , 156(7) :1011–1 018, 2008. A Proof of Theor em 9 W e ne ed the follo wing inequality . Lemma 14. Let f ( x ) = x − ( 1 + x ) ln ( 1 + x ) and α = 1 − 2 ln 2 . Then, f ( x ) ≤ ( α x 2 if x ∈ ( 0 , 1 ) α x if x ≥ 1 . Pr oof. First, consider x ∈ ( 0 , 1 ) . Define g ( x ) = f ( x ) x 2 = 1 x − 1 x + 1 x 2 ln ( 1 + x ) . W e can verify tha t g ( x ) is an increasi ng function of x for x ∈ ( 0 , 1 ] . F urther , at x = 1, g ( x ) = α . Thus, f ( x ) < α x 2 for x ∈ ( 0 , 1 ) . 20 No w , consid er x ≥ 1. D efine h ( x ) = f ( x ) x = 1 − 1 + 1 x ln ( 1 + x ) . W e can ver ify that h ( x ) is a decreasi ng function of x for x ≥ 1. F urther , at x = 1, h ( x ) = α . T hus, f ( x ) ≤ α x for x ≥ 1. W e us e the abov e inequalit y to prov e the follo wing lemmas. Lemma 15. Suppo se X 1 , X 2 , . . . , X n is a set of independe nt r andom variab les s uch that eac h X i , i ∈ { 1 , 2 , . . . , n } , has v alue 1 / p i with pr obability p i for s ome fix ed 0 < p i ≤ 1 and h as valu e 0 with pr obab ility 1 − p i . F or any p ≤ min i p i and for any ε > 0 , P " ∑ i X i > ( 1 + ε ) n # < ( e − 0 . 38 ε 2 pn if 0 < ε < 1 e − 0 . 38 ε pn if ε ≥ 1 . Pr oof. For any t > 0, 10 P " ∑ i X i > ( 1 + ε ) n # = P h e t ∑ i X i > e t ( 1 + ε ) n i < E e t ∑ i X i e t ( 1 + ε ) n ( by Marko v bound ( see e . g . [ 17 ])) = n ∏ i = 1 E e t X i e t ( 1 + ε ) n ( by indepe ndence of X 1 , X 2 , . . . , X n ) = n ∏ i = 1 p i e t / p i + 1 − p i e t ( 1 + ε ) n = n ∏ i = 1 1 + p i ( e t / p i − 1 ) e t ( 1 + ε ) n ≤ exp ( n ∑ i = 1 p i ( e t / p i − 1 ) − t ( 1 + ε ) n ) ( since 1 + x ≤ e x , ∀ x ≥ 0 ) . Since p i ≥ p for all i ∈ { 1 , 2 , . . . , n } , n ∑ i = 1 ( p i ( e t / p i − 1 )) ≤ n ∑ i = 1 ( p ( e t / p − 1 )) = n p ( e t / p − 1 ) . Thus, P " ∑ i X i > ( 1 + ε ) n # < exp ( n p ( e t / p − 1 ) − t ( 1 + ε ) n ) . Setting t = p ln ( 1 + ε ) , we ge t P " ∑ i X i > ( 1 + ε ) n # < e ε ( 1 + ε ) 1 + ε pn . 10 For any ra ndom variable X , E [ X ] denotes the expectation of X . 21 Since 1 − 2 ln 2 < − 0 . 38, we can use Lemma 14 to conclude that P " ∑ i X i > ( 1 + ε ) n # < ( e − 0 . 38 ε 2 pn if 0 < ε < 1 e − 0 . 38 ε pn if ε ≥ 1 . Lemma 16. Suppo se X 1 , X 2 , . . . , X n is a set of independe nt r andom variab les s uch that eac h X i , i ∈ { 1 , 2 , . . . , n } , has v alue 1 / p i with pr obability p i for s ome fix ed 0 < p i ≤ 1 and h as valu e 0 with pr obab ility 1 − p i . F or any p ≤ min i p i and for any ε > 0 , P " ∑ i X i < ( 1 − ε ) n # ( < e − 0 . 5 ε 2 pn if 0 < ε < 1 = 0 if ε ≥ 1 . Pr oof. For ε ≥ 1, P " ∑ i X i < ( 1 − ε ) n # ≤ P " ∑ i X i < 0 # = 0 . No w , suppos e ε ∈ ( 0 , 1 ) . For an y t > 0, P " ∑ i X i < ( 1 − ε ) n # = P h e − t ∑ i X i > e − t ( 1 − ε ) n i < E e − t ∑ i X i e − t ( 1 − ε ) n ( by Marko v bound ) = n ∏ i = 1 E e − t X i e − t ( 1 − ε ) n ( by indepe ndence of X 1 , X 2 , . . . , X n ) = n ∏ i = 1 p i e − t / p i + 1 − p i e − t ( 1 − ε ) n = n ∏ i = 1 1 − p i ( 1 − e − t / p i ) e − t ( 1 − ε ) n ≤ exp ( n ∑ i = 1 − p i ( e − t / p i − 1 ) + t ( 1 − ε ) n ) ( since 1 − x ≤ e − x , ∀ x ≥ 0 ) . Since p i ≥ p for all i ∈ { 1 , 2 , . . . , n } , n ∑ i = 1 ( p i ( 1 − e − t / p i )) ≤ n ∑ i = 1 ( p ( 1 − e − t / p )) = n p ( 1 − e − t / p ) . Thus, P " ∑ i X i < ( 1 − ε ) n # < exp ( n p ( 1 − e − t / p ) + t ( 1 − ε ) n ) . Setting t = − p ln ( 1 − ε ) , we get P " ∑ i X i < ( 1 − ε ) n # < e ε ( 1 − ε ) 1 − ε pn ≤ e − 0 . 5 ε 2 pn . 22 W e no w prove Theore m 9 using the abov e lemmas. Pr oof of Theor em 9. Let δ = ε N | C | . F irst, conside r the case w here δ ∈ ( 0 , 1 ) . From Lemmas 15 and 16, w e conclu de that P | ∑ e X e − | C || > ε | C | = P | ∑ e X e − | C || > δ | C | < 2 e − 0 . 38 δ 2 p | C | = 2 e − 0 . 38 ε 2 pN ( N / | C | ) ≤ 2 e − 0 . 38 ε 2 pN ( since N ≥ | C | ) . No w , consid er the case where δ ≥ 1. From Lemmas 15 and 16, we conclud e that P | ∑ e X e − | C || > ε N = P | ∑ e X e − | C || > δ | C | < e − 0 . 38 δ p | C | = e − 0 . 38 ε pN ≤ e − 0 . 38 ε 2 pN ( since ε ≤ 1 ) . 23
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment