BGP Stability is Precarious

BGP Stability is Precarious P . Brighten Godfrey University of Illinois at Urbana-Champaign pbg@illinois.edu ABSTRA CT W e note a fac t which is simple, but may b e useful for the net w orking researc h comm unit y: essen tially any change to BGP’s decision pro cess can cause div ergence — or con v ergence when BGP would otherwise diverge. 1. INTR ODUCTION The In ternet’s in terdomain routing protocol, BGP [3], uses a decision pro cess to select a single b est route to eac h destination when presented with multiple options. This decision process can b e customized and modiﬁed at each router to select routes that achiev e v arious ob- jectiv es such as load balance, path quality , or security . When prop osing suc h a mo diﬁcation, we w ere ask ed a v ery natural question: Giv en the known problem that a distributed netw ork of BGP routers might nev er con- v erge to a stable state [5], migh t the prop osed c hange mak e the problem w orse? That is, do there exist cases in which the standard BGP proto col conv erges, but the proposed mo diﬁcation causes divergence? The Hippo cratic goal to do no harm is natural to desire of any mo diﬁcation to BGP’s decision pro cess, giv en its global importance. Ho w ev er, w e observe here that any mo diﬁcation to the decision pro cess can cause div ergence in some case when standard BGP w ould con- v erge, under very mild conditions. Sp eciﬁcally , (1) the mo diﬁed BGP m ust actually diﬀer, in that there is some case where the mo diﬁed and standard BGP b oth con- v erge, but to diﬀerent outcomes; and (2) the mo diﬁca- tion eit her ma y be deplo y ed at only some routers, or the mo diﬁcation preserves the expressiveness of standard BGP (for example b y maintaining the initial operator- conﬁgurable LOCAL PREF step). Even seemingly triv- ial changes, like changing a tiebreaking step from “lo w- est router ID” to “high est router ID” , satisfy these con- ditions and therefore may cause divergence. But this fact should not incite fear of modifying BGP . Indeed, an y mo diﬁcation could also cause c onver genc e when BGP w ould otherwise div erge. Th us, fear of mod- ifying BGP can b e equally matc hed with fear of not mo difying BGP . Instead, what the result points out is that the ques- tion of whether new cases of divergence c ould happen b y swit c hing fr om decision pro cess A to B is uninfor ma- tiv e, b ecause the answer is alwa ys “y es” for an y distinct v alues of A and B . A more v aluable question is how con- v ergence is aﬀected in realistic cases. This, of course, is a m uc h more diﬃcult question to answer convincingly , not least b ecause it requires assumptions ab out what is realistic. 2. MODEL 2.1 The standard model W e follo w the model of [1]. 1 An instance of the Stable P aths Problem (SPP) consists of a graph G = ( V , E ) and a set λ of ranking functions , one for each no de v ∈ V . No de v ’s ranking function λ v sp eciﬁes which paths v prefers; sp eciﬁcally , if λ v ( P 1 ) > λ v ( P 2 ) then v prefers P 1 o v er P 2 . W e require that λ v ( P 1 ) 6 = λ v ( P 2 ) unless P 1 and P 2 ha v e the same ﬁrst edge (since BGP learns only a single route from each neighbor, we will nev er need to compare tw o such routes). The “n ull path” ε represents the absence of a path to the destination, and is considered a v alid path. Since BGP’s decision process may eliminate a path P due to import or exp ort ﬁlters, w e ma y hav e λ v ( ε ) > λ v ( P ). W e write P 1 P 2 to denote the concatenation of tw o paths, or v w P to concatenate the edge ( v , w ) with path P . There is a single distinguished no de 0 to which all no des are choosing paths. A t any given time t , each no de v has a current path assignmen t π t ( v ). A t all times t , w e hav e π t (0) = 0 (i.e., the destination alwa ys selects the trivial one-hop path to itself ). The dynam- ics of the proto col are mo deled by a sequence of “acti- v ations” of no des A = ( v i 1 , v i 2 , . . . ) in whic h each no de (other than 0) must appear inﬁnitely often. At time t , only node A t up dates its selected route π t ( A t ) and all other no des are unaﬀected. Sp eciﬁcally , if v = A t , then v c hooses its new b est route by setting π t ( v ) = argmax P ∈ choices ( v ,t ) λ v ( P ) , (1) 1 W e omit [1]’s FIF O queues and permitted path sets, which other features of the mo del can emulate. 1 where choices ( v , t ) is the set of all simple (non-lo opy) paths of the form vw π t ( w ) where w is a neighbor of v and π t ( w ) is w ’s current path. A no de v is stable in path assignmen t π if execut- ing (1) pro duces no c hange. A path assignment π is stable if all no des are stable in π . An instance ( G, λ ) is safe if any activ ation sequence ev en tually produces a stable path assignment, regardless of the initial path assignmen t. 2.2 Modeling a modiﬁed decision process A ranking function λ encapsulates the ﬁnal result of the BGP decision pro cess, whether that is due to an op erator’s assignment of the LOCAL PREF attribute for a route, or minimizing the AS P A TH length, or any of the v arious other factors that aﬀect the decision pro- cess. So a “modiﬁed BGP decision process” is simply a diﬀeren t ranking function λ 0 . But t w o ranking functions migh t b e eﬀectiv ely equiv- alen t, in that they pro duce the same outcome in prac- tice. The following deﬁnition rules out such degenerate mo diﬁcations. Deﬁnition 1. Tw o ranking functions λ, λ 0 are safely distinct if there exists a net w ork N for whic h ( N , λ ) and ( N , λ 0 ) are safe, but their stable states diﬀer. (Note that since the tw o instances are safe, they each ha v e a single st able state [2, 4].) As men tioned in the in- troduction, this deﬁnition restricts our attention to the case that there is some net w ork on whic h λ and λ 0 are safe and conv erge to diﬀerent outcomes. While this ap- p ears to be a v ery mild restriction, it is conceiv able that λ and λ 0 always pro duce identical stable states exc ept on netw orks where at least one of them may div erge. In that case, reasoning ab out diﬀerences in outcomes in- v olv es the system’s dynamics, i.e., particular activ ation sequences. It would b e p ossible to use our techniq ue to mak e statements about particular activ ation sequences, but w e c hoose to av oid that complication here. 3. PRECARIOUSNESS 3.1 Partial deployment In this section w e show that any safely distinct mo d- iﬁcation of the BGP decision pro cess can cause div er- gence or con v ergence, when partially deploy ed. But what exactly is a “partial deploymen t” of the mo diﬁed decision process? Since a ranking function is deﬁned for a sp eciﬁc netw ork, how can we “deplo y” it in a new environmen t where it may ha v e to rank new paths? F ortunately we can sidestep this mo deling com- plication since w e will need to use the ranking functions in only a black-box manner in our theorem. Sp eciﬁcally , suppose we ha v e ranking functions λ N and λ G on netw orks N and G , resp ectively , and a given subgraph N 0 ⊆ G is identical to N . Then a partial deplo ymen t of λ N in ( G, λ G ) is an instance ( G, λ ∗ ) where λ ∗ v ( P ) =    λ G v ( P ) if v ∈ G \ N 0 λ N v ( P ) if v ∈ N 0 and P ⊆ N 0 −∞ if v ∈ N 0 and P 6⊆ N 0 . In other words, the new ranking function λ ∗ mimics λ G except on N 0 where it mimics λ N . The third case causes λ ∗ to rank any path outside N 0 strictly less than ε , which ensures that λ N nev er is called up on to rank a path outside the net w ork N 0 on whic h it is w ell-deﬁned. This mo dels a scenario in whic h no des outside N 0 export no BGP route advertisemen ts to no des in N 0 . W e can now state and prov e the theorem. Theorem 1. If λ and λ 0 ar e safely distinct, then ther e exists an SPP instanc e ( G, λ G ) in which a p artial de- ployment of λ is safe, but a p artial deployment of λ 0 has no stable p ath assignment. One can in terpret the theorem as follo ws. If w e let λ b e the b ehavior of standard BGP , then the partial deplo ymen t of λ in ( G, λ G ) just means that the whole net w ork runs standard BGP , and the mo diﬁcation λ 0 causes divergence. Symmetrically , we can just as easily let λ 0 b e the behavior of standard BGP , in which case the modiﬁcation causes conv ergence. Proof. W e construct G as follows (Fig. 1). Since λ and λ 0 are safely distinct, there is a net w ork N on whic h their stable states diﬀer. W e include in G t w o copies of N which we call N and N 0 , but with only one instance of the destination 0. By the condition of the theorem, there m ust exist a w ∈ N whic h has diﬀering path sel ections in th e stable states of ( N , λ ) and ( N , λ 0 ). Let w 0 b e the corresp onding no de in N 0 . W e add a new no de x connected to w and w 0 . Finally , we add an “oscillator gadget” — a triangle a, b, c with eac h no de connected to the destination 0 — and connect a to x . W e construct λ G as follows. First, λ G v = λ v for all v ∈ N . The b ehavior of λ G on N 0 is irrelev an t, since this is where we will place the partial deploymen t of λ or λ 0 . Second, λ G x ranks paths as follows. Let P 1 , . . . , P k b e a list of all w ; 0 paths in N , and let P 0 1 , . . . , P 0 k b e the corresponding w 0 ; 0 paths in N 0 . Without loss of gen- eralit y , suppose that P 1 is w ’s selected path in the stable state of ( N , λ 0 ), while w ’s selected path in ( N , λ ) is some other path P i . Then w e let λ G x ( xw P 1 ) > λ G x ( xw 0 P 0 1 ) > λ G x ( xw P 2 ) > λ G x ( xw 0 P 0 2 ) > . . . > λ G x ( xw P k ) > λ G x ( xw 0 P 0 k ) > ε, with all other paths ranked b elow ε . Third and ﬁnally , on the oscillator gadget, λ G b e- ha v es lik e the classic Bad Gadget [1]: eac h of a, b, c will accept one of tw o paths, the direct path (e.g. a 0) and the path via its coun terclockwise neigh bor (e.g. ab 0), with the latter preferred. How ev er, to this structure we add the fact that a most prefers the path axw P i . 2 ... ... Standard BGP network N Modiﬁed BGP network N � Not-equal gadget Oscillator gadget w w � 0 0 0 x a b c xwP 1 xw � P � 1 ... xwP k xw � P � k axwP i ab 0 a 0 bc 0 b 0 ca 0 c 0 P 1 P k P � 1 P � k Figure 1: An SPP instance which conv erges if and only if N and N 0 ha v e the same stable state. The ranking function of certain nodes is written in blue next to the no de, listing paths from most to least preferred. Multi- ple copies of the destination 0 are drawn for clarit y , but these are in fact the same no de. With a partial deploymen t of λ on N 0 , the eﬀect of the construction is as follo ws. Since ( N , λ ) is safe, it m ust hav e a single stable state [2, 4], so N and N 0 will ev en tually stabilize with corresp onding path selections P i and P 0 i . Therefore, since x alw a ys prefers a path in N o v er the corresp onding path in N 0 , it will ev en tu- ally select xw P i p ermanen tly , causing a to select the path axw P i , causing c to select c 0, and b to select bc 0. Th us, a unique stable state is reac hed for an y activ ation sequence. On the other hand, consider a partial deplo ymen t of λ 0 on N 0 . Since ( N 0 , λ 0 ) is safe it must hav e a single stable state [2], which we know must diﬀer from the stable state of ( N , λ ). Th us, after N and N 0 con v erge, no de x is presente d with t w o diﬀer ent paths, xw P i and xw 0 P 0 1 . It will prefer the path xw 0 P 0 1 and remain with that selection thereafter. With a ’s possibility of any more-preferred path via x now eliminated, the no des a, b, c mimic the Bad Gadget, and hav e no stable state. Therefore, with a partial deploymen t of λ 0 , this instance has no stable state. 3.2 Full deployment If the requiremen t of partial deplo ymen t w ere remo v ed, Theorem 1 w ould no longer hold. Consider, for exam- ple, a mo diﬁed decision pro cess which simply p erforms shortest path routing. The theorem sho ws that a partial deplo ymen t of shortest path routing can cause div er- gence; how ever, a full deploymen t will alwa ys con v erge. But the theorem holds with full deploymen ts if we add a constraint on the mo diﬁcation: it m ust preserve the expressiv e p o w er of BGP . One wa y to formalize this is as follows. The op erator of eac h no de v sp eciﬁes a partial ranking function ˆ λ v whic h may assign m ulti- ple paths the same v alue. A decision pro cess is now a function d which, given a partial ranking function ˆ λ v , returns a ranking function d ˆ λ v consisten t with ˆ λ v (that is, ˆ λ v ( P 1 ) > ˆ λ v ( P 2 ) implies d ˆ λ v ( P 1 ) > d ˆ λ v ( P 2 )). In- tuitiv ely , d breaks any “ties” in ˆ λ ’s ranking of paths. 2 W e let d ˆ λ refer to the set of ranking functions pro duced b y applying d to the set of partial ranking functions ˆ λ v o v er all no des v . T aking common ISP business relationships as an ex- ample, an operator migh t specify ˆ λ v ( P ) = 100 for paths through v ’s providers, ˆ λ v ( P ) = 200 for paths through p eers, and ˆ λ v ( P ) = 300 for paths through customers. If v has m ultiple providers, p eers, or customers, this will not alwa ys yield a un ique b est path; the decision process d breaks those ties, p erhaps b y examining path length or other factors. How ever, the op erator c an c hoose to sp ecify an arbitrary ranking function by giving ˆ λ v ( P ) a distinct v alue for each P (in which case d do es not af- fect the outcome). In this sense, an y mo diﬁed decision pro cess preserves BGP’s expressiveness. W e can now sho w a theorem analogous to Theorem 1 without the partial deplo ymen t requiremen t. The pro of is an easy adaptation of our earlier technique: since the partial ranking function can b e used to force any total order, w e can build the necessary ranking functions even though the mo diﬁcation is deploy ed at all no des. Deﬁnition 2. Tw o decision processes d, d 0 are safely distinct if there exists a net w ork N and partial ranking functions ˆ λ for which ( N , d ˆ λ ) and ( N , d 0 ˆ λ ) are safe, but their stable states diﬀer. Theorem 2. If d and d 0 ar e safely distinct, then ther e exists a network G and p artial r anking functions ˆ λ G in which ( G, d ˆ λ G ) is safe, but ( G, d 0 ˆ λ G ) has no stable p ath assignment. Proof. Let N and ˆ λ b e suc h that ( N , d ˆ λ ) and ( N , d 0 ˆ λ ) are safe, but their stable states diﬀer. W e construct G based on N as in the pro of of Theorem 1. Let λ G v b e the ranking functions constructed in the pro of of The- orem 1 with λ = d ˆ λ and λ 0 = d 0 ˆ λ . Deﬁne the partial ranking functions ˆ λ G as ˆ λ G v =    λ G v if v ∈ { x, a, b, c } d ˆ λ v if v ∈ N ˆ λ v if v ∈ N 0 . Note that applying one of the decision processes to ˆ λ G can only v ary its b ehavi or for v ∈ N 0 . With this con- struction, applying the decision process d yields ranking functions d ˆ λ G whic h are exactly equiv alent to λ G with a partial deplo yment of d ˆ λ on N 0 . Likewise, d 0 ˆ λ G is ex- actly equiv alent to λ G with a partial deplo yment of d 0 ˆ λ on N 0 . Therefore, the result follo ws by the argument of the proof of Theorem 1. 2 Recall from the deﬁnition of a ranking function that d ˆ λ v migh t still ha ve ties, but only b etw een tw o routes that go through the same neigh bor, whic h we never need to compare. 3 4. EXTENSIONS Our results dealt with decision pro cesses whic h diﬀer in their ﬁnal stable state. One can also show that if there is any diﬀerence in path selections in tw o rank- ing functions λ and λ 0 at any moment during the dy- namic conv ergence pro cess, then for a particular acti- v ation sequence, a partial deploymen t of λ 0 will div erge while a partial deploymen t of λ will conv erge (or vice v ersa). This in volv es inserting a gad get (essen tially Dis- agree [1]) b etw een x and a in the construction of Fig. 1, to “remem ber” that some diﬀerence has occurred in the past. In one sense, this is st ronger than our previous re- sults, as it applies even to transient diﬀerences b et w een λ and λ 0 . How ever, it is less satisfying since it needs an activ ation sequence that runs N and N 0 in lo c kstep. With an y v ariation in timing, even tw o copies of λ could b e judged to b e diﬀerent at some moments in time. One could consider more general mo dels of the BGP decision pro cess, p erhaps treating it as a state mac hine with memory , in order to model features such as route ﬂap damping [6] which change their preferences across time. Since our theorems use λ and λ 0 essen tially as blac k boxes, such extensions may b e straightforw ard. 5. A CKNO WLEDGEMENTS W e thank Alex F abrik ant, Michael Schapira, and Scott Shenk er for helpful comments. 6. REFERENCES [1] T. Griﬃn, F. Shepherd, and G. Wilfong. The stable paths problem and interdomain routing. IEEE/A CM T r ansactions on Networking , 10(2), April 2002. [2] Aaron D. Jaggard, Mic hael Sc hapira, and Reb ecca N. W right. Distributed computing with adaptiv e heuristics. In Innovations in Computer Scienc e , January 2011. [3] Y. Rekhter and T. Li. A b order gatew a y proto col 4 (BGP-4). In RFC1771 , Marc h 1995. [4] R. Sami, M. Sc hapira, and A. Zohar. Searching for stabilit y in interdomain routing. In IEEE INF OCOM , April 2009. [5] K. V aradhan, R. Govindan, and D. Estrin. P ersisten t route oscillations in inter-domain routing. Computer networks , 32(1):1–16, 2000. [6] C. Villamizar, R. Chandra, and R. Govindan. BGP route ﬂap damping. In RFC2439 , Nov em ber 1998. 4

BGP Stability is Precarious

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment