BGP Stability is Precarious

We note a fact which is simple, but may be useful for the networking research community: essentially any change to BGP's decision process can cause divergence --- or convergence when BGP would otherwise diverge.

Authors: P. Brighten Godfrey

BGP Stability is Precarious P . Brighten Godfrey University of Illinois at Urbana-Champaign pbg@illinois.edu ABSTRA CT W e note a fac t which is simple, but may b e useful for the net w orking researc h comm unit y: essen tially any change to BGP’s decision pro cess can cause div ergence — or con v ergence when BGP would otherwise diverge. 1. INTR ODUCTION The In ternet’s in terdomain routing protocol, BGP [3], uses a decision pro cess to select a single b est route to eac h destination when presented with multiple options. This decision process can b e customized and modified at each router to select routes that achiev e v arious ob- jectiv es such as load balance, path quality , or security . When prop osing suc h a mo dification, we w ere ask ed a v ery natural question: Giv en the known problem that a distributed netw ork of BGP routers might nev er con- v erge to a stable state [5], migh t the prop osed c hange mak e the problem w orse? That is, do there exist cases in which the standard BGP proto col conv erges, but the proposed mo dification causes divergence? The Hippo cratic goal to do no harm is natural to desire of any mo dification to BGP’s decision pro cess, giv en its global importance. Ho w ev er, w e observe here that any mo dification to the decision pro cess can cause div ergence in some case when standard BGP w ould con- v erge, under very mild conditions. Sp ecifically , (1) the mo dified BGP m ust actually differ, in that there is some case where the mo dified and standard BGP b oth con- v erge, but to different outcomes; and (2) the mo difica- tion eit her ma y be deplo y ed at only some routers, or the mo dification preserves the expressiveness of standard BGP (for example b y maintaining the initial operator- configurable LOCAL PREF step). Even seemingly triv- ial changes, like changing a tiebreaking step from “lo w- est router ID” to “high est router ID” , satisfy these con- ditions and therefore may cause divergence. But this fact should not incite fear of modifying BGP . Indeed, an y mo dification could also cause c onver genc e when BGP w ould otherwise div erge. Th us, fear of mod- ifying BGP can b e equally matc hed with fear of not mo difying BGP . Instead, what the result points out is that the ques- tion of whether new cases of divergence c ould happen b y swit c hing fr om decision pro cess A to B is uninfor ma- tiv e, b ecause the answer is alwa ys “y es” for an y distinct v alues of A and B . A more v aluable question is how con- v ergence is affected in realistic cases. This, of course, is a m uc h more difficult question to answer convincingly , not least b ecause it requires assumptions ab out what is realistic. 2. MODEL 2.1 The standard model W e follo w the model of [1]. 1 An instance of the Stable P aths Problem (SPP) consists of a graph G = ( V , E ) and a set λ of ranking functions , one for each no de v ∈ V . No de v ’s ranking function λ v sp ecifies which paths v prefers; sp ecifically , if λ v ( P 1 ) > λ v ( P 2 ) then v prefers P 1 o v er P 2 . W e require that λ v ( P 1 ) 6 = λ v ( P 2 ) unless P 1 and P 2 ha v e the same first edge (since BGP learns only a single route from each neighbor, we will nev er need to compare tw o such routes). The “n ull path” ε represents the absence of a path to the destination, and is considered a v alid path. Since BGP’s decision process may eliminate a path P due to import or exp ort filters, w e ma y hav e λ v ( ε ) > λ v ( P ). W e write P 1 P 2 to denote the concatenation of tw o paths, or v w P to concatenate the edge ( v , w ) with path P . There is a single distinguished no de 0 to which all no des are choosing paths. A t any given time t , each no de v has a current path assignmen t π t ( v ). A t all times t , w e hav e π t (0) = 0 (i.e., the destination alwa ys selects the trivial one-hop path to itself ). The dynam- ics of the proto col are mo deled by a sequence of “acti- v ations” of no des A = ( v i 1 , v i 2 , . . . ) in whic h each no de (other than 0) must appear infinitely often. At time t , only node A t up dates its selected route π t ( A t ) and all other no des are unaffected. Sp ecifically , if v = A t , then v c hooses its new b est route by setting π t ( v ) = argmax P ∈ choices ( v ,t ) λ v ( P ) , (1) 1 W e omit [1]’s FIF O queues and permitted path sets, which other features of the mo del can emulate. 1 where choices ( v , t ) is the set of all simple (non-lo opy) paths of the form vw π t ( w ) where w is a neighbor of v and π t ( w ) is w ’s current path. A no de v is stable in path assignmen t π if execut- ing (1) pro duces no c hange. A path assignment π is stable if all no des are stable in π . An instance ( G, λ ) is safe if any activ ation sequence ev en tually produces a stable path assignment, regardless of the initial path assignmen t. 2.2 Modeling a modified decision process A ranking function λ encapsulates the final result of the BGP decision pro cess, whether that is due to an op erator’s assignment of the LOCAL PREF attribute for a route, or minimizing the AS P A TH length, or any of the v arious other factors that affect the decision pro- cess. So a “modified BGP decision process” is simply a differen t ranking function λ 0 . But t w o ranking functions migh t b e effectiv ely equiv- alen t, in that they pro duce the same outcome in prac- tice. The following definition rules out such degenerate mo difications. Definition 1. Tw o ranking functions λ, λ 0 are safely distinct if there exists a net w ork N for whic h ( N , λ ) and ( N , λ 0 ) are safe, but their stable states differ. (Note that since the tw o instances are safe, they each ha v e a single st able state [2, 4].) As men tioned in the in- troduction, this definition restricts our attention to the case that there is some net w ork on whic h λ and λ 0 are safe and conv erge to different outcomes. While this ap- p ears to be a v ery mild restriction, it is conceiv able that λ and λ 0 always pro duce identical stable states exc ept on netw orks where at least one of them may div erge. In that case, reasoning ab out differences in outcomes in- v olv es the system’s dynamics, i.e., particular activ ation sequences. It would b e p ossible to use our techniq ue to mak e statements about particular activ ation sequences, but w e c hoose to av oid that complication here. 3. PRECARIOUSNESS 3.1 Partial deployment In this section w e show that any safely distinct mo d- ification of the BGP decision pro cess can cause div er- gence or con v ergence, when partially deploy ed. But what exactly is a “partial deploymen t” of the mo dified decision process? Since a ranking function is defined for a sp ecific netw ork, how can we “deplo y” it in a new environmen t where it may ha v e to rank new paths? F ortunately we can sidestep this mo deling com- plication since w e will need to use the ranking functions in only a black-box manner in our theorem. Sp ecifically , suppose we ha v e ranking functions λ N and λ G on netw orks N and G , resp ectively , and a given subgraph N 0 ⊆ G is identical to N . Then a partial deplo ymen t of λ N in ( G, λ G ) is an instance ( G, λ ∗ ) where λ ∗ v ( P ) =    λ G v ( P ) if v ∈ G \ N 0 λ N v ( P ) if v ∈ N 0 and P ⊆ N 0 −∞ if v ∈ N 0 and P 6⊆ N 0 . In other words, the new ranking function λ ∗ mimics λ G except on N 0 where it mimics λ N . The third case causes λ ∗ to rank any path outside N 0 strictly less than ε , which ensures that λ N nev er is called up on to rank a path outside the net w ork N 0 on whic h it is w ell-defined. This mo dels a scenario in whic h no des outside N 0 export no BGP route advertisemen ts to no des in N 0 . W e can now state and prov e the theorem. Theorem 1. If λ and λ 0 ar e safely distinct, then ther e exists an SPP instanc e ( G, λ G ) in which a p artial de- ployment of λ is safe, but a p artial deployment of λ 0 has no stable p ath assignment. One can in terpret the theorem as follo ws. If w e let λ b e the b ehavior of standard BGP , then the partial deplo ymen t of λ in ( G, λ G ) just means that the whole net w ork runs standard BGP , and the mo dification λ 0 causes divergence. Symmetrically , we can just as easily let λ 0 b e the behavior of standard BGP , in which case the modification causes conv ergence. Proof. W e construct G as follows (Fig. 1). Since λ and λ 0 are safely distinct, there is a net w ork N on whic h their stable states differ. W e include in G t w o copies of N which we call N and N 0 , but with only one instance of the destination 0. By the condition of the theorem, there m ust exist a w ∈ N whic h has differing path sel ections in th e stable states of ( N , λ ) and ( N , λ 0 ). Let w 0 b e the corresp onding no de in N 0 . W e add a new no de x connected to w and w 0 . Finally , we add an “oscillator gadget” — a triangle a, b, c with eac h no de connected to the destination 0 — and connect a to x . W e construct λ G as follows. First, λ G v = λ v for all v ∈ N . The b ehavior of λ G on N 0 is irrelev an t, since this is where we will place the partial deploymen t of λ or λ 0 . Second, λ G x ranks paths as follows. Let P 1 , . . . , P k b e a list of all w ; 0 paths in N , and let P 0 1 , . . . , P 0 k b e the corresponding w 0 ; 0 paths in N 0 . Without loss of gen- eralit y , suppose that P 1 is w ’s selected path in the stable state of ( N , λ 0 ), while w ’s selected path in ( N , λ ) is some other path P i . Then w e let λ G x ( xw P 1 ) > λ G x ( xw 0 P 0 1 ) > λ G x ( xw P 2 ) > λ G x ( xw 0 P 0 2 ) > . . . > λ G x ( xw P k ) > λ G x ( xw 0 P 0 k ) > ε, with all other paths ranked b elow ε . Third and finally , on the oscillator gadget, λ G b e- ha v es lik e the classic Bad Gadget [1]: eac h of a, b, c will accept one of tw o paths, the direct path (e.g. a 0) and the path via its coun terclockwise neigh bor (e.g. ab 0), with the latter preferred. How ev er, to this structure we add the fact that a most prefers the path axw P i . 2 ... ... Standard BGP network N Modified BGP network N � Not-equal gadget Oscillator gadget w w � 0 0 0 x a b c xwP 1 xw � P � 1 ... xwP k xw � P � k axwP i ab 0 a 0 bc 0 b 0 ca 0 c 0 P 1 P k P � 1 P � k Figure 1: An SPP instance which conv erges if and only if N and N 0 ha v e the same stable state. The ranking function of certain nodes is written in blue next to the no de, listing paths from most to least preferred. Multi- ple copies of the destination 0 are drawn for clarit y , but these are in fact the same no de. With a partial deploymen t of λ on N 0 , the effect of the construction is as follo ws. Since ( N , λ ) is safe, it m ust hav e a single stable state [2, 4], so N and N 0 will ev en tually stabilize with corresp onding path selections P i and P 0 i . Therefore, since x alw a ys prefers a path in N o v er the corresp onding path in N 0 , it will ev en tu- ally select xw P i p ermanen tly , causing a to select the path axw P i , causing c to select c 0, and b to select bc 0. Th us, a unique stable state is reac hed for an y activ ation sequence. On the other hand, consider a partial deplo ymen t of λ 0 on N 0 . Since ( N 0 , λ 0 ) is safe it must hav e a single stable state [2], which we know must differ from the stable state of ( N , λ ). Th us, after N and N 0 con v erge, no de x is presente d with t w o differ ent paths, xw P i and xw 0 P 0 1 . It will prefer the path xw 0 P 0 1 and remain with that selection thereafter. With a ’s possibility of any more-preferred path via x now eliminated, the no des a, b, c mimic the Bad Gadget, and hav e no stable state. Therefore, with a partial deploymen t of λ 0 , this instance has no stable state. 3.2 Full deployment If the requiremen t of partial deplo ymen t w ere remo v ed, Theorem 1 w ould no longer hold. Consider, for exam- ple, a mo dified decision pro cess which simply p erforms shortest path routing. The theorem sho ws that a partial deplo ymen t of shortest path routing can cause div er- gence; how ever, a full deploymen t will alwa ys con v erge. But the theorem holds with full deploymen ts if we add a constraint on the mo dification: it m ust preserve the expressiv e p o w er of BGP . One wa y to formalize this is as follows. The op erator of eac h no de v sp ecifies a partial ranking function ˆ λ v whic h may assign m ulti- ple paths the same v alue. A decision pro cess is now a function d which, given a partial ranking function ˆ λ v , returns a ranking function d ˆ λ v consisten t with ˆ λ v (that is, ˆ λ v ( P 1 ) > ˆ λ v ( P 2 ) implies d ˆ λ v ( P 1 ) > d ˆ λ v ( P 2 )). In- tuitiv ely , d breaks any “ties” in ˆ λ ’s ranking of paths. 2 W e let d ˆ λ refer to the set of ranking functions pro duced b y applying d to the set of partial ranking functions ˆ λ v o v er all no des v . T aking common ISP business relationships as an ex- ample, an operator migh t specify ˆ λ v ( P ) = 100 for paths through v ’s providers, ˆ λ v ( P ) = 200 for paths through p eers, and ˆ λ v ( P ) = 300 for paths through customers. If v has m ultiple providers, p eers, or customers, this will not alwa ys yield a un ique b est path; the decision process d breaks those ties, p erhaps b y examining path length or other factors. How ever, the op erator c an c hoose to sp ecify an arbitrary ranking function by giving ˆ λ v ( P ) a distinct v alue for each P (in which case d do es not af- fect the outcome). In this sense, an y mo dified decision pro cess preserves BGP’s expressiveness. W e can now sho w a theorem analogous to Theorem 1 without the partial deplo ymen t requiremen t. The pro of is an easy adaptation of our earlier technique: since the partial ranking function can b e used to force any total order, w e can build the necessary ranking functions even though the mo dification is deploy ed at all no des. Definition 2. Tw o decision processes d, d 0 are safely distinct if there exists a net w ork N and partial ranking functions ˆ λ for which ( N , d ˆ λ ) and ( N , d 0 ˆ λ ) are safe, but their stable states differ. Theorem 2. If d and d 0 ar e safely distinct, then ther e exists a network G and p artial r anking functions ˆ λ G in which ( G, d ˆ λ G ) is safe, but ( G, d 0 ˆ λ G ) has no stable p ath assignment. Proof. Let N and ˆ λ b e suc h that ( N , d ˆ λ ) and ( N , d 0 ˆ λ ) are safe, but their stable states differ. W e construct G based on N as in the pro of of Theorem 1. Let λ G v b e the ranking functions constructed in the pro of of The- orem 1 with λ = d ˆ λ and λ 0 = d 0 ˆ λ . Define the partial ranking functions ˆ λ G as ˆ λ G v =    λ G v if v ∈ { x, a, b, c } d ˆ λ v if v ∈ N ˆ λ v if v ∈ N 0 . Note that applying one of the decision processes to ˆ λ G can only v ary its b ehavi or for v ∈ N 0 . With this con- struction, applying the decision process d yields ranking functions d ˆ λ G whic h are exactly equiv alent to λ G with a partial deplo yment of d ˆ λ on N 0 . Likewise, d 0 ˆ λ G is ex- actly equiv alent to λ G with a partial deplo yment of d 0 ˆ λ on N 0 . Therefore, the result follo ws by the argument of the proof of Theorem 1. 2 Recall from the definition of a ranking function that d ˆ λ v migh t still ha ve ties, but only b etw een tw o routes that go through the same neigh bor, whic h we never need to compare. 3 4. EXTENSIONS Our results dealt with decision pro cesses whic h differ in their final stable state. One can also show that if there is any difference in path selections in tw o rank- ing functions λ and λ 0 at any moment during the dy- namic conv ergence pro cess, then for a particular acti- v ation sequence, a partial deploymen t of λ 0 will div erge while a partial deploymen t of λ will conv erge (or vice v ersa). This in volv es inserting a gad get (essen tially Dis- agree [1]) b etw een x and a in the construction of Fig. 1, to “remem ber” that some difference has occurred in the past. In one sense, this is st ronger than our previous re- sults, as it applies even to transient differences b et w een λ and λ 0 . How ever, it is less satisfying since it needs an activ ation sequence that runs N and N 0 in lo c kstep. With an y v ariation in timing, even tw o copies of λ could b e judged to b e different at some moments in time. One could consider more general mo dels of the BGP decision pro cess, p erhaps treating it as a state mac hine with memory , in order to model features such as route flap damping [6] which change their preferences across time. Since our theorems use λ and λ 0 essen tially as blac k boxes, such extensions may b e straightforw ard. 5. A CKNO WLEDGEMENTS W e thank Alex F abrik ant, Michael Schapira, and Scott Shenk er for helpful comments. 6. REFERENCES [1] T. Griffin, F. Shepherd, and G. Wilfong. The stable paths problem and interdomain routing. IEEE/A CM T r ansactions on Networking , 10(2), April 2002. [2] Aaron D. Jaggard, Mic hael Sc hapira, and Reb ecca N. W right. Distributed computing with adaptiv e heuristics. In Innovations in Computer Scienc e , January 2011. [3] Y. Rekhter and T. Li. A b order gatew a y proto col 4 (BGP-4). In RFC1771 , Marc h 1995. [4] R. Sami, M. Sc hapira, and A. Zohar. Searching for stabilit y in interdomain routing. In IEEE INF OCOM , April 2009. [5] K. V aradhan, R. Govindan, and D. Estrin. P ersisten t route oscillations in inter-domain routing. Computer networks , 32(1):1–16, 2000. [6] C. Villamizar, R. Chandra, and R. Govindan. BGP route flap damping. In RFC2439 , Nov em ber 1998. 4

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment