Locator/identifier split using the data link layer

Lo cator/iden tiﬁer split using the data link la y er Victor Grishchenk o , Ura l State Uni versity Octob er 2 6, 20 21 1 In tro duction The lo cator/iden tiﬁer split approac h as- sumes separating functions o f a lo cator (i.e. top ology–dep en den t attac hmen t p oin t ad- dress) and iden tiﬁer (top ology-indep endent unique iden tiﬁer) curren tly b oth serv ed by an IP address. This w ork is an att empt to redeﬁne seman tics of MA C address t o make it a pure lay er-2 lo cator instead of a pure globally-unique iden tiﬁer. Suc h an exercise migh t b e in teresting from the standp oin t of Ethernet scaling and Metro Ethernet tec h- nologies. F rom the global routing p ersp ec - tiv e, in t r oduction of m ultihoming, t raﬃc e n- gineering and failov er at the 2nd la y er ma y reduce pressu re on the 3rd la y er. Historically , an Ethe rnet net w ork w as sup- p osed to b e a single wire with man y de- vices attac hed. F or reliable iden tiﬁcatio n, eac h device has a factory-preset globally- unique 6-byte MA C address. Eac h Ether- net frame has destination and source MA C addresses. Curren tly , switc hed net w o r ks are prev alen t, so o ne ph ysical “wire” normally connects just t w o devices. La y er-2 switc hed net w o r k s are t ypically divided in to ﬂat logical segmen ts (VLANs) where MAC addresse s are used as pure identiﬁers to p erform frame fo r - w arding/routing using spanning trees and an- nounce ﬂo o ding. Inte restingly , that b eha vior is more akin to what w as traditionally con- sidered as “routing”, alb eit ultima t e ly it still em ulates shared copp er. The trick is to try to in tro duce adv a nc ed routing options and to p ologies to the data link lay er of a given net w ork, without touch- ing a nything at the 3rd la y er or end hosts. 2 Redeﬁning MAC The ob jectiv e is to ease la y er- 2 switc hing in large lay er-2 net w orks by ov erloading MA C addresses to b e pure lo cators. IP address thus pla ys a s a pure iden tiﬁer. As a matter o f fact, mesh netw orks are not eﬀectiv e. STP proto col, for example, starts with deactiv ating extra links to turn the to pology into a tree. I will cons ider a tiered arc hitecture where every switc h is con- nected to some uplink switc hes and some do wnlink switc hes/devices. “Horizon tally” connected switc hes are mo deled as a single switc h (“stac king”). Shortcutting, a w eak er form of “ ho riz on tal” linking, is mo deled a s a ﬁctiv e common uplink switc h. The p rop osed addres sing sc heme is a feature-cut of the preﬁx-bunc h archite cture. A device of i -th tier has a n um b er of “Big- MA C” addresses consisting of i meaningful b ytes and 6 − i padding zero es: { b 1 . . .b i 0 . . . } . More precisely , eac h suc h address b elongs to some uplink p ort. F urther, c - th do wnlink p ort is assigned addresses { b 1 . . .b i c 0 . . . } . Th us, eve ry BigMA C a ddre ss corresp onds to a down w ard path from some top-tier switc h to the target device. Diﬀerently from hierarc hical addressing, all t he net w ork’s ad- dresses com bined form not a tree, but a tree- resem bling structure I will christe n a “branc h bunc h”. The a v erage n um b e r of addresses an end host will hav e is estimated as N log 2 u log 2 d − log 2 u , where N is the n um b er of end hosts; u a nd d are uplink/do wnlink fanouts resp. (So, 4 √ N for u = 2, d = 32.) It is not generally sup- p osed that a device kno ws all of its addresses. 1 3 Switc hing Ob viously , forw arding a frame fro m uplink to do wnlink is as simple as c hec king ( i + 1)-th b yte of the destination address whic h con- tains the n um b er of the egress p ort. T o for- w ard a frame from a do wnlink up, a switc h has to direct the fra me to t he uplink p ort that o wns the source Big M A C address (strat- egy α ). The switc h migh t as we ll rewrite ﬁrst i b ytes of the source address to forw ard the frame to an arbitrary uplink – a s suming, that the top-tier switc hes are f ully intercon- nected (strategy β ). A more sophisticated up w a rd-forw ar ding strategy is to chec k the destination address against av a ilable uplink addresses to detect the longest preﬁx matc h (strategy γ ). Alb eit, this functionality is not dirt-c heap t o implemen t, so it is b etter to shift it aw a y fr om the switc h, see Sec. 4. Another strategy δ ma y reduce up w ard for- w arding t o the same kind of by te-c hec k em- plo y ed b y do wn w a r d forw arding. Namely , if a typical switc h has 24. . . 32 p orts and uses a b yte of a ddre ssing space, then there are 3 spare bits to use. As the num b er of uplinks is supp ose dly less than 8, 3 bits of the b yte ma y stand fo r the n um b er of the uplink the original BigMAC came from, while the rest 5 denote the dow nlink p ort, as b efore. So, 3 bits of the ( i + 1)-th by te of the source address denote the egress uplink p ort. One more issue is when to forward from do wnlink to downlin k. O ne criterion is ﬁrst i b ytes of the source address b eing equal t o the ﬁrst i bytes of the destination address. T o preserv e compatibilit y with the end hosts, some MA C address rewriting is needed. By using ARP , end hosts learn BigMA Cs o f p eers (see Sec. 4). T o fully control host-to- host traﬃc paths, w e ha v e to set b oth source and destination BigMACs . So, a customer- edge switc h has not only to rewrite gen uine MA C of an end host for a BigMA C, but also to remem b er whic h particular source Big - MA C to use fo r a giv en destination Big MAC. By sacriﬁcing one b yte of BigMA C this is also reduced to a b yte-c hec k (left as an excercise). 4 ARP&DHCP T o remain backw ard compatible with Ether- net+IP end host stac ks, a diﬀeren t function- ing of ARP and DHCP is needed. I supp ose that all AR P and DHCP tr aﬃc is div erted to some dedicated ARP&DHCP serv er. P ossi- ble v ariants of distributed/tiered implemen- tation are omitted. All AR P /DHCP broadcasts of end hosts are up w a rd-ﬂoo ded, i.e. sen t to ev ery up- link. Finally , AR P&D HC P serv er gets a copy of a request from ev ery p oss ible pat h, thus passiv ely learning the top ology . The tot a l amoun t of requests is t h us O ( N 1 . 25 ) for the reference case of u = 2 , d = 32. A reply tra v- els by a single path to the end host. As men tioned, ARP serv er may do some traﬃc engineering b y sending a reply con tain- ing BigMAC of the targ e t ho s t to a particu- lar BigMA C address of the requesting host so the edge switc h learns the asso ciation. This w a y we ma y “outsource” the a foreme n tioned longest-preﬁx matc hes to a dedicated out-of- band en tit y and cac he them later on (b eneﬁts of γ for the price o f δ ). That also op ens p os- sibilities for load balancing. Some failov er and on-t he -ﬂy reconﬁgura- tion functionality might b e ac hiev ed by the means o f ARP a nn ouncemen ts. 5 Conclus ion So, if nothing imp ortan t was ov erlo ok ed, branc h-bunc h lo cators may bring man y gains to t he data link la y er. Switc hing logic is dramatically simpliﬁed; it needs no rout ing tables, no asso ciativ e memory lo okups, no longest preﬁx matc hes. Scalabilit y is high as forw arding-related computational loa d o n a single switc h generally do es not dep end on the size of the netw ork. The net w ork has sim- ple to ols for basic traﬃc engineering: on-the- ﬂy load balancing and failov er. Last but not least, the approach preserv es bac kw ar d com- patibilit y . An yw ay , an y questions, commen ts, criticisms and considerations are welc omed. 2

Locator/identifier split using the data link layer

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment