A Radar for the Internet

A Radar f or the Intern et Matthieu Latapy 1 , 2 , Cl ´ emence Magnien 1 , 2 , Fr ´ ed ´ eric Ou ´ edraogo 1 , 2 , 3 1: UPMC Univ Paris 06, UMR 7606, LIP6, F-75016, Paris, France 2: CNRS , UMR 7606, LIP6, F-75016, Paris, Fra nce 3: Uni versity of Ouagadougou, L TIC, Ouagadougou, Burkina F aso First- name.Lastname @lip6.fr Abstract In contrast w ith most internet topology measur ement r es- ear ch, our conce rn her e is not to obtain a map a s comp lete and pr ecise as possible of the whole intern et. Instead, we claim that each machine’s view of this top ology , which we call ego-centered view , is an object worth of stud y in it- self. W e design a nd imp lement an ego-center ed me asur e- ment tool, and perform radar-lik e mea sur ements co nsist- ing of r epeated measur ements of such views o f the internet topology . W e cond uct lo ng-term (several weeks) and high- speed (one r o und every few minutes) measurements of this kind fr om mor e than one hundred monitors, and we p r ovide the obtained data. W e also show that these data may be used to detect events in the dynamics of internet topology . 1 Intr oduction. Since the end of the nineties, constructing maps of the internet using trac eroute -like mea surements received much attention, see for in stance [13, 26, 18, 3, 20, 14, 21, 7, 2 9, 16, 27]. Such mea surements are h owe ver partial and they may co ntain signiﬁcant bias [19, 6, 8, 9, 1 7, 4]. As a consequen ce, much effort is nowadays dev oted to th e col- lection of m ore accu rate d ata [ 26, 3, 5 , 28], but this task is challengin g. In order to avoid these issues and obtain some insight on internet to pology d ynamics , we use here a radically d iffer - ent ap proach : we focus on what a gi ven machine sees of the topolog y around itself, which we call an ego-center ed view (it basically is a ro uting tree measu red in a trac eroute - like manner) . These ego-cen tered m easuremen ts may b e perfor med very efﬁciently (ty pically in minutes, and ind uc- ing lo w network load); it is therefo re possible to repeat them in per iodic roun ds, an d ob tain in this way infor mation on the dynamics of the top ology , at a time-scale sign iﬁcantly higher than previous ap proach es (see for instance [23, 8]). T ak ing advantage of these stren gths, we con duct ma ssi ve radar-like measu rements o f the internet. W e provide both the measure ment tool and the co llected data, and show that they re veal interesting features of the observed top ology . 2 Measur ement framework. One may use traceroute directly to collect ego-cent- ered v iews by pr obing a set of destinations. This app roach howe ver has serious d rawbacks. First, as d etailed in [1 1] and illustrated in Figure 1, the measure ment load is highly unbalan ced between node s and there is much red undancy in the obtained data (in tuitiv ely , one p robes links close to the m onitor mu ch mo re than others). Even worse, this im- plies that the o btained info rmation is not homogeneous, and thus much mo re difﬁcult to analyse rig orously (fo r in stance, the dy namics may seem hig her close to the monito r). Fi- nally , tho ugh th e measurem ent would intuitively produ ce a routing tree, the ob tained vie w actu ally differs si gn iﬁcantly from a tree (see for in stance [28]). Again, this makes the analysis (visua lisation o f the d ata, f or instan ce) m ore intr i- cate. Finally , the dir ect trac eroute approach ha s m ultiple se vere drawback s. In this section we ﬁr st design an ego- centered measuremen t tool remed ying to this. W e then in- clude it in a radar measureme nt schem e. 2.1 Ego-center ed measuremen ts. As alread y discussed in v ario us contexts [ 12, 11, 10, 24, 22, 27], one ma y av oid the issues describ ed above by p er- forming tree-like measur ements in a b ackward way: given a set o f d estinations to prob e, o ne ﬁrst d iscovers the last link on the path to e ach o f th em, th en th e previous link on e ach of these paths, and so on ; when two (or more) paths reach the same n ode then the probin g tow ards all co rrespon ding destinations, except one, stop s 1 . However , as illustrated in 1 Such measurements require the distance to wards each destinati on, which is not tri vial [22]; we discuss this in Section 2.2. Figure 1, naiv e such measurements encounter serious prob- lems because o f rou ting changes an d other events. W e p ro- vide a solutio n in the trac etree alg orithm below: the tree nod es are n ot I P addr esses anymore, but pairs compo sed of an I P ad dress (or a star if a timeout o ccurred ) an d the T T L at which it was o bserved (see Figure 1 for an illustration) . This is sufﬁcient to ensure that the obtain ed view is a tre e, while keeping the algo rithm very simple. It sen ds o nly one packet fo r each lin k, and thus is optimal. Moreover , each link is discovered exactly once, wh ich gives an hom oge- neous v iew of the topolog y an d balances the measureme nt load. Algorithm 1 : tracetre e algorithm . Input : set D of destina tions, with d ∈ D at distance ttl d . to probe ← em pty queue, to rece iv e ← ∅ , seen ← ∅ foreach d ∈ D do add ( d, ttl d ) to to pr obe while to probe not empty or to r eceiv e 6 = ∅ do α if to pr obe not empty then pop ( d, ttl ) f rom to probe and send a prob e to it add ( d, ttl, cur r ent time ()) to to r eceiv e // here necessarily to r eceive 6 = ∅ β if answer p to a pr obe to ( d, ttl ) received then // p sent by p .sour ce , reply to a probe to ( d, ttl ) if ( d, ttl, ) ∈ to rec e iv e then // else timeout remove ( d, ttl , ) from to r eceiv e ; print p.source ttl d if ( p.source , ttl ) 6∈ seen then add ( p.source, ttl ) to seen push ( d, ttl − 1) in to probe if ttl > 1 for ( d , ttl , t ) ∈ to r eceiv e if timeout e xceed ed do remove ( d, ttl , t ) from to r eceiv e print * ttl d push ( d, ttl − 1) in to probe if ttl > 1 From such trees with ( I P , T T L ) no des, on e o btains a tree on I P addresses by applying the fo llowing ﬁlter (illu strated in Figure 1) 2 : ﬁrst m erge all nodes of the tree wh ich corre- spond to a s ame I P ; remove loops (links from an I P to itself); iterativ ely r emove the stars with no successor ; merge all the stars which are successor of a same node into a uniqu e star; construct a B F S tree of th e obtain ed grap h which leads to a tree on I P addr esses 3 ; iteratively rem ove the leav es wh ich are n ot the last nod es enc ountered when pr obing any desti- nation. 2 The measurement would be slightly more ef ﬁ cient if the ﬁlter was in- cluded directl y in tracetree ; howe ver , to kee p things simple and mod- ular , w e preferre d to separate the two. 3 During the construction of the B F S tree, neighbours of a node are vis- ited in le xicographi c order , and sta rs are vi sited after I P s. b c a d g f o n m l k j h i p e a0 j5 g4 d3 o6 *5 j4 f3 c1 e2 p6 p5 k4 i3 b2 n7 *6 (1) (2) (3) (4) (5) (6) (7) * * * * * * * * Figure 1. T ypical outputs of various measure- ments schemes. (1) – Real top ology . a is the monitor , n , o , and p are the destinat ions. W e suppose that l does not answer to pr obes, that b is a per-destination load balancer , for - warding trafﬁc f or n to d , and trafﬁc f or o to f , and that e is a per-pac ket load balancer for - warding packets alternately to i and h . Suc h situations are frequent in practice. (2) – Mea- surement with traceroute . Th ree r outes are collected, leading to a h igher load on links c lose to the monito r (represented by thicker lines here). (3) – Naive tree measurement. Because of a route change due to per-pac ket load balancer e , one obtains a disconnected part . (4) – Measure ment wit h tracetree . Nodes ar e pairs of I P addresses and T T L , with redundancy in th e addresses; one necessar- ily obtains a tree . (5–7) – Main steps of t he ﬁltering process. (5) – Pair s w ith same I P ad- dress are mer ged and loops are remo ved; (6) – Appropriate stars are mer g ed and a B F S t ree is computed; (7) – Leaves which are not the last node on a path towar ds a destination are iteratively removed. This is the ﬁn al o utput of the ﬁlter . The key po int is that the obtain ed tree is a possible I P routing tree fr om th e monito r to th e d estinations (similar to a bro adcast tre e). The obtain ed tree c ontains a lmost as much inform ation as the origina l trace tree output an d has the advantage o f bein g much mo re simple to a nalyse. W e ev aluated the impact of th is ﬁltering on ou r observa- tions, a nd fou nd that it was negligible: Detailing this is ou t of the scope of this paper . Many non-trivial points would deserve more discussion. For instanc e, o ne may ap ply a gree dy sending or r eceiv- ing strategy (by replac ing line α o r β in Alg orithm 1 by a while , respe cti vely); identif ying reply packets is non - trivial, as well as extracting th e r elev ant in formation fr om the rea d pac kets; introd ucing a delay m ay b e necessary to stay below the maxim al I C M P sending rate o f the mo ni- tor; one may consider a nswers r eceived after the timeou t but bef ore the end of the me asurement ( whereas we ig nore them); one m ay use o ther pr otocols than I CM P (the class ical traceroute uses U D P or I C M P packets); the initial order of the destinations may have an impact on the measur ement; there may be many cho ices for the B F S tre e in the ﬁlter; etc. Howe ver , enterin g in such details is far beyond the scope of this paper, and we refer to th e code an d its docu mentation [2] for full details. 2.2 Radar. W ith the trace tree tool and its ﬁltered version , we have the groun d material to condu ct radar measur ements: giv en a monitor and a set of de stinations, it sufﬁces to r un periodic ego-c entered mea surements, which we call mea- surement r ou nds . Th e me asurement frequen cy must be high enoug h to cap ture interesting dy namics, but low en ough to keep the network load reaso nable. W e will discu ss this in the next s ection . The o nly remain ing issue is the estimatio n of distances tow ards destinations, wh ich is a no n-trivial task in genera l [22]. This play s a key role here, since over -estimated dis- tances lead to se veral packets h itting destination s. Under- estimated distances, instead, miss th e last links towards the destinations. One may howe ver suppose that the distan ce between the monitor and any destination generally is stable between consecutive rou nds o f r adar measuremen t. Then , the dis- tances a t a g i ven r ound are the ones observed dur ing the pre - vious roun d. If the d istance happ ens to b e und er-estimated (we d o n ot see the destinatio n at this d istance), then we set it to a default maximal value (gen erally equal to 30 ) and start the measuremen t from there (and we update the corre- sponding distance for the next round). 3 Measur ement a nd data. First notice that many p arameters (including the monitor and destination set) may h av e a d eep impact on the ob tained data. Estimating this impact is a challenging task since test- ing all combinations of parameters is totally out o f reach. In addition, th e co ntinuou s evolution of the measure d object makes it dif ﬁcult to com pare sev eral measurements: the ob- served ch anges may be d ue to parameter mod iﬁcations or t o actual changes in the topology . T o by pass these issues wh ile keeping the study r igorou s, we pr opose the following ap proach . W e ﬁrst cho ose a set of seemingly rea sonable para meters, which we call b ase pa - rameters (see Section 3.1). Then we conduct measur ements with these par ameters from several monitors in par allel. On some mo nitors, called contr ol mo nitors , we keep th ese pa - rameters co nstant; on others, called test mon itors , we al- ternate period s with base parame ters and perio ds where we change (gen erally on e of) these p arameters. Contro l mo n- itors make it possible to ch eck that the chang es o bserved from test mo nitors are due to change s of p arameters, no t to events on th e network. The alternation of pe riods with base par ameters an d modiﬁed o nes also makes it possible to con ﬁrm this, and to observe the ind uced changes in the observations. I n many cases, it is also possible to simulate what o ne would have seen in principle if the parameters had stayed unch anged, wh ich gives f urther in sight (we will il- lustrate this below). W e u se a wide set of mo re tha n on e hundr ed mo nitors scattered around the world, provided b y PlanetLab [25] and other struc tures (sm all com panies and individual D S L links) [2]. I n o rder to be as general as possible, an d to simplify the destination setup , we use destinations ch osen by samp ling random valid I P addre sses and keeping those answering to ping at th e time o f th e list construction . Other selection proced ures would o f c ourse make sense (this raises inter- esting perspectives). 3.1 Our base parameters and data set. In all the paper, the b ase parameters co nsist o f a set o f 3 000 d estinations for each monitor , a maximal T T L of 30 , a 2 seconds timeout and a 10 minu tes delay between r ounds. All ou r measur ements were condu cted with variations of these parameters; wherever it is n ot explicitly speciﬁed, the parameters were the b ase on es. W e ran measu rements con- tinuously du ring several weeks, with so me interruptio ns due to mon itors an d/or loc al network shutdowns. Th e obtain ed data is av ailable at [2]. 3.2 Inﬂuence of parameters. Using the meth odolo gy sketched ab ove, we show h ere how to rig orously evaluate the inﬂuen ce of various par am- eters. W e focus on a few r epresentative ones only , the key conclusion b eing tha t the b ase pa rameters d escribed ab ove ﬁt our needs very well. Figure 2 (left) shows the impact of t he inter -ro und delay: on th e r ightmost par t the delay was signiﬁcan tly red uced, leading to an in crease in the ob servation’ s time resolution ( i.e. mor e points p er un it of time) . It is clear fro m t he ﬁgure that this has no signiﬁcant impact on the obser ved behavior . In p articular, the variations in the numb er of I P add resses seen, thoug h th ey have a high er resolu tion after the speed- up, ar e very similar before an d a fter it. Moreover, th e con- trol monitor shows that the base time scale is relev ant, since improving it does not re veal signiﬁcantly higher dynamics. Figure 2 (m iddle) shows the impact of the nu mber of destinations. As expected, increasing this numb er leads to an increase in th e num ber o f observed I P addre sses. Th e key point ho wever is that increasing the nu mber of destina- tions may lead to a relative loss of efﬁciency: simulatio ns of wh at we would have seen with 3 000 or 1 0 00 d estina- speed−up measurement control monitor 10400 10600 10800 11000 11200 11400 11600 11800 12000 0 10 20 30 40 50 60 70 80 3000 d. 1000 d. 10000 d. 3000 d. (sim) 1000 d. (sim) 3000 d. monitor control 0 5000 10000 15000 20000 25000 30000 0 10 20 30 40 50 60 70 80 90 2s 4s 1s 200 250 300 350 400 450 500 0 20 40 60 80 100 120 x = hours; y = # ip x = hours; y = # ip x = hours; y = round duration (s) Figure 2. Impact of mea surement para meters. Th e x axis o f all plots represents t he time (in hours) since the beginning of the measurement. Left: impact of inter-round delay . Number o f distinct I P addresses viewed at each round. The bot tom plot corresponds to a con trol monitor with the base parameters; the ot her m onitor star ts with the base parameters, and abo ut 27 hours later w e reduce the inter -round dela y from 10 minutes to 1 ( each ego-centered measurement takes ar ound 4 minutes). Cent er: impact of the number of destinatio ns. Nu mber o f distinct I P addresses viewed at each r ound. The plo t close to y = 10 000 corresponds to a control monitor with the base parameters. The other p lain-line plot is produced by a mon itor which star ts with the base parameters, thus with a destinatio n set D of siz e 3 000 , changes t o a set D ′ of 10 0 00 destinations containing D , goes back t o D , and ﬁnally turns to a sub set D ′′ of size 1 000 of D . In addition, the dot ted plots are simulations of what we would have seen from this mo nitor with D during the measurement using D ′ (obtained by dropping all nodes and links which are o n paths towards destinations that are not in D ), and w hat we would have seen with D ′′ during the measurements using D o r D ′ (obtained similarly). Rig ht: impact of t imeout value. Round duration (in seconds). The monitor starts w ith a timeout v alue of 4 s , then we c hange it to 2 s , and ﬁnally to 1 s . tions displa y a smaller numb er of I P addresses than d irect measuremen ts with the se number s of destinations (the con- trol mo nitor p roves that this is not du e to a simultaneo us topolog y chang e). This is d ue to th e fact that prob ing to- wards 10 0 00 destinations ind uces too high a network load: since some r outers a nswer to I C M P p ackets with a limited rate on ly [15], overloadin g them makes them invisible to our measurem ents. Im portantly , this does not occur in sim- ulations of 1 000 destination mea surements fr om ones with 3 000 , th us showing that the loa d in duced with 3 0 00 d esti- nations is reasonable, to this regard. Figure 2 (right) shows th e impact o f the timeou t value. As expected, decreasing the timeout lead s to a decrease in the rou nd duration . Howe ver , it also causes more rep lies to probe pa ckets to be ignored be cause we receive th em after the time out. A go od value for the timeout is a co mprom ise between the two. W e observe th at the rou nd d uration is only slightly larger with a timeou t o f 2 s than with a timeou t of 1 s (contrar y to the chang e be tween a timeout o f 4 and 2 s ) . The base value of the timeo ut (2 s) seems ther efore appr opriate, because it is ra ther la rge and doe s not lead to a long roun d duration . W e also consider ed o ther observables (like the numb er of stars seen at ea ch roun d, and th e num ber of packets re- ceiv ed after the timeou t), for m easurements o btained from various monitor s and towards various destination s; in all cases, the co nclusion was the same: the b ase p arameters propo sed ab ove meet ou r requirements. 3.3 Comparison with tracerout e . As explained in Section 2 .1, a key goal o f our trace- tree measurem ent too l is to perf orm sig niﬁcantly better than direct u se of trace route in our co ntext. T o eval- uate this, we comp are the d ifference in th e obtained infor- mation with tr aceroute and tr acetree , a s well as the load they ind uce on the network (Figu re 3, left and cen- ter). First notice that the p lot as a fun ction of the num ber of roun ds with tracerout e is h igher than th e one with tracetree , as expected : any tracerou te round gath- ers slightly more data than the correspo nding tracetree round (below 1 % , here ). It is h owe ver mu ch more interest- ing to compare them in terms of the number of packets sent (reﬂecting the lo ad indu ced on the network and ou r ability to in crease the me asurement fre quency). The p lots show that, to this regar d, t racetree is m uch more efﬁcient than direct tracer oute measu rements: h ere, trace - tree reac hes 14 10 0 distinct I P add resses with aroun d 3 millions packets, while tracer oute need s aro und 4 . 5 millions packets. Recall more over that the loa d in duced by tracetre e is balanced among links, which is not the case f or trace- traceroute tracetree f(nb rounds) 13400 13500 13600 13700 13800 13900 14000 14100 14200 14300 20 30 40 50 60 70 80 90 100 110 traceroute f(nb packets) tracetree 1e+06 2e+06 3e+06 4e+06 5e+06 6e+06 7e+06 13400 13500 13600 13700 13800 13900 14000 14100 14200 14300 link load distribution with traceroute 1 10 100 1000 10000 100000 1 10 100 1000 x = # r ounds; y = # ip x = # packets; y = # ip x = # times pr obed; y = # links Figure 3. Comparison between traceroute and tracetree . Left and center: number of distinct I P addresses viewed since t he b eginning with a tr aceroute m easurement ( plain lines) and a trace - tree measurement simulated from it (dotted lines); left: as a function of the number of rounds; center: as a f unction of the number of packets sent. T o impro ve readability , we cut the par t of the plots corresponding to the 20 ﬁrst rounds and to the 10 6 ﬁrst pac kets, respectivel y . Right: typical l ink load distribution with a traceroute ego-centered measurement. F or each v alue x on the horizontal axis, we give the number o f links which are discove red x times during a tr aceroute ego-centered measurement with 3 000 destinations (base va lue). 0 1000 2000 3000 4000 5000 6000 40000 60000 80000 100000 120000 140000 160000 180000 200000 each round 10 rounds Figure 4. B ottom: number of distinct I P addresses obser ved during each r ound of measurement. T op : number of distinct I P addresses observed during series of ten consecutive r ound s. route , see Figure 3 (right). W e can see that some links are probed a v ery hig h n umber of times at ea ch r o und (typica lly up to 3 0 00 times if we use 3 000 destinations). See [12, 1 1, 10] for detailed studies of such ef fects. Finally , in add ition to the key ad vantage of p roviding homog eneous tree ego-c entered views o f the topo logy , the tracetree tool also is much m ore efﬁcient than trace- route in terms of the nu mber of packets sent, thu s makin g it possible to repeatedly run it in r adar measureme nts with a reasonable network cost. 4 T oward s event det ection. One key interest o f our m easurements is that they m ake it possible to obser ve the dyn amics of the I P intern et topol- ogy from a n ego-cen tered p erspective, at a time scale of a few minutes on ly . In p articular, detecting events in this d y- namics, i.e. majo r chang es in the topolog y , is very ap pealing from a security and modeling point of vie w . A most n atural dir ection to try an d detect events is to observe the nu mber of distinct I P addresses seen at ea ch round , as plo tted in Figure 4. Clear events indeed app ear in such p lots, under th e form of downward peaks. Howev er, this pr ovides little inform ation, if any: these peaks m ay be caused by tem porary partial or total co nnectivity losses at the mo nitor (o r close to it), not by imp ortant events at the internet level. On th e other hand, one may no tice that n o signiﬁcant upward peak appe ars in this plot. Notice that this is a no n-trivial fact: fr om a to pologica l po int of view , such peak s would be possible; the fact tha t they d o not occu r reﬂects n on-trivial p roperties of the topolog y and its dy nam- ics, which we leav e for further study . Interestingly , the plot of th e n umber of d istinct I P ad- dresses seen during te n con secutiv e rou nds, Figure 4, has very different characteristics. It exhibits up ward peaks (th e distribution of ob served values, presen ted in Figure 5, lef t, conﬁrms th at th ese pea ks ar e statistically signiﬁcant out- liers). These p eaks reveal impo rtant chang es in the I P ad- 0 200 400 600 800 1000 1200 1400 1600 3800 4000 4200 4400 4600 4800 5000 5200 1 10 100 1000 1 10 x = # ip ; y = # series x = components size; y = # components Figure 5. Left: Distribution of the values of the upper plot in Fig ure 4 . Center: ty pical islands of appearing nodes. Eac h n ode is an I P address; th e black on es are the ones o bserved during the second half of th e measurement o nly , the others being already present in the ﬁrst half. Th e square no des were present in all t he ( 2 200 ) rounds of measurement. Links are directed from bottom to t op, i.e. from the monitor to d estinations. The number of rounds necessary to d isco ver all 13 new nodes in the left drawing was 66 9 rounds ( 1 30 6 to 1 974 ), but only 2 r ounds ( 2 0 21 an d 2 022 ) were sufﬁcient for t he 9 right ones. Notice that 7 connected comp onents of new nodes are display ed: 4 of siz e 1 , 1 of size 4 , 1 of size 5 , and 1 of size 9 . Right : distribution of new node component sizes. For each possible siz e x (horizontal axis), the number of connected components of new nodes of size x is given. dresses observed in consecutive rou nds, and thus importan t routing changes: thou gh the n umber of observed I P ad- dresses is roug hly th e sam e befo re and after these events, the ego-centered vie ws ha ve changed . T o illustrate this, we p resent in Figu re 6 a graph obtained by m erging ego-centered views measured before a nd af ter such an up ward p eak. W e can clearly see th at this peak correspo nds to a large nu mber of n ew edges ap pearing in a speciﬁc part of the network, conﬁrming the occurren ce of a signiﬁcant e vent. Another appro ach consists in detecting events occurring during a measu rement fr om roun d i to round j b y co mpar- ing it to the measu rement f rom roun d i − k to rou nd i , which serves as a referen ce: we consider the IP addr esses seen du r- ing the perio d of interest which were not ob served in the referenc e perio d. W e call these I P addre sses th e new ad- dresses. Our observations show that it is natural to o bserve such new addresses during any measuremen t. Howe ver , on e may expect that events of interest will lead to th e app ear- ance o f connec ted gr oups of such add resses; we ther efore propo se to comp ute the co nnected compo nents comp osed of new addresses 4 as a way to observe these e vents. W e display such co mponen ts in Figu re 5 ( center), to- gether with the ir neig hborh ood. This ﬁgur e shows clearly that, in some cases, the ob served comp onents are non -trivial islands of newly observed nodes, revealing local events in the network. Figu re 5 (right) ho wever shows th at such non- trivial islands are quite rar e: most connected compo nents 4 i.e. maximal sets of new addresses such that there exists a path be- tween any t wo of them composed only of new addre sses. of new nod es are very small, of ten reduced to a si ng le node ( 949 over a total of 1 457 comp onents, in ou r example). De- spite this, s ome large compo nents app ear ( the largest one i n our example h as size 17 , and 1 5 co mponen ts have size at least 10 ), thus rev ealing underlyin g events of in terest. Another imp ortant ch aracteristic of co nnected compo - nents o f new addresses is the n umber of roun ds n eeded to discover all their no des, deﬁn ed as the round nu mber at which their last no de was discovered minus the round num - ber at wh ich their ﬁrst nod e was, plus one. I ndeed, short dis- covery times indicate that all the new nod es un der concer n probab ly ap peared because o f a same event. L arge tim es, in- stead, show that several events (located clo se to each other in the network) occu rred. The examples in Figure 5 (cen ter) show t hat both cases occur . The distribution o f the number of roun ds nee ded to dis- cover each comp onent of new nodes ( not rep resented here) is very heterog eneous, with m any comp onents discovered very rapidly and others much mo re slowly . This gives lit- tle information, howe ver, as the d iscovery time may depend strongly on the com ponent size. Study ing the cor relations between the two (n ot represented her e) conﬁrms this, but it also shows th at some large con nected c ompon ents are dis- covered very rapidly . The two approaches we described point o ut speciﬁc mo- ments at which ev ents o ccurred ; on e may then observe th e data more closely , in order to inv estigate the natu re of these ev ents. W e leave this for further research. Figure 6. Representation of the event at round 106231 in Figure 4: th e graph is obtained by merging 100 r ound s before the event to gether with a single round after the event. Edg es in bold black are edges that were seen in the round after the event but not in the 100 r ounds before. 5 Conclusion and perspectiv es. In this p aper, we prop ose, implem ent, and illustrate a new measurement appr oach which m akes it po ssible to study the dy namics of I P -level intern et to pology at a time scale of a fe w m inutes. W e provide a rich dataset consisting in rad ar me asurements fro m more than one hund red mon- itors towards tho usands of d estinations, co nducted for se v- eral weeks in continuou s. The most impor tant direction for f urther resear ch is o f course the analysis of collected d ata. A particularly appeal- ing g oal is the detection of events in the dynam ics of the observed topo logy; this ra ises difﬁcult f undame ntal qu es- tions, such as th e char acterization of normal dy namics, or the identiﬁcation of relev ant time scales fo r th e observation. Other pr omising directions includ e visualizing the ob- served d ynamics, an d cond ucting more rad ar measuremen ts to gain a deeper insight (for instance , one could conduct si- multaneou s m easuremen ts from several mo nitors to observe the dynamics from different vie wpoints). Acknowledgments. W e warmly thank the PhD stud ents and o ther co lleagues o f the LI P6, in particular Gu illaume V aladon, Renata T eixeira, and Brice A ugustin who p ro- vided great in sight durin g this work. Likewise, we th ank Benoˆ ıt Do nnet, who h elped much with th e r eference s and also p rovided useful commen ts. Many in teresting discus- sions within the METROSEC project [1] also play ed a key role in our work. W e also thank all the pe ople who provide d mon itors to us, in pa rticular the PlanetLab staff [25], Fr ´ ed ´ eric Aidou ni, Julien Aussibal, Prof . Hiroshi Esaki (WIDE), Jean -Charles de Longueville (Hellea) and S ´ ebastien W acquiez (Enix); no such work w ould b e possible without their help. This work was funded in part by the M ETR OSEC and A GRI projects. Refer ences [1] Metrosec project. http://www2.la as.fr/METROSE C/ . [2] Supplementary material (pro grams and d ata). http://www- rp.lip6.fr/ ˜ latapy/Radar/ . [3] traceroute@home project. http://trhome .sourceforge. net/ . [4] D. Achlioptas, A. Clauset, D. Kempe, and C. Moore. On t he bias of traceroute sampling. In P r oc. ACM S TOC , 2005 . [5] B. Augustin, T . Friedman, and R. T eixeira. Multipath Trac- ing with Paris Traceroute. In Pr oc. W orkshop on End -to-End Monitoring , E2EMON , May 2007. [6] P . Barford, A. Bestavros, J. Byers, and M. Crovella. On the marginal utilit y of network topology measurements. In Pr oc. ACM SIGCOMM Internet Measur ement W orkshop (IMW) , Nov . 2001. [7] S. Branigan, H. Burch, W . R. C heswick, and F . W ojcik. What can you do wit h traceroute? IEEE I nternet Comput- ing , 5(5), Sept./Oct. 2001. [8] Q. C hen, H. Chang, R. Govin dan, S . Jamin, S. Shenker , and W . Willinger . The origin of power laws in internet topologies rev isited. In Proc. IEEE INFOCOM , Jun . 2002. [9] L. Dall’Asta, J. I . Alvarez-Hame lin, A. Barrat, A. V ´ azquez, and A. V espignani. Exploring networks with traceroute- like probes: T heory and simulati ons. Theor . Comput. Sci. , 355(1):6–24 , 2006. [10] B. Donnet, P . Raoult, and T . F riedman. E fﬁcient route trac- ing from a single source. cs.NI 0605133, arXi v , May 2006. [11] B. Donnet, P . Raoult, T . Friedman, and M. Crovella. Efﬁ- cient algorithms for larg e-scale topolog y discove ry . In Pr oc. ACM SIGMETRICS , Jun. 2005. [12] B. Donnet, P . Raoult, T . Friedman, and M. Crove lla. De- ployme nt of an algorithm for large-scale topology discov- ery . IEE E Jou rnal on Selected Areas in Communica- tions, Sampling the Internet: T echnique s and Applications , 24(12):2210 –2220, Dec. 2006. [13] M. Faloutsos, P . Faloutsos, and C. Faloutsos. On power - law relationships of the internet topology . In Proc. ACM SIGCOMM , 1999. [14] F . Georgatos, F . Gruber , D. Karrenberg, M. Santcroos, A. Susanj, H. Uijterwaal, and R. W ilhelm. Provid- ing active measurements as a regular service for ISPs. In Pr oc. of P assive and Acti ve Measur ement W ork- shop , 2001. See also the RIPE NCC T TM service: http://www.ri pe.net/test- traffic/ . [15] R. Govindan and V . Paxson. Est imating router ICMP gen- eration delays. In P r oc. of P assive and Active Measur ement W orkshop , March 2002. [16] R. Govin dan and H. T angmunarunk it. Heuristics for internet map discov ery . In Pr oc. IEEE INFOCOM , Mar . 2000. [17] J. L. Guillaume and M. Latap y . Relev ance of massi vely dis- tributed explorations of the internet topology: S imulation results. In Pr oc. IEEE INFOCOM , Mar . 2005. [18] B. Huffak er , D. Plummer , D. Moore, and k. claffy . T opol- ogy discov ery by active probing. In Pr oc. Symposium on Applications and the Internet , Jan. 2002. [19] A. L akhina, J. Byers, M. Crov ella, and P . Xi e. Sampling biases in I P topology measurements. In Proc. IEEE INF O- COM , Apr . 2003. [20] M. Luckie. IPv6 sc amper , 2005. W AND Network Research Group. See http://www.wa nd.net.nz/ ˜ mjl12/ipv6- scamper/ . [21] A. McGre gor , H. -W . Braun , and J. Bro wn. The NLANR net- work analysis infrastructure. IE EE Communications Mag . , 38(5):122–1 28, May 2000. See also the NLANR AMP project: http://watt. nlanr.net/ . [22] T . Moors. Streamlining traceroute by estimating path lengths. In Pro c. IEEE International W orkshop on IP Op- erations an d Managemen t ( IPOM) , Oct. 2004. [23] R. Oliveira, B. Zhang, and L. Zhang. Observing the ev o- lution of internet AS topology . In Proc. ACM SIGCOMM , 2007. [24] J. -J. Pansiot. Local and dynamic analysis of internet mul- ticast router topology . A nnales des t ´ el ´ ecommunications , 62:408–4 25, 2007. [25] P lanetLab Consortium. PlanetLab project, 2002. See http://www.pl anet- lab.org . [26] Y . Shavitt and E. Shir . DIMES: Let the internet measure itself. ACM SIGCOMM Computer Communication Review , 35(5):71 – 74, October 2005. [27] N. Spring, D. W etherall, and T . Anderson. Scrip- troute: A pub lic internet measu rement facil- ity . In Pr oc. 4th USENIX Symposium on Inter- net T echno logies and Systems , 2002. see also http://www.cs .washington.e du/research/networking/scri p t r o u t e / . [28] F . V iger, B. Augustin, X. Cuvellier , C. Magnien, M. Latapy , T . Friedman, and R. T eixeira. Detection, understanding, and pre vention of traceroute measurement art ifacts. Computer Networks , 52, 2008 . [29] D. G. W addington, F . Chang, R. V iswanathan , and B. Y ao. T opology discove ry for public IPv6 networks. A CM SIG- COMM Computer Communication Revie w , 33(3):59–68, Jul. 2003.

A Radar for the Internet

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment