Can User-Level Probing Detect and Diagnose Common Home-WLAN Pathologies?

Can User -Lev el Probin g Detect and Diagnose Common Home-WLAN Pathologies? Partha Kanuparthy † ∗ , Constantine Dovrolis † , K ons tantina P apagi annaki ‡ , Sriniv asan Seshan § , Peter Steenkiste § † Geor gia Instit ute of T echno l ogy ‡ T elefonica Resear ch § Carne gie Mellon University Abstract Common W ireless L AN (WLAN) patholog ies include low signal-to-n oise ratio, congestion , hidd en terminals or interferen ce from no n-802 .11 de v ices and pheno mena. Prior work has focused on the d etection and diagn osis of such pro blems using layer-2 information from 802. 11 devices and spec ial-purpo se access po ints and mon itors, which may no t be generally av ailab le. Here, we inv esti- gate a user-le vel approach: is it possible to detect and di- agnose 802 .11 patho logies with strictly user-level acti ve probin g, without any co operation from , and with out any visibility in, layer-2 devices? In this p aper, we presen t preliminar y but promising r esults indicating that such di- agnostics are feasible. 1 Intr oduction Most home networks today use an 802.11 W ireless LAN (WLAN) with a single Access Point (AP), typically operating in Distributed Coordination Fu nction (DCF) mode. Home WLANs often suffer from various perf or- mance patholo gies, such a s low signal stren gth, signif- icant noise, in terference f rom extern al n on-80 2.11 de - vices and physical ph enomena , various for ms of fading, hidden term inals fr om d evices in th e same WLAN or in nearby WLANs, or con gestion. These patho logies can result in throug hput degradation, sig niﬁcant jitter an d packet losses. T o make things worse, due to the wireless nature of the medium, tr oublesho oting WLAN perfor- mance is hard ev en for experts, lea ve alone home users. User-le vel pro bing is a well-established research area in wir ed networks and it is used in p ractice to infer var- ious prop erties and problem s in su ch networks. In th e wireless domain, o n the other hand , it is still u nclear whether u ser-le vel pro bing can be near ly as effective. A main m otiv ation behind this work is to answer the fol- lowing “intellectual cur iosity” q uestion: is it p ossible to ∗ Contac t author: partha@cc. gatech.e du diagnose commo n WLAN perfo rmance pr ob lems using active pr o bing, without any information fr o m, or mod iﬁ- cations to , the 802. 11 devices or AP? The me thods pre- sented in this paper show promising (but preliminary) re- sults, poten tially op ening a n ew research thr ead within the area of wireless networks. Speciﬁcally , our objective is to co nstruct a user-level tool fo r any 80 2.11 DCF WLAN that can detect: a) low Signal-to-No ise Ratio ( SNR), b) Hidden T erminals (HT), or c) congestion . Th e methodology we propose, referred to as WLAN-pr obe , is a s imple, easy-to-use, client-server application that eliminates the need for vendo r-speciﬁc network card (NIC), driver , AP , monitorin g d evices, o r network modiﬁca tions. It is also p ortable across plat- forms, since it only requir es a user-le vel socket library (e.g., Berkeley sockets, POSIX, or W insock APIs). There are se veral reasons for a user -level p robing tool: Usability: W e want to build a diagnostic tool that would not req uire the u ser to install a sp eciﬁc NIC, AP , o r modify the kernel (mo reover , it would n ot req uire administrative privileges). The u ser would just run a single instance of the WLAN-prob e clien t a t the wireless link that appears problematic. Hardware-agnostic: Most wir eless cards today expo rt some fo rm of signal strength; for example, th e Re- ceiv ed Signal Strength Indicato r (RSSI). RSSI im- plementation s are vendor-speciﬁc and they a re not unifor m across NICs. A user-le vel appr oach av oid s the nee d to calibr ate NIC statistics across d evices and drivers on different OSes. Software-ag nostic: A user-lev e l approach elimi- nates the need to wr ite an d maintain a h ardware- compatibility layer fo r different OSes th at would expose NIC statistics at user-space. An example of that app roach is WRAPI [1], designed to work on Windo ws XP with NICs supporting NDIS 5 .1 drivers. 1 Passiv e inference: Under standing active probin g in the wireless dom ain may also ena ble method s fo r passiv e infer ence. For e xample, is it possible to troublesho ot client perfo rmance at a remo te web or video server using strictly application traf ﬁc? State-of-the- art diagnosis to ols req uire (o r m odify) vendor-speciﬁc d rivers and NICs, or they requ ire spe- cial monitors at th e hom e network ; we cover these ap- proach es in the related work section. WLAN-prob e is based on two fundamen tal effects: a) the fact that low-SNR and HTs cause a depend ency b e- tween p acket size and re transmission pr obability , while congestion does no t do so, an d b) th e fact that low-SNR condition s differ signiﬁcantly from HTs in the de lay o r loss tem poral c orrelations they create. Howe ver , mea- suring layer-2 retransmission s an d d elays is not feasible without infor mation from the li nk layer . In this short pa- per, we presen t the basic ideas and alg orithms for u ser- lev e l infe rence of lin k la yer effects, with a limited testb ed ev aluation . In f uture work, we will co nduct a more ex- tensiv e e valuation, e x periment with actual deployme nt at se veral home network s, and expand the set o f diagn osed patholog ies. W e consider th e fo llowing arc hitecture, which is typ- ical f or most hom e WLANs (see Fig ure 1 ). A single 802.1 1 AP is used to inter connect a numb er of wireless devices; we do not make any assumption s about the exact type of the 80 2.11 de v ices o r AP . W e ass ume that another computer, used as o ur WLAN-pr obe m easurement server S is co nnected to the AP through an Ethern et connection. This is not d ifﬁcult in prac tice given that m ost APs p ro- vide an Ethernet port, as long as the user has at least two computer s a t home. The key requ irement for the server S an d its connectio n to the WLAN A P is that it sh ould not introdu ce sign iﬁcant jitter (say more than 1 -3msec). The server S allows us to prob e the WLAN chann el with- out demand ing ping- like rep lies from the AP and witho ut distorting the forward-p ath measuremen ts with reverse- path respo nses. Th e measurem ents can be cond ucted ei- ther from C to S or from S to C to allow diagnosis of both channel d irections; we f ocus o n the forme r . Note that some APs or terminals that are no t a p art of ou r WLAN may be nea rby (e.g., in o ther h ome networks) creating hidden terminals and/o r interfer ence, wh ile the user has no contro l o ver these networks. W e hav e conducted all experiments in this paper using a testbed that co nsists of 802.11g Soekris net4826 nodes with m ini-PCI inte rfaces. The mini-PCI interfaces h ost either an Ath eros chip set or an I ntel 2 915ABG chip set, with the MadWi Fi and ip w2200 drivers respectively (on the Linu x 2. 6.21 kernel). The MadWiF i driver allows us to choose between f our r ate ad aptation m odules. W e disable the optio nal MadW iFi features referred to as fast frames and bursting becau se they are speciﬁc to Mad- Wired network AP Clients Server WLAN-Probe client WLAN-Probe server Figure 1: System architec ture. Wireless node Metal 12m Figure 2: T e stbed layout. W iFi’ s S uper-G imple mentation an d they c an interf ere with the p roposed rate in ference p rocess. The testbed is housed in the Co llege of Computing at Georgia T ech, and the geograp hy is shown in Figure 2. Related work There is signiﬁcant p rior work in th e area of WLAN monitorin g and diagn osis. Howe ver, to the exten t of our knowledge, the re is no earlier attem pt to diag- nose WLAN problems using exclusiv ely user-le vel ac- ti ve prob ing, withou t any in formatio n fr om 8 02.11 de- vices and other layer-2 mon itors. User-le vel active prob- ing h as be en used to estimate con ﬂict graphs and hid - den terminals, assuming that the inv olved devices coo p- erate in the detection of hidden terminals [5, 18, 19]. Instead, with WLAN-pro be, h idden termin als m ay not participate in th e detectio n pro cess (and they m ay b e lo- cated in different WLANs). P assive measu rements have also been u sed for the constructio n of conﬂict graph s [6, 11, 2 4]. Earlier systems r equire m ultiple 802.1 1 monitorin g devices [8, 9, 17], NIC-speciﬁc or driv er- lev e l sup port f or layer -2 in formation [7, 23], and net- work conﬁguration data [4]. Model-b ased app roaches use transm ission o bservations from th e NIC to p redict interferen ce [14, 16, 2 1, 22]. Sign al proc essing-based approa ches decod e PHY sig nals to identify th e type of interferen ce [15]; some com mercial spectrum analyzers [2, 3] deploy such monitoring de v ices at vantage points. 2 2 Wir eless Access Delay The pro posed diagn ostics are based on a certain c ompo- nent of a pr obing p acket’ s One-W ay De lay (OWD), re - ferred to as wir eless access delay or simply access delay . Intuitively , this term captu res the fo llowing delay compo- nents that a packet encou nters at an 802 .11 link: a) wait- ing for the chan nel to beco me av ailable, b ) a ( variable) backoff window befor e its transmission, c) the tran smis- sion delay of potential r etransmissions, and d) certain constant d elays (DIFS, SIFS, transmission of A CKs, etc). The a ccess d elay d oes no t includ e the potential q ueue- ing delay at the send er due to the transmission o f ear lier packets, as well as the laten cy f or the ﬁrst tr ansmission of the packet. The access delay captures important prop- erties of the link layer delay s which allow us to distin- guish between p athologie s; furth er , we can estimate it with user-le vel measurements. Before we deﬁne the wireless acc ess d elay mo re pre- cisely , let u s gro up the various comp onents of the OW D d i of a packet i fr om C to S (see Fig ure 1) into four de - lay c ompon ents. W e ass ume that the link between the AP and S does no t cau se queueing delay s. First, packet i ma y have to wait at th e send er NIC’ s transmission queu e for the successful tran smission o f packet i − 1 - this is du e to the FCFS natu re of that que ue and it d oes no t dep end on the 80 2.11 pr otocol. If th e time- distance (“gap ”) be - tween t he arri val of the tw o packets at the s ender’ s queue is g i , packet i will ha ve to wait for w i before it is a vailable for transmission at the head of that queue, where: w i = max { d i − 1 − g i , 0 } (1) W e can estimate w i only if packet i − 1 has not been lost - otherwise we cann ot estimate the access delay for p acket i . The second delay co mponen t is the ﬁrst (and p otentially last) transmission d elay of packet i . In 802.1 1, packets may be retran smitted se veral times and each transmission c an be at a different layer-2 r ate in general. The r atio s i / r i , 1 represents the ﬁrst transmis- sion’ s delay , where s i is the size o f the pac ket (including the 8 02.11 h eader and the frame-check sequence) and r i , 1 is the lay er-2 rate o f the ﬁrst tra nsmission; we focus on the estimatio n of r i , 1 in the next section. T he third d e- lay component c includes various constant latencies dur- ing the ﬁrst tran smission of a pa cket; without going into the details ( which are available in long er descriptions of the 802.1 1 standar d), these latenc ies in clude various DIFS/SIFS segments an d the layer-2 A CK tran smission delay (which is always at the same rate). Finally , there is a variable delay componen t β i . When the packet is tr ans- mitted only o nce, β i consists of the waiting time (“busy- wait”) f or the 802.11 chann el to become a vailable as well as a rand om bac koff window (uniformly distributed in a certain num ber o f time slots ). If th e p acket has to be transmitted more than on ce, β i also include s all the ad - ditional delays b ecause of subseque nt retransmission la- tencies, busy-wait, backoff times an d co nstant laten cies. These delay co mpone nts are illustrated in Figur e 4. W e deﬁne the wireless access delay a i as a i = c + β i (2) and so it can be estimated from the O WD as a i = d i − w i − s i r i , 1 (3) where w i is derived from Equation 1. Another way to think ab out the wireless access delay is as follows. Suppose that we co mpare th e O WD of a packet that traverses an 802.11 link with the OWD of an equal-sized packet that goes thro ugh a work-con serving FCFS queu e with constant service rate r (e.g., a DSL o r a switched Ethern et port). The O WD of the latter would include the sender waiting tim e w i and the transmission latency s i / r . I n that case the term a i would only co n- sist of the queueing delay due to cross traf ﬁc that arriv ed at the link b efore packet i . In the case of 802.11, the link is no t work- conservin g (packets may n eed to wait ev e n if the ch annel is a vailable), the transmission rate can change across packets, and there may be retransmis- sions of the same packet. T hus, the wireless access delay captures n ot only th e delays d ue to cross trafﬁc, but also all the add itional de lays due to th e id iosyncrasies o f th e wireless channe l and th e 802.1 1 pro tocol. A signiﬁcan t increase in th e access d elay of a p acket imp lies either long busy-waiting times due to cross trafﬁc, or prob lem- atic wireless ch annel conditio ns due to low SNR, inter- ference etc. In the following sections we examine th e informa tion th at can be extracted fro m either temporal correlation s in the access delay , or from the d ependen - cies between access delay and packet size. It should be noted th at the access dela y can have add itional ap plica- tions in other wireless netw ork inference problems (such as av ailable ban dwidth estimation), which we plan to in- vestigate in future w ork. Diagnosis tre e and pr obing structur e Having deﬁned the key m etric in the p roposed meth od, we now presen t an overview of the WLAN-prob e diag- nosis tree that a llows us to distinguish betwe en patho lo- gies (see Figure 3 ). W e start b y ana lyzing each pac ket train separ ately , and use a novel dispersion-based metho d to infer the per-packet lay er-2 tran smission rate, when possible (Section 3). Based on the inferre d rates, we can estimate the wireless a ccess d elay for each packet. W e then examine whether the access delays inc rease w ith the packet size (Sectio n 4). When this is n ot the case, the WLAN path ology is d iagnosed as congestion. On 3 Access delay increasing with packet size? Large access delay/loss after large access delay/loss? Y Congestion N Symmetric Hidden Terminals N Low SNR Y Estimate per-packet L2 transmission rate and access delay Figure 3: WLAN-pr obe decision tree. the other han d, when the access delay s increase with the packet size, the o bserved path ology is d ue to low SNR or hidden terminals. W e d istinguish be tween th ese two patholog ies based on temp oral cor relation pr operties of packets tha t either en countered very la rge a ccess delays or that were lost at layer-3 (Section 5). T o c onduct th e previous diagn osis tests, we need to probe the WLAN chan nel with mu ltiple packet trains and with packets of dif fer ent sizes. E ach t rain provides a unique “sample” - we need multiple samples to make any statistical inf erence. Each train consists of sev eral b ack- to-back packets of different sizes. Th e packets h av e to be transmitted back- to-back so that we can use dispersion - based rate inference methods, and they ha ve to be of dif- ferent sizes so th at we can examine the p resence of an increasing trend b etween acce ss d elay and size. Specif - ically , the probing ph ase consists of 10 0 bac k-to-ba ck UDP p acket trains. Th ese packet tr ains ar e sent from the WLAN-probe clien t C to the WLAN- probe server S . The packets are timestamped at C an d S so that we can m easure their r elative On e-W ay Delay (O WD) vari- ations. The two ho sts do not n eed to h av e syn chronize d clocks, and we compe nsate for clo ck skew d uring e ach train by su btracting th e minimum OWD in that train. The send/r eceiv e timestamps are o btained at user-lev el. There is an idle time of o ne second b etween successive packet train s. Each train con sists of 5 0 pac kets of dif- ferent sizes. Abou t 10% of the packets, rand omly cho- sen, are o f the minimum- possible size (8-b ytes fo r a se- quence numb er and a sen d-timestamp, togeth er with the UDP/IP header s) and they are referred to as tiny-pr o bes - they play a sp ecial role in tra nsmission r ate inferen ce (see Section 3). The size of the remaining packets is uni- formly selected from the set of values { 8 + 200 × k , k = 1 . . . 7 } bytes. 3 T ransmission Rate Infere nce The co mputation of the wireless access delay requ ires the estimation of th e r ate r i , 1 for th e ﬁrst transmission of eac h probin g p acket. Even though capac ity estima- tion using packet-pair d ispersion techniq ues in wired net- works h as been studied extensi vely [10, 13], the accu racy of those methods in the wireless context has been repeat- edly questioned [20]. Ther e are three reasons that capac- ity estimation is much harde r in the wireless context and in 802 .11 WLANs in particular . First, different packets can be transmitted at different rates ( i.e., time-varying capacity). Secon d, the chann el is not work-co nserving , i.e., there m ay be id le times even thoug h one or mor e terminals have packets to send. Third, potential layer-2 retransmissions increase the dispersion between packet pairs, leading to u nderestimation err ors. On the other hand, th ere are two positi ve factor s in the problem of 802.1 1 tran smission rate infer ence. First, ther e ar e only few standar dized tra nsmission rates, and so instead of es- timating an arbitrar y value we can select one ou t eight possible ra tes. Secon d, most (b u t not all) 802 .11 rate adaptation modules show strong tempo ral correlations in the transmission rate of back-to-back packets. In the fol- lowing, we propose a transmission rate inferenc e meth od for 802.1 1 WLANs. E ven thoug h the basic idea of the method is based o n packet-pair probin g, the m ethod is novel because it addresses the previous three challenges, exploiting these two positive factors. Ap proach: Recall that W LAN-pro be send s many packet train s fr om C to S , and each train consists of 50 back-to -back prob ing pac kets (i.e., 49 pa cket-pairs). Consider th e pa ckets i − 1 and i for a c ertain train ; we aim to estimate the rate r i , 1 for the ﬁrst tran smission of p acket i giv en the “dispersio n” (or interar riv al) ∆ i between the two p ackets at the rec ei ver S . Of course this is po ssible only when neith er of these two pa ckets is lost (at lay er- 3). Further, we require that packet i is not a “tiny-probe”. Let u s ﬁrst assume that packet i was tran smitted on ly once. In the case of 80 2.11, and under the assumption of no r etransmissions fo r p acket i , the dispersion can be written as: ∆ i = s i r i , 1 + c + β i (4) using the notation of the p revious section. T o estimate r i , 1 , we ﬁ rst need to subtract from ∆ i the constant latency term c and the variable delay term β i which cap tures the waiting time for the ch annel to becom e av ailable and a unifor mly rand om backoff period. The sum of these two terms c + β i is estimated using the tiny-probes; recall that their IP-layer size is only 8 bytes an d so their transmis- sion latency is small comp ared to the tr ansmission la- tency for the r est of the probin g p ackets. On the other hand, the tiny-p robes still experience the sam e constant 4 send() Time NIC tail NIC head DIFS +busy-wait +[0,W] TX delay SIFS ACK (waiting time) send() Time NIC tail NIC head DIFS +wait +[0,W] TX delay TX delay timeout ~SIFS DIFS +wait +[0,2W] (waiting time) (a) Busy channel (b) Channel with bit-errors TX delay DIFS +wait +[0,3W] ACK SIFS Access delays ~SIFS Figure 4: Timeline of an 802.11 packet transmission showing access delays . latency c as larger packets, and their variable-delay β follows th e same d istribution with that of larger p rob- ing packets (because the channel waiting tim e, or th e backoff time, do not depend o n the size of the tran smit- ted p acket). So, consider ing only those packet-pairs in which the secon d packet is a tiny-prob e, we measur e the median dispersi on ∆ tiny . This median is used as a rough estimate of the sum c + β i , when p acket i is not a tiny- probe. 1 W e then estimate the transm ission rate r i , 1 as: r i , 1 = s i ∆ i − ∆ tiny (5) If the i ’th p acket w as retran smitted one or more times, the d ispersion ∆ i will be larger than s i / r i , 1 + c + β i and the rate will be underestimated . A ﬁrst check is to exam- ine w hether th e estimated r i , 1 is signiﬁcantly smaller than the lowest possible 802.11 transmission rate (1Mbps). In that case, we reject the estimate r i , 1 and ﬂag th at packet. Of co urse it is po ssible that some rem aining packets have been retransmitted, but withou t being ﬂagged at this point. W e also ﬂag all tiny- probes, as well as any packet i if packet i − 1 w as lo st. The n ext step is to map each r emaining estimate r i , 1 to the nearest standar dized 802.1 1 transmission rate ˆ r i , 1 . For instance, if r i , 1 =10.5Mb ps, the nearest 802.11 rate is 11Mbp s. (n ote th at this tra nsmission rate ap plies to the 802.1 1 f rame and so s i has to inclu de the layer-2 h ead- ers). W e also exploit th e tempor al correlations between the transmission rate of successive packets (within the same train) to imp rove th e existing estimates and to produ ce an estimate for all ﬂagged packets. W e have experi- mented with the four rate adaptation mod ules av a ilable in the MadW iFi dri ver used with the Atheros chipset (Sam- pleRate, AMRR, Onoe and Minstrel) . Figu re 5 (top ) shows the fraction of probing packets in a train that were transmitted at the mo st co mmon tr ansmission r ate dur- ing th at train, und er three different ch annel con ditions. These results were obtain ed fro m 100 exper iments with 1 This estimate is revise d in the last stage of the algorithm, after we hav e obtained a ﬁrst estimate for the transmission rate during a train. W e then estimate the transmission latency of each tiny -probe and sub- tract it from its measured dispersio n. 50-pac ket trains; we also show the W ilco xon 95% conﬁ- dence inte rval in each case. Note th at all rate adap tation modules exhibit strong tempo ral correlation s, while three of them (AMRR, Min strel and Ono e) seem to use a sin- gle rate fo r all p ackets during a tra in (each train la sts for 5-250 msec, depe nding on the transmission rate) . Based on th e previous strong temporal correlations, we compute the mode ˜ r (mo st commo n value) of the dis- crete ˆ r i , 1 estimates. If the mode includes less than a frac- tion (3 0%) o f the measurem ents, we reject that p acket train a s too no isy . Otherwise, we replace every estimate ˆ r i , 1 , and the estimate for every ﬂagged pa cket, with ˜ r . If most trains show weak mod es (i.e., a m ode with less than 30% of the measur ements), we abort the diagno sis p ro- cess because the underlying rate adaptatio n modu le does not s eem to exhibit strong tempo ral correlations between successiv e packets. In our experimen ts, this is sometimes the case with the Samp leRate MadW iFi module . In the rest of this work, we on ly use th at rate adaptation mod- ule (which is also th e default in Mad W iFi) because we want to examine whether the pr op osed diagno stics work r eliab ly even under conside rable rate estimation err o rs. Evaluation: Figure 5 (bottom ) shows the accuracy of the pr oposed rate estimatio n meth od unde r three quite different chann el cond itions. I n particu lar , we show th e average of th e ab solute r ela tive err or acro ss all probin g packets for which we know the groun d-truth transmis- sion rate. T he “grou nd-tru th” f or each packet was ob - tained using an AirPcap monito r , po sitioned close to the sender, that captur ed most (but n ot all) pro bing pa ckets. W e detect the ﬁrst transmission for each packet using the “Retry” ﬂag in the 802 .11 header . W e see that the infer- ence error is low in most c ases; the SampleRate m odule giv es a relati vely higher error . 4 Detecting Size-depen dent Pathologies The ﬁrst “branc hing po int” in th e decision tree of Fig- ure 3 is to examine whether a ccess d elays in cr ease with the size of pr obing packets . Recall that each probing train co nsists of packets with eig ht distinct sizes. The 5 0.5 0.6 0.7 0.8 0.9 1.0 SampleRate AMRR Minstrel Onoe Mode rate fraction 0 20 40 60 80 100 SampleRate AMRR Minstrel Onoe Abs. inference error (%) High SNR Low SNR Cross traffic Figure 5 : Rate infer ence: strong tempora l co rrelations between th e tra nsmission rate of packets in the same train (top) and rate inference accuracy . Low-SNR cond itions are cre ated by separatin g C and its AP b y several me- ters; con gestion is cau sed by a UD P bulk-transfer over a second network that is in-range. patholog ies in an 8 02.11 WLAN can be grou ped in two categories: a) patho logies that are more likely to increase the access delay of larger pac kets, b ecause of increased waiting at the sender o r increased retransmission likeli- hood, and b) patholog ies that increase the access delay o f all packets with the sam e likelihood, indepen dent of size. W e refer to the former as size-dependent pa thologies and the latter as size-independe nt . The ﬁrst category include s a broad class of pr oblems such as bit erro rs du e to noise, fading, interfere nce, low transmission signal stren gth, or hid den terminals. In the simplest (but unrealistic) case o f indep endent b it er- rors, the prob ability that a fram e of size s bits will be received with b it error s when the bit-error rate is p is 1 − ( 1 − p ) s , which incr eases sha rply with s . Of course, in practice b it er rors are no t in depend ent and 80 2.11 frame tr ansmissions are partially pr otected with FEC and rate adap tation techniqu es. W e e x pect howe ver that when the p reviously mentione d pathologie s are severe enou gh to cause performan ce problem s, larger packets ha ve a higher prob ability of bein g retran smitted, ca using an in- creasing trend between access delay and packet size. The size-in depend ent class includ es patho logies th at can also cause lar ge access delays, due to increased wait- ing at th e sender o r r etransmissions, but where the mag- nitude of the access delay is inde penden t of th e packet size. The b est instan ce in this class is WLAN conges- tion. I t is impor tant, howev e r , that the trafﬁc that causes congestion is genera ted by WLAN termin als that can “carrier-sense” each other ( otherwise w e h av e hidden- terminals). In the ca se o f cong estion, the access delay s will be larger than the case when ther e is no congestion (packets have to wait mor e for th e chan nel to become 0 10 20 30 40 50 60 70 80 90 100 8 408 808 1208 Access delay (ms) Payload size (B) 0 1 2 3 4 5 6 7 8 408 808 1208 Payload size (B) Figure 6: Low signal strength and co ngestion: ef fec t of packet size (SampleRate module). av ailable) but the access delays would not depend on the packet size. Ap proach: W e distinguish between the two p atholog y classes using statistical trend detection in the relation be- tween access de lay a nd pac ket size. Figure 6 shows the inferred access delays from experiments with 100 packet trains. In the ﬁrst experiment ( left), the client C and the AP are separated by a lar ge distance of 5-6m, so that C ’ s bulk-transfer thro ughp ut drop s to abo ut 1M bps. In the second experim ent, we attempt to saturate the WLAN with UDP trafﬁc that orig inates from anoth er ter minal. All terminals and APs can carrier-sense each other (we test th is b ased o n th rough put co mparisons when on e or more n odes a re active). W e u se 8 02.11 g chan nel-6 and SampleRate in both experiments. The access delays in the case of low signal stre ngth increase with the packet size, wh ile this is not true in the case of cong estion. A more thorou gh an alysis of these measurements re veals that not all access delays in- crease with the packet size, un der low sign al strength. Instead, the increasing trend is clearly observed amon g those pack ets that have the lar ger access delays for each pr obing size . This is not surprising : the packets with the larger access delays among the set of packets o f a cer- tain size, are typic ally those that are retra nsmitted, and the retran smission prob ability incr eases with th e packet size und er size-dep endent p atholog ies. For this reason, instead of examinin g the average or the m edian access delay fo r each packet size, we co nsider instead the 95- th percentile ˜ a 95 p ( s ) of the access delays for each packet size s . The trend d etection is p erformed u sing the n onpara- metric Kendall o ne-sided hypo thesis test [12]. T he null hypoth esis is th at there is no tr end in the biv ariate sam- ple { s , ˜ a 95 p ( s ) } f or s = { 8 + k × 200 , k = 1 . . . 7 } (bytes), while th e alternate h ypothe sis is that there is an increa s- 6 ing trend. Evaluation: For the exp eriments of Figure 6 th e test strongly rejects th e nu ll hy pothesis under low sign al strength with a p-value of 0 (the p -value is less than 0.01 acro ss all MadWiF i rate modu les), while th e p- value in the case o f congestion is 0. 81 (0.7 -1.0 acro ss all MadW iFi r ate modules). W e h av e rep eated simi- lar experiments w ith all other MadWiFi rate adap tation modules and under different signal strength s and conges- tion levels. T he p-values in all exper iments sho w a clea r difference between size-de penden t and size-independ ent patholog ies, as long as the r eceiv ed signal stren gth is less than ab out 8 -10dBm. For higher signal streng ths, the user-le vel throug hput is m ore than 5M bps, and so it is questionab le whether there is a patholog y that nee ds to be diagnosed in the ﬁrst place. 5 Low SNR and Hidden T erminals After the detec tion of a size-depen dent pathology , WLAN-prob e attempts to distinguish between low-S NR condition s and Symmetric Hid den T erminals (SHTs) . The former r epresents a wid e r ange of p roblems (lo w signal strength , interfer ence from non-80 2.11 d evices, signiﬁcant fading, and other s) - a co mmon ch aracteris- tic is th at they are all caused by exogenous factors that affect the wireless chan nel indep endent of the p resence of tra fﬁc in the chann el. SHTs r epresent the case that at least two 802 .11 send ers (from the same or different WLANs) can not carrier-sense each other and when they both tr ansmit a t the same tim e ne ither sende r’ s trafﬁc is correctly received. SHTs do no t represen t an exog enous patholog y becau se th e p roblem disapp ears if all but on e of the colliding senders backoff. T he case of asymmetric HTs ( or one-n ode HTs), wh ere on e sen der’ s tran smis- sions ar e corru pted while the conﬂicting sen der’ s trans- missions are cor rectly received, is n o different than the exogenou s factors we co nsider and WLAN-pr obe will diagnose them as low-SNR. Ap proach: T o d istinguish betwee n low-SNR and SHTs, we ﬁrst introduce some additional terminology of ev e nts that pro bing packets m ay see. A p robing pac ket may be lost at laye r - 3 ( denoted by L3), after a numb er of unsuccessful retr ansmissions at layer-2. A pro bing packet may see an o utlier dela y (OD), if its ac cess de - lay is signiﬁcantly higher than th e typical access d elay in that p robing experimen t - we c lassify a packet as OD if its acc ess delay is larger than the sample median plus three standar d deviations (the samp le includ es all mea- sured access delays in that prob ing experime nt - across all trains). Finally , a probing pac ket may see a lar ge delay (LD) if its access delay is higher than the typica l access delay in th at probin g experimen t - we c lassify a packet as LD if its access dela y is high er than the 90 -th 0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50 60 70 80 CDF Probability ratio Low SNR (tx-power 6-10 dBm) Hidden terminal Figure 7: Probab ility ratio ( p c / p u ) to disting uish be- tween low-SNR and SHT con ditions. percentile o f th e empirical d istribution of access delay s (after we have excluded OD packets). N ote th at the ac- cess delays of OD packets a re typically much larger than the access delays of LD packets. The pro bing and diagno sis p rocess works as follows. The probing packets in this WLAN-probe exper iment are of the largest possible size that will n ot b e fragmen ted. The reaso n is that larger packets are mo re likely to collide with other transmission s in the c ase of SHTs. W e then identify all OD or L3 packets in the probing trains of the experiment, and estimate the un condition al proba bility p u that either ev e nt takes place: p u = Prob [ OD ∨ L3 ] (6) W e then fo cus on the su ccessor of an OD or L3 event, i.e., the pro bing packet that follows an OD or L3 packet. Under low-SNR scenarios we expect that the chann el condition s exhib it strong tempo ral correlatio ns, and so if a packet i experiences an OD or L3 e vent, its successor packet i + 1 (den ote by su ccessor(i) ) will see a large d e- lay (LD) or layer-3 loss (L3) e vent with high probability . On the other han d, if p acket i experienc es an outlier delay (OD) or L3 event due to an SHT , the collid ing senders will b ackoff fo r a rand om time period an d it is less likely that the succ essor packet will be LD or L 3. T o c apture the pre vious tempo ral correlation s between an L3 or OD packet and its successor, we consider th e condition al prob ability: p c = Prob [ successor ( i ) : LD ∨ L 3 | i : OD ∨ L3 ] (7) The detection method focu ses on the ratio p c / p u of the previous conditional and unc onditiona l probabilities. If th ere is a strong tempor al correlatio n between a pro b- ing packet th at experienc es a n OD or L3 ev ent and its successor , this p robability ratio will be much larger than one. W e expect this to be the case under lo w- SNR condi- tions. Otherwise, un der an SHTs condition , the pr evious 7 temporal corr elation is much weaker and the pro bability ratio will be closer to one. Evaluation: Figure 7 sho ws the distribution o f the probab ility ratio p c / p u for 1 00 low-SNR and 9 0 SHT experiments in o ur testbed. W e create low-SNR con di- tions by reducing the transmission power of the WLAN- probe client C to 6-10d Bm; the access point is abou t 3m away . W e crea te SHTs condition s using two differ - ent network s on 8 02.11 g ch annel-6, such th at the two senders can no t c arrier-sense each o ther . When only one sender is active, the throu ghpu t in the co rrespond - ing network is higher than 1 0Mbps - wh en b oth senders are alw ays back logged, the throug hput drops to les s than 1Mbps. T he probab ility ratio is alw ay s less than 5 und er SHTs, while it is higher than 5 in 80% of the experimen ts under lo w-SNR conditions. A pr obability ratio threshold between 3-5 should be sufﬁcient to diag nose almost all SHTs accu rately . Un der low-SNR con ditions, h owe ver, we shou ld expect some d iagnosis erro rs: in 1 0-20% of the cases, WLAN- probe will d iagnose a low-SNR con- dition as SHT . W e are in vestigating ways to fu rther im- prove the accur acy of this diagno stic pro cess. 6 Conclusions and futur e work W e proposed a hom e WLAN diagnosis process that only requires user-lev el a cti ve pr obing, and pre sented some preliminar y but promising results that show the feasi- bility of such diagnostics. A design consideration for our meth ods is usability : we do not require admin is- trativ e pr i vileges, any for m of suppo rt from th e wireless card/driver/AP , or sensor nod es at vantage points in the home. W e are w o rking on sev eral extension s of WLAN- Probe. First, it is possible th at there is no real WLAN patholog y - we are working on a m ethod that can dis- tinguish between n ormal operatio n and the previous patholog ies. Second, so me prelimin ary work shows that we can dete ct cer tain non-8 02.11 inter ference sour ces, such as microwa ve ov en s. Third, we are working on improvements in the r ate inferen ce m ethod and on test- ing these m ethods with addition al rate a daptation m ech- anisms. Finally , we will condu ct a larger-scale ev alua- tion o f the WLAN-pr obe diagno stic accuracy with more testbed experime nts as well as with actu al home WLAN deployments. Refer ences [1] WRAPI: API for Real-time Monitoring and Control of an 802.1 1 W ireless LAN. http://sysnet.ucsd .edu/pawn/wrap i, 2002 . [2] AirMa gnet W iFi An alyzer . http://www .airmagn et.com, 2010. [3] Aru ba Networks: RFProtect Spectrum An alyzer . http://www .aruban etworks.com, 2010. [4] B. Aggar wal, R. Bhagwan, T . Das, S. Eswaran, V .N. Padmanabh an, and G.M . V oelker . NetPrints: Diagnosing home network misconﬁgurations using shared knowledge. In USENIX NSDI , 200 9. [5] N. Ahm ed, U. Ismail, S. Kesha v , and K. Papagian- naki. Onlin e estimatio n o f RF in terference. In A CM CoNEXT , 2008. [6] K. Cai, M. Blackstock, M.J. Feeley , and C. Krasic. Non-intru si ve, dyn amic inter ference detection fo r 802.1 1 networks. In ACM SIGCOMM IMC , 20 09. [7] R. Chan dra, V .N. Padmanabh an, and M. Zhang . W iFiProﬁler: coop erative diagnosis in wir eless LANs. In ACM Mobisys , 2006. [8] Y .C. Cheng , M. Afanasyev , P . V erk aik, P . Benko, J. C hiang, A.C. Snoere n, S. Sa vage, and G.M. V oelker . Au tomating cross-laye r diag nosis of en- terprise wireless networks. ACM SIGCOMM CCR , 37(4) :25–36 , 2 007. [9] Y .C. Cheng, J. Bellardo , P . Benko, A.C. Sno eren, G.M. V oelker , an d S. Sav age . Jigsaw: So lving th e puzzle o f enter prise 80 2.11 analysis. ACM SIG- COMM CCR , 36(4):3 9–50, 2006 . [10] C. Do v rolis, P . Ramanathan, an d D. Mo ore. Packet- dispersion tech niques and a capacity- estimation methodo logy . Networking, IEEE /AC M T ransac- tions on , 12(6) :963–9 77, 2 004. [11] D. Giustiniano, D. Malone, D.J. L eith, an d K. Pa- pagiann aki. Measuring transmission o pportu nities in 802.11 links. IEEE/ACM T oN , (99) :1, 201 0. [12] M . Hollander and D.A. W olfe. Nonp arametric Sta- tistical Methods. 1999 . [13] R. Kapoor, L.J. Chen, L. Lao, M. Gerla, and M .Y . Sanadidi. CapPro be: a simple and accur ate ca- pacity estimation techniqu e. In ACM SI GCOMM , 2004. [14] A. Kashyap , S. Ga nguly , and S.R. Das. A measuremen t-based ap proach to mo deling link ca- pacity in 8 02.11 -based wireless networks. In AC M MOBICOM , 2007. [15] K. Lakshmin arayanan , S. Sapr a, S. Seshan, and P . Steenkiste. RFDump: an ar chitecture f or mon - itoring the wireless ether. In ACM C o NEXT , 2009. 8 [16] J. Lee, S.J. Lee, W . Kim, D. Jo, T . Kwon, and Y . Choi. RSS-based car rier sensing and interf er- ence estimatio n in 802 .11 wireless networks. In IEEE SECON , 2007. [17] R. Mahajan, M. Rodrig, D. W ether all, an d J. Za- horjan. Analyzin g the MAC - lev el be havior of wire- less networks in the wild. ACM SI GCOMM CCR , 36(4) :75–86 , 2 006. [18] D. Niculescu. I nterferen ce map fo r 802. 11 net- works. In A CM S IGCOMM IMC , 2007. [19] J. Padhye, S. Ag arwal, V .N. Padmanabhan, L. Qiu, A. Rao, and B. Zill. Estimation of link interferenc e in static multi-ho p wireless networks. In A CM SIG- COMM IMC , 2005. [20] M arc Portoles-Comeras, Albert Cabellos-Apar icio, Josep Ma ngues-Bafalluy , Albert Banchs, and Jo rdi Domingo -Pascual. Impa ct of transient csma/ca ac- cess d elays o n active band width measuremen ts. In AC M SIGCOMM IMC , 2009. [21] L . Qiu, Y . Zhan g, F . W ang , M.K. Han, and R. Ma- hajan. A general model of wireless interference . In AC M MOBICOM , 2007. [22] Cha rles Reis, Ratul Mahajan, May a Rodr ig, David W etherall, and John Zahorjan. Measur ement-ba sed models of delivery an d in terferenc e in static wire- less networks. ACM SIGCOMM , 2006. [23] A. Sheth, C. Doerr, D . Gru nwald, R. Han , and D. Sicker . MOJO: A d istributed physical layer anomaly dete ction system f or 802 .11 WLANs. In AC M Mobisys , 2006. [24] M . V utuku ru, K. Jamieson, and H. Balakrishnan. Harnessing exp osed terminals in wireless n etworks. In USENIX NSDI , 2008. 9

Can User-Level Probing Detect and Diagnose Common Home-WLAN Pathologies?

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment