Local Read-Write Operations in Sensor Networks

Designing protocols and formulating convenient programming units of abstraction for sensor networks is challenging due to communication errors and platform constraints. This paper investigates properties and implementation reliability for a \emph{loc…

Authors: Ted Herman, Morten Mjelde

Local Read-Write Operations in Sensor Networks
Lo cal Read-W rite Op erations in Sensor Net w orks ∗ T ed Herman Univ ersit y of Io wa herman@cs.u iowa.edu Morten Mjelde Univ ersit y of Bergen mortenm@ii. uib.no No ve m b er 2, 20 18 Abstract Designing pro to cols and for mulating conv enient programming units o f abstractio n for sensor net works is challenging due to communication er r ors a nd platfor m constraints. This pap er in vestigates pr op erties and implemen tation re liability for a lo c al r e ad-write abs trac- tion. Lo cal r ead-write is inspired by the c lass of r ead-mo dify-write o p erations defined for shared-memor y multipro cessor architectures. The class of r ead-mo dify-wr ite op erations is impo rtant in solv ing co nsensus and related synchronization problems fo r co nc ur rency co n- trol. Loc a l read-write is shown to be an atomic abstra ction for synchronizing neigh b o rho o d states in senso r netw o rks. The pap er compar es lo cal read-write to simila r lig ht weigh t o p- erations in wireless senso r net works, such as re a d-all, write-a ll, and a transaction-bas ed abstraction: for some optimistic scenarios , loca l read-w r ite is a mo re efficient neigh bo rho o d op eration. A partial implementation is describ ed, which shows that three outcomes charac- terize o p eration resp onse: success, failure, and cancel. A failure r esp onse indicates p oss ible inconsistency for the op er a tion result, which is the result of a timeout even t at the op er- ation’s initiator. The pap er pr esents exp erimental r esults on op era tion per formance with different timeout v alues and situatio ns of no co nten tion, with so me tests a lso on v ario us neighborho o d sizes. 1 In tro duction Wireless Sensor Net w ork (WSN) platforms add a t w ist to traditional programming as- sumptions. Man y resources can b e qu ite constrained, includ ing band width, program memory , and platform computing p o wer. Not surpr isingly , research on sens or net work p ro- gramming to d ate h as sough t abstracti ons and to ols that can satisfy the resource con- strain ts, ye t enable pr o ductivit y in softw are dev elopment cycles. Typically , these abstr ac- tions are not entirely new ideas, but adap- ∗ Researc h supp orted in part by NSF aw ard 0519907 . tations of (p erhaps less ortho d o x) techniques from areas of signal processing, database, and parallel or distribu ted computing. This pa- p er follo ws the same researc h direction, ex- ploring the adaptation of a r ead-mo dify-write abstraction as a unit of sensor net w ork pro- grams; w e prop ose an op eration called lo c al r e ad-wr ite (LR W) for neighborh o o d commu- nication in a s ensor net work. The compare-and-swap ( c&s ) instruction, a v ailable on many multipro cessor arc hitec- tures, is an example of read-mo dify-write. In one atomic step, a pro cessor executing c&s conditionally s waps the con tent of a memory 1 w ord with the con tent of a register; the con- dition for this sw ap is that the conten t of th e memory wo rd h av e a prescrib ed v alue giv en as a field of the instruction or giv en in an- other register. This idea, that a single in- struction sp ecifies a cond ition, a write v alue, and exp ects a resp onse v alue, can b e gener- alized and translated to the setting of no d es and pac k et-based comm u nication. A simp le instance of an LR W op eration is illustrated in Figure 1. Sensor no de x initiates th e op er - ation by transmitting a pac k et to neighborin g no de y . No de y in sp ects the p ack et, and p os- sibly sc h edules a tentat iv e wr ite to some lo cal v ariables; then y transmits a resp onse pack et to x . Up on receipt of y ’s resp onse, no de x will either decide to confirm the op eration or bac k out and voi d the op eration. V oidin g the op er- ation will result in y discarding its sc hedu led, ten tativ e write. x y z shade d ar e a i s neighb orho o d of x Figure 1: LR W with one neighbor. No de z sh o wn in Figure 1 lies ou tsid e x ’s neigh b orho o d ; there is the p ossibilit y that z could in itiate an LR W op eration concurrently with x , so that y first receiv es a p ac ke t f rom x , then a pac k et fr om z , and these requests conflict b ecause they write to th e same lo ca- tion. F o r these LR W op erations to b e atomic, the net effect of r unnin g b oth should b e log- ically serial, that is, as though one op eration completes b efore the other b egins. Th e de- sign c hoice for this pap er is that y should re- ject z ’s request wh ile x ’s oper ation is p ending, that is, y sh ould imm ed iately send a negativ e resp onse to z . The sin gle-no d e neighborh o o d of x , in Fig- ure 1, can b e generalized to l arger neig h- b orho o ds, illustrated by Figure 2. No de x ’s LR W op erates on a n eigh b orho o d of no d es y 1 through y k . Although the figure suggests k messages w ould b e tran s mitted b y x , a single lo cal b roadcast suffices for many ra- dio platforms. A t yp ical WSN app lication for LR W is d ata aggregation. Supp ose eac h y i has recorded some sensor v alue d i , and no de x ’s task is to compute some fu n ction of { d i | 1 ≤ i ≤ k } and sa v e the result to its flash memory . After x has completed this ag- gregatio n, eac h y i can discard its d i v alue and recycle lo cal memory . Note that if y i w ere to async hr onously send d i to x , it could b e that x do es not ha ve lo cal b uffers a v ailable for this data; putting x in con trol is a wa y to man- age resources safely . I n one LR W op er ation, x can collect all d i v alues and also sc hedu le the d i v ariables at eac h y i for recycling. Ho w- ev er, if x do es not collec t enough d i v alues, sa y fewer th an k / 2 neigh b ors resp ond to the request initiated within the LR W op eration, then x could cancel th e op eration and retry it later. Classical applications of read-mo dify- write, suc h as consensu s or leader election (applied to a WSN neigh b orh o o d) can easily b e expressed as an LR W op eration. x y 1 y 2 . . . . . . y k Figure 2: LR W with k neighbors. Con t ributions and Organization. Sec- tion 2 summarizes related work. Section 3 sp ecifies LR W prop erties and exp oses some design c h oices for implemen tation. Section 4 present s a theoretical result showing how a 2 mo del b ased on this abstraction differs from other c hoices. Section 5 cont ains implemen- tation results, which feature exp erimen ts to sho w d esign tradeoffs. Discussion of conclu- sions is in S ection 6 . 2 Motiv ation and Related W ork Sev eral trac ks of WSN r esearc h d ra w analo- gies to database and parallel compu ting meth- o ds. Early prop osals for queryin g sensor net works motiv ated proto cols for aggregatio n and routing to supp ort query language op- erations [10 , 8, 9]. The idea of program- ming a WSN as a whole (calle d macropro- gramming) sometimes tak e the p osition that programming sensors resembles the ensem- ble programming of parallel compu ting ma- c hines, using SIMD or MIMD instruction se- quences [11, 12]. Inspired b y distribu ted computing researc h, there are prop osals to adapt suc h paradigms as snapshots, leader electio n, and wa v e computations in WSN sys- tems [13, 15, 14]. Th is p ap er draws analogy to instructions for atomic comm unication in m u ltipro cessor, sh ared memory systems. In con trast to high-lev el concepts for WSN soft ware, there is also significan t researc h adapting the tec h niques of ad ho c net wo rks, p eer-to-p eer, and ev en in ternet protocols to the needs of WSN app lications and the lim- itations of WSN platforms. Tw o priorities for su c h researc h are reliable comm u nication and p o wer conserv ation. A question emerg- ing from this researc h is: what kin d of com- m u nication abstractio ns will b e co nv enient for p r ogramming ( i.e. , the interfaces are sim- ple and h ide lo w-leve l complexit y and p r ob- lems of heterogeneous platforms) wh ile en- abling efficien t u se of resour ce? This question has predominantly b een inv estigat ed with r e- sp ect to n on-lo cal communicatio n, for in- stance, multi -hop proto cols, routing struc- tures, and middlew are services for publish- subscrib e abstractions. Our w ork looks at lo cal comm u nication, wh ere “lo cal” refers to single-hop communicat ion, also called neigh- b orho o d co mm unication. On one hand, the literature of MA C p r o- to cols, sp ecialize d to WSN p latforms, exten- siv ely explores the concerns of lo cal comm u- nication [3]. P latform hardware ma y directly supp ort unicast and n eigh b orho o d broadcast op erations, and s ome r adio c hips pr o vide lo w- lev el sup p ort for unicast pac k et ac knowledg- men t in one p rogrammable op eration. On the other hand , there are s ev eral pap ers [6, 4, 2] suggesting higher-lev el programming units f or lo cal comm u nication. A n atural ab s traction for local comm un ication is ato mic r e ad-al l , whic h is the op eration of r eading the lo cal states of all no des in a neigh b orh o o d. Us- ing atomic r ead-all op erations, p rograms el- egan tly exp ress calculation of neigh b orho o d statistics; Section 4 elab orates on a v arian t of read-all w ith s tr onger atomicit y prop erties. Unfortunately , the read-all abstraction d o es not efficient ly map to WSN platform abilities. An alternativ e abstraction is atomic write- al l , which ma y b e implemen ted b y a sin gle lo cal message broadcast. A wr ite-all op era- tion writes (some part of ) the states of ev ery other n o de in a neigh b orh o o d. This op eration is not so n atur al for programming as read- all, ho w ev er p rogram trans forms hav e b een prop osed that conv ert man y programs using read-all op erations into ones that employ only write-all op erations [6, 5, 4]. Reliabilit y is a concern with na ¨ ıv e implemen tation of write- all consisting of a single message b roadcast; the broadcast can lose messages to a sub set of neigh b ors d ue to n oise or collision with other message traffic, sa y originating fr om other neigh b orho o d s in the WSN (in [4], the b asic op eration is called “write-all with collision”). The concerns of reliabilit y and atomicit y 3 are fun damen tal to database transaction the- ory , where A CID pr op erties define co rrect transaction pro cessing. Th e p ap er [2] sug- gests a local WSN op eration motiv ated b y database transactions: one atomic op era- tion reads from a s ubset of neigh b ors and writes to a subset of neighbors. T o im- pro ve reliabilit y , the lo cal transaction imple- men tation consists of a sequence of messages: read-request, resp onse, then write-commit or ab ort-transaction. T rans actions ma y b e ab orted b ecause of in terference with con tend- ing trans actions, and th e ab orted transac- tions need to b e retried. The reliabilit y of suc h lo cal transactions is imp erfect: final commit messages can b e lost and the transac- tion initiator can crash. Standard techniques that add reliabilit y to database transactio n pro cessing, such as stable storage and trans- action jour naling, are u nrealistic for ma n y WSN platforms. T o give some idea of th e r esources needed for the op erations discussed ab o ve , T a ble 1 summarizes optimistic , best-case resource measures for a neighborh o o d of n no des. The t wo measures are num b er of messages (in- cluding b oth u n icast and lo cal br oadcast m es- sages) and num b er of rounds, wh ere a round is a time interv a l of sufficient length to allo w all no des in a neigh b orho o d to send a message. The latter measure w ould allo w for queu in g, pro cessing, and transmission d ela ys as we ll as extra dela ys d ue to the m edium access con trol la ye r for collision a v oidance. The first row of the table r eflects th at a read-all op eration is initiated b y one no d e, follo wed by eac h of its n − 1 neigh b ors sending a r esp onse. A write- all op eration p oten tially has the least r esource cost of any op eration, consisting of jus t one broadcast message; ho wev er to pro vide for re- liabilit y , an implementat ion of write-all ma y return ackno wledgmen ts fr om eac h recipien t of the broadcast back to op eration’s initiator. F or this reason, the n um b er of message prim- Opera tion Messa ges Rounds read-all n 1 write-all 1 or n 1 transact 2 + r + w 2 LR W n 1 T able 1: Op eration comparison. itiv es is rep orted as “1 or n ” in T able 1. A lo cal transaction, as defined in [2] and called “transact” in T a ble 1, has a read set of r no des and a w rite set of w no des. Th e transaction is initiated with a b r oadcast, follo we d by a re- sp onse from eac h n o de in the read set. Then the transaction initiator transmits a b r oad- cast to the write set con taining v alues to b e written, and eac h m emb er of the wr ite set uni- casts an ac kn o wledgment to the initiator; the ac kno wledgmen ts are needed so that the ini- tiator can decide whether to allo w th e trans- action to commit or to broadcast a cancel message. The read and w rite sets ma y ov er- lap, with the worst case b eing r = w = n − 1 (whic h w ould put th e m essage cost of trans- act at 2 n ). The LR W op eration b egins with a broadcast, follo wed b y eac h of th e n − 1 other no des resp onding. Since any v alue to b e writ- ten is conta ined in the initial broadcast and resp onses are collected by the LR W initiator, no add itional r ound is needed to complete the op eration. The measures of T able 1 are optimistic n u m b ers in t wo senses. First, the measures are for transactions that succeed, that is, they do not fail due to conflicts with con- current transactions or n egativ e resp onses (an LR W op eration w ould need to includ e a can- cellatio n message if any neighb or resp onse indicated some unant icipated v alue). Sec- ondly , the table do es not include commit messages for transaction or L R W op erations. This is b ecause sensor no de timing and clo ck- ing mec h an ism s enable commit to b e time- triggered, that is, eac h no de commits a trans- 4 action after sufficient time has p assed without receiving a cancellation message. 3 LR W Design Issues L o c a l R e a d-Write (LR W) is an op eration de- fined on v ariables of WSN no d es. W e assume that eac h no de has the same s et of v ariables ∗ that can b e read and written b y an LR W op- eration. F or v ariable v and no de q , let v q refer to q ’s ins tance of v . Eac h inv o cation of LR W sp ecifies: ( i ) a fun ction f defined on a sub - set of no d e v ariables, ( ii ) a subset of no de v ariables to b e written, and ( iii ) a b o olean function g . F unction f ca n b e computed at an y no de, and either return s a ne gative r e- sp onse v alue ⊥ or return s a pair ( r, B ), where r is a v alue pro vided for computing g and B is a list of v alues to b e written to the v ari- ables sp ecified in ( ii ). A nonlossy LR W op er- ation is defined with resp ect to an initiating no de p and p ’s neigh b orho o d N ( p ), consist- ing of three s teps : ( 1 ) for eac h no de q ∈ N ( p ), function f is compu ted; ( 2 ) fu nction g is com- puted on th e set of r -v alues { r q | q ∈ N ( p ) } ; and ( 3 ) if the result of g is true , then B q is written to th e write v ariables of q , for eac h no de q . A lossy L R W op eration would allo w, in ( 1 )–( 3 ) , prop er subsets of N ( p ) to m o del the loss of messages. W e do n ot formally sp ec- ify lossy LR W instances in this pap er. View ed from the app lication p ersp ectiv e, an LR W op eration b egins w hen initiating no de p inv ok es LR W and ends when p r eceiv es a resp onse from the LR W. Bet w een th e in vo- cation and resp onse, w e assu me that p do es not in vok e another LR W instance. Th us th e only source of concurrency in th e system is con tent ion among LR W op erations of d iffer- en t LR W initiators. The b eha vior of a set of (p ossibly concurr en t) LR W op erations can b e ∗ This assumption is n ot essential, but simplifies the description. sp ecified by a sequence, called an L R W his- tory , which con tains LR W inv o cations, con- tains results of f and g ev a luations, assign- men ts to v ariables, and con tains resp onses to the LR W in vocations. W e omit details of the history formalizatio n, whic h f ollo w from standard tec hn iques similar to the notation of trans action serializabilit y . A well-fo rmed LR W h istory is one in wh ic h every LR W in- v o cation finds a matc hing resp onse. A well- formed L R W history determines v alues for all v ariables. Implementa tions of LR W or similar op erations result in refined histories, where b et w een inv o cation and resp onse, lo wer-lev el ev ents (transm iss ion, reception, message pro- cessing) o ccur. Analysis of suc h op eration h is- tories, for imp lemen tations of op erations in T able 1, can v erif y their atomicit y prop erties. The framew ork [2] uses terminology of transactions to describ e lo cal op erations, in- cluding some A CID p rop erties of transactions in databases. A tomicit y of a trans action, whic h is the all-or-none pr op erty , is du e to tw o prop erties of the proto col. First, in the WSN mo del, ordering transactions can b e simp le b ecause m essage propagation latency is neg- ligible. If no des p and p ′ concurrent ly initi- ate a transaction u sing lo cal b roadcast, with x, y ∈ N ( p ) and x, y ∈ N ( p ′ ), then x and y cannot receiv e br oadcasts f rom p and p ′ in differen t order. Second, all the writes of a transaction are sand b oxe d and only actually written up on the ev en t of transaction com- mit. Consistency of transactions is ensured b y conflict resolution. If the transactions of p and p ′ conflict, sa y b ecause they write to the same v ariable in no de x , then one of the t wo transactions will b e ab orted (and p ossibly re ¨ ınitiated later). Prop erties corresp onding to atomicit y and consistency can similarly b e sho w n f or LR W op erations (and p r o ved using LR W histories). T r an s actions of p and p ′ can b e concurrent, ev en w ith n eigh b ors { x, y } in common, p ro vid ed that they op erate on d is- 5 tinct sets of v ariables (and more generally , if it can b e sh o wn that the transactions ha ve com- m u tativ e seman tics). This observ ation also holds for LR W op erations. The problematic asp ects of ACID prop er - ties for WSNs arise fr om platform limitations and un reliable message transp ort. The p os- sibilit y of message loss implies, for example, that a commit message or a cancellation mes- sage could b e lost. It is wel l-kno wn th at no ac kno w ledgment proto col can guaran tee that all neigh b ors of a transaction initiator will receiv e a commit or cancellat ion message, ev en if it is retransm itted some n umber of times [17, 16]. Ho wev er, the p robabilit y of a comm un ication loss can b e red u ced if mes- sages are retransmitted, and retransmission ma y b e a practical strategy to improv e reli- abilit y for WSNs (in effect, retransmission is an appr o ximation to ev entuall y correct mes- sage deliv ery). A proto col optimization for transaction or LR W op erations is to replace a commit or cancellation message with timeout- driv en activ ation. The design choic e of [2] and in this pap er is to let commit b e timeout- driv en: if, after some fixed time p erio d, a no de d o es n ot receiv e an y cancellat ion mes- sage fr om the LR W initiator, then v ariables writes are committed. An alternativ e design c hoice w ould b e to let cancellat ion b e the timeout-trigge red default, ho wev er th is c h oice w ould shift the balance of p ow er u sage (b e- cause messages consume p o wer) to commit, and for most applications and t ypical WSN w orkloads, one w ould exp ect most LR W op- erations to b e committed. A limitatio n of sev eral curr en t WSN mes- sage pr otocols is pac ket payloa d size. F or the platform used in our exp eriments, the pa y- load is 28 bytes, whic h limits ho w muc h can b e sp ecified in an LR W op eration b ased on a single br oadcast. S caling LR W to larger data amoun ts would r equire fragmen tation of LR W message fields ov er multiple broadcasts. Some WSN platforms m a y n ot s upp ort native lo cal br oadcast; there, ordering LR W op era- tions b y the instan t of reception would n ot b e reliable. Ho w eve r LR W op erations can also b e ordered by timestamp, if the WSN has sync h ronized clocks. With synchronized clocks, LR W op erations can b e group ed by slotted time in terv als. In a slotted time pr o- to col, initiat ors w ait unti l the b eginning of a slot b efore transmitting an L R W op eration message; wh en a neighbor receiv es an LR W message, it d ela ys sending a r esp onse un til the end of the current slot, in order to collect all LR W op erations, ord er them, and sort out conflicts. A reason to consider u sing un icast, rather than broadcast of the initial LR W mes- sage, is to improv e s c hedu ling efficiency of re- sp onses fr om n eighb ors . Th e CC2420 radio c hip has a feature for immediate ackno wledg- men t of u nicast messages, and this feature is not a v ailable for lo cal b roadcast. In the discussion ab o v e, we ha v e treated N ( p ) as a constan t, s u pp osing the neigh b or- ho o d of p to b e fixed in the WSN. The ex- p erience of many researc hers is th at, ev en for a s tatic WSN, r adio prop erties are dynamic: the set of stable, bidirectional links defining neigh b orho o d s ev olves. T herefore the design of an LR W proto col should p lan for dynamic neigh b orho o d s. If an LR W op eration fails b e- cause the initiato r did n ot collect resp onses from ev ery neigh b or (this would dep end on the d efinition of g ), it could b e th at the neigh- b orho o d has c hanged. In this case, subse- quen tly submitting the LR W op eration would use the new n eigh b orho o d. 4 LR W Op eration Compari- son T able 1 do es not compare expressive, or com- puting p o w er of differen t n eigh b orho o d op er - ations. In the table, transact consumes most 6 resource, but transact is more p o werful than an y ot her: in one op eration, a f unction of neigh b orho o d v alues can b e compu ted and written to several no des. An L R W op er a- tion is strictly less p o w erf ul b ecause any v alue written m ust b e prescrib ed, b efore the op er- ation is inv ok ed, rather than compu ting the v alue to write du ring th e op eration. One tec hnique to compare op er ation p ow er is to examine proto cols that use only that op- eration to solve some classic problem, su c h as consensu s . If one op eration typ e enables consensus to b e s olved whereas another op- eration do es not, then the former op eration is more p o we r ful (with resp ect to consensu s) than the latter. Briefly , a consensus proto col b egins with eac h no d e ha ving an input v alue and a d ecision v ariable, w hic h can b e written at most once. The inpu t is n ot in an y v ariable, that is, inpu t v alues cannot directly b e view ed b y any of the op erations of T able 1; an early step in any consensu s proto col is to sh are the input with other no des. The initial v alue of the decision is some constan t ω not equal to an y n o de’s inp ut. Consensu s proto cols must satisfy three prop erties: v alidit y , agreemen t, and termination. Th e termination prop ert y is that eve ry no de eve n tu ally writes to its deci- sion v ariable, regardless of the progress or fail- ure of other no d es; agreement requires that no tw o no des wr ite different decision v alues; v alidit y requ ir es that any decision written b e the inpu t of some no de. The difficulty of con- sensus lies in the timing of no des participating in the proto col. If some no de p is very slo w to engage in the proto col, th en other n o des will n eed to d ecide without kno wing p ’s in- put. Although sync h ronous timing is imp licit in the implementati on of op erations such as LR W, soft w are at the application la yer may b e asynchronous, hence the timing of app li- cations using LR W can b e unp redictable. F or the follo wing resu lts, we assume com- m u nications are nonlossy and do n ot fail due to con tent ion conflicts. Also, neigh b orho o d s are static and definitions of neighborh o o d are consisten t, that is, if q ∈ N ( p ) then p ∈ N ( q ). The follo wing shows that read-all is insu ffi- cien t to solv e the consensu s p roblem. Lemma 1 Consensus using only r e ad-al l op- er ations is imp ossible. Pro of: Th e p r o of rep eats standard argu- men ts [1] b ased on finding a contradictio n in a constructed execution. S upp ose consen- sus is p ossible, and that no des p and q are neigh b ors with inpu ts 0 and 1 resp ectiv ely . If p (or q ) waits long enough to exp ose its in- put v alue, then the other no de ma y tak e suf- ficien tly man y steps so that it is f orced, b y the termination prop erty , to d ecide; b ecause the other’s input is unknown, it w ill decide in fa v or of its o wn input. Thus the initial state for the consensus proto col is multiva- lent , that is, there exist t wo p ossible execu- tions leading to differen t decisions. A state is univalent if all p ossible executions follo w- ing that state can only lead to one decision (in effect, the d ecision h as already b een cho- sen, ev en if not presentl y in a decision v ari- able). Executions consist of an in terlea ving, of atomic steps from some no de in th e neigh- b orho o d, where a step is either a r ead-all op- eration, some lo cal calculation, or writing to some v ariable(s). If p wr ites to a v ariable v , and the next step in the execution is a r ead- all f or v by q , th en q obtains the v alue p wrote to v . Let σ b e the last m ultiv alen t state in an execution (the termination p rop erty implies σ exists). There are at least t wo p ossible cont in- uations from σ leading to different d ecisions, b y definition. S uc h cont in u ations necessar- ily b egin w ith steps of differen t no des. W e consider differen t cases for the first step by p and q with resp ect to cont in u ations. Note that if p steps first after σ , then th e v alency 7 is differen t than w ould b e if q steps first (oth- erwise σ is n ot m ultiv alen t). If the fi rst step b y p is a lo cal calculation or a read-all op- eration, then the o ccurrence of that step is undetectable by q . Th is contradicts th e as- sumption that p ’s first step after σ results in a u niv alen t state. If the first step by p wr ites to a v ariable, then it cannot b e th at q ’s first step writes to a v ariable, b ecause these t wo steps comm ute, whic h w ould con tradict the differing v alency of these tw o steps. T here- fore, the essen tial case to examine is wher e p ’s fir st step writes to a v ariable and q ’s firs t step is a read-all op eration. If p steps first and then sleeps while q runs long enough to decide, the v alency will f ollo w from p ’s write of a v ariable; the same v alency is obtained if q makes no steps while p r u ns long enough to decide. Ho wev er, p cannot detect whether or not q has p erformed a read-all, hence if q steps first, then sleeps, with p run ning long enough to decide, p m u st decide as if q to ok no steps, whic h contradicts the su pp osed v alency of q ’s read-all op eration. Thus in any case, the tran- sition f r om m ultiv alency to univ alency can b e prev en ted in some p ossible execution. ❑ Although read-all do esn’t p ro v id e a solu- tion to consens u s, an enhanced form of read- all, called r e ad-al l-write , do es allo w for a so- lution. In a read-all-write op eration, a n o de atomical ly reads v alues fr om all neigh b ors and wr ites some function of the resu lt to a v ariable. If p and q inv ok e read-all-write at nearly the same instan t, then atomicit y guaran tees that one op eration will precede the other. Thereby , if p ’s r ead-all-write o c- curs first, then the v ariable written b y p will b e visible in q ’s read-all-write. Thus q can detect that p ’s operation preceded q ’s, and the d ecision v alue for b oth no des can b e the input of p . A read-all op eration is “ligh ter weigh t” compared to a r ead-all-write op eration, which must constrain concurrency to guarant ee atomicit y . W e are not aw are of WSN researc h on neighborh o o d read-all- write. Presumably the transactional meth- o ds, sa y of [7] or [2] could b e u sed to imp le- men t read-all-write. Unlik e read-all, the wr ite-all op eration can b e used to solv e consens us in particular cases. The follo wing first iden tifies a negativ e case, where write-all is insufficien t; afterward we discuss a case where a consensus pr oto col us es write-all. Lemma 2 Consensus usi ng only single- variable write-al l op er atio ns is imp ossible. Pro of: Th e pr o of is similar to that for Lemma 1. Here, eac h no de obtains v alues of other no des only by lo cally reading v ariables that ha ve b een assigned b y a write-all op er- ation. Let σ b e the last multiv al en t state in an execution, and supp ose th e n ext steps of p and q are write-all op erations. If these steps write to differen t v ariables, then the steps comm ute and a v alency con tradiction is ob- tained. When N ( p ) = { q } the steps of p and q w r ite to different v ariables b ecause write-all op erations assign to v ariables of other no des. One case where tw o steps write to the s ame v ariable, is th at p ’s first step wr ites to its v ari- able v p and q ’s firs t step is a write-all to v p . Supp ose th e v alency of p taking the first step is 1, and th e v alency of q taking the fi rst step is 0. If q take s the fi rst step and p sleeps long enough for q to decide, the decision is 0. If p tak es the firs t step and then sleeps wh ile q runs long en ough to decide, the decision w ill still b e 0, b ecause q o v erwr ote what p h ad written, and thus q is not influenced b y p ’s initial step. T his con tradicts the assumption that p ’s first step results in a un iv alen t state with v alency 1. ❑ The write-all op eration of T able 1 d o es not include any lo cal v ariable as a wr ite target 8 (that is, p 6∈ N ( p )). An extension to write- all w ould b e to include v p in p ’s write-all of v ariable v . This extension alone turns out not to help in solving consensu s, ho wev er the in- clusion of v p together with allo wing multiple v ariables to b e w ritten do es enable a consen- sus proto col. Supp ose N ( p ) = { q } and thr ee v ariables u , v , w are in itially ω . Let p ’s op er- ation wr ite its inp ut v alue to u and to v ; an d let q ’s op eration write its in put v alue to v and to w . Whic h ev er no de has the fir st wr ite-all op eration forces the decision v alue to b e its input. In an execution with differing inputs suc h that p in vok es the first w rite-all and q sleeps, p will d etect th at it has the first op- eration, b ecause w = ω ; and if q do es not sleep, p ma y detect that w 6 = ω , ho w ever then v do es not con tain p ’s input, and so p detects that its write-all o ccurred fi rst ( q will get the decision from v in that case). It is not difficult to sho w that LR W or transact suffice to solve consensu s, b ecause it is simple f or a no de to record a decision v alue th at is not ov erwritten b y an y s u bse- quen t op eration. Th e LR W op eration is a ligh ter w eigh t primitiv e than write-all, whic h deals with more v ariables than LR W when used for for consensu s. When op erations ha v e equ iv alent solv abil- it y p o we r, then ma y also b e compared by the time required or the n umb er of op erations used in a solution. In tu itiv ely an L R W op- eration do es more wo rk p er op eration than either read-all or write-all op erations, and all of these h a ve 1-round (optimistic) time com- plexit y . 5 Implemen tation The previous sections of the p ap er motiv ate LR W op erations and sk etc h, at a h igh lev el, ho w such an op eration could b e implemented in a WSN. T o confi rm the f easibilit y of LR W on a curren t s en sor net w ork platform, this section r ep orts results f rom simple exp eri- men ts on some small mote net w orks. Section 3 exp oses general design issues for an imple- men tation, wh ereas the exp erimen tal imple- men tation must con tend with lo w-lev el de- sign considerations. F or example, the table in Figure 1 rep orts optimistic message coun ts for LR W, but our exp erimen ts consider f ail- ures in message d eliv ery . Op eration d uration and th roughput is affected by thresholds for message transit time and exp ected num b er of retransmissions; such factors are determined from exp erimen ts. F or instance, the d uration of an LR W op eration can b e r educed b y set- ting smaller time limits and lo wering the num- b er of r etries f or lost messages; this will allo w more LR W op erations to b e executed, at the cost of reliabilit y . Figures presente d b elo w sho w effects of such tun ing decisions, f or the case of an LR W op eration r un in isolation and also f or the case of an LR W con tending w ith other op erations. 5.1 Exp erimen t al Platform Our implementa tion and exp erimen ts were written in the NesC language for the TinyOS (v ersion 2) op erating system, r u nning on T elosb [18] and MicaZ motes (b oth platforms use the same radio c h ip, CC2420). Rather than a full implementa tion of LR W, w e u sed a simpler p r oto col that ignored the case of application-trigg ered op eration failures (suc h as one neigh b or h a ving a v alue th at cancels the LR W op eration); thus all our exp eriments consider only cases of su ccessful LR W op era- tions, except where an op eration is rejected due to concur r ency . The im p lemen tation is built on sev eral services: a MAC-la y er r ad io stac k transmits and delivers p ack ets, also in- serting random d ela ys (t ypically b et wee n 3ms and 12ms) to a vo id collisio n with other trans- missions; a neighborh o o d service determines 9 LR W-initiate( p ): mo de ← active , S ← { } start T T im eout , T C ommit broadcast ( initMsg p ( T C ommit ) ) start T Response while ( mo de = active ) receive (rejectMsg q ) : mo de ← c anc el receive ( m = acceptMsg q ) : S ← S ∪ { sender ( m ) } if | S | = | N ( p ) | stop Timers ; return suc c ess T Response expir es : broadcast (initMsg p ( T C ommit )) r estart T Response T T im eout expir es : mo de ← ab ort , S ← { } broadcast ( ab ortMsg p ) start T Response , T T im eout while ( mo de = ab ort ) receive ( m = ab ortA ck q ) : S ← S ∪ { m } if | S | = | N ( p ) | c anc el Timers , return c anc ele d T Response expir es : broadcast ( ab ortMsg p ) start T Response T T im eout expir es : stop Timers ; return fail e d Figure 3: LR W for Initiator p the effec tiv e set of a no de’s neighbors (for whic h there is current ly bidirectional com- m u nication); a clo ck sync h ronization service aligns the timers of no des, wh ich facilitates exp eriments that ind uce concurr ent LR W op- erations in a control led wa y . Our largest exp eriments u sed 31 MicaZ motes, and due to p ro ximity , w e artificially constrained eac h no de to ha ve at most s ix neigh b ors (essen- tially , this is top ology control). Our imple- men tation emplo y ed three countdo wn timers, t wo for message deliv ery and ackno wledg- men t, and a third to limit th e total dura- tion of the LR W op eration. W e used tw o timers for message deliv er y , to mak e the dis- tinction b et wee n ( i ) time for message tran s - mission and receiving a r esp onse message, and ( ii ) th e time f or su ccessful transmiss ion and resp onse fr om all neigh b ors, includin g retries of ( i ). A tec h nical reason to use ( ii ) instead of a retry coun ter is th at the MAC la ye r’s tim- ing is randomized, and our design goal w as to implemen t a proto col with kno w n th resholds for the LR W op eration. Should the timer for ( ii ) exp ire, then the LR W op eration’s in itia- tor ab orts the op eration and trans m its ab ort commands to its neigh b orho o d . The third timer is for op eration commit: if a neigh b or do es not receiv e an ab ort message and the commit timer expires, then the result of the LR W is committed. 5.2 Three Outcomes for LR W W e sa y that an LR W op eration has three p os- sible outcomes: It is considered a Suc c ess if the LR W is acc epted by all neigh b ors; the op eration is considered Canc ele d if an ab ort message (wh ic h ma y ha ve b ecome necessary for a n u m b er of reasons) is r esp ond ed to by all neighb ors ; and it is considered F aile d if at least one neighbor do es not resp ond to the ab ort message. Eac h L R W op eration return s to the application, whic h in vok ed LR W, one of these thr ee outcomes. The first t wo outcomes, Suc c ess a nd Canc el , are within th e in tend ed b ehavio r of the proto col (in th e terminology of transactions, the resu lt satisfies At omicit y and Consistency criteria). The final outcome, F aile d , represen ts a failure of comm unication, a sen sor n o de crash, or (silen t) n eigh b orho o d reconfiguration. F or a failed outcome, it is un- certain whether or not all neighbors receiv ed and committed the LR W op eration (p ossibly , the initiator attempted to cancel the op era- tion, but not all neighbors ac kn o wledged the cancel r equest within the allo w ed timeout p e- rio d). F or a set of LR W op erations, r eliability is the p ercen tage of n on-failed LR W op era- tions, that is, op erations which resp ond b y success or cancel. 10 LR W-neighbor( q ): initially : mo de = id le while ( m o de = engage d p ) receive ( ab ortMsg p ) : stop T C ommit send ( p, ab ortA ck q ) mo de ← id le receive ( m = initMsg p ( t ) ) : send ( p , acceptMsg q ) receive ( initMsg r ∧ r 6 = p ) : send ( r, rejectMsg q ) T C ommit expir es : q c ommits LR W mo de ← id le while ( m o de = i d le ) receive ( ab ortMsg r ) : send ( r, abortA ck q ) receive ( m = initMsg p ( t ) ) : start T C ommit ← t mo de ← engage d p send ( p , acceptMsg q ) Figure 4: LR W for Neighbor q 5.3 Proto col In th e follo wing, p refers to an initiator and { q 0 , q 1 , ..., q k } = N ( p ) refers to th e neigh- b ors of p . Figures 3 and 4 con tain a high- lev el description of the LR W implementa tion. Fiv e message types are used in the p roto- col, initMsg , ac c eptMsg , r eje ctMsg , ab ortMsg , and ab ortA ck . Th r ee ev en t types driv e the proto col: in v o cation of an LR W op eration, message arr iv al, and timer expiration. Th e proto col timers are denoted T x where x ∈ { R esp onse,Time out,Commit } . Eac h timer starts with s ome p ositive v alue and decreases to zero, wh ereat an expiration ev ent o ccurs. The in itiator b egins by starting three timers, ho wev er only T Response and T T imeout ha ve ex- piration ev ent s sho wn in the initiator proto- col; T C ommit pro v id es a time stamp for the LR W op eration, used by neigh b ors . (Note that T C ommit could b e signifi can t to the ini- tiator as a safeguard, to r eject any application in vocation of LR W-initiate while a cur r en t in- v o cation is already in progress; T C ommit could also trigger commit at the initiator itself, but to simplify the p resen tation w e suppr ess su ch details.) The first message br oadcast by th e ini- tiator is the initMsg , con taining the cur- ren t T C ommit v alue and other fields relev an t to the LR W op eration. The initiator w aits for all neigh b ors to reply w ith ac c eptMsg (i f T Response expires, then the initiator again broadcasts an ini tMsg ). T o simplify the de- scription, the case where T C ommit expires (whic h could occur if the initiator retries a broadcast to o many times) is not sh o wn. Also, th e pr esen tation omits the case where a collection of ac c eptMsg v alues m ight trigger some application-sp ecific ab ort of the LR W op eration. F or the mote imp lemen tation, w e also in- cluded a list of neighbors in the initMsg : th is list named the n eigh b ors for wh ic h th e initia- tor had not y et receiv ed a ac c eptMsg . Th is small optimization reduced the o ve rhead af- ter retransmitting an initMsg , by a vo iding a needless r esending of ac c eptMsg (usu ally f or the ma jorit y of n eigh b ors). It is imp ortan t to note that even if LR W- initiate retur ns Suc c ess , T C ommit ma y n ot ha ve exp ired, and thus the LR W has not b een committed. F or this reason p m u st not b e p er- mitted to b egin a new LR W op er ation u n til T C ommit expires. In our trials, the frequency of LR W op erations w as small enough to en- sure that th e previous LR W w as committed or ab orted b efore a new L R W was started, but an y imp lementati on of this proto col would ha ve to address the p ossibilit y of such an o c- currence. Con trary to this, if the LR W op er- ation is ab orted, a n ew LR W ma y b e initiated immediately . If the LR W fails due to a neigh b or q i , then since at least one attempt has b een made by the initiator p to cancel the op eration, we 11 kno w that one of the follo wing is true: (1) q i did not receiv e the initiation message; (2) q i receiv ed the initiatio n message, but n ot th e ab ort message; or (3) p did not receiv e q i ’s resp onse to the ab ort message. In cases (1) and (3) the LR W is not committed, and thus consistency is mainta in ed. In C ase (2) ho w - ev er, the LR W is committed, and consistency is lost. 5.4 Metrics W e ev aluated the LR W implemen tation with resp ect to time and reliabilit y . F or exp eri- men ts, we considered different platforms, d if- feren t top ologies, d ifferen t lo w-lev el choice s for comm unication, whether op erations are run in isolation or op erations run are un der con tent ion, as well as d ifferen t settings for T T imeout and T C ommit . The total time allot- ted to an LR W op eration is T C ommit , how- ev er u nder id eal circumstances, all messages are sent, deliv ered, and ac kno w ledged in a time p ossibly muc h less than T C ommit : we re- fer to the time needed for one LR W op era- tion to complete und er these circumstances as the optimistic dur ation of an LR W op- eration. Measuring the optimistic d uration is int eresting: if the gap b et ween the opti- mistic duration an d T C ommit is large, and if close to id eal circum stances are common, then T C ommit ma y b e decreased. A smaller v alue of T C ommit w ould allo w an app lication to su bmit more LR W op erations, that is, the through- put of LR W op erations could b e higher if the op eration in terv al is shorter. Ho wev er, de- creasing T T imeout and T C ommit ma y also de- crease the fr equency of success, b ecause few er message retries will b e attempted and late- arriving messages could b e discarded. A less reliable imp lemen tation impacts th e applica- tion, w h ic h ma y or ma y not retry an u nsuc- cessful LR W op eration. In view of the tw o unsuccessful outcome p ossibilities for LR W- initiate, c anc ele d and faile d , the application should d ecide w h ether or not to retry a can- celed oper ation. Reducing the frequency of faile d cases can b e handled within the LR W implemen tation. 0 20 40 60 80 100 120 140 1 2 3 4 5 6 7 8 9 10 11 Neighb o rho o d S ize Millis econds l l l l l l l l l l l r r r r r r r r r r r r Broadcast l Unicast Figure 5: Broadcast vs Un icast with Ack T o measure the implement ation under con- ten tion, we used the synchronized clo ck ser- vice [19] to arrange that a set of n o des sim u l- taneously inv ok e LR W-initiate. Within an ex- p eriment, a set of LR W op erations started si- m u ltaneously is called a series ; the d uration of that series is defined to b e the maxim u m optimistic du r ation of an y LR W op eration in the set. W e sa y that a series faile d if at least one op eration in it faile d , and th at it suc c e e de d otherwise. W e define the reliabilit y of a set of series as the p ercen tage of successful op- erations. O ur exp erimen ts show factors that influence a series duration and reliabilit y . 5.5 Exp erimen t s Comm unication Primitive. Eac h initMsg from an initiator should b e ac kno w ledged, ei- ther with an ac c ept Msg or a r eje ctMsg , by eac h neigh b or. The CC 2420 radio chip of- fers a hardw are-lev el ac knowledgmen t fea- ture, which enables the r eceiv er of a u nicast 12 message to sen d an ac k frame immed iately , without the usual MA C dela y (to a void colli- sion). Though this feature do es not p resen tly pro v id e a pa yload area within the ac kno wl- edgmen t frame, an d hence is not sufficien t for LR W pu rp oses, we nonetheless tested how w ell u sing u nicast (with the hardware ac- kno wledgments) compared to using n eigh b or- ho o d b roadcast. Using unicast requires more transmission op erations by the in itiator (one p er neighbor), but also sa ves time comm u ni- cating ac kno wledgments. Figure 5 disp lays results of an exp eriment conducted on T elosb motes, where eac h data p oint in the graph is the mean of appro ximately 250 successful op- erations. Due to a generous timeout, no op er- ations we re ab orted or failed. In this exp eri- men t, there is only one initiator, and the num- b er of neigh b ors v aried, sh o wn on the x-axis; the optimistic d uration is sho wn on the y-axis. Because this exp erim ent sho wed the sup eri- orit y of us ing br oadcast with d enser net w ork s (and the cur ren t unicast with hard w are ack is deficien t for our purp oses), we u s ed the neigh- b orho o d broadcast p rimitiv e in all subsequent exp eriments. Figure 5 also su ggests a s tarting p oint for testing differen t v alues of T T imeout at different neigh b orho o d sizes. F or example, giv en an initiator with six n eigh b ors, 50ms migh t b e a starting p oin t for an exp erimen t testing reliabilit y . Reliabilit y and Timeout Recall from Section 5.4 that the duration of th e T C ommit timer is a significan t factor in the through- put of the protocol. If the proto col is ex- p ected to rapidly execute several LR W op er- ations, there is a significant incentiv e to k eep the T C ommit timer as s hort as p ossible. Th is ho wev er carries with it another set of c hal- lenges: note that the execution of the LR W op eration must b e conta ined en tirely w ithin the timespan of T C ommit . F urthermore, ob- serv e from Figure 3 that T T imeout p erforms a slightly different function dep en ding on the current mo d e of the initiator p ; if it is in ac- tive mod e the timer is used to ent er the ini- tiator into ab ort mo d e, and if it is in ab ort mo de, the timer signals w h en the op eration is to b e declared as faile d . Sin ce the T C ommit timer m u st b e of sufficient duration to allo w for b oth of these ev en tualities it follo ws that the length of T C ommit m u st b e at least twice that of T T imeout . Timeout Relia bilit y 50ms 46.64% 75ms 96.87 % 100ms 100 % T able 2: Reliabilit y without con tentio n . Due to the ab ov e reasoning, man y of our exp eriments we re fo cused on stu dying th e ef- fect th at reducing the duration of T T imeout w ould ha v e on th e duration and reliabilit y of b oth op erations and series. W e p erformed t wo sets of trials. T he fir st set was executed in a simp le net work con taining one initiator with a static neigh b orho o d of size six. In these ex- p eriments we us ed T elosB motes. 55 60 65 70 75 80 50 75 100 Milliseco nds Timeout in milliseconds b b b Figure 6: Av erage LR W du ration with 95% confidence interv al for v arying timeout. The second set introd uced con tentio n in the form of s everal initiators b eing present in the net wo rk at the same time. W e also em- plo yed pr e¨ existing clo ck synchronizatio n and neigh b orho o d services. Note h o wev er that, as w as mentioned in Sectio n 3, ev en for a static WSN net work the neigh b orho o d of a 13 single mote is often dynamic. The n etw ork in this case consisted of 31 MicaZ motes, lab eled as v 1 , v 2 , ..., v 31 . A mote v i w ould b e consid- ered an initiator if and only if i mod (6) ≡ 1 (th us the motes v 1 , v 7 , v 13 , v 19 , v 25 and v 31 w ere in itiators). In order to con tr ol the top ol- ogy of the net work, we limited the p oten- tial neighborho o d of an in itiator v i suc h that N ( v i ) ⊂ { v i − 3 , v i − 2 , v i − 1 , v i +1 , v i +2 , v i +3 } . Observe that b ecause of this, an y t wo co or- dinators v i and v i + 6 will ha ve o verla pping p oten tial neigh b orho o d (due to u sing a dy- namic n eigh b orho o d service, the actual neigh- b orho o d v aries). 0 % 10 % 20 % 30 % 40 % 50 % 60 % 20-40 40-60 60-80 80-100 100-120 120-140 140-160 LR W op erations Time interv al in milliseconds r 50ms r 75ms r 100ms Figure 7: Duration distribu tion for v arying timeout. Our initial exp eriment s w ere intended to ev aluate the optimistic d uration of the p r o- to col. Th e implementa tion we used for these trials did not mak e u se of clock sync hroniza- tion or neigh b orho o d services. Eac h trial con- sisted of approximat ely 250 LR W op erations and the net work con tained only a sin gle ini- tiator with a n eigh b orho o d of size six. As was previously menti oned, the du ration of T T imeout is a ma jor factor in the through- put. Th us we r an sev eral exp erimen ts wh ere w e v aried T T imeout , and in eac h case noted not only the dur ation of eac h op eration, but also the ov erall reliabilit y . As w as suggesting in th e start of this s ection, we b egan with T T imeout = 50ms, and incremen ted th is in steps of 25ms un til we ac hiev ed 100% reliabil- it y . T able 2 sho ws the reliabilit y of eac h trial, and Figure 6 shows the a verag e du ration of the LR W op erations with th e 95% confidence in terv al. 0 % 5 % 10 % 15 % 20 % 25 % 30 % 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160 160-180 180-200 200-220 220-240 240-260 260-280 280-300 300-320 LR W op erations Time interv als in milliseconds r 100ms r 150ms r 250ms r 350ms Figure 8: Distrib u tion of op eration d uration. As seen in T able 2, with a 50ms timeout v alue the r eliabilit y w as less than 50%, but it increased to appro ximately 97% when we allo w ed a 75ms timeout, and with a 100ms timeout the r eliabilit y was 100%. W e also see from Figure 6 that the a verage duration of the LR W op erations increases as the timeout is r educed. This ma y at firs t seem sur pris- ing, un til w e consid er the duration distribu - tion sho w n on Figure 7. This figure sho ws the p ercenta ge of LR W op erations that ended within a giv en time in terv al, as s een on the x- axis. There are a few imp ortan t observ ations we can make when we cross r eferen ce Figure 7 with T able 2. First, note that an app r o ximate 14 equal p ercen tage of the LR W op erations for eac h timeout v alue ended within the 20-40ms in terv al. O b viously the timeout had no ef- fect on these, as one would exp ect. The next large grouping o ccurs at th e 60-80 ms time in- terv al, w here we find most of the op erations from the 100ms timeout trial, and man y from the 75ms trial. Ho we v er, the 50ms trial is virtually absent fr om this interv al, wh ile it is instead almost the sole o ccupant in the 100- 120ms in terv al. Since this interv al includ es the double of th e timeo u t v alue, it is n at- ural to hyp othesize that for the ma jorit y of the LR W op erations from the 50ms trial that ended within this in terv al, T T imeout expired t wice. (T o confir m this h yp othesis, w e in- v estigated the b eha vior of the cycle of broad- cast (or reb r oadcast triggered by T Response ex- piring) and ac k n o wledgments, d etailed in the next paragraph.) A similar effect is s een for the 75ms tr ial at the 140-160ms in terv al. The only difference is that in this case we n ote that while appro ximately 10% of the LR W op era- tions ended within this interv al , the reliabilit y data implies only 3% of the op erations failed. On closer insp ection of the data we observ ed that most of the op erations that ended within this interv a l w ere c anc ele d , not faile d . In ei- ther of the t wo trials, we see that red u cing th e timeout v alue had a n egativ e effect not only on the r eliabilit y , but also on the duration of the LR W op erations. 20 40 60 80 100 120 140 2 3 4 5 6 Milliseco nds Neighbors b b b b b Figure 9: Av erage LR W op eration duration with 95% confidence interv al. In an exp erimen t using seven T el osB motes, eac h mote acte d as initiator app ro xim ately 1,300 times, ho w ever the exp eriment tested only the first p art of the pr oto col of Figure 3: effectiv ely T T imeout = ∞ for this exp eriment. Eac h initMsg br oadcast included the list of neigh b ors w ho h ad n ot y et resp onded to the LR W op eration. The initiator’s neigh b orho o d w as fixed to b e th e other six motes. Thus, the initial broadcast of an initMsg con tained an in vitee list of length six; rebroadcasts con- tained smaller in v itee lists. An op eration ter- minated at the instant the initiator had re- ceiv ed ac kno w ledgments fr om all neigh b ors. The exp eriment allo cated app ro ximately four seconds to eac h LR W op eration, to ensu re that n o con tentio n b et ween op erations could o ccur. In a total of 9,258 op erations, 4,228 of them (45.6%) were “luc ky”, that is, ac kn o wl- edgmen ts fr om all six neigh b ors o ccurred im- mediately follo wing the initial br oadcast, so that n o rebroadcast was n ecessary . The dis- tribution of op eration times f or these cases is sho w n in Figure 10, indicated by inv ited = 6. Some 3,288 of the op erations in cluded a re- broadcast con taining a singleton invi tee list, sho w n as inv ited = 1 in the fi gure. T h e mean (standard deviation) op er ation d uration in milliseconds for inv ited = 6 w as 26.5 (3.49), while for inv ited = 1 the results are 19.6 (3.85) . The curve s in Figure 10 are Gaussian distributions fitted the the mean and stan- dard d eviations (w e also obtained d ata for inv ited = k , 1 < k < 6, whic h follo w sim- ilar patterns). W e also measured th e total duration (fr om start to termination) of eac h op eration. This is sho wn in Figure 11, clearly reflecting T Response expiration and r eb road- cast. The mean n u m b er of (re-) b roadcasts p er op eration was 1.55, and the bimo dal d is- tribution in the figure explains this (actually , the distribu tion has three mo d es, how ev er the third do es not con tribute s ignifi can tly). These exp eriments with T T imeout = ∞ ex- 15 plain the results of Figure 7 and T able 2. One could ev en use distrib utions suggested by the fitted curves to build an analytical mo del, ho wev er this mo del w ould b e im p ractical for situations where initiators may con tend with net work traffic and for nonoptimistic cases where an LR W op eration should b e canceled. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 10 15 20 25 30 35 40 45 50 F requency Milliseconds in vited=1 in vited=6 r r r r r r r r r r r r r r r r r r r r r b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b Figure 10: Timings b y In vitee List Size 0 0.01 0.02 0.03 0.04 0.05 0.06 20 40 60 80 100 120 F requency Milliseconds N=9258 b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b Figure 11: Dur ation for T T imeout = ∞ Our second set of exp erimen ts in tro d uced con tent ion by having six initiators pr esen t in the graph. Recall that w e en f orced a top ol- ogy on the net work su c h that every initiator shares at least one p oten tial neighbor with another in itiator. In th ese exp erimen ts, eac h trial consisted of app ro xim ately 1000 series. Due to the dynamic neigh b orho o d , the n u m b er of motes attac hed to a single in itia- tor v aried o ver the course of one trial. Figure 9 shows th e a ve rage LR W op eration duration with a 95% confidence in terv al, dep en ding on the size of the neighborh o o d. As we see the a ve rage du ration has increased, esp ecially for larger neighborh o o ds, compared to Figure 5. 0 % 5 % 10 % 15 % 20 % 25 % 30 % 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160 160-180 180-200 200-220 220-240 240-260 260-280 280-300 300-320 LR W Series Time interv als in milliseconds r 100ms r 150ms r 250ms r 350ms Figure 12: Distribu tion of series du ration. Previously we examined the effect that re- ducing the timeout v alue would h a ve on LR W duration and reliabilit y in a non-con tentio n net work. W e rep eated th ese exp erimen ts with con tent ion, starting with a timeout of 100ms and increasing it by 50ms in eac h trial. T able 3 shows th e op er ation and series reliabilit y of eac h trial, and Figure 8 sho w s the distribu tion of th e du ration for LR W op erations. Figure 8 omits the data fr om the 200 and 300ms trials, as these w er e similar to th e 250 and 350 ms trials. As seen in T able 3, the op eration reliabil- it y is very p o or in the 100ms trial. But it increases r apidly , b ecoming appro ximately 99 % in the 200ms trial and 100 % at 350ms. If we cr oss reference this w ith Figure 8 we 16 see that in the 100ms trial almost every LR W op eration had completed within 220ms, wh ile in the 350ms trial ev er y transaction h ad com- pleted within 320ms. Due to the inclusion of con tentio n , clock sync h ronization, and dynamic neighbor- ho o ds , th e b ehavio r of the p roto col is signifi- can tly m ore complex than for previous trials. Ho we v er w e still notice the same basic tren d that w as eviden t in Figure 7: the p ercen t- age of op erations th at ended w ithin 160ms is largely similar for eac h trial, while at later in terv als w e notice first a drop, an d then an increase in the n u mb er of op erations from the 100ms trial. Op eration Series Timeout Relia bilit y Reliabilit y 100ms 87.91% 55.11% 150ms 96.65 % 83.66 % 200ms 99.37 % 96.64 % 250ms 99.61 % 97.85 % 300ms 99.85 % 98.90 % 350ms 100.00 % 100.00 % T able 3: Reliabilit y with con tentio n This b ecomes muc h m ore pronoun ced when w e consider th e s er ies reliabilit y in the third column in T able 3 and the d uration d istribu- tion in Figure 12. Here we clearly see that a large num b er of the series in the 100ms tr ial ended b et w een 160 and 220ms. Recall that we observed from Figure 7 that reducing the timeout v alue has the effect of in- creasing the d uration of th e LR W op erations. Figure 13, whic h shows the a verag e s er ies du- ration and 95% confidence in terv al for eac h of the ab ov e trials, illustrates the same ten- dency . Ho w ever, note th at con trary to wh at w e sa w in Figure 7 , w h en an y reduction of the timeout v alue b ey ond 100ms resulted in an increased d uration, w e see in this figur e that th e series in the 100ms trial has app r o x- imately the same a ve rage duration as in the 350ms trial. 140 145 150 155 160 165 170 175 180 100 150 200 250 300 350 Milliseco nds Timeout in milliseconds b b b b b b Figure 13: Ave rage series duration w ith 95% confidence interv al for v arying timeout. 6 Conclusions This p ap er’s p rop osal and inv estigation of the LR W pr otocol is an atte m pt to answer the question: what basic, single comm unication- round pr imitiv e maximizes the wo rk accom- plished for a lo cal neighborh o o d op eration? In some sense, an LR W op eration is a com- m u nication r en dezv ous, where memb er s of a lo cal neigh b orho o d c hange states jointly . Not surpr isingly , su ch an op er ation is more p ow- erful than s im p ler p rimitiv es, suc h as neigh- b orho o d queries or unconditional b roadcast- write commands. Ho w ever, the adv an tages of LR W pr esume an op timistic, or sp eculativ e programming approac h: if con ten tion is high or the seman tics of the application do n ot fa- v or success, th en LR W could b e less attrac- tiv e. Without notions of stable storage and jour - naling, whic h supp ort A C ID p rop erties of database managers, consistency of LR W op er- ations cannot b e guaran teed; ho wev er tu ning the T T imeout parameter app r opriately do es increase the probabilit y of non-failed op era- tions. The imp ortance of consistency for ab- stractions like L R W, wr ite-all, or lo cal trans- actions, is diminished for most wireless sens or 17 net work applications — lo w erin g comp onent cost and conserving p o w er ha v e high priorit y . Applications can often b e d esigned to tolerate some small p robabilit y of data inconsistency , using standard tec hniqu es of r eplication, fil- tering, outlier remo v al, and mo del-generated prediction. Our exp erimen ts show that tu ning T T imeout impacts b oth a v erage duration and reliabil- it y . Of the t wo most significant timers u sed, T T imeout and T Response , w e limited ourselves to stud ying the former; T Response w as set to 40 milliseconds in all exp eriment s. A p ossible area of future researc h w ould b e to examine the length of T Response , and its effect on the proto col. Th er e are many factors that w ould need to b e considered in this case, suc h as size of neigh b orho o d, size of the n et wo rk, radio ac- tivit y , and so on. Similar as for T T imeout , w e exp ect that there will b e trade-offs b et w een throughput and the n umb er of unn ecessary retransmissions. One asp ect of LR W we d id not explore in this pap er is the “co n v enience” of LR W for common applications of sensor net work pro- gramming. The introdu ction explains ho w LR W can b e used for lo cal data coll ection; artificial examples of consensu s or leader elec- tion can easily b e s h o wn , ho w ever practi- cal case studies wo u ld b e h elpful to ev aluate LR W as a p rogramming primitiv e. References [1] MP Herlih y . W ait-free syn chronizatio n. ACM TOPLAS 13 (1):1 24-14 9, 1991. [2] M Demirbas. A transactional fr amew ork for p rogramming wireless sensor/actor net works. In The 11th IEEE Interna- tional Workshop on F utur e T r ends of Distribute d Computing Systems , p p. 123- 129, 2007. [3] M Ali, U S aif, A Dunk els, T V oigt, K R¨ omer, K Langendo en, J P olastre, ZA Uzmi. Medium acc ess con trol issu es in sensor net w orks . Computer Communic a- tion R eview 36 (2):33-36, 2006. [4] S S Kulk arni, M Arum ugam. T ransforma- tions for write-all-with-collisio n mo del. Computer Comm unic ations 29 (2):183- 199, 2006. [5] T Herman, S Tixeuil, A distr ibuted TDMA slot assignmen t algorithm for wireless sensor netw orks. Algorithm ic Asp e cts of Wir eless Sensor Networks: First International Workshop, ALGO- SENSORS 2004 , pages 45-58. [6] T Herman. Models of self-sta bilization and sensor netw orks. In Pr o c e e dings of th e Fifth International Workshop on Distribute d Computing IWDC03 , Springer LNCS 2918, pp . 205-21 5, 2003. [7] M Mizun o, M Nesterenko . A transforma- tion of self-stabilizing serial mo del pro- grams for asyn chronous parallel compu t- ing en vironments. Information Pr o c ess- ing L etters , 66(6):28 5-290 , 1998. [8] J Gehrke, S Madden. Query pro cess- ing in sensor netw orks. IEEE Pervasive Computing , 3(1):46-5 5, 2004. [9] S Nath, PB Gibb ons, S Seshan, Z Ander- son. Synopsis diffusion for robust aggre- gation in sensor net works. ACM T r ans- actions on Sensor Networks , 4(2), 2008. [10] K Akk a ya , M Y ounis. A su r v ey on rout- ing proto cols for wireless sensor netw orks A d Ho c Networks , 3(3):325-3 49, (Ma y ) 2008. [11] R Newton, G Morrisett, M W elsh. Th e regimen t macroprogramming system. In Pr o c e e dings of the Sixth Internationa l 18 Confer enc e on Information Pr o c essing in Sensor Networks (IPSN’07) , p p. 489- 498, 2007. [12] R Sugihara, RK Gupta. Programming mo dels for sensor net works: a surve y . ACM T r ansactions on Sensor Networks 4(2), (Marc h) 2008. [13] Y Kotidis. Sn apshot queries: to w ard s data-cen tric sensor net works. In 21st In- ternational Confer enc e on Data Engi- ne ering (ICDE’05) , pp. 131-142, 2005 [14] S Upadh y ayulaa , SKS. Gupta. Sp an- ning tree based algorithms for lo w la- tency and energy efficient data aggre- gation enhanced con verge cast (D AC) in wireless sensor net w ork s . A d Ho c Net- works 5(5):626 -648, (July) 2007. [15] T He, P Vicaire, T Y an, L Luo, L Gu, G Zhou, R Stoleru, Q Cao, J A Stanko vic, T Ab delzaher. Ac hieving teal- time target trac king us in g wireless sensor net works, In 12th IE EE R e al-Time and Emb e dde d T e chnolo gy and Applic ations Symp osium (R T AS’06) , pp.37-48, 2006. [16] H A ttiya, R Rapp op ort. Th e lev el of handshake requ ired for establishing a connection. In D istribute d Algorithms, 8th International Workshop (W DA G’94) Springer-V erlag LNCS 857, pp. 179-193, 1994. [17] J Gray . Notes on data base op erat- ing systems. Op er ating Systems, An A d- vanc e d Course , S pringer-V erlag LNCS 60, p.393-481, 1978. [18] J P olastre, R Szewcz yk, D Culler., T e- los: enabling ultra-lo w p o wer wireless re- searc h. In P r o c e e dings of the 4th Interna- tional Symp osium on Information P r o- c essing in Sensor N etworks (IPSN’05) , P oster Session, S POTS trac k, Article 48, 2005. [19] Tin yOS 2.0 co n tributed code: ro otless timesync. 19

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment