Local Read-Write Operations in Sensor Networks

Lo cal Read-W rite Op erations in Sensor Net w orks ∗ T ed Herman Univ ersit y of Io wa herman@cs.u iowa.edu Morten Mjelde Univ ersit y of Bergen mortenm@ii. uib.no No ve m b er 2, 20 18 Abstract Designing pro to cols and for mulating conv enient programming units o f abstractio n for sensor net works is challenging due to communication er r ors a nd platfor m constraints. This pap er in vestigates pr op erties and implemen tation re liability for a lo c al r e ad-write abs trac- tion. Lo cal r ead-write is inspired by the c lass of r ead-mo dify-write o p erations deﬁned for shared-memor y multipro cessor architectures. The class of r ead-mo dify-wr ite op erations is impo rtant in solv ing co nsensus and related synchronization problems fo r co nc ur rency co n- trol. Loc a l read-write is shown to be an atomic abstra ction for synchronizing neigh b o rho o d states in senso r netw o rks. The pap er compar es lo cal read-write to simila r lig ht weigh t o p- erations in wireless senso r net works, such as re a d-all, write-a ll, and a transaction-bas ed abstraction: for some optimistic scenarios , loca l read-w r ite is a mo re eﬃcient neigh bo rho o d op eration. A partial implementation is describ ed, which shows that three outcomes charac- terize o p eration resp onse: success, failure, and cancel. A failure r esp onse indicates p oss ible inconsistency for the op er a tion result, which is the result of a timeout even t at the op er- ation’s initiator. The pap er pr esents exp erimental r esults on op era tion per formance with diﬀerent timeout v alues and situatio ns of no co nten tion, with so me tests a lso on v ario us neighborho o d sizes. 1 In tro duction Wireless Sensor Net w ork (WSN) platforms add a t w ist to traditional programming as- sumptions. Man y resources can b e qu ite constrained, includ ing band width, program memory , and platform computing p o wer. Not surpr isingly , research on sens or net work p ro- gramming to d ate h as sough t abstracti ons and to ols that can satisfy the resource con- strain ts, ye t enable pr o ductivit y in softw are dev elopment cycles. Typically , these abstr ac- tions are not entirely new ideas, but adap- ∗ Researc h supp orted in part by NSF aw ard 0519907 . tations of (p erhaps less ortho d o x) techniques from areas of signal processing, database, and parallel or distribu ted computing. This pa- p er follo ws the same researc h direction, ex- ploring the adaptation of a r ead-mo dify-write abstraction as a unit of sensor net w ork pro- grams; w e prop ose an op eration called lo c al r e ad-wr ite (LR W) for neighborh o o d commu- nication in a s ensor net work. The compare-and-swap ( c&s ) instruction, a v ailable on many multipro cessor arc hitec- tures, is an example of read-mo dify-write. In one atomic step, a pro cessor executing c&s conditionally s waps the con tent of a memory 1 w ord with the con tent of a register; the con- dition for this sw ap is that the conten t of th e memory wo rd h av e a prescrib ed v alue giv en as a ﬁeld of the instruction or giv en in an- other register. This idea, that a single in- struction sp eciﬁes a cond ition, a write v alue, and exp ects a resp onse v alue, can b e gener- alized and translated to the setting of no d es and pac k et-based comm u nication. A simp le instance of an LR W op eration is illustrated in Figure 1. Sensor no de x initiates th e op er - ation by transmitting a pac k et to neighborin g no de y . No de y in sp ects the p ack et, and p os- sibly sc h edules a tentat iv e wr ite to some lo cal v ariables; then y transmits a resp onse pack et to x . Up on receipt of y ’s resp onse, no de x will either decide to conﬁrm the op eration or bac k out and voi d the op eration. V oidin g the op er- ation will result in y discarding its sc hedu led, ten tativ e write. x y z shade d ar e a i s neighb orho o d of x Figure 1: LR W with one neighbor. No de z sh o wn in Figure 1 lies ou tsid e x ’s neigh b orho o d ; there is the p ossibilit y that z could in itiate an LR W op eration concurrently with x , so that y ﬁrst receiv es a p ac ke t f rom x , then a pac k et fr om z , and these requests conﬂict b ecause they write to th e same lo ca- tion. F o r these LR W op erations to b e atomic, the net eﬀect of r unnin g b oth should b e log- ically serial, that is, as though one op eration completes b efore the other b egins. Th e de- sign c hoice for this pap er is that y should re- ject z ’s request wh ile x ’s oper ation is p ending, that is, y sh ould imm ed iately send a negativ e resp onse to z . The sin gle-no d e neighborh o o d of x , in Fig- ure 1, can b e generalized to l arger neig h- b orho o ds, illustrated by Figure 2. No de x ’s LR W op erates on a n eigh b orho o d of no d es y 1 through y k . Although the ﬁgure suggests k messages w ould b e tran s mitted b y x , a single lo cal b roadcast suﬃces for many ra- dio platforms. A t yp ical WSN app lication for LR W is d ata aggregation. Supp ose eac h y i has recorded some sensor v alue d i , and no de x ’s task is to compute some fu n ction of { d i | 1 ≤ i ≤ k } and sa v e the result to its ﬂash memory . After x has completed this ag- gregatio n, eac h y i can discard its d i v alue and recycle lo cal memory . Note that if y i w ere to async hr onously send d i to x , it could b e that x do es not ha ve lo cal b uﬀers a v ailable for this data; putting x in con trol is a wa y to man- age resources safely . I n one LR W op er ation, x can collect all d i v alues and also sc hedu le the d i v ariables at eac h y i for recycling. Ho w- ev er, if x do es not collec t enough d i v alues, sa y fewer th an k / 2 neigh b ors resp ond to the request initiated within the LR W op eration, then x could cancel th e op eration and retry it later. Classical applications of read-mo dify- write, suc h as consensu s or leader election (applied to a WSN neigh b orh o o d) can easily b e expressed as an LR W op eration. x y 1 y 2 . . . . . . y k Figure 2: LR W with k neighbors. Con t ributions and Organization. Sec- tion 2 summarizes related work. Section 3 sp eciﬁes LR W prop erties and exp oses some design c h oices for implemen tation. Section 4 present s a theoretical result showing how a 2 mo del b ased on this abstraction diﬀers from other c hoices. Section 5 cont ains implemen- tation results, which feature exp erimen ts to sho w d esign tradeoﬀs. Discussion of conclu- sions is in S ection 6 . 2 Motiv ation and Related W ork Sev eral trac ks of WSN r esearc h d ra w analo- gies to database and parallel compu ting meth- o ds. Early prop osals for queryin g sensor net works motiv ated proto cols for aggregatio n and routing to supp ort query language op- erations [10 , 8, 9]. The idea of program- ming a WSN as a whole (calle d macropro- gramming) sometimes tak e the p osition that programming sensors resembles the ensem- ble programming of parallel compu ting ma- c hines, using SIMD or MIMD instruction se- quences [11, 12]. Inspired b y distribu ted computing researc h, there are prop osals to adapt suc h paradigms as snapshots, leader electio n, and wa v e computations in WSN sys- tems [13, 15, 14]. Th is p ap er draws analogy to instructions for atomic comm unication in m u ltipro cessor, sh ared memory systems. In con trast to high-lev el concepts for WSN soft ware, there is also signiﬁcan t researc h adapting the tec h niques of ad ho c net wo rks, p eer-to-p eer, and ev en in ternet protocols to the needs of WSN app lications and the lim- itations of WSN platforms. Tw o priorities for su c h researc h are reliable comm u nication and p o wer conserv ation. A question emerg- ing from this researc h is: what kin d of com- m u nication abstractio ns will b e co nv enient for p r ogramming ( i.e. , the interfaces are sim- ple and h ide lo w-leve l complexit y and p r ob- lems of heterogeneous platforms) wh ile en- abling eﬃcien t u se of resour ce? This question has predominantly b een inv estigat ed with r e- sp ect to n on-lo cal communicatio n, for in- stance, multi -hop proto cols, routing struc- tures, and middlew are services for publish- subscrib e abstractions. Our w ork looks at lo cal comm u nication, wh ere “lo cal” refers to single-hop communicat ion, also called neigh- b orho o d co mm unication. On one hand, the literature of MA C p r o- to cols, sp ecialize d to WSN p latforms, exten- siv ely explores the concerns of lo cal comm u- nication [3]. P latform hardware ma y directly supp ort unicast and n eigh b orho o d broadcast op erations, and s ome r adio c hips pr o vide lo w- lev el sup p ort for unicast pac k et ac knowledg- men t in one p rogrammable op eration. On the other hand , there are s ev eral pap ers [6, 4, 2] suggesting higher-lev el programming units f or lo cal comm u nication. A n atural ab s traction for local comm un ication is ato mic r e ad-al l , whic h is the op eration of r eading the lo cal states of all no des in a neigh b orh o o d. Us- ing atomic r ead-all op erations, p rograms el- egan tly exp ress calculation of neigh b orho o d statistics; Section 4 elab orates on a v arian t of read-all w ith s tr onger atomicit y prop erties. Unfortunately , the read-all abstraction d o es not eﬃcient ly map to WSN platform abilities. An alternativ e abstraction is atomic write- al l , which ma y b e implemen ted b y a sin gle lo cal message broadcast. A wr ite-all op era- tion writes (some part of ) the states of ev ery other n o de in a neigh b orh o o d. This op eration is not so n atur al for programming as read- all, ho w ev er p rogram trans forms hav e b een prop osed that conv ert man y programs using read-all op erations into ones that employ only write-all op erations [6, 5, 4]. Reliabilit y is a concern with na ¨ ıv e implemen tation of write- all consisting of a single message b roadcast; the broadcast can lose messages to a sub set of neigh b ors d ue to n oise or collision with other message traﬃc, sa y originating fr om other neigh b orho o d s in the WSN (in [4], the b asic op eration is called “write-all with collision”). The concerns of reliabilit y and atomicit y 3 are fun damen tal to database transaction the- ory , where A CID pr op erties deﬁne co rrect transaction pro cessing. Th e p ap er [2] sug- gests a local WSN op eration motiv ated b y database transactions: one atomic op era- tion reads from a s ubset of neigh b ors and writes to a subset of neighbors. T o im- pro ve reliabilit y , the lo cal transaction imple- men tation consists of a sequence of messages: read-request, resp onse, then write-commit or ab ort-transaction. T rans actions ma y b e ab orted b ecause of in terference with con tend- ing trans actions, and th e ab orted transac- tions need to b e retried. The reliabilit y of suc h lo cal transactions is imp erfect: ﬁnal commit messages can b e lost and the transac- tion initiator can crash. Standard techniques that add reliabilit y to database transactio n pro cessing, such as stable storage and trans- action jour naling, are u nrealistic for ma n y WSN platforms. T o give some idea of th e r esources needed for the op erations discussed ab o ve , T a ble 1 summarizes optimistic , best-case resource measures for a neighborh o o d of n no des. The t wo measures are num b er of messages (in- cluding b oth u n icast and lo cal br oadcast m es- sages) and num b er of rounds, wh ere a round is a time interv a l of suﬃcient length to allo w all no des in a neigh b orho o d to send a message. The latter measure w ould allo w for queu in g, pro cessing, and transmission d ela ys as we ll as extra dela ys d ue to the m edium access con trol la ye r for collision a v oidance. The ﬁrst row of the table r eﬂects th at a read-all op eration is initiated b y one no d e, follo wed by eac h of its n − 1 neigh b ors sending a r esp onse. A write- all op eration p oten tially has the least r esource cost of any op eration, consisting of jus t one broadcast message; ho wev er to pro vide for re- liabilit y , an implementat ion of write-all ma y return ackno wledgmen ts fr om eac h recipien t of the broadcast back to op eration’s initiator. F or this reason, the n um b er of message prim- Opera tion Messa ges Rounds read-all n 1 write-all 1 or n 1 transact 2 + r + w 2 LR W n 1 T able 1: Op eration comparison. itiv es is rep orted as “1 or n ” in T able 1. A lo cal transaction, as deﬁned in [2] and called “transact” in T a ble 1, has a read set of r no des and a w rite set of w no des. Th e transaction is initiated with a b r oadcast, follo we d by a re- sp onse from eac h n o de in the read set. Then the transaction initiator transmits a b r oad- cast to the write set con taining v alues to b e written, and eac h m emb er of the wr ite set uni- casts an ac kn o wledgment to the initiator; the ac kno wledgmen ts are needed so that the ini- tiator can decide whether to allo w th e trans- action to commit or to broadcast a cancel message. The read and w rite sets ma y ov er- lap, with the worst case b eing r = w = n − 1 (whic h w ould put th e m essage cost of trans- act at 2 n ). The LR W op eration b egins with a broadcast, follo wed b y eac h of th e n − 1 other no des resp onding. Since any v alue to b e writ- ten is conta ined in the initial broadcast and resp onses are collected by the LR W initiator, no add itional r ound is needed to complete the op eration. The measures of T able 1 are optimistic n u m b ers in t wo senses. First, the measures are for transactions that succeed, that is, they do not fail due to conﬂicts with con- current transactions or n egativ e resp onses (an LR W op eration w ould need to includ e a can- cellatio n message if any neighb or resp onse indicated some unant icipated v alue). Sec- ondly , the table do es not include commit messages for transaction or L R W op erations. This is b ecause sensor no de timing and clo ck- ing mec h an ism s enable commit to b e time- triggered, that is, eac h no de commits a trans- 4 action after suﬃcient time has p assed without receiving a cancellation message. 3 LR W Design Issues L o c a l R e a d-Write (LR W) is an op eration de- ﬁned on v ariables of WSN no d es. W e assume that eac h no de has the same s et of v ariables ∗ that can b e read and written b y an LR W op- eration. F or v ariable v and no de q , let v q refer to q ’s ins tance of v . Eac h inv o cation of LR W sp eciﬁes: ( i ) a fun ction f deﬁned on a sub - set of no d e v ariables, ( ii ) a subset of no de v ariables to b e written, and ( iii ) a b o olean function g . F unction f ca n b e computed at an y no de, and either return s a ne gative r e- sp onse v alue ⊥ or return s a pair ( r, B ), where r is a v alue pro vided for computing g and B is a list of v alues to b e written to the v ari- ables sp eciﬁed in ( ii ). A nonlossy LR W op er- ation is deﬁned with resp ect to an initiating no de p and p ’s neigh b orho o d N ( p ), consist- ing of three s teps : ( 1 ) for eac h no de q ∈ N ( p ), function f is compu ted; ( 2 ) fu nction g is com- puted on th e set of r -v alues { r q | q ∈ N ( p ) } ; and ( 3 ) if the result of g is true , then B q is written to th e write v ariables of q , for eac h no de q . A lossy L R W op eration would allo w, in ( 1 )–( 3 ) , prop er subsets of N ( p ) to m o del the loss of messages. W e do n ot formally sp ec- ify lossy LR W instances in this pap er. View ed from the app lication p ersp ectiv e, an LR W op eration b egins w hen initiating no de p inv ok es LR W and ends when p r eceiv es a resp onse from the LR W. Bet w een th e in vo- cation and resp onse, w e assu me that p do es not in vok e another LR W instance. Th us th e only source of concurrency in th e system is con tent ion among LR W op erations of d iﬀer- en t LR W initiators. The b eha vior of a set of (p ossibly concurr en t) LR W op erations can b e ∗ This assumption is n ot essential, but simpliﬁes the description. sp eciﬁed by a sequence, called an L R W his- tory , which con tains LR W inv o cations, con- tains results of f and g ev a luations, assign- men ts to v ariables, and con tains resp onses to the LR W in vocations. W e omit details of the history formalizatio n, whic h f ollo w from standard tec hn iques similar to the notation of trans action serializabilit y . A well-fo rmed LR W h istory is one in wh ic h every LR W in- v o cation ﬁnds a matc hing resp onse. A well- formed L R W history determines v alues for all v ariables. Implementa tions of LR W or similar op erations result in reﬁned histories, where b et w een inv o cation and resp onse, lo wer-lev el ev ents (transm iss ion, reception, message pro- cessing) o ccur. Analysis of suc h op eration h is- tories, for imp lemen tations of op erations in T able 1, can v erif y their atomicit y prop erties. The framew ork [2] uses terminology of transactions to describ e lo cal op erations, in- cluding some A CID p rop erties of transactions in databases. A tomicit y of a trans action, whic h is the all-or-none pr op erty , is du e to tw o prop erties of the proto col. First, in the WSN mo del, ordering transactions can b e simp le b ecause m essage propagation latency is neg- ligible. If no des p and p ′ concurrent ly initi- ate a transaction u sing lo cal b roadcast, with x, y ∈ N ( p ) and x, y ∈ N ( p ′ ), then x and y cannot receiv e br oadcasts f rom p and p ′ in diﬀeren t order. Second, all the writes of a transaction are sand b oxe d and only actually written up on the ev en t of transaction com- mit. Consistency of transactions is ensured b y conﬂict resolution. If the transactions of p and p ′ conﬂict, sa y b ecause they write to the same v ariable in no de x , then one of the t wo transactions will b e ab orted (and p ossibly re ¨ ınitiated later). Prop erties corresp onding to atomicit y and consistency can similarly b e sho w n f or LR W op erations (and p r o ved using LR W histories). T r an s actions of p and p ′ can b e concurrent, ev en w ith n eigh b ors { x, y } in common, p ro vid ed that they op erate on d is- 5 tinct sets of v ariables (and more generally , if it can b e sh o wn that the transactions ha ve com- m u tativ e seman tics). This observ ation also holds for LR W op erations. The problematic asp ects of ACID prop er - ties for WSNs arise fr om platform limitations and un reliable message transp ort. The p os- sibilit y of message loss implies, for example, that a commit message or a cancellation mes- sage could b e lost. It is wel l-kno wn th at no ac kno w ledgment proto col can guaran tee that all neigh b ors of a transaction initiator will receiv e a commit or cancellat ion message, ev en if it is retransm itted some n umber of times [17, 16]. Ho wev er, the p robabilit y of a comm un ication loss can b e red u ced if mes- sages are retransmitted, and retransmission ma y b e a practical strategy to improv e reli- abilit y for WSNs (in eﬀect, retransmission is an appr o ximation to ev entuall y correct mes- sage deliv ery). A proto col optimization for transaction or LR W op erations is to replace a commit or cancellation message with timeout- driv en activ ation. The design choic e of [2] and in this pap er is to let commit b e timeout- driv en: if, after some ﬁxed time p erio d, a no de d o es n ot receiv e an y cancellat ion mes- sage fr om the LR W initiator, then v ariables writes are committed. An alternativ e design c hoice w ould b e to let cancellat ion b e the timeout-trigge red default, ho wev er th is c h oice w ould shift the balance of p ow er u sage (b e- cause messages consume p o wer) to commit, and for most applications and t ypical WSN w orkloads, one w ould exp ect most LR W op- erations to b e committed. A limitatio n of sev eral curr en t WSN mes- sage pr otocols is pac ket payloa d size. F or the platform used in our exp eriments, the pa y- load is 28 bytes, whic h limits ho w muc h can b e sp eciﬁed in an LR W op eration b ased on a single br oadcast. S caling LR W to larger data amoun ts would r equire fragmen tation of LR W message ﬁelds ov er multiple broadcasts. Some WSN platforms m a y n ot s upp ort native lo cal br oadcast; there, ordering LR W op era- tions b y the instan t of reception would n ot b e reliable. Ho w eve r LR W op erations can also b e ordered by timestamp, if the WSN has sync h ronized clocks. With synchronized clocks, LR W op erations can b e group ed by slotted time in terv als. In a slotted time pr o- to col, initiat ors w ait unti l the b eginning of a slot b efore transmitting an L R W op eration message; wh en a neighbor receiv es an LR W message, it d ela ys sending a r esp onse un til the end of the current slot, in order to collect all LR W op erations, ord er them, and sort out conﬂicts. A reason to consider u sing un icast, rather than broadcast of the initial LR W mes- sage, is to improv e s c hedu ling eﬃciency of re- sp onses fr om n eighb ors . Th e CC2420 radio c hip has a feature for immediate ackno wledg- men t of u nicast messages, and this feature is not a v ailable for lo cal b roadcast. In the discussion ab o v e, we ha v e treated N ( p ) as a constan t, s u pp osing the neigh b or- ho o d of p to b e ﬁxed in the WSN. The ex- p erience of many researc hers is th at, ev en for a s tatic WSN, r adio prop erties are dynamic: the set of stable, bidirectional links deﬁning neigh b orho o d s ev olves. T herefore the design of an LR W proto col should p lan for dynamic neigh b orho o d s. If an LR W op eration fails b e- cause the initiato r did n ot collect resp onses from ev ery neigh b or (this would dep end on the d eﬁnition of g ), it could b e th at the neigh- b orho o d has c hanged. In this case, subse- quen tly submitting the LR W op eration would use the new n eigh b orho o d. 4 LR W Op eration Compari- son T able 1 do es not compare expressive, or com- puting p o w er of diﬀeren t n eigh b orho o d op er - ations. In the table, transact consumes most 6 resource, but transact is more p o werful than an y ot her: in one op eration, a f unction of neigh b orho o d v alues can b e compu ted and written to several no des. An L R W op er a- tion is strictly less p o w erf ul b ecause any v alue written m ust b e prescrib ed, b efore the op er- ation is inv ok ed, rather than compu ting the v alue to write du ring th e op eration. One tec hnique to compare op er ation p ow er is to examine proto cols that use only that op- eration to solve some classic problem, su c h as consensu s . If one op eration typ e enables consensus to b e s olved whereas another op- eration do es not, then the former op eration is more p o we r ful (with resp ect to consensu s) than the latter. Brieﬂy , a consensus proto col b egins with eac h no d e ha ving an input v alue and a d ecision v ariable, w hic h can b e written at most once. The inpu t is n ot in an y v ariable, that is, inpu t v alues cannot directly b e view ed b y any of the op erations of T able 1; an early step in any consensu s proto col is to sh are the input with other no des. The initial v alue of the decision is some constan t ω not equal to an y n o de’s inp ut. Consensu s proto cols must satisfy three prop erties: v alidit y , agreemen t, and termination. Th e termination prop ert y is that eve ry no de eve n tu ally writes to its deci- sion v ariable, regardless of the progress or fail- ure of other no d es; agreement requires that no tw o no des wr ite diﬀerent decision v alues; v alidit y requ ir es that any decision written b e the inpu t of some no de. The diﬃculty of con- sensus lies in the timing of no des participating in the proto col. If some no de p is very slo w to engage in the proto col, th en other n o des will n eed to d ecide without kno wing p ’s in- put. Although sync h ronous timing is imp licit in the implementati on of op erations such as LR W, soft w are at the application la yer may b e asynchronous, hence the timing of app li- cations using LR W can b e unp redictable. F or the follo wing resu lts, we assume com- m u nications are nonlossy and do n ot fail due to con tent ion conﬂicts. Also, neigh b orho o d s are static and deﬁnitions of neighborh o o d are consisten t, that is, if q ∈ N ( p ) then p ∈ N ( q ). The follo wing shows that read-all is insu ﬃ- cien t to solv e the consensu s p roblem. Lemma 1 Consensus using only r e ad-al l op- er ations is imp ossible. Pro of: Th e p r o of rep eats standard argu- men ts [1] b ased on ﬁnding a contradictio n in a constructed execution. S upp ose consen- sus is p ossible, and that no des p and q are neigh b ors with inpu ts 0 and 1 resp ectiv ely . If p (or q ) waits long enough to exp ose its in- put v alue, then the other no de ma y tak e suf- ﬁcien tly man y steps so that it is f orced, b y the termination prop erty , to d ecide; b ecause the other’s input is unknown, it w ill decide in fa v or of its o wn input. Thus the initial state for the consensus proto col is multiva- lent , that is, there exist t wo p ossible execu- tions leading to diﬀeren t decisions. A state is univalent if all p ossible executions follo w- ing that state can only lead to one decision (in eﬀect, the d ecision h as already b een cho- sen, ev en if not presentl y in a decision v ari- able). Executions consist of an in terlea ving, of atomic steps from some no de in th e neigh- b orho o d, where a step is either a r ead-all op- eration, some lo cal calculation, or writing to some v ariable(s). If p wr ites to a v ariable v , and the next step in the execution is a r ead- all f or v by q , th en q obtains the v alue p wrote to v . Let σ b e the last m ultiv alen t state in an execution (the termination p rop erty implies σ exists). There are at least t wo p ossible cont in- uations from σ leading to diﬀerent d ecisions, b y deﬁnition. S uc h cont in u ations necessar- ily b egin w ith steps of diﬀeren t no des. W e consider diﬀeren t cases for the ﬁrst step by p and q with resp ect to cont in u ations. Note that if p steps ﬁrst after σ , then th e v alency 7 is diﬀeren t than w ould b e if q steps ﬁrst (oth- erwise σ is n ot m ultiv alen t). If the ﬁ rst step b y p is a lo cal calculation or a read-all op- eration, then the o ccurrence of that step is undetectable by q . Th is contradicts th e as- sumption that p ’s ﬁrst step after σ results in a u niv alen t state. If the ﬁrst step by p wr ites to a v ariable, then it cannot b e th at q ’s ﬁrst step writes to a v ariable, b ecause these t wo steps comm ute, whic h w ould con tradict the diﬀering v alency of these tw o steps. T here- fore, the essen tial case to examine is wher e p ’s ﬁr st step writes to a v ariable and q ’s ﬁrs t step is a read-all op eration. If p steps ﬁrst and then sleeps while q runs long enough to decide, the v alency will f ollo w from p ’s write of a v ariable; the same v alency is obtained if q makes no steps while p r u ns long enough to decide. Ho wev er, p cannot detect whether or not q has p erformed a read-all, hence if q steps ﬁrst, then sleeps, with p run ning long enough to decide, p m u st decide as if q to ok no steps, whic h contradicts the su pp osed v alency of q ’s read-all op eration. Thus in any case, the tran- sition f r om m ultiv alency to univ alency can b e prev en ted in some p ossible execution. ❑ Although read-all do esn’t p ro v id e a solu- tion to consens u s, an enhanced form of read- all, called r e ad-al l-write , do es allo w for a so- lution. In a read-all-write op eration, a n o de atomical ly reads v alues fr om all neigh b ors and wr ites some function of the resu lt to a v ariable. If p and q inv ok e read-all-write at nearly the same instan t, then atomicit y guaran tees that one op eration will precede the other. Thereby , if p ’s r ead-all-write o c- curs ﬁrst, then the v ariable written b y p will b e visible in q ’s read-all-write. Thus q can detect that p ’s operation preceded q ’s, and the d ecision v alue for b oth no des can b e the input of p . A read-all op eration is “ligh ter weigh t” compared to a r ead-all-write op eration, which must constrain concurrency to guarant ee atomicit y . W e are not aw are of WSN researc h on neighborh o o d read-all- write. Presumably the transactional meth- o ds, sa y of [7] or [2] could b e u sed to imp le- men t read-all-write. Unlik e read-all, the wr ite-all op eration can b e used to solv e consens us in particular cases. The follo wing ﬁrst iden tiﬁes a negativ e case, where write-all is insuﬃcien t; afterward we discuss a case where a consensus pr oto col us es write-all. Lemma 2 Consensus usi ng only single- variable write-al l op er atio ns is imp ossible. Pro of: Th e pr o of is similar to that for Lemma 1. Here, eac h no de obtains v alues of other no des only by lo cally reading v ariables that ha ve b een assigned b y a write-all op er- ation. Let σ b e the last multiv al en t state in an execution, and supp ose th e n ext steps of p and q are write-all op erations. If these steps write to diﬀeren t v ariables, then the steps comm ute and a v alency con tradiction is ob- tained. When N ( p ) = { q } the steps of p and q w r ite to diﬀerent v ariables b ecause write-all op erations assign to v ariables of other no des. One case where tw o steps write to the s ame v ariable, is th at p ’s ﬁrst step wr ites to its v ari- able v p and q ’s ﬁrs t step is a write-all to v p . Supp ose th e v alency of p taking the ﬁrst step is 1, and th e v alency of q taking the ﬁ rst step is 0. If q take s the ﬁ rst step and p sleeps long enough for q to decide, the decision is 0. If p tak es the ﬁrs t step and then sleeps wh ile q runs long en ough to decide, the decision w ill still b e 0, b ecause q o v erwr ote what p h ad written, and thus q is not inﬂuenced b y p ’s initial step. T his con tradicts the assumption that p ’s ﬁrst step results in a un iv alen t state with v alency 1. ❑ The write-all op eration of T able 1 d o es not include any lo cal v ariable as a wr ite target 8 (that is, p 6∈ N ( p )). An extension to write- all w ould b e to include v p in p ’s write-all of v ariable v . This extension alone turns out not to help in solving consensu s, ho wev er the in- clusion of v p together with allo wing multiple v ariables to b e w ritten do es enable a consen- sus proto col. Supp ose N ( p ) = { q } and thr ee v ariables u , v , w are in itially ω . Let p ’s op er- ation wr ite its inp ut v alue to u and to v ; an d let q ’s op eration write its in put v alue to v and to w . Whic h ev er no de has the ﬁr st wr ite-all op eration forces the decision v alue to b e its input. In an execution with diﬀering inputs suc h that p in vok es the ﬁrst w rite-all and q sleeps, p will d etect th at it has the ﬁrst op- eration, b ecause w = ω ; and if q do es not sleep, p ma y detect that w 6 = ω , ho w ever then v do es not con tain p ’s input, and so p detects that its write-all o ccurred ﬁ rst ( q will get the decision from v in that case). It is not diﬃcult to sho w that LR W or transact suﬃce to solve consensu s, b ecause it is simple f or a no de to record a decision v alue th at is not ov erwritten b y an y s u bse- quen t op eration. Th e LR W op eration is a ligh ter w eigh t primitiv e than write-all, whic h deals with more v ariables than LR W when used for for consensu s. When op erations ha v e equ iv alent solv abil- it y p o we r, then ma y also b e compared by the time required or the n umb er of op erations used in a solution. In tu itiv ely an L R W op- eration do es more wo rk p er op eration than either read-all or write-all op erations, and all of these h a ve 1-round (optimistic) time com- plexit y . 5 Implemen tation The previous sections of the p ap er motiv ate LR W op erations and sk etc h, at a h igh lev el, ho w such an op eration could b e implemented in a WSN. T o conﬁ rm the f easibilit y of LR W on a curren t s en sor net w ork platform, this section r ep orts results f rom simple exp eri- men ts on some small mote net w orks. Section 3 exp oses general design issues for an imple- men tation, wh ereas the exp erimen tal imple- men tation must con tend with lo w-lev el de- sign considerations. F or example, the table in Figure 1 rep orts optimistic message coun ts for LR W, but our exp erimen ts consider f ail- ures in message d eliv ery . Op eration d uration and th roughput is aﬀected by thresholds for message transit time and exp ected num b er of retransmissions; such factors are determined from exp erimen ts. F or instance, the d uration of an LR W op eration can b e r educed b y set- ting smaller time limits and lo wering the num- b er of r etries f or lost messages; this will allo w more LR W op erations to b e executed, at the cost of reliabilit y . Figures presente d b elo w sho w eﬀects of such tun ing decisions, f or the case of an LR W op eration r un in isolation and also f or the case of an LR W con tending w ith other op erations. 5.1 Exp erimen t al Platform Our implementa tion and exp erimen ts were written in the NesC language for the TinyOS (v ersion 2) op erating system, r u nning on T elosb [18] and MicaZ motes (b oth platforms use the same radio c h ip, CC2420). Rather than a full implementa tion of LR W, w e u sed a simpler p r oto col that ignored the case of application-trigg ered op eration failures (suc h as one neigh b or h a ving a v alue th at cancels the LR W op eration); thus all our exp eriments consider only cases of su ccessful LR W op era- tions, except where an op eration is rejected due to concur r ency . The im p lemen tation is built on sev eral services: a MAC-la y er r ad io stac k transmits and delivers p ack ets, also in- serting random d ela ys (t ypically b et wee n 3ms and 12ms) to a vo id collisio n with other trans- missions; a neighborh o o d service determines 9 LR W-initiate( p ): mo de ← active , S ← { } start T T im eout , T C ommit broadcast ( initMsg p ( T C ommit ) ) start T Response while ( mo de = active ) receive (rejectMsg q ) : mo de ← c anc el receive ( m = acceptMsg q ) : S ← S ∪ { sender ( m ) } if | S | = | N ( p ) | stop Timers ; return suc c ess T Response expir es : broadcast (initMsg p ( T C ommit )) r estart T Response T T im eout expir es : mo de ← ab ort , S ← { } broadcast ( ab ortMsg p ) start T Response , T T im eout while ( mo de = ab ort ) receive ( m = ab ortA ck q ) : S ← S ∪ { m } if | S | = | N ( p ) | c anc el Timers , return c anc ele d T Response expir es : broadcast ( ab ortMsg p ) start T Response T T im eout expir es : stop Timers ; return fail e d Figure 3: LR W for Initiator p the eﬀec tiv e set of a no de’s neighbors (for whic h there is current ly bidirectional com- m u nication); a clo ck sync h ronization service aligns the timers of no des, wh ich facilitates exp eriments that ind uce concurr ent LR W op- erations in a control led wa y . Our largest exp eriments u sed 31 MicaZ motes, and due to p ro ximity , w e artiﬁcially constrained eac h no de to ha ve at most s ix neigh b ors (essen- tially , this is top ology control). Our imple- men tation emplo y ed three countdo wn timers, t wo for message deliv ery and ackno wledg- men t, and a third to limit th e total dura- tion of the LR W op eration. W e used tw o timers for message deliv er y , to mak e the dis- tinction b et wee n ( i ) time for message tran s - mission and receiving a r esp onse message, and ( ii ) th e time f or su ccessful transmiss ion and resp onse fr om all neigh b ors, includin g retries of ( i ). A tec h nical reason to use ( ii ) instead of a retry coun ter is th at the MAC la ye r’s tim- ing is randomized, and our design goal w as to implemen t a proto col with kno w n th resholds for the LR W op eration. Should the timer for ( ii ) exp ire, then the LR W op eration’s in itia- tor ab orts the op eration and trans m its ab ort commands to its neigh b orho o d . The third timer is for op eration commit: if a neigh b or do es not receiv e an ab ort message and the commit timer expires, then the result of the LR W is committed. 5.2 Three Outcomes for LR W W e sa y that an LR W op eration has three p os- sible outcomes: It is considered a Suc c ess if the LR W is acc epted by all neigh b ors; the op eration is considered Canc ele d if an ab ort message (wh ic h ma y ha ve b ecome necessary for a n u m b er of reasons) is r esp ond ed to by all neighb ors ; and it is considered F aile d if at least one neighbor do es not resp ond to the ab ort message. Eac h L R W op eration return s to the application, whic h in vok ed LR W, one of these thr ee outcomes. The ﬁrst t wo outcomes, Suc c ess a nd Canc el , are within th e in tend ed b ehavio r of the proto col (in th e terminology of transactions, the resu lt satisﬁes At omicit y and Consistency criteria). The ﬁnal outcome, F aile d , represen ts a failure of comm unication, a sen sor n o de crash, or (silen t) n eigh b orho o d reconﬁguration. F or a failed outcome, it is un- certain whether or not all neighbors receiv ed and committed the LR W op eration (p ossibly , the initiator attempted to cancel the op era- tion, but not all neighbors ac kn o wledged the cancel r equest within the allo w ed timeout p e- rio d). F or a set of LR W op erations, r eliability is the p ercen tage of n on-failed LR W op era- tions, that is, op erations which resp ond b y success or cancel. 10 LR W-neighbor( q ): initially : mo de = id le while ( m o de = engage d p ) receive ( ab ortMsg p ) : stop T C ommit send ( p, ab ortA ck q ) mo de ← id le receive ( m = initMsg p ( t ) ) : send ( p , acceptMsg q ) receive ( initMsg r ∧ r 6 = p ) : send ( r, rejectMsg q ) T C ommit expir es : q c ommits LR W mo de ← id le while ( m o de = i d le ) receive ( ab ortMsg r ) : send ( r, abortA ck q ) receive ( m = initMsg p ( t ) ) : start T C ommit ← t mo de ← engage d p send ( p , acceptMsg q ) Figure 4: LR W for Neighbor q 5.3 Proto col In th e follo wing, p refers to an initiator and { q 0 , q 1 , ..., q k } = N ( p ) refers to th e neigh- b ors of p . Figures 3 and 4 con tain a high- lev el description of the LR W implementa tion. Fiv e message types are used in the p roto- col, initMsg , ac c eptMsg , r eje ctMsg , ab ortMsg , and ab ortA ck . Th r ee ev en t types driv e the proto col: in v o cation of an LR W op eration, message arr iv al, and timer expiration. Th e proto col timers are denoted T x where x ∈ { R esp onse,Time out,Commit } . Eac h timer starts with s ome p ositive v alue and decreases to zero, wh ereat an expiration ev ent o ccurs. The in itiator b egins by starting three timers, ho wev er only T Response and T T imeout ha ve ex- piration ev ent s sho wn in the initiator proto- col; T C ommit pro v id es a time stamp for the LR W op eration, used by neigh b ors . (Note that T C ommit could b e signiﬁ can t to the ini- tiator as a safeguard, to r eject any application in vocation of LR W-initiate while a cur r en t in- v o cation is already in progress; T C ommit could also trigger commit at the initiator itself, but to simplify the p resen tation w e suppr ess su ch details.) The ﬁrst message br oadcast by th e ini- tiator is the initMsg , con taining the cur- ren t T C ommit v alue and other ﬁelds relev an t to the LR W op eration. The initiator w aits for all neigh b ors to reply w ith ac c eptMsg (i f T Response expires, then the initiator again broadcasts an ini tMsg ). T o simplify the de- scription, the case where T C ommit expires (whic h could occur if the initiator retries a broadcast to o many times) is not sh o wn. Also, th e pr esen tation omits the case where a collection of ac c eptMsg v alues m ight trigger some application-sp eciﬁc ab ort of the LR W op eration. F or the mote imp lemen tation, w e also in- cluded a list of neighbors in the initMsg : th is list named the n eigh b ors for wh ic h th e initia- tor had not y et receiv ed a ac c eptMsg . Th is small optimization reduced the o ve rhead af- ter retransmitting an initMsg , by a vo iding a needless r esending of ac c eptMsg (usu ally f or the ma jorit y of n eigh b ors). It is imp ortan t to note that even if LR W- initiate retur ns Suc c ess , T C ommit ma y n ot ha ve exp ired, and thus the LR W has not b een committed. F or this reason p m u st not b e p er- mitted to b egin a new LR W op er ation u n til T C ommit expires. In our trials, the frequency of LR W op erations w as small enough to en- sure that th e previous LR W w as committed or ab orted b efore a new L R W was started, but an y imp lementati on of this proto col would ha ve to address the p ossibilit y of such an o c- currence. Con trary to this, if the LR W op er- ation is ab orted, a n ew LR W ma y b e initiated immediately . If the LR W fails due to a neigh b or q i , then since at least one attempt has b een made by the initiator p to cancel the op eration, we 11 kno w that one of the follo wing is true: (1) q i did not receiv e the initiation message; (2) q i receiv ed the initiatio n message, but n ot th e ab ort message; or (3) p did not receiv e q i ’s resp onse to the ab ort message. In cases (1) and (3) the LR W is not committed, and thus consistency is mainta in ed. In C ase (2) ho w - ev er, the LR W is committed, and consistency is lost. 5.4 Metrics W e ev aluated the LR W implemen tation with resp ect to time and reliabilit y . F or exp eri- men ts, we considered diﬀerent platforms, d if- feren t top ologies, d iﬀeren t lo w-lev el choice s for comm unication, whether op erations are run in isolation or op erations run are un der con tent ion, as well as d iﬀeren t settings for T T imeout and T C ommit . The total time allot- ted to an LR W op eration is T C ommit , how- ev er u nder id eal circumstances, all messages are sent, deliv ered, and ac kno w ledged in a time p ossibly muc h less than T C ommit : we re- fer to the time needed for one LR W op era- tion to complete und er these circumstances as the optimistic dur ation of an LR W op- eration. Measuring the optimistic d uration is int eresting: if the gap b et ween the opti- mistic duration an d T C ommit is large, and if close to id eal circum stances are common, then T C ommit ma y b e decreased. A smaller v alue of T C ommit w ould allo w an app lication to su bmit more LR W op erations, that is, the through- put of LR W op erations could b e higher if the op eration in terv al is shorter. Ho wev er, de- creasing T T imeout and T C ommit ma y also de- crease the fr equency of success, b ecause few er message retries will b e attempted and late- arriving messages could b e discarded. A less reliable imp lemen tation impacts th e applica- tion, w h ic h ma y or ma y not retry an u nsuc- cessful LR W op eration. In view of the tw o unsuccessful outcome p ossibilities for LR W- initiate, c anc ele d and faile d , the application should d ecide w h ether or not to retry a can- celed oper ation. Reducing the frequency of faile d cases can b e handled within the LR W implemen tation. 0 20 40 60 80 100 120 140 1 2 3 4 5 6 7 8 9 10 11 Neighb o rho o d S ize Millis econds l l l l l l l l l l l r r r r r r r r r r r r Broadcast l Unicast Figure 5: Broadcast vs Un icast with Ack T o measure the implement ation under con- ten tion, we used the synchronized clo ck ser- vice [19] to arrange that a set of n o des sim u l- taneously inv ok e LR W-initiate. Within an ex- p eriment, a set of LR W op erations started si- m u ltaneously is called a series ; the d uration of that series is deﬁned to b e the maxim u m optimistic du r ation of an y LR W op eration in the set. W e sa y that a series faile d if at least one op eration in it faile d , and th at it suc c e e de d otherwise. W e deﬁne the reliabilit y of a set of series as the p ercen tage of successful op- erations. O ur exp erimen ts show factors that inﬂuence a series duration and reliabilit y . 5.5 Exp erimen t s Comm unication Primitive. Eac h initMsg from an initiator should b e ac kno w ledged, ei- ther with an ac c ept Msg or a r eje ctMsg , by eac h neigh b or. The CC 2420 radio chip of- fers a hardw are-lev el ac knowledgmen t fea- ture, which enables the r eceiv er of a u nicast 12 message to sen d an ac k frame immed iately , without the usual MA C dela y (to a void colli- sion). Though this feature do es not p resen tly pro v id e a pa yload area within the ac kno wl- edgmen t frame, an d hence is not suﬃcien t for LR W pu rp oses, we nonetheless tested how w ell u sing u nicast (with the hardware ac- kno wledgments) compared to using n eigh b or- ho o d b roadcast. Using unicast requires more transmission op erations by the in itiator (one p er neighbor), but also sa ves time comm u ni- cating ac kno wledgments. Figure 5 disp lays results of an exp eriment conducted on T elosb motes, where eac h data p oint in the graph is the mean of appro ximately 250 successful op- erations. Due to a generous timeout, no op er- ations we re ab orted or failed. In this exp eri- men t, there is only one initiator, and the num- b er of neigh b ors v aried, sh o wn on the x-axis; the optimistic d uration is sho wn on the y-axis. Because this exp erim ent sho wed the sup eri- orit y of us ing br oadcast with d enser net w ork s (and the cur ren t unicast with hard w are ack is deﬁcien t for our purp oses), we u s ed the neigh- b orho o d broadcast p rimitiv e in all subsequent exp eriments. Figure 5 also su ggests a s tarting p oint for testing diﬀeren t v alues of T T imeout at diﬀerent neigh b orho o d sizes. F or example, giv en an initiator with six n eigh b ors, 50ms migh t b e a starting p oin t for an exp erimen t testing reliabilit y . Reliabilit y and Timeout Recall from Section 5.4 that the duration of th e T C ommit timer is a signiﬁcan t factor in the through- put of the protocol. If the proto col is ex- p ected to rapidly execute several LR W op er- ations, there is a signiﬁcant incentiv e to k eep the T C ommit timer as s hort as p ossible. Th is ho wev er carries with it another set of c hal- lenges: note that the execution of the LR W op eration must b e conta ined en tirely w ithin the timespan of T C ommit . F urthermore, ob- serv e from Figure 3 that T T imeout p erforms a slightly diﬀerent function dep en ding on the current mo d e of the initiator p ; if it is in ac- tive mod e the timer is used to ent er the ini- tiator into ab ort mo d e, and if it is in ab ort mo de, the timer signals w h en the op eration is to b e declared as faile d . Sin ce the T C ommit timer m u st b e of suﬃcient duration to allo w for b oth of these ev en tualities it follo ws that the length of T C ommit m u st b e at least twice that of T T imeout . Timeout Relia bilit y 50ms 46.64% 75ms 96.87 % 100ms 100 % T able 2: Reliabilit y without con tentio n . Due to the ab ov e reasoning, man y of our exp eriments we re fo cused on stu dying th e ef- fect th at reducing the duration of T T imeout w ould ha v e on th e duration and reliabilit y of b oth op erations and series. W e p erformed t wo sets of trials. T he ﬁr st set was executed in a simp le net work con taining one initiator with a static neigh b orho o d of size six. In these ex- p eriments we us ed T elosB motes. 55 60 65 70 75 80 50 75 100 Milliseco nds Timeout in milliseconds b b b Figure 6: Av erage LR W du ration with 95% conﬁdence interv al for v arying timeout. The second set introd uced con tentio n in the form of s everal initiators b eing present in the net wo rk at the same time. W e also em- plo yed pr e¨ existing clo ck synchronizatio n and neigh b orho o d services. Note h o wev er that, as w as mentioned in Sectio n 3, ev en for a static WSN net work the neigh b orho o d of a 13 single mote is often dynamic. The n etw ork in this case consisted of 31 MicaZ motes, lab eled as v 1 , v 2 , ..., v 31 . A mote v i w ould b e consid- ered an initiator if and only if i mod (6) ≡ 1 (th us the motes v 1 , v 7 , v 13 , v 19 , v 25 and v 31 w ere in itiators). In order to con tr ol the top ol- ogy of the net work, we limited the p oten- tial neighborho o d of an in itiator v i suc h that N ( v i ) ⊂ { v i − 3 , v i − 2 , v i − 1 , v i +1 , v i +2 , v i +3 } . Observe that b ecause of this, an y t wo co or- dinators v i and v i + 6 will ha ve o verla pping p oten tial neigh b orho o d (due to u sing a dy- namic n eigh b orho o d service, the actual neigh- b orho o d v aries). 0 % 10 % 20 % 30 % 40 % 50 % 60 % 20-40 40-60 60-80 80-100 100-120 120-140 140-160 LR W op erations Time interv al in milliseconds r 50ms r 75ms r 100ms Figure 7: Duration distribu tion for v arying timeout. Our initial exp eriment s w ere intended to ev aluate the optimistic d uration of the p r o- to col. Th e implementa tion we used for these trials did not mak e u se of clock sync hroniza- tion or neigh b orho o d services. Eac h trial con- sisted of approximat ely 250 LR W op erations and the net work con tained only a sin gle ini- tiator with a n eigh b orho o d of size six. As was previously menti oned, the du ration of T T imeout is a ma jor factor in the through- put. Th us we r an sev eral exp erimen ts wh ere w e v aried T T imeout , and in eac h case noted not only the dur ation of eac h op eration, but also the ov erall reliabilit y . As w as suggesting in th e start of this s ection, we b egan with T T imeout = 50ms, and incremen ted th is in steps of 25ms un til we ac hiev ed 100% reliabil- it y . T able 2 sho ws the reliabilit y of eac h trial, and Figure 6 shows the a verag e du ration of the LR W op erations with th e 95% conﬁdence in terv al. 0 % 5 % 10 % 15 % 20 % 25 % 30 % 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160 160-180 180-200 200-220 220-240 240-260 260-280 280-300 300-320 LR W op erations Time interv als in milliseconds r 100ms r 150ms r 250ms r 350ms Figure 8: Distrib u tion of op eration d uration. As seen in T able 2, with a 50ms timeout v alue the r eliabilit y w as less than 50%, but it increased to appro ximately 97% when we allo w ed a 75ms timeout, and with a 100ms timeout the r eliabilit y was 100%. W e also see from Figure 6 that the a verage duration of the LR W op erations increases as the timeout is r educed. This ma y at ﬁrs t seem sur pris- ing, un til w e consid er the duration distribu - tion sho w n on Figure 7. This ﬁgure sho ws the p ercenta ge of LR W op erations that ended within a giv en time in terv al, as s een on the x- axis. There are a few imp ortan t observ ations we can make when we cross r eferen ce Figure 7 with T able 2. First, note that an app r o ximate 14 equal p ercen tage of the LR W op erations for eac h timeout v alue ended within the 20-40ms in terv al. O b viously the timeout had no ef- fect on these, as one would exp ect. The next large grouping o ccurs at th e 60-80 ms time in- terv al, w here we ﬁnd most of the op erations from the 100ms timeout trial, and man y from the 75ms trial. Ho we v er, the 50ms trial is virtually absent fr om this interv al, wh ile it is instead almost the sole o ccupant in the 100- 120ms in terv al. Since this interv al includ es the double of th e timeo u t v alue, it is n at- ural to hyp othesize that for the ma jorit y of the LR W op erations from the 50ms trial that ended within this in terv al, T T imeout expired t wice. (T o conﬁr m this h yp othesis, w e in- v estigated the b eha vior of the cycle of broad- cast (or reb r oadcast triggered by T Response ex- piring) and ac k n o wledgments, d etailed in the next paragraph.) A similar eﬀect is s een for the 75ms tr ial at the 140-160ms in terv al. The only diﬀerence is that in this case we n ote that while appro ximately 10% of the LR W op era- tions ended within this interv al , the reliabilit y data implies only 3% of the op erations failed. On closer insp ection of the data we observ ed that most of the op erations that ended within this interv a l w ere c anc ele d , not faile d . In ei- ther of the t wo trials, we see that red u cing th e timeout v alue had a n egativ e eﬀect not only on the r eliabilit y , but also on the duration of the LR W op erations. 20 40 60 80 100 120 140 2 3 4 5 6 Milliseco nds Neighbors b b b b b Figure 9: Av erage LR W op eration duration with 95% conﬁdence interv al. In an exp erimen t using seven T el osB motes, eac h mote acte d as initiator app ro xim ately 1,300 times, ho w ever the exp eriment tested only the ﬁrst p art of the pr oto col of Figure 3: eﬀectiv ely T T imeout = ∞ for this exp eriment. Eac h initMsg br oadcast included the list of neigh b ors w ho h ad n ot y et resp onded to the LR W op eration. The initiator’s neigh b orho o d w as ﬁxed to b e th e other six motes. Thus, the initial broadcast of an initMsg con tained an in vitee list of length six; rebroadcasts con- tained smaller in v itee lists. An op eration ter- minated at the instant the initiator had re- ceiv ed ac kno w ledgments fr om all neigh b ors. The exp eriment allo cated app ro ximately four seconds to eac h LR W op eration, to ensu re that n o con tentio n b et ween op erations could o ccur. In a total of 9,258 op erations, 4,228 of them (45.6%) were “luc ky”, that is, ac kn o wl- edgmen ts fr om all six neigh b ors o ccurred im- mediately follo wing the initial br oadcast, so that n o rebroadcast was n ecessary . The dis- tribution of op eration times f or these cases is sho w n in Figure 10, indicated by inv ited = 6. Some 3,288 of the op erations in cluded a re- broadcast con taining a singleton invi tee list, sho w n as inv ited = 1 in the ﬁ gure. T h e mean (standard deviation) op er ation d uration in milliseconds for inv ited = 6 w as 26.5 (3.49), while for inv ited = 1 the results are 19.6 (3.85) . The curve s in Figure 10 are Gaussian distributions ﬁtted the the mean and stan- dard d eviations (w e also obtained d ata for inv ited = k , 1 < k < 6, whic h follo w sim- ilar patterns). W e also measured th e total duration (fr om start to termination) of eac h op eration. This is sho wn in Figure 11, clearly reﬂecting T Response expiration and r eb road- cast. The mean n u m b er of (re-) b roadcasts p er op eration was 1.55, and the bimo dal d is- tribution in the ﬁgure explains this (actually , the distribu tion has three mo d es, how ev er the third do es not con tribute s igniﬁ can tly). These exp eriments with T T imeout = ∞ ex- 15 plain the results of Figure 7 and T able 2. One could ev en use distrib utions suggested by the ﬁtted curves to build an analytical mo del, ho wev er this mo del w ould b e im p ractical for situations where initiators may con tend with net work traﬃc and for nonoptimistic cases where an LR W op eration should b e canceled. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 10 15 20 25 30 35 40 45 50 F requency Milliseconds in vited=1 in vited=6 r r r r r r r r r r r r r r r r r r r r r b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b Figure 10: Timings b y In vitee List Size 0 0.01 0.02 0.03 0.04 0.05 0.06 20 40 60 80 100 120 F requency Milliseconds N=9258 b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b Figure 11: Dur ation for T T imeout = ∞ Our second set of exp erimen ts in tro d uced con tent ion by having six initiators pr esen t in the graph. Recall that w e en f orced a top ol- ogy on the net work su c h that every initiator shares at least one p oten tial neighbor with another in itiator. In th ese exp erimen ts, eac h trial consisted of app ro xim ately 1000 series. Due to the dynamic neigh b orho o d , the n u m b er of motes attac hed to a single in itia- tor v aried o ver the course of one trial. Figure 9 shows th e a ve rage LR W op eration duration with a 95% conﬁdence in terv al, dep en ding on the size of the neighborh o o d. As we see the a ve rage du ration has increased, esp ecially for larger neighborh o o ds, compared to Figure 5. 0 % 5 % 10 % 15 % 20 % 25 % 30 % 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160 160-180 180-200 200-220 220-240 240-260 260-280 280-300 300-320 LR W Series Time interv als in milliseconds r 100ms r 150ms r 250ms r 350ms Figure 12: Distribu tion of series du ration. Previously we examined the eﬀect that re- ducing the timeout v alue would h a ve on LR W duration and reliabilit y in a non-con tentio n net work. W e rep eated th ese exp erimen ts with con tent ion, starting with a timeout of 100ms and increasing it by 50ms in eac h trial. T able 3 shows th e op er ation and series reliabilit y of eac h trial, and Figure 8 sho w s the distribu tion of th e du ration for LR W op erations. Figure 8 omits the data fr om the 200 and 300ms trials, as these w er e similar to th e 250 and 350 ms trials. As seen in T able 3, the op eration reliabil- it y is very p o or in the 100ms trial. But it increases r apidly , b ecoming appro ximately 99 % in the 200ms trial and 100 % at 350ms. If we cr oss reference this w ith Figure 8 we 16 see that in the 100ms trial almost every LR W op eration had completed within 220ms, wh ile in the 350ms trial ev er y transaction h ad com- pleted within 320ms. Due to the inclusion of con tentio n , clock sync h ronization, and dynamic neighbor- ho o ds , th e b ehavio r of the p roto col is signiﬁ- can tly m ore complex than for previous trials. Ho we v er w e still notice the same basic tren d that w as eviden t in Figure 7: the p ercen t- age of op erations th at ended w ithin 160ms is largely similar for eac h trial, while at later in terv als w e notice ﬁrst a drop, an d then an increase in the n u mb er of op erations from the 100ms trial. Op eration Series Timeout Relia bilit y Reliabilit y 100ms 87.91% 55.11% 150ms 96.65 % 83.66 % 200ms 99.37 % 96.64 % 250ms 99.61 % 97.85 % 300ms 99.85 % 98.90 % 350ms 100.00 % 100.00 % T able 3: Reliabilit y with con tentio n This b ecomes muc h m ore pronoun ced when w e consider th e s er ies reliabilit y in the third column in T able 3 and the d uration d istribu- tion in Figure 12. Here we clearly see that a large num b er of the series in the 100ms tr ial ended b et w een 160 and 220ms. Recall that we observed from Figure 7 that reducing the timeout v alue has the eﬀect of in- creasing the d uration of th e LR W op erations. Figure 13, whic h shows the a verag e s er ies du- ration and 95% conﬁdence in terv al for eac h of the ab ov e trials, illustrates the same ten- dency . Ho w ever, note th at con trary to wh at w e sa w in Figure 7 , w h en an y reduction of the timeout v alue b ey ond 100ms resulted in an increased d uration, w e see in this ﬁgur e that th e series in the 100ms trial has app r o x- imately the same a ve rage duration as in the 350ms trial. 140 145 150 155 160 165 170 175 180 100 150 200 250 300 350 Milliseco nds Timeout in milliseconds b b b b b b Figure 13: Ave rage series duration w ith 95% conﬁdence interv al for v arying timeout. 6 Conclusions This p ap er’s p rop osal and inv estigation of the LR W pr otocol is an atte m pt to answer the question: what basic, single comm unication- round pr imitiv e maximizes the wo rk accom- plished for a lo cal neighborh o o d op eration? In some sense, an LR W op eration is a com- m u nication r en dezv ous, where memb er s of a lo cal neigh b orho o d c hange states jointly . Not surpr isingly , su ch an op er ation is more p ow- erful than s im p ler p rimitiv es, suc h as neigh- b orho o d queries or unconditional b roadcast- write commands. Ho w ever, the adv an tages of LR W pr esume an op timistic, or sp eculativ e programming approac h: if con ten tion is high or the seman tics of the application do n ot fa- v or success, th en LR W could b e less attrac- tiv e. Without notions of stable storage and jour - naling, whic h supp ort A C ID p rop erties of database managers, consistency of LR W op er- ations cannot b e guaran teed; ho wev er tu ning the T T imeout parameter app r opriately do es increase the probabilit y of non-failed op era- tions. The imp ortance of consistency for ab- stractions like L R W, wr ite-all, or lo cal trans- actions, is diminished for most wireless sens or 17 net work applications — lo w erin g comp onent cost and conserving p o w er ha v e high priorit y . Applications can often b e d esigned to tolerate some small p robabilit y of data inconsistency , using standard tec hniqu es of r eplication, ﬁl- tering, outlier remo v al, and mo del-generated prediction. Our exp erimen ts show that tu ning T T imeout impacts b oth a v erage duration and reliabil- it y . Of the t wo most signiﬁcant timers u sed, T T imeout and T Response , w e limited ourselves to stud ying the former; T Response w as set to 40 milliseconds in all exp eriment s. A p ossible area of future researc h w ould b e to examine the length of T Response , and its eﬀect on the proto col. Th er e are many factors that w ould need to b e considered in this case, suc h as size of neigh b orho o d, size of the n et wo rk, radio ac- tivit y , and so on. Similar as for T T imeout , w e exp ect that there will b e trade-oﬀs b et w een throughput and the n umb er of unn ecessary retransmissions. One asp ect of LR W we d id not explore in this pap er is the “co n v enience” of LR W for common applications of sensor net work pro- gramming. The introdu ction explains ho w LR W can b e used for lo cal data coll ection; artiﬁcial examples of consensu s or leader elec- tion can easily b e s h o wn , ho w ever practi- cal case studies wo u ld b e h elpful to ev aluate LR W as a p rogramming primitiv e. References [1] MP Herlih y . W ait-free syn chronizatio n. ACM TOPLAS 13 (1):1 24-14 9, 1991. [2] M Demirbas. A transactional fr amew ork for p rogramming wireless sensor/actor net works. In The 11th IEEE Interna- tional Workshop on F utur e T r ends of Distribute d Computing Systems , p p. 123- 129, 2007. [3] M Ali, U S aif, A Dunk els, T V oigt, K R¨ omer, K Langendo en, J P olastre, ZA Uzmi. Medium acc ess con trol issu es in sensor net w orks . Computer Communic a- tion R eview 36 (2):33-36, 2006. [4] S S Kulk arni, M Arum ugam. T ransforma- tions for write-all-with-collisio n mo del. Computer Comm unic ations 29 (2):183- 199, 2006. [5] T Herman, S Tixeuil, A distr ibuted TDMA slot assignmen t algorithm for wireless sensor netw orks. Algorithm ic Asp e cts of Wir eless Sensor Networks: First International Workshop, ALGO- SENSORS 2004 , pages 45-58. [6] T Herman. Models of self-sta bilization and sensor netw orks. In Pr o c e e dings of th e Fifth International Workshop on Distribute d Computing IWDC03 , Springer LNCS 2918, pp . 205-21 5, 2003. [7] M Mizun o, M Nesterenko . A transforma- tion of self-stabilizing serial mo del pro- grams for asyn chronous parallel compu t- ing en vironments. Information Pr o c ess- ing L etters , 66(6):28 5-290 , 1998. [8] J Gehrke, S Madden. Query pro cess- ing in sensor netw orks. IEEE Pervasive Computing , 3(1):46-5 5, 2004. [9] S Nath, PB Gibb ons, S Seshan, Z Ander- son. Synopsis diﬀusion for robust aggre- gation in sensor net works. ACM T r ans- actions on Sensor Networks , 4(2), 2008. [10] K Akk a ya , M Y ounis. A su r v ey on rout- ing proto cols for wireless sensor netw orks A d Ho c Networks , 3(3):325-3 49, (Ma y ) 2008. [11] R Newton, G Morrisett, M W elsh. Th e regimen t macroprogramming system. In Pr o c e e dings of the Sixth Internationa l 18 Confer enc e on Information Pr o c essing in Sensor Networks (IPSN’07) , p p. 489- 498, 2007. [12] R Sugihara, RK Gupta. Programming mo dels for sensor net works: a surve y . ACM T r ansactions on Sensor Networks 4(2), (Marc h) 2008. [13] Y Kotidis. Sn apshot queries: to w ard s data-cen tric sensor net works. In 21st In- ternational Confer enc e on Data Engi- ne ering (ICDE’05) , pp. 131-142, 2005 [14] S Upadh y ayulaa , SKS. Gupta. Sp an- ning tree based algorithms for lo w la- tency and energy eﬃcient data aggre- gation enhanced con verge cast (D AC) in wireless sensor net w ork s . A d Ho c Net- works 5(5):626 -648, (July) 2007. [15] T He, P Vicaire, T Y an, L Luo, L Gu, G Zhou, R Stoleru, Q Cao, J A Stanko vic, T Ab delzaher. Ac hieving teal- time target trac king us in g wireless sensor net works, In 12th IE EE R e al-Time and Emb e dde d T e chnolo gy and Applic ations Symp osium (R T AS’06) , pp.37-48, 2006. [16] H A ttiya, R Rapp op ort. Th e lev el of handshake requ ired for establishing a connection. In D istribute d Algorithms, 8th International Workshop (W DA G’94) Springer-V erlag LNCS 857, pp. 179-193, 1994. [17] J Gray . Notes on data base op erat- ing systems. Op er ating Systems, An A d- vanc e d Course , S pringer-V erlag LNCS 60, p.393-481, 1978. [18] J P olastre, R Szewcz yk, D Culler., T e- los: enabling ultra-lo w p o wer wireless re- searc h. In P r o c e e dings of the 4th Interna- tional Symp osium on Information P r o- c essing in Sensor N etworks (IPSN’05) , P oster Session, S POTS trac k, Article 48, 2005. [19] Tin yOS 2.0 co n tributed code: ro otless timesync. 19

Local Read-Write Operations in Sensor Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment