Error-Free Multi-Valued Consensus with Byzantine Failures
In this paper, we present an efficient deterministic algorithm for consensus in presence of Byzantine failures. Our algorithm achieves consensus on an $L$-bit value with communication complexity $O(nL + n^4 L^{0.5} + n^6)$ bits, in a network consisti…
Authors: Guanfeng Liang, Nitin Vaidya
Error-F ree Multi-V alued Consensus with Byzan tine F ailures ∗ Guanfeng Liang and Nitin V aidy a Departmen t of Electrical and Computer Engineering, and Co ordinated Science Lab oratory Univ ersit y of Illinois at Urbana-Champaign gliang2@illinoi s.edu , nh v@illinois.edu No v em b er 9, 2018 Abstract In this paper, we present an efficient deterministic a lgorithm for consensus in presence of Byzantine failures. Our algorithm achiev es consens us on a n L -bit v alue with communication complexity O ( nL + n 4 L 0 . 5 + n 6 ) bits, in a netw ork consisting of n pro cessors with up to t Byzantine failures , such that t < n/ 3. F or larg e enough L , co mm unication complexity of the prop osed algorithm approaches O ( nL ) bits. In other w ords , for larg e L , the comm unicatio n complexity is linear in the num ber of pro cess ors in the netw or k . This is an improv ement ov er the w ork of Fitzi and Hirt ( fro m P ODC 2006), who prop osed a probabilistically co rrect multi- v alued Byza n tine consensus algor ithm with a similar co mplexit y for large L . In contrast to the algorithm by Fitzi and Hirt, our algo r ithm is guaranteed to be alwa ys er ror-free. Our alg orithm require no cr yptographic technique, such as a uthentication, nor any secret sharing mechanism. T o the b est of our knowledge, we ar e the first to show that, for lar ge L , erro r-free multi-v alued Byzantine consensus on an L -bit v alue is a c hiev able with O ( nL ) bits of communication. ∗ This researc h is sup p orted in part b y Arm y Researc h Office gran t W -911-NF-071 0287 and National S cience F oundation a w ard 1059 540. An y opinions, fi ndings, and conclusions or recom- mendations expr essed h ere are those of the authors and do not necessarily reflect the views of the fundin g agencies or the U.S. go v ernment. 1 In tro duction This pap er consider s th e m ulti-v alued Byzan tine c onsensus pr oblem. The Byzan tine consensus problem considers n pro cessors, n amely P 1 , ..., P n , of wh ic h at most t pro cessors ma y b e faulty and deviate from the algorithm in arbitrary fashion. Eac h p ro cessor P i is giv en an L -b it input v alue v i , and they wan t to agree on a v alue v su c h that the follo win g pr op erties are satisfied: • T ermina tion : ev ery fault-free P i ev en tually decides on an outpu t v alue v ′ i , • Consistency : the outpu t v alues of all fault-free pro cessors are equal, i.e. , for eve r y fault-free pro cessor P i , v ′ i = v ′ for some v ′ , • V alidity : if eve r y fault-free P i holds the same inp u t v i = v for s ome v , then v ′ = v . Algorithms that satisfy the ab o v e prop erties in all executions are said to b e error-free . W e are in terested in the comm unication complexit y of error-fr ee consensus algorithms. Com- munic ation c omplexity of an algorithm is defin ed as the maximum (o v er all p erm issible executions) of the total n um b er of bits transmitted by all the p ro cessors according to the sp ecification of the algorithm. Th is measure of complexit y w as firs t in tro du ced by Y ao [11], and has b een widely us ed b y the distributed computing comm unit y [4, 5, 10]. System Mo del: W e assume netw ork and adversary mo dels commonly used in other related wo rk [7, 1, 2, 5 , 6 ]. W e assume a syn c hronous f u lly connected n et w ork of n pro cessors, wherein th e pr ocessor id en- tifiers are common knowledge . Ev ery pair of pro cessors are connected with a pair of directed p oin t-to-p oin t comm u n icatio n c hann els. Whenev er a pro cessor r eceiv es a message on such a d i- rected channel, it can correctly assume that the message is sen t b y the pr ocessor at the other end of the channel. W e assume a Byza ntine adversary that has complete kno wledge of the state of the other pro- cessors, including the L -bit input v alues. No secret is hidden from the adv ersary . Th e adv ersary can tak e o ve r up to t pro cessors ( t < n / 3) at an y p oin t du r ing th e algorithm. Th ese pro cessors are s aid to b e faulty . The faulty pro cessors can engage in any “ misb ehavior ” , i.e. , d eviatio ns fr om the algorithm, including s ending incorrect messages, and collusion. The r emaining pro cessors are fault-fr e e and follo w the algorithm. Finally , w e make no assu mption of an y cryp tographic tec hnique, suc h as auth en tication and secret sharing. It h as b een sho wn th at error-fr ee consensus is imp ossible if t ≥ n/ 3 [9, 7]. Ω( n 2 ) has b een sho wn to b e a lo w er boun d on the num b er of messages needed to ac hiev e error-free consensus [3]. Since an y message must b e of at least 1 bit, this give s a low er b ound of Ω( n 2 ) bits on the communication complexit y of an y b inary (1-bit) consensus algorithm. In practice, ag r eement is sometimes requir ed for longer messages rather than just single bits. F or instance, the “v alue” b eing agreed up on may b e a large fi le in a fault-toleran t distribu ted storage s ystem. F or instance, as [5] suggests, in a v oting protocol, the authorities m ust agree on 1 the set of all ballots to b e tallied (whic h can b e gigabyt es of data). S imilarly , as also suggested in [5], m ulti-v alued Byzan tine agree ment is relev an t in secure multi-part y computation, wh er e many broadcast in vocations can b e parallelized and th er eby optimized to a single inv ocation w ith a long message. The problem of ac hieving consensus on a single L -b it v alue may b e solv ed using L instances of a 1-bit co ns ensus algorithm. How ev er, this approac h will result in comm un icatio n complexit y of Ω( n 2 L ), since Ω( n 2 ) is a low er b ound on communicatio n complexit y of 1-bit consensus. In a PODC 2006 pap er, Fitzi and Hirt [5] p resen ted a probabilistica lly correct m u lti-v alued consensu s algorithm whic h impro ve s the comm un icati on co mp lexit y to O ( n L ) for sufficiently large L , at th e cost of allo wing a n on-zero probabilit y of error. Since Ω( nL ) is a lo wer b ound on the communicatio n complexit y of consensus on an L -bit v alue, this algortihm has optimal complexit y for large L . In their algorithm, a n L -bit v alue (or message) is first redu ced to a m uch shorter message, us in g a univ ersal hash function. Byzanti n e consensus is then p erformed for the shorter hashed v alues. Giv en the r esult of consensus on the hashed v alues, consensus on L bits is th en achiev ed by requiring pro cessors w hose L -bit in p ut v alue matc hes the agreed hashed v alue d eliver the L bits to the other pro cessors join tly . By p erforming initia l consensus only for t h e smaller hashed v alues, this algo rithm is able to r ed uce the comm unication complexity to O ( nL + n 3 ( n + κ )) wh ere κ is a parameter of th e algorithm. Ho w ev er, since the hash fu nction is not collision-free, this algorithm is not error-free. Its probabilit y of error is lo w er b ounded by the collision probabilit y of the hash function. W e impr ov e on the work of Fitzi and Hirt [5], and present a deterministic error-fr ee consensus algorithm with communicatio n complexit y of O ( nL ) bits for su fficien tly large L . Our algorithm always p ro duce the correct result, unlike [5]. F or smaller L , the comm unication complexit y of ou r algorithms is O ( nL + n 4 L 0 . 5 + n 6 ). T o our knowle d ge, this is the first known er r or-free m ulti-v alued Byzan tine consensus algorithm that achiev es, f or large L , comm un ication complexit y linear in n . 2 Byzan tine Consensus: Salien t F eatures of the Algorithm The goal of our consens u s algo rithm is to ac hieve consensus on an L -bit v alue (or message). T he algorithm is d esigned to p erform effici ently for large L . Consequent ly , our discussion will assume that L is “sufficien tly large” (h o w large is “su fficien tly large” will b ecome clearer later in the pap er). W e no w briefly describ e the salien t features of our consensus algorithm, with the detailed algorithm present ed later in Section 3. • Algorithm exe cution in multiple gener ations : T o impr ov e the comm unication complexit y , con- sensus on the L -bit v alue is p erformed “in p arts”. In particular, for a certain in teger D , the L -bit v alue is divided into L/D parts, eac h consisting of D bits. F or con ve nience of p resen- tation, w e will assume that L/D is an intege r. A sub-algorithm is used to p erform consensus on eac h of these D -bit v al u es, and we will refer to eac h executio n of the sub -algo rithm as a “generation”. • M emory acr oss g ener ations: If during an y one generation, misb ehavio r b y some fault y pro- cessor is d etecte d, then additional (and exp en s iv e) d iagnostic steps are p erformed to gain information on the p oten tial identit y of the misb eha ving pro cessor(s). This information is captured b y mea n s of a diagnosis gr aph , as elab orated later. As the sub-algorithm is p erformed 2 for eac h new generation, the diagnosis gr aph is u p dated to in corp orate any new information that ma y b e learn t regarding th e lo cation of the f au lty p ro cessors. Th e execution of th e sub-algorithm in ea ch generation is adapted to the state of the diag n osis graph at the start of the generation. • Bounde d instanc es of misb ehavior: With Byzan tine failures, it is not alw a ys p ossib le to im- mediately determine the identit y of a misb ehaving pro cessor. Ho wev er, due to the manner in whic h th e diagnosis graph is main tained, and the m an n er in which the sub-algorithm adapts to the diagnosis graph, the t (or few er) fault y pro cessors can collectiv ely m isb eha ve in at most t ( t + 1) generations, b efore all the faulty pro cessors are exactly identified. Once a fault y pro cessor is iden tified, it is effectiv ely isolated f rom the netw ork, and cannot tamp er with future generations. Thus, t ( t + 1) is also an upp er b ou n d on the n umb er of generations in whic h th e exp ensive diagnostic steps referred ab o v e ma y need to b e p erformed. • L ow-c ost failur e-fr e e exe cution: Due to the b ounded n umb er of generations in whic h the fault y pr o cessors can misb ehav e, it turns out that the faulty p ro cessors do not tamp er with the execution in a ma jorit y of the generations. W e use a lo w-cost mec hanism to ac h iev e consensus in f ailure-free generations, which helps to ac hiev e low comm unication complexit y . In particular, w e u se an err or dete cting c o de -based strategy to red uce the amoun t of information the pr ocessors must exc hange to b e able to ac hieve consensus in the absence of any misb eha vior (the strategy , in fact, also allo ws detection of p oten tial misb eha vior). • Consistent diagnosis gr aph maintenanc e: A cop y of the diagnosis graph is main tained lo cally b y eac h fault-free pro cessor. T o ensu re consistent main tenance of this graph, the diagnostic information (elab orated later) n eeds to b e distributed consisten tly to all the pro cessors in the netw ork. This op eration itsel f requires a Byzant ine br oadcast a lgorithm that solv es the “Byzan tine Generals Pr oblem” [7]. With this algorithm, a “source” p ro cessor broadcasts its message to all other pro cessors reliably , ev en if some pro cessors (includ in g the sour ce) ma y b e faulty . F or this op eration we use an error-free 1-bit Byzan tine broadcast algorithm that tolerates t < n/ 3 Byzan tine failures with comm unication complexit y of O ( n 2 ) bits [2, 1]. This 1-bit broadcast algorithm is referred as Br o adc ast Single Bit in our discussion. While Br o adc ast Single Bit is exp ensive , the cumulativ e o v erh ead of Br o adc ast Single Bit is kept lo w by inv oking it a relativ ely small num b er of times, wh en compared to L . W e now elab orate on th e err or detecting co de used in our algorithms, and also describ e the diagnosis gr aph in some more d etail. Error detecting co de: W e will use Reed-Solomon co des in our algorithms (p oten tially other co des m a y b e used instead). Consider a ( m, k ) Reed-Solomon co de in Galois Field GF(2 c ), w here c is c hosen large enough (sp ecifically , m ≤ 2 c − 1). Th is co de enco des k data symbols from GF(2 c ) in to a co dewo rd consisting of m sym b ols from GF(2 c ). Eac h symb ol from GF(2 c ) can b e represen ted using c bits. Th us, a data ve ctor of k s y mb ols con tains k c bits, and the corresp onding co dew ord con tains mc b its. Eac h s ym b ol of the co dew ord is compu ted as a linear com bination of the k data symb ols, su c h that ev ery s u bset of k co ded symbols represent a set of linearly indep endent combinatio ns of the k data sym b ols. This prop ert y implies th at any su bset of k s y mb ols fr om the m sym b ols of a giv en 3 co dew ord can b e used to determine the data v ector corresp ondin g to the co dew ord. Similarly , kno wledge of any subset of k symb ols from a co dewo rd suffices to determine the remaining symb ols of the co dewo rd . S o k is also called the dimension of the co de. F or a co de C , let us denote C () as the enco ding function, and C − 1 () as the deco ding function. The deco ding fun ction can b e app lied s o long as at least k sym b ols of a co dew ord are a v aila b le. Diagnosis Graph: The fault-free p ro cessors’ (p otentia lly partial) kno wledge of the iden tit y of the faulty pro cessors is captured by a d iagnosis graph. A diagnosis graph is an undirected graph with n ve r tices, with v ertex i corresp ondin g to pro cessor P i . A pair of pro cessors are said to “trust” eac h other if the corresp onding p airs of vertice s in the diagnosis graph is connected with an edge; otherwise they are said to “accuse” eac h other. Before the start of the v ery fir st generation, the d iagnosis graph is initialize d as a f ully conn ected graph, which implies that all the n pro cessors initially trust eac h other. During the executio n of the algorithm, wheneve r misb eha vior b y some fau lty pro cessor is detected, the diagnosis graph will b e up dated, and one or more edges will b e remo ve d from the graph, usin g the diagnostic information comm unicated u sing the Br o adc ast Single Bit algo rithm . The use of Br o adc ast Single Bit ensures that the fault-free pro cessors alwa ys h a v e a consisten t view of the d iagnosis graph. As we will show later, the evo lution of th e diagnosis graph satisfies the follo win g pr op erties: • If an ed ge is r emo v ed from the diagnosis graph, at least one of the pro cessors corresp onding to the tw o end p oin ts of the remo ve d edge m ust b e faulty . • T he f au lt-free pro cessors alw a ys trust eac h other thr oughout th e algorithm. • If more than t edges at a v ertex in the diagnosis graph are remo ved, then the pro cessor corresp onding to th at vertex must b e fault y . The last t wo prop erties ab o v e f ollo w directly from the fi rst p r op ert y , and the assumption that there are at most t fault y pro cessors. 3 Multi-V alued Consensus In this section, w e d escrib e our consens u s algorithm, present a pro of of correctness. The L -bit input v alue v i at eac h pr ocessor is divided into L/D p arts of size D bits eac h, as noted earlier. These parts are denoted as v i (1) , v i (2) , · · · , v i ( L/D ). Our algorithm for ac hieving L -bit consens us consists of L/D sequent ial executions of Algorithm 1 presented in th is s ectio n (w e will discu s s the algorithm in detail b elo w). Algorithm 1 is executed once for eac h generation. F or the g -th generation (1 ≤ g ≤ L /D ), eac h pro cessor P i uses v i ( g ) as its inpu t in Algorithm 1. Eac h generation of the algorithm results in pro cessor P i deciding on g -th part (namely , v ′ i ( g )) of its fi n al decision v al u e v ′ i . The v alue v i ( g ) is represente d b y a v ector of n − 2 t sym b ols, eac h sym b ol r epresen ted with D / ( n − 2 t ) bits. F or con venience of presentat ion, we assume that D / ( n − 2 t ) is an intege r. W e will refer to these n − 2 t symbols as the data symb ols . 4 A ( n, n − 2 t ) d istance-(2 t + 1) Reed-Solomon co de, denoted as C 2 t , is used to enco de the n − 2 t data s y mb ols into n co ded sy mb ols. W e assu me that D / ( n − 2 t ) is large enough to allo w th e ab ov e Reed-Solomon co de to exist, s p ecifically , n ≤ 2 D / ( n − 2 t ) − 1. This cond ition is met only if L is large enough (since L > D ). W e no w presen t some notations to b e used in our discu ssion b elo w. F or a m -elemen t v ector V , we d enote V [ j ] as the j -th ele ment of the v ector, 1 ≤ j ≤ m . Give n a subset A ⊆ { 1 , . . . , m } , denote V / A as the ordered list of elemen ts of V at the lo cations corresp onding to elemen ts of A . F or instance, if m = 5 and A = { 2 , 4 } , then V / A is equal to ( V [2] , V [4]). W e will sa y that V / A ∈ C 2 t if there exists a co deword Z ∈ C 2 t suc h that Z / A = V / A . Otherw ise, we will say that V / A / ∈ C 2 t . Supp ose that Z is th e co dew ord corresp onding to data v . This is denoted as Z = C 2 t ( v ), and v = C − 1 2 t ( Z ). W e will extend the defin ition of the in verse function C − 1 2 t as follo ws. When set A con tains at least n − 2 t elemen ts, w e will d efi ne C − 1 2 t ( V / A ) = v , if there exists a cod ew ord Z ∈ C 2 t suc h th at Z / A = V / A and C 2 t ( v ) = Z . Let the s et of all the fau lt-free pro cessors b e denoted as P g ood . Algorithm 1 for eac h generation g consists of three s tage s. W e summarize the function of these three stages fir st, follo wed by a more detailed discuss ion: 1. Matc hing stage: Eac h p ro cessor P i enco des its D -bit input v i ( g ) for generation g in to n cod ed sym b ols, as n oted ab o ve. Eac h pro cessor P i sends one of these n co ded symb ols to the other pro cessors t hat it t rusts . Pro cessor P i trusts pro cessor P j if and only if the corresp onding v ertices in the diagnosis graph are connected by an edge. Using the sym b ols thus receiv ed from eac h other, the p ro cessors attempt to iden tify a “matc hing set” of pr ocessors (denoted P match ) of size n − t su c h that th e fault-free pro cessors in P match are guaranteed to hav e an identica l inpu t v alue for the current generation. If suc h a P match is not found, it can b e determined with certain t y that all the f au lt-free p r o cessors do n ot ha ve the same input v alue – in this case, the fault-free pro cessors decide on a default output v alue and terminate the algorithm. 2. Ch ec king stage: If a set of p ro cessors P match is iden tified in the ab ov e matc hing s tage, eac h pro cessor P j / ∈ P match c hec ks whether the sym b ols receiv ed the from p r o cessors in P match corresp ond to a v alid cod eword. If suc h a cod ew ord exists, then the symb ols r eceiv ed fr om P match are s aid to b e “consistent” . If any pr ocessor finds that these symbols are n ot consistent , then m isb eha vior b y some faulty pro cessor is detected. Else all the p ro cessors are ab le to correctly compute the v alue to b e agreed up on in the curren t generation. 3. Diagnosis stage: When misb eha vior is detecte d in the c hec king stage, th e pro cessors in P match are required to br o adc ast the co ded symb ol they s ent in the matc hing stage, usin g the Br o ad- c ast Single Bit algorithm. Using the inf ormation receiv ed d uring these broadcasts, th e fault- free pro cessors are able to learn n ew inf ormation regarding the p oten tial identit y of the fault y pro cessor(s). T he diagnosis gr aph (called Diag Gr aph in Algorithm 1) is up dated to incorp o- rate this new information. In the rest of this section, w e discuss eac h of the three stages in more d etail. Note that whenev er algorithm Br o adc ast Single Bit is used , all the fault-free pro cessors will r eceiv e th e broadcasted information identica lly . One ins tance of Br o adc ast Single Bit is needed for eac h bit of information broadcasted usin g Br o adc ast Single Bit . 5 Algorithm 1 Multi-V alued Consensus (generation g ) 1. Matching Sta ge: Eac h p ro cessor P i p erforms the matc hing stage as follo ws: (a) Compu te ( S i [1] , . . . , S i [ n ]) = C 2 t ( v i ( g )), and se nd S i [ i ] to ev ery trus ted p r o cessor P j (b) R i [ j ] ← ( sym b ol that P i receiv es from P j , if P i trusts P j ; ⊥ , otherwise (c) If S i [ j ] = R i [ j ] then M i [ j ] ← true ; else M i [ j ] ← false (d) P i broadcasts the vec tor M i using Br o adc ast Single Bit Using the receiv ed M v ectors: (e) Find a set of pr o cessors P match of size n − t suc h that M j [ k ] = M k [ j ] = true for eve ry pair of P j , P k ∈ P match (f ) If P match do es not exist then decide on a default v alue and terminate; else en ter the Checking Stage 2. C hec king Stage: Eac h p ro cessor P j / ∈ P match p erforms steps 2(a) and 2(b): (a) If R j /P match ∈ C 2 t then D etected j ← false ; else D etected j ← true . (b) Broadcast D etected j using Br o adc ast Single Bit . Eac h p ro cessor P i p erforms step 2(c): (c) Receiv e D etected j from eac h pro cessor P j / ∈ P match (broadcasted in step 2(b)). If D etected j = false for all P j / ∈ P match , then d ecide on v ′ i ( g ) = C − 1 2 t ( R i /P match ); else en ter Diagnosis Stage 3. Diagnosis Stage: Eac h p ro cessor P j ∈ P match p erforms step 3(a): (a) Broadcast S j [ j ] using Br o ad c ast Single Bit (one instance of Br o adc ast Single Bit is needed for eac h bit of S j [ j ]) Eac h p ro cessor P i p erforms the follo wing steps: (b) R # [ j ] ← symbol receiv ed from P j ∈ P match as a result of broadcast in step 3(a) (c) F or all P j ∈ P match , if P i trusts P j and R i [ j ] = R # [ j ] then T r us t i [ j ] ← true ; else T r us t i [ j ] ← false (d) Broadcast T r ust i /P match using Br o adc ast Single Bit (e) F or eac h edge ( j, k ) in Diag Gr aph , remo v e edge ( j, k ) if T r ust j [ k ] = false or T r us t k [ j ] = false (f ) If R # /P match ∈ C 2 t then if for any P j / ∈ P match , D etected j = true , but no edge at v ertex j w as remo ved in step 3(e) then remo ve all edges at v ertex j in Diag Gr aph (g) If at least t + 1 edges at any ve rtex j ha v e b een remo ve d so far, then pro cessor P j m ust b e fault y , and all ed ges at j are remo ved. (h) Find a set of pro cessors P decide ⊂ P match of size n − 2 t in the up dated D iag Gr aph , suc h th at every pair of P j , P k ∈ P decide trust eac h other. (i) Decide on v ′ i ( g ) = C − 1 2 t ( R # /P decide ). 6 3.1 Matc hing Stage The lin e num b ers referred b elo w corresp ond to the line num b ers for the ps eudo-code in Algorithm 1. Line 1(a): In generation g , eac h pro cessor P i first enco des v i ( g ), represen ted by n − t symb ols, in to a co deword S i from the co de C 2 t . T he j -th sy mb ol in the co deword is d enoted as S i [ j ]. T hen pro cessor P i sends S i [ i ], the i -th symbol of its co deword, to all the other p ro cessors that it trusts . Recall th at P i trusts P j if and only if there is an edge b et wee n the corresp onding v ertices in the diagnosis graph (r eferr ed as Diag Gr aph in the pseud o-code). Line 1(b): Let us denote by R i [ j ] the symb ol th at P i receiv es from a tru sted pro cessor P j . Pro cessor P i ignores an y messages receiv ed from untrusted pro cessors, treating the message as a distinguished symb ol ⊥ . Line 1(c) : Flag M i [ j ] is u sed to record whether pr ocessor P i finds pro cessor P j ’s symbol consistent with its own lo cal v alue. Sp ecifically , the pseud o-co de in line 1(c) is equiv alen t to the follo wing: • When P i trusts P j : I f R i [ j ] = S i [ j ], then set M i [ j ] = true ; else M i [ j ] = false . • When P i do es not tru s t P j : M i [ j ] = false . Line 1(d): As we w ill see later, if a f ault-free pro cessor P i do es not tru st another pr ocessor, then the other pro cessor m us t b e fault y . Thus entry M i [ j ] in v ector M i is false if P i b eliev es that pro cessor P j is faulty , or th at the v alue at p r o cessor P j differs from the v alue at P i . Th us, en try M i [ j ] b eing true imp lies that, as of this time, P i b eliev e that P j is fault-free, and that the v alue at P j is p ossibly iden tical to the v alue at P i . Pro cessor P i uses Br o adc ast Single Bit to b roadcast M i to all the pro cessors. One ins tance of Br o adc ast Single Bit is needed for eac h bit of M i . Lines 1(e) and 1(f): Due to the use of Br o adc ast Single Bit , all fault-free pr ocessors receiv e iden tical v ector M j from eac h p ro cessor P j . Using these M vect ors, eac h pro cessor P i attempts to find a set P match con taining exac tly n − t pro cessors such that, for every pair P j , P k ∈ P match , M j [ k ] = M k [ j ] = true . S in ce the M v ectors are receiv ed id en tically by all the fault-free pr ocessors (using Br o adc ast Single Bit ), they can compute identic al P match . How ev er, if suc h a set P match do es not exist, then the fault-free pro cessors conclude that all the fault-free pr o cessors do n ot hav e iden tical inp u t – in this case, they decide on some default v alue, and terminate the algorithm. In the follo wing discussion, we will sho w the correctness of this s tep. In the pro of of the lemmas 1 and 2, we assume that the fault-free p ro cessors (that is, the pro cessors in set P g ood ) alw ays trust eac h other – this assumption w ill b e sho wn to b e correct later in Lemma 4. Lemma 1 If for e ach fault-fr e e pr o c essor P i ∈ P g ood , v i ( g ) = v ( g ) , for some value v ( g ) , then a set P match ne c essarily e xists (assuming that the fault-fr e e pr o c essors trust e ach other). Pro of: Since all the fault-free pr ocessors hav e identi cal inpu t v ( g ) in generation g , S i = C 2 t ( v ( g )) for all P i ∈ P g ood . S ince these p r o cessors are fault-free, and trust eac h other, they send eac h other correct messages in the m atc hing stage. Thus, R i [ j ] = S j [ j ] = S i [ j ] for all P i , P j ∈ P g ood . This fact implies that M i [ j ] = true for all P i , P j ∈ P g ood . S ince there are at least n − t fault-free pro cessors, it follo ws that a set P match of size n − t must exist. ✷ 7 Observe that, although the ab o ve pro of shows that there exists a set P match con taining only fault-free pro cessors, ther e m a y also b e other such sets that cont ain some fau lty pro cessors as wel l. That is, all th e pro cessors in P match cannot b e assum ed to b e f au lt-free. Con verse of Lemma 1 imp lies that, if a set P match do es n ot exist, it is certain that the fault-free pro cessors do n ot hav e the same input v alues. In th is case, th ey can correctly agree on some default v alue and terminate the algorithm. This p ro v es the correctness of Lin e 1(f ). In the case when a set P match is foun d, the follo wing lemma is useful. Lemma 2 Th e fault-fr e e pr o c essors in P match (that is, al l the pr o c essors in P match ∩ P g ood ) have the same input for gener at ion g . Pro of: | P match ∩ P g ood | ≥ n − 2 t b ecause | P match | = n − t and there are at most t fault y pro cessors. Consider an y tw o pro cessors P i , P j ∈ P match ∩ P g ood . Since M i [ j ] = M j [ i ] = true , it follo ws that S i [ i ] = S j [ i ] and S j [ j ] = S i [ j ]. Since there are n − 2 t fault-free pro cessors in P match ∩ P g ood , this implies that the co dewo r d s computed b y these fau lt-free pro cessors (in Line 1(a)) con tain at least n − 2 t id en tical symb ols. Since the co de C 2 t has d imension ( n − 2 t ), this im p lies that the fault-free pro cessors in P match ∩ P g ood m ust ha v e iden tical in put in generation g . ✷ 3.2 Chec king St age When P match is found du r ing the matc hing stage, the c hec king stage is entered. Lines 2(a) and 2(b): Ev ery fault-free pro cessor P j / ∈ P match c hec ks whether the symbols r e- ceiv ed from the trusted pro cessors in P match are consisten t with a v alid co deword: that is, chec k whether R j /P match ∈ C 2 t . T he r esult of this test is broadcasted as a 1-bit notification D etected i , using Br o adc ast Single Bit . If R j /P match / ∈ C 2 t , then pr ocessor P j is said to h a v e d etect ed an inc onsistency . Line 2(c): If no pro cessor announces in L ine 2(b) that it has detected an inconsistency , eac h fault-free pro cessor P i c ho oses C − 1 2 t ( R i /P match ) as its d ecision v alue f or generation g . The follo wing lemma argues correctness of the d ecisio n made in Lin e 2(c). Lemma 3 If no pr o c essor dete cts i nc onsistency in Line 2(a), al l fault-fr e e pr o c essors P i ∈ P g ood de cide on the identic al output value v ′ ( g ) such that v ′ ( g ) = v j ( g ) for al l P j ∈ P match ∩ P g ood . Pro of: Observ e that s ize of set P match ∩ P g ood is at least n − 2 t , and hence the inv erse op erations C − 1 2 t ( R i /P match ) and C − 1 2 t ( R i /P match ∩ P g ood ) are b oth d efi ned. Since fault-free pro cessors send correct messages, R i /P match ∩ P g ood are id en tical for all fault- free pro cessors P i ∈ P g ood . Since no inconsistency has b een detected by any pr ocessor, ev ery fault-free p ro cessor P i decides on C − 1 2 t ( R i /P match ) as its outpu t. Since C 2 t has d imension ( n − 2 t ), C − 1 2 t ( R i /P match ) = C − 1 2 t ( R i /P match ∩ P g ood ). It then follo ws that all the fault-free pro cessors P i decide on the iden tical v alue v ′ ( g ) = C − 1 2 t ( R i /P match ∩ P g ood ) in Line 2(c). Since R j /P match ∩ P g ood = S j /P match ∩ P g ood for all pr o cessors P j ∈ P match ∩ P g ood , v ′ ( g ) = v j ( g ) for all P j ∈ P match ∩ P g ood . ✷ 8 3.3 Diagnosis Stage When an y pro cessor th at is not in P match announces that it has detected an inconsistency , the diagnosis stage is en tered. T he algorithm allo ws f or th e p ossibilit y that a fault y pro cessor ma y erroneously ann ounce that it has detected an in consistency . Th e purp ose of the diagnosis stage is to learn new information regarding the p oten tial iden tit y of a faulty pr ocessor. The new information is used to remov e one or more edges from the diagnosis graph Diag Gr aph – as we will so on sho w, when an edge ( j, k ) is remov ed fr om the diagnosis graph, at least on e of P j and P k m ust b e fault y . W e now describ e the s teps in the Diagnosis Stage. Lines 3(a) and 3(b): Every fault-free pro cessor P j ∈ P match uses Br o adc ast Single Bit to broad- cast S j [ j ] to all pro cessors. Let us d enote by R # [ j ] the result of the broadcast from P j . Due to the use of Br o adc ast Single B it , all fault-free pro cessors receiv e identic al R # [ j ] for eac h pro cessor P j ∈ P match . T his inform ation w ill b e u sed for diagnostic p urp oses. Line 3(c) and 3(d): Ev ery fault-free pr ocessor P i uses fl ag T r ust i [ j ] to record whether it “ b e- lieves ”, as of th is time, that eac h pr ocessor P j ∈ P match is fault-free or not. Then P i broadcasts T r ust i /P match to all pro cessors u sing Br o adc ast Single Bit . Sp ecifically , • If P i trusts P j and R i [ j ] = R # [ j ], then set T r ust i [ j ] = true ; • If P i do es not tru s t P j or R i [ j ] 6 = R # [ j ], then set T r ust i [ j ] = false . Line 3(e): Using the T r ust vect ors, eac h fault-free pr o cessor P i then remo ves an y edge ( j, k ) from the diagnosis graph suc h that T r ust j [ k ] or T r ust k [ j ] = false . Due to the used of Br o ad- c ast Single Bit , all fault-free pro cessors receiv e identica l T r ust v ectors. Hence they will remo ve the same set of edges and maint ain an identica l view of th e up dated Diag Gr aph . Line 3(f): As w e will so on sho w, in the case R # /P match ∈ C 2 t , a pro cessor P j / ∈ P match that announces th at it has detected an inconsistency , i.e., D etected j = true , m u s t b e f ault y if no edge attac hed to verte x j w as r emo v ed in Line 3(e). Such pr o cessors P j are “ isolate d ”, by h a ving all edges attac hed to verte x j r emo v ed fr om D iag Gr aph , and th e fault-free p ro cessors w ill not communicate with it anymore in subs equ en t generations. Line 3(g): As w e will so on show, a p r o cessor P j m ust b e faulty if at least t + 1 edges at v ertex j ha ve b een r emov ed. The identified faulty p ro cessor P j is then isolated. Lines 3(h) and 3(i): Since Diag Gr ap h is up dated only with information b roadcasted with Br o adc ast Single Bit ( D etected , R # and T r ust ), all fault-free pro cessors main tain an iden tical view of the up dated D iag Gr aph . Then they can compute an iden tical set P decide ⊂ P match con taining exactly n − 2 t pro cessors such that ev ery pair P j , P k ∈ P decide trust eac h other. Finally , every fault-free pro cessor chooses C − 1 2 t ( R # /P decide ) as its d ecision v alue f or generation g . W e firs t prov e the follo wing prop erty of the ev olution of D iag Gr aph . Lemma 4 Every t ime the diagnosis stage is p erforme d, at le ast one e dge attache d to a vertex c orr esp ond ing to a faulty pr o c essor wil l b e r emove d fr om D iag Gr aph , and only su ch e dges wil l b e r emove d. Pro of: W e pro ve this lemma b y induction. F or the con v enience of discussion, let us sa y an edge ( j, k ) is “ b ad ” if at least one of P j and P k is faulty . 9 Consider a generatio n g starting with any instance of the Diag Gr aph in wh ic h only b ad edges ha ve b een remov ed. When th e diagnosis stage is p erformed, there are t wo p ossibilities: (1) a fault- free p ro cessor P i / ∈ P match detects an inconsistency; or (2) a faulty pro cessor P j / ∈ P match announces that it h as detected an inconsistency . W e consider th e tw o p ossibilities separately: 1. A fault-free pr ocessor P i / ∈ P match detects an inconsistency: In this ca se, R i /P match / ∈ C 2 t . Ho w eve r, according to the defin ition of P match , R k /P match = S k /P match ∈ C 2 t for ev ery pro cessor P k ∈ P match ∩ P g ood . This imp lies that there m ust b e a fault y p ro cessor P j ∈ P match , whic h is trusted by P i and P k , has sent differen t symbols to the fault-free pro cessors P i and P k during the mat ching stage. Th us, the R # [ j ] m ust b e different from at least one of R i [ j ] and R k [ j ]. As a result, T r ust i [ j ] = false o r T r ust k [ j ] = false . Then at least one of the b ad edges ( i, j ) a n d ( j, k ) will b e remov ed in Line 3(e). 2. A fault y pr ocessor P j / ∈ P match announces that it d etect s an inconsistency: Denote b y X ⊂ P match the set of pro cessors ∈ P match that P j trusts. Acc ord in g to the algorithm, either an bad ed ge ( j, k ) for some P k ∈ X w as remov ed in Lin e 3(e), or none of such edges is remo ved. In the former case, the bad edge ( j, k ) is remov ed. In the later case, there are t wo p ossibilities (a) R # /P match ∈ C 2 t : Giv en th at no edge ( j, k ) for ev ery P k ∈ X w as r emov ed in Lin e 3(e), one can conclude that, if P j is fault-fr e e , then T r us t j [ k ] = t rue for all P k ∈ X , and R j [ k ] /X = R # [ k ] /X ∈ C 2 t . On the other hand, observ e that P j computes D etected j b y c hec king wh ether R j /X ∈ C 2 t , since any message from u n trusted p ro cessors in P match should ha ve b een ignored b y P j in Line 1(b). F rom D etected j = true , one can conclude that, if P j is fault-fr e e , R j /X / ∈ C 2 t . No w we ha v e a contradict ion if P j is fault-free. So pro cessor P j m ust b e fault y and all edges at ve rtex j are bad. These bad edges are remo v ed in Lin e 3(f ). (b) R # /P match / ∈ C 2 t : In this case, s im ilar to the discu ssion in case 1, some bad edge connecting tw o vertic es corresp ondin g to pr ocessors in P match is remov ed in Line 3(e). So b y the end of Lin e 3(f ), at least one n ew bad edge has b een remov ed. Moreo v er, since R i [ k ] = R # [ k ] for all fault-free pro cessors P k ∈ P match ∩ P g ood , T r ust i [ k ] remains t rue for ev ery pair of pro cessors P i , P k ∈ P g ood , whic h implies that the v ertices corresp onding to th e fault-free pro cessors will remain fully connected, and eac h w ill alw a ys ha v e at least n − t − 1 edges. This follo ws that a pro cessor P j m ust b e fault y if at least t + 1 edges at v ertex j has b een remo ved. S o all edges at j are b ad and will b e r emov ed in Line 3(g). No w w e h av e p ro v ed that for ev ery generatio n that b egins with a D iag Gr aph in whic h only bad edges hav e b een remo v ed, at least one new b ad edge, and only bad edges, will b e r emov ed in the up dated D iag Gr aph by the end of the diagnosis stage. T ogether with the fact that D iag Gr aph is initialized as a complete graph, we fi nish th e p ro of. ✷ The ab o v e pro of of Lemma 4 shows that all fault-free pro cessors will trust eac h other throughout the execution of the algorithm, whic h justifies the assum ption m ad e in the proofs of the p revious lemmas. T h e follo win g lemma s ho ws the correctness of Lines 3(h) and 3(i). Lemma 5 By the end of diagnosis stage, al l fault-fr e e pr o c essors P i ∈ P g ood de cide on the same output value v ′ ( g ) , such that v ′ ( g ) = v j ( g ) for al l P j ∈ P match ∩ P g ood . 10 Pro of: First of all, the set P decide necessarily exists sin ce th ere are at least n − 2 t ≥ t + 1 fault- free pr ocessors in P match ∩ P g ood that alw a ys trus t eac h other. Secondly , since the size of P decide is n − 2 t ≥ t +1, it must con tain at least one fault-free pro cessor P k ∈ P decide ∩ P g ood . Sin ce P k still trusts all p r o cessors of P decide in the up dated D iag Gr aph , R # /P decide = R k /P decide = S k /P decide . The second equalit y is du e to the fact that P k ∈ P match . Finally , since the size of set P decide is n − 2 t , the in ve r s e op eration of C − 1 2 t ( R # /P decide ) is defi n ed, and it equals to C − 1 2 t ( S k /P decide ) = v k ( g ) = v j ( g ) for all P j ∈ P match ∩ P g ood , as p er Lemma 2. ✷ W e can now conclud e the correctness of the Algorithm 1. Theorem 1 Given n pr o c essors with at m ost t < n / 3 ar e faulty, e ach given an input value of L b its, A lgorithm 1 ach iev es c onsensus c orr e ctly in L/ D gener ations , with the diagnosis sta ge p e rforme d for at mo st t ( t + 1) times. Pro of: Acco rd ing to Lemmas 1 to 5, co nsen sus is ac hiev ed correctly for eac h ge neration g of D bits. So the termination a n d consistency prop erties are sati sfi ed for the L -bit outputs after L/D generations. Moreo ve r, in the case all fault-free pr o cessors are given an ident ical L -bit input v , th e D bits output v ′ ( g ) in eac h generation g equals to v ( g ) as p er Lemm as 1, 3 and 5. So the L -bit output v ′ = v and the v alidity prop erty is also s atisfied. According to Lemma 4 and the fact that a fault y pro cessor P j will b e remo ve d once more than t edges at ve rtex j ha ve b een r emo v ed, it tak es at most t ( t + 1) in stance of the diagnosis stage b efore all fault y p r o cessors are identified. After that, the fault-free pro cessors will not communicate with the fault y pro cessors. Thus, the diagnosis stage will not b e p erformed an y more. S o it will b e p erformed for at m ost t ( t + 1) times in all cases. ✷ 3.4 Complexit y W e h a v e discuss ed the op erations of the prop osed multi-v alued consensus algorithm ab o v e. No w let us study the comm unication complexit y of this algorithm. Let us denote by B the complexity of broadcasting 1 bit with one in stance of Br o adc ast Single Bit . I n ev ery generation, the complexit y of eac h stage is as f ollo ws : • Matc hin g stage: ev ery pro cessor P i sends at most n − 1 symb ols, eac h of D / ( n − 2 t ) b its, to the p r o cessors that it trusts, and br oadcasts n − 1 b its for M i . So at most n ( n − 1) n − 2 t D + n ( n − 1) B bits in tota l are trans mitted by all n p r o cessors. • C hec king stage: eve ry pro cessor P j / ∈ P match broadcasts one b it D etected j with Br o ad- c ast Single Bit , and th ere are t su c h pro cessors. So tB bits are transmitted. • Diagnosis stage: every p ro cessor P j ∈ P match broadcasts one symb ol S j [ j ] of D / ( n − 2 t ) bits with Br o adc ast Single Bit ; and ev ery p ro cessor P i broadcasts n − t bits of T r ust i /P match with Br o adc ast Single Bit . So the complexit y is n − t n − 2 t D B + n ( n − t ) B bits. According to Theorem 1, th er e are L /D generations in tota l. In the wo rs t case, P match can b e found in ev ery generation, so the matching an d c hec king stages will b e p erformed for L/D times. In addition, the d iagnosis stage will b e p erformed for at most t ( t + 1) time. Hence th e comm unication 11 complexit y of the prop osed consensus algo r ith m , den oted as C con ( L ), is th en compu ted as C con ( L ) = n ( n − 1) n − 2 t D + n ( n − 1) B + tB L D + t ( t + 1) n − t n − 2 t D + n ( n − t ) B (1) F or a large enough v alue of L , with a suitable c hoice of D = r ( n 2 − n + t )( n − 2 t ) L t ( t +1)( n − t ) , we hav e C con ( L ) = n ( n − 1) n − 2 t L + 2 B L 0 . 5 s ( n 2 − n + t ) t ( t + 1)( n − t ) n − 2 t + t ( t + 1) n ( n − t ) B (2) Error-free algorithms that b roadcast 1 bit w ith communicatio n complexit y Θ( n 2 ) bits are kno wn [1, 2]. So we assume B = Θ( n 2 ). T h en the complexit y of our algorithm for t < n / 3 b ecomes C con ( L ) = n ( n − 1) n − 2 t L + O ( n 4 L 0 . 5 + n 6 ) = O ( nL + n 4 L 0 . 5 + n 6 ) . (3) So for sufficiently large L (Ω( n 6 )), the communicatio n complexit y approac hes O ( nL ). 4 Multi-V alued Broadcast and T olerat ing t ≥ n / 3 F ailures Here we briefly discuss the Byzan tine br o adc ast pr oblem (also known as the “Byzan tine Generals Problem” [7]). Similar to the consensus problem, the broadcast problem also considers ac hieving agreemen t among n p ro cessors: A designated “ sour c e” pro cessor tries to broadcast an L -b it v alue to the ot her p ro cessors, while t < n/ 3 pro cessors (p r obably in cluding the sou r ce) m a y b e f ault y . Using te chniques intro d uced in this pap er, w e can ac h iev e error-free multi-v alued broadcast w ith comm unication complexit y C bro ( L ) < 1 . 5( n − 1) L + Θ( n 4 L 0 . 5 ) b its f or t < n/ 3 and large L [8]. Notice that the complexity of an y b roadcast algorithm, eve n the ones that allo w a p ositive pr ob ab ility of error, is lo wer b ounded by ( n − 1) L . So w e can ac h iev e error-free broadcast w ith complexit y w ithin a factor of 1 . 5 + ǫ to the optimal f or any constan t ǫ > 0 and sufficient ly large L . Most of our discussion in th e p revious section is ind ep enden t of the num b er of f au lty pro cessors. The r equiremen t for t < n/ 3 is needed only for the correctness of the d eterministic error-fr ee 1-bit broadcast algorithm Br o ad c ast Single Bit . I n p ractice, it ma y b e desirable to b e ab le to tolerate t ≥ n/ 3 failures at the cost of a non-zero probabilit y of error. This need can b e met b y our algorithm with a sm all mo difi catio n : sub s titute Br o adc ast Single Bit with an y p robabilistically correct 1-bit broadcast a lgorithm that tolerates the desired num b er of failures (ones with authen tication from [10, 4] for example). With this mo dification, our algorithm tolerates the same num b er of failur es as the 1-bit broadcast algorithm do es, and mak es an err or on ly if the 1-bit broadcast algorithm fails. The only difference in the communicati on complexity is the term sub-line ar in L . So for su fficien tly large L , the complexit y o f the mo dified algorithm is also O ( nL ). 5 Conclusion In this p ap er, we p resen t efficien t error-fr ee Byzan tine consensu s algorithm for long messages. The algorithm requires O ( nL ) total bits of comm unication for messages of L bits for sufficien tly large L . Our algorithm mak es no cryptographic assumption and still is able to alwa ys solv e the Byzan tine consensus pr oblem correctly . 12 References [1] Piotr Berman, Juan A. Gara y , and Kenneth J. P erry . Bi t optimal distrib uted consensus. Computer scienc e: r ese ar ch and applic ations , 1992. [2] Brian A. Coan and Jenn if er L. W elc h. Mo dular constr u ction of a b yzan tine agreemen t proto col with optimal message bit complexit y . Inf. Comput. , 97(1):61–85 , 1992. [3] Dann y Dolev and R ¨ udiger Reisc h u k. Boun d s on information exchange f or byzan tine agreemen t. J. ACM , 32(1):1 91–204, 1985. [4] Dann y Dolev and H. Ra y Strong. Authentica ted algorithms for byzan tine agreemen t. SIAM Journal on Computing , 12(4):65 6–666, 19 83. [5] Matthias Fitzi and Martin Hirt. Optimally efficien t m u lti-v alued b yzant ine agreemen t. In PODC ’06 , 2006. [6] V alerie K ing and Jared Saia. Breaking the o(n2) bit barr ier: scalable byzan tine agreement with an adaptive adv ersary . I n Pr o c e e ding of the 29th ACM SIGACT-SIGOPS symp osium on Principles of distribute d c omputing , PODC ’10, p ages 420–4 29, New Y ork, NY, USA, 2010. A CM. [7] Leslie Lamp ort, Rob ert Shostak, and Marshall Pe ase. The byzan tine generals problem. ACM T r ans. on Pr o gr amming L anguages and Systems , 1982. [8] Guanfeng Liang and Nitin V aidy a. Complexit y of multi-v alued byza ntine agreemen t. T e chnic al R ep ort, CSL, UIUC (http://arxiv.or g/abs/1006.2422) , June 2010. [9] M. Pea se, R. Shostak, and L. Lamp ort. Reac hing agreemen t in the p resence of f au lts. JOUR- NAL OF THE ACM , 1980. [10] Birgit Pfi tzmann and Mic h ael W aidner. Information-theoretic pseud osignatures and byzan tine agreemen t for t ≥ n / 3. T e chnic al R ep ort, IBM R ese ar ch , 1996. [11] An drew Ch i-Chih Y ao. Some complexit y questions rela ted to distributiv e comput- ing(preliminary rep ort). In STOC ’79 , 1979. 13
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment