On Counteracting Byzantine Attacks in Network Coded Peer-to-Peer Networks

1 On Counteracting Byzantine Attacks in Network Cod ed Peer -to-Peer Netw orks MinJi Kim ∗ , Lu ´ ısa Lima ∗ , Fang Zhao ∗ , Jo ˜ ao Barros, Muriel M ´ edard, Ralf K oetter , T on Kalker , K eesook J. Han Abstract Random l inear network coding can be used in peer -to-peer n etworks to increase the efﬁciency of content distribution and distrib uted storage. Ho wev er , these systems are particularly su sceptible to Byzantine attacks. W e quantify the impact of Byzantine attacks on the coded system by e valuating the probab ility that a receiver node fails to cor rectly recover a ﬁle. W e sh ow that ev en for a small proba bility of attack, the system fails with overwhelming probability . W e then prop ose a novel signature scheme that allows pa cket-le vel Byzantine detection . Th is scheme allows on e-hop containmen t of the c ontamination , and sav es bandwid th by allo wing nodes to detect and dro p the contaminated packets. W e compare the net cost o f ou r signature scheme with various other Byzantine schemes, and show that wh en the probab ility of Byzantine attack s is high, o ur sch eme is the mo st bandwid th efﬁcient. Index T erms Network co ding, Byzantine, secur ity , peer to peer , distributed storage, con tent distribution. ∗ The ﬁrst three authors contributed equally t o this work. M. Kim, F . Zhao and M. M ´ edard ( { minjikim, zhaof, medard } @mit.edu) are with the Research Laboratory of El ectronics at the Massachusetts Instit ute of T echnolo gy , MA USA. L. Lima (luisalima@dcc.fc.up.pt) is with the Instit uto de T elecomu nicac ¸ ˜ oes, Department of Computer Science, Faculdade de Ci ˆ encias, Univ ersidade do Porto, Portugal. J. Barros (jbarros@fe.up.pt) is with the Instituto de T elecomun icac ¸ ˜ oes, Departamento de Engenharia Electrot ´ ecnica e de Computadores, Faculdade de Engenharia da Univ ersidade do Porto, Portugal. R. Koe tter (ralf .koetter@tum.de) is with the Institute for Communications Engineering of t he T echnischen Univ ersitaet Muenchen, Germany . T . K olker (ton.kalker@hp.com) is with the Hewlett-Pack ard Laboratories, CA USA. K. Han (keesook.Han@rl.af.mil) is with Ai r Force Research Laboratory , NY US A. 2 I . I N T RO D U C T I O N Network coding [1], an alternative to the traditional forwarding paradigm, allows algebraic mixing of packets in a netw ork. It maximizes throughput for multicast transmission s [2], [3], [4], as well as rob ustness against failures [5] and erasures [6]. Random linear network codi ng (RLNC), in which nodes independently take random linear combination of the packets, is sufﬁ cient for multicast networks [7], and i s suitable for dynamic and unstable networks, su ch as peer -to-peer (P2P) net works [8], [9]. A P2P network i s a cooperati ve network in which storage and bandwidth resources are shared in a d istributed architecture. Thi s is a cost-ef fecti ve and scalable way to distribute content to a lar ge number of receiv ers. One such architecture i s the BitT orrent syst em [10], which spl its l ar ge ﬁles into small blocks. After a node downloads a block, it acts as a source for that particular block. The main challenges i n these s ystems are the scheduli ng and management of rare blocks. As an alt ernati ve to current s trategies for these challenges, [8], [9] propose the u se of RLNC to increase th e efﬁ ciency of content di stribution i n a P2P solut ion. These s chemes are completely distributed and elim inate the need of a scheduler , si nce each node independent ly forwards a random linear combination. In addition, there is a high probability that each packet a node recei ves is li nearly independent of the previous ones, and thus , the problem of redundancy caused by th e ﬂooding approaches in traditional P2P networks is reduced. RLNC based schemes signiﬁcantly reduce the downloading t ime and im prove the robustness of the sy stem [8], [11]. Despite their desirable properties, network coded P2P systems are particularly sus ceptible to Byzantine atta c ks [12], [13], [14] – th e injection of corrupted packets into the informat ion ﬂow . Since network coding relies on mix ing of packets, a singl e corrupted packet may easily corrupt the entire information ﬂo w [15], [16]. Furthermo re, in P2P netw orks, there is typically no security control over t he no des that join the network and the packets that they redistribute. The topologies of th e overlay graphs that arise from traditional P2P networks are often m odeled as scale-free and sm all-world n etworks [17], [18], which are prone t o the dissemin ation of epidemics, su ch 3 as worms and v iruses [19], [20]. Se veral authors address these problems in coded P2P networks. W e shall di scuss these countermeasures in Section II. Most of t hese can be divided in to two main categories: (i) end-to-end error correction and (ii) m isbehavior detection. Motiv ated by t hese observations, we address the is sues of Byzantine adversaries i n coded P2P networks. The main contributions of t his paper are as fol lows: • W e propose a m odel for the ev aluation of the impact of Byzantine attacks in coded P2P networks, and provide analytical results which show that, even for a sm all prob ability of attack, the in formation can become contaminated with overwhelming probability . • W e propose a ne w efﬁcient, packet-based signature scheme, desi gned speciﬁcally for RLNC systems, to detect Byzantine attacks by checking the membership of a receive d packet in the valid vector space. This scheme allows an on e-hop containm ent o f the contamination. • W e analyze the overhe ad in terms of bandwidth associated with our signature scheme, and compare it to t hat of va rious Byzantine detection schemes. W e also show that o ur scheme is the most bandwidth efﬁcient if the prob ability of attack is high. This paper is organized as foll ows. Section II gives an overvie w of network coding in P2P networks and existing Byzantine detectio n schemes. In Section III, we analyze the i mpact of Byzantine attacks on the system. W e propose our s ignature scheme in Section IV, and compare its overhead wi th other s chemes in Section V. Finally , we conclude in Section VI. I I . B AC K G RO U N D A. Network coding in P2P networks References [6], [7] p ropose a random block l inear network codin g system – a sim ple, practical capacity-achie ving code, in whi ch ev ery node independentl y constructs its linear cod e randomly . In such a sys tem, a source generates information in batches of G packets (called a generation ). The source then multicasts t hem to its destination nodes us ing RLNC, where only the packets from the same generation are m ixed. Note t hat RLNC is a dist ributed prot ocol, which requires 4 no state inform ation; thus, making it sui table for dynamic and unstable networks where state information may change rapidly or may be h ard to obtai n. Se veral authors hav e e v aluated the performance of network coding in P2P networks. Gkantsidis et al. [9] propo se a s cheme for con tent dist ribution of large ﬁles in which no des make forwarding decisions solely based on local informati on. Thi s s cheme i mproves the expected ﬁle d ownload time a nd the robustness of the system. Reference [8] compares the performance of netw ork c oding with t raditional cod ing measures in a dist ributed storage setting w ith very limited sto rage space with the goal of minimizing the number of st orage l ocations a ﬁle-downloader conn ects to. They show t hat RLNC performs well witho ut the need for a lar ge amount of addi tional st orage s pace. Dimakis et al [21] introduce a graph-theoretic framew ork for P2P distributed system, and show that RLNC minimizes the required bandwidth to m aintain the dist ributed storage architectures. B. Byza ntine detection scheme for network coded systems 1) End-to-end err or corr ec tion scheme: Reference [22] introduces network err or corr ection for coded syst ems. They bound the m aximum achiev able rate in an adversarial setting, and generalize the Hamming, Gi lbert-V arshamov , and Singleton bounds. Jaggi et al. [15] introduce the ﬁrst di stributed pol ynomial-time rate-optimal network codes that work i n the presence of Byzantine nodes and are in formation-theoretically secure. Th e adversarial nodes are viewed as a secondary source. The s ource adds redundancy to help the recei vers distill out the source information from the recei ved mixtures. This work i s generalized in [23], [24]. 2) Generation-based Byzantine detection s cheme: Ho et al . [25] introduce an in formation- theoretic app roach for detecting Byzantine adversaries, which on ly ass umes that the adversary did no t see all linear com binations received by t he receivers. Their detection probability varies with the length of t he hash, ﬁeld s ize, and the amo unt o f information un known t o th e adversary . A polynom ial hash is added to each packet in the generation. Once the destination node recei ves enough p ackets to decode a generation , i t can probabi listically detect errors. The intuition b ehind this scheme is th at if a packet i s valid, then its data and hash are consi stent with it s coding vector; 5 and a linear combin ation of v alid packets is als o valid. 3) P ac ke t-based Byzant ine detection scheme: There are sev eral s ignature schemes t hat ha ve been presented in the literature. For instance, [8], [26], [27] use hom omorphic hash function s to detect contaminated packets. Reference [16] suggests the use of a Secure Random Checksum (SRC) which requires less comp utation than the homomorphi c hash functi on, b ut requires a secure channel to transm it the SRCs. In addition, [28] p roposes a signature scheme for network coding based on W ei l pairing o n elliptic curves. I I I . I M P AC T O F B Y Z A N T I N E A T T AC K S O N P 2 P N E T W O R K S In thi s section, we ﬁrst int roduce our mod el for ev aluating the probabili ty of a distributed denial of service attack (DDoS) caused by Byzantine nodes in a P2P network. W e t hen p resent results for t wo distinct scenarios. A. Mo del W e consider a directed graph wi th a set of nodes N . A sour ce node has a large ﬁle to be sent to r eceiver nod es. Th e ﬁle is divided int o m packets. T o d o s o, the source connects to a s ubset of nodes, N s ⊆ N , chos en uniforml y at random, and sends each of them a different random linear combination of the original ﬁle packets. T o ensure that enou gh degrees of freedom exist in th e network, |N s | ≥ m . W e refer to the nodes in N s as level-s nodes. A track er n ode keeps track of the list of informed nodes , N ( t ) , i.e. , nodes that keep an informatio n p acket. For a recei ver to retrie ve th e ﬁle, it con nects to a subset of nodes N r ⊆ N , chosen uniforml y at random, with | N r | ≥ | N s | . W e refer to the nodes in N r as level-r nodes. Note that there may be an overlap betw een lev el-s and lev el-r . In each tim e s lot, one of th e uninformed level-r nodes, n ∈ N r \ N s , cont acts the tracker to retrie ve a rando m li st of d informed nodes, where d < |N s | . The node n then connects to t hese i nformed nodes through a secure overlay connection, retriev es their packets, and stores a sin gle random linear combination of t hese packets. During the sam e time slot, the tracker updates i ts list of informed nodes to N ( t ) ∪ { n } . This process is repeated 6 track er level-s level-r n r2 n r1 source receiver Fig. 1. Network model. The source is connected to the le vel-s nodes, and the recei ver is connected to the level-r nodes. The dark nodes are the informed nodes. The l e vel-r nodes take turns to contact the tracker , and connects to | D | = 2 le vel-s nodes based on the list returned by the tracker . Here, nodes n r 1 and n r 2 has completed this process, and the other level-r nodes hav e not. for all nodes in N r \ N s , and then all le vel-r nodes forward their stored packets to the receive r . In order to maximize the probability of storing linearly independent combinati ons in level-r nodes and ensure decodabilit y at the receiver , we set d ≥ 2 . Alt hough we assume that each no de in lev el-s and l e vel-r stores only one packet, the m odel can be easily generalized to account for higher numbers. An example of this network model is shown in Figure 1. Note that the tracker is considered to be a trusted party in our model – in fact, as in the case of m ost P2P protocols, a dishon est tracke r would yield a proto col failure with overwhelming prob ability . W e deﬁne an Information Cont act Graph G ( t ) = { N ( t ) , A ( t ) } to deno te the ev olving graph formed in th e above process, where N ( t ) is the list of informed n odes and A ( t ) is the set of overlay l inks that connect the level-s and level-r nodes. The prob ability that a node becomes a Byzantine att acker is p b . An attacker corrupts the p acket it stores by generating arbitrary content wh ile complying to the st andard packet format. A node independently decides whether it becom es Byzantine at the start of t he ﬁle di ssemination process according t o p b and stays 7 that way throughout the process. W e deﬁne an indicator va riable I b ( n ) which is 1 if node n is Byzantine and 0 ot herwise. The tracker has no in formation about which nodes are Byzantine. A contaminated pack et is a packet that is either directly corrupted by an attacker , or i s a linear combination that in v olves at least o ne cont aminated packet. A contamin ated node is a node that stores a cont aminated packet. The blocking pr obability Ψ is the probability that the receiver collects at least one contam inated packet, and t hus, is unabl e to d ecode the ﬁle. This is equiv alent to the p robability that the attacker successfully carries ou t a DD oS attack. B. An alysis of Impact of Byzantine Att ac ks W e now e valuate the bloc king probability at the receiv er . W e then consider the e xpected number of contaminated nodes at any giv en tim e. First, we introduce necessary deﬁnitions, as follows. W e d eﬁne an indi cator var iable I c ( t, n ) which is equal to 1 if node n is contam inated at ti me t and 0 otherwise. C ( t ) is a random variable for the number of contaminated n odes in N ( t ) , and C ( t ) = | N ( t ) | − C ( t ) is the number of uncontaminated nodes. The function h ( k ; N , m, n ) denotes the hyper geometric distribution, in which h ( k ; N , m, n ) =  m k  N − m n − k  /  N m  . Let N b denote the num ber of i nformed Byzantin e n odes at tim e t = 0 , t hat is, the num ber of Byzantine nodes i n N s . N b has a binomial distribution with parameters ( |N s | , p b ) . W e consi der two s cenarios. In Theor em 1 , for sim plicity , we consider a static in formed nodes list, in wh ich t he list kept by the tracker is ﬁxed to N s . In this case, level-r nodes o nly connect to level-s nodes. Second, i n Theor em 2 , we generalize t o the case in which th e tracker updates its list of informed no des to N ( t ) , as stated in Section III-A. Theor em 1 (Static Info rmed Nodes List): L et G ( t ) be an information contact graph in which nodes in N r only connect to nodes in N s . Then it s blocking probabilit y Ψ is giv en by: Ψ = 1 − |N s | X y = 0 h ( y ; |N | , |N s | , |N r | )   |N s | X i =0  |N s | i  p i b (1 − p b ) |N s |− i f ( i, y )   , 8 where f ( i, y ) =  1 − i |N s |  y  (1 − p b ) h (0 , |N s | , i, d )  |N r |− y . Pr oof: W e cons ider two disjoi nt su bsets of N r : the set of informed nodes at t = 0 , that is, N r ∩ N s , and t he unin formed nodes, that is, N r \N s . Let Y be a random variable for the number of nodes i n N r ∩ N s . Y has a hypergeometric di stribution, P ( Y = y ) = h ( y ; |N | , |N s | , |N r | ) . W e ﬁrst consider n ∈ N r ∩ N s . Gi ven N b = i and Y = y , the probability that n is uncontaminated is equal to th e p robability that it is no t initially Byzantine, which is equal to 1 − i/ |N s | . Then, the probabili ty that all no des in N r ∩ N s are uncont aminated is: P ( I c ( n, 0) = 0 , ∀ n ∈ N r ∩ N s | N b = 1 , Y = y ) =  1 − i |N s |  y . Now , at each timeslo t t > 0 , a no de n ∈ N r \N s becomes informed. For n to be uncontami - nated, it must not b e Byzantine and i t must connect to d uncontaminated nodes. Then, P ( I c ( n, t ) = 0 | N b = i, Y = y ) = (1 − p b ) h (0 , |N s | , i, d ) . It follows that the probabi lity that all nodes in N r \N s are uncont aminated at tim e t is: P ( I c ( n, t ) = 0 , ∀ n ∈ N r \N s | N b = i, Y = y ) =  (1 − p b ) h (0 , |N s | , i, d )  t , for 0 ≤ t ≤ |N r | − y . Note that since |N r \N s | nodes are added, the information diss emination process ends at t = |N r |− y . Now , the probability that only uncontaminated nodes exist in N r at time t = |N r |− y , conditioned on Y = y and N b = i , is: f ( i, y ) =  1 − i |N s |  y  (1 − p b ) h (0 , |N s | , i, d )  |N r |− y . N b has a bin omial distri b ution, Y has a hypergeometric distribution and t hey are in depen- dent of each ot her . T aking out these two condit ions, the probabi lity that all nodes in N r are uncontaminated is: γ = |N s | X y = 0 h ( y ; |N | , |N s | , |N r | )   |N s | X i =0  |N s | i  p i b (1 − p b ) |N s |− i f ( i, y )   . It follows that the bl ocking probability is Ψ = 1 − γ . 9 W e now consider that the list of informed nodes at the t racker is N ( t ) , that is, it is updated with each new informed level-r no de. Theor em 2 (Evolving inf ormed nodes list ): Let G ( t ) be an information c ontact graph in which |N r \N s | are to be added to the graph by connecting to nod es in N ( t ) . Then its b locking probability Ψ is: Ψ = 1 − |N s | X y = 0 h ( y ; |N | , |N s | , |N r | )   |N s | X i =0  |N s | i  p i b (1 − p b ) |N s |− i f ( i, y )   , where f ( i, y ) =  1 − i |N s |  y " |N r |− y Y t =1 (1 − p b ) h (0; |N s | + t − 1 , i, d ) # . Pr oof: Rec all from Theore m 1 th at we consider two disjoint subsets of N r , that is, N r ∩ N s and N r \N s . As before, Y is the nu mber of nodes in N r ∩ N s . Again, at tim e t = 0 , the probabi lity that all nodes in N r ∩ N s are uncont aminated giv en N b = i and Y = y i s (1 − i/ |N s | ) y . W e now consider the n odes in N r \N s and assu me N b = i, Y = y . At each tim e step, there are C ( t ) contaminated nodes and C ( t ) = |N s | + t − C ( t ) uncontaminated nodes in N ( t ) . The probability of obtaining a contaminated no de at t ime t + 1 is only dependent on C ( t ) and C ( t ) , and th us, we can m odel these probabiliti es by Markov chains Ξ | N b , Y = { S, P } , in which S represents th e set of states and P represents the matrix o f transiti on probabilities. A state in S is represented by s = ( C ( t ) , C ( t )) . T ransitions from s are onl y possible to s ′ = ( C ( t ) + 1 , C ( t )) and to s ′′ = ( C ( t ) , C ( t ) + 1 ) . It is also im portant to note th at the depth of t he Markov chain is equal to |N r \N s | = |N r | − y . The transiti on probabilit ies from s when adding a no de n are P ( s → s ′ ) = P ( I c ( t + 1 , n ) = 1 | C ( t ) , C ( t ) , N b , Y ) and P ( s → s ′′ ) = P ( I c ( t + 1 , n ) = 0 | C ( t ) , C ( t ) , N b , Y ) . Ξ | N b , Y is ill ustrated in F igur e 2 for |N r \N s | = 2 . Let us denote C ( t ) as x and t ′ = |N s | + t , it follows that C ( t ) = t ′ − x . Now let p t { s } denote the probabil ity of being in state s at time t . p t { s } = p t { x,t − x } can be d eﬁned recursiv ely as: 10 Fig. 2. Marko v diagram for the dissemination process, |N r | − Y = 2 . The transitions to the left (dotted arrows) represent the addition of an uncontaminated node, and the transitions to the right (ﬁlled arrows) represent the addition of a contaminated node. The grey states are considered in computing Ψ , that is, the states in which no contaminated nodes are added.                    p t { x,t ′ − x } = p t − 1 { x − 1 ,t ′ − x } p ( { x − 1 ,t ′ − x }→{ x,t ′ − x } ) + p t − 1 { x,t ′ − x − 1 } p ( { x,t ′ − x − 1 }→{ x,t ′ − x } ) , p ( { x,t ′ − x }→{ x +1 ,t ′ − x } ) = 1 − P ( I c ( t, n ) = 0 | x, t ′ − x, N b = i, Y = y ) , p ( { x,t ′ − x }→{ x,t ′ − x +1 } ) = P ( I c ( t, n ) = 0 | x, t ′ − x, N b = i, Y = y ) , p 0 { i, |N s |− i } = 1 . Now , consider that node n is active at time t . The probabili ty of n being un contaminated is the probabil ity that it is not Byzantine and does no t connect to contaminated nodes. Thu s, P ( I c ( t, n ) = 0 | C ( t − 1) , C ( t − 1) , N b = i, Y = y ) = (1 − p b ) h (0; |N s | + t − 1 , C ( t − 1) , d ) . Now , n otice that the probability of only ha ving uncon taminated no des at time t = |N r |− y is the probability of, s tarting in s tate ( C (0) , C (0)) = ( i, |N s | − i ) , ending in state ( i, |N s | − i + |N r | − y ) after |N r | − y steps: in that case, no contaminated node is added to the network. The probabilit y of this e vent, condi tioned on N b = i and Y = y , is |N r |− y Y t =1 P ( I c ( t, n ) = 0 | C ( t − 1) , C ( t − 1) , N b = i, Y = y ) = |N r |− y Y t =1 (1 − p b ) h (0; |N s | + t − 1 , i, d ) . Combining t he results for sets N r ∩ N s and N r \N s , we hav e t hat the probability t hat no contaminated nodes exist in N r giv en that N b = i and Y = y i s giv en by 11 f ( i, y ) =  1 − i |N s |  y " |N r |− y Y t =1 (1 − p b ) h (0; |N s | + t − 1 , i, d ) # . Finally , it fol lows that the blocking probabilit y at time |N r \N s | is Ψ = 1 − |N s | X y = 0 h ( y ; |N | , |N s | , |N r | )   |N s | X i =0  |N s | i  p i b (1 − p b ) |N s |− i f ( i, y )   . The results from Theor ems 1 and 2 are illu strated in Figur e 3 . Note that e ven for a sm all p b , the b locking probabil ity Ψ is very high . Even for the case in Theore m 1 , Ψ grows exponentially . This is because it is suf ﬁcient for a single lev el-r node to connect to a Byzantine node in lev el-s to contaminate the receiver . Figure 3 indi cates that Ψ grows faster for the ev olving informed node list than for the static informed node list). This is du e to the fac t that as more nodes are add ed to the network, the p resence of contami nated nodes becomes mo re likely , and thu s, the probabi lity that a leve l-r node conn ects to at least one contamin ated node increases. The probability Ψ al so increases with other parameters such as d , |N s | , and | N r | sin ce t hey increase the probabil ity of level-r nodes conn ecting to contamin ated nodes. From the above proo fs, it foll ows that the number of contamin ated n odes in N ( t ) , t > 0 , is dependent on the random variable Y = |N r ∩ N s | . W e now perform an analysi s of t he expected number of con taminated nodes in the network E [ C ( t )] condit ioned on Y = y . First, we consider the case of the st atic informed n odes list, conditioned on N b = i, Y = y . It is clear that E [ C (0 ) | N b = i ] = i . Now , at each tim e step t , one contami nated node is added to N ( t ) with probability 1 − P ( I c ( n, t ) = 0 | N b = i, Y = y ) and thu s E [ C ( t ) | N b = i, Y = y ] = i + t (1 − (1 − p b ) h (0; |N s | , i, d )) . It follows that E [ C ( t ) | Y = y ] = |N s | X i =0  |N s | i  p i b (1 − p b ) |N s |− i  i + t (1 − (1 − p b ) h (0; |N s | , i, d ))  . In the case of the ev olving inform ed nod es list, since the s tates of Ξ | N b , Y are representativ e of the num ber of contaminated nodes in the network, E [ C ( t ) | N b , Y ] has a di rect correspondence to the expected state the Markov Chain is in after t t ime steps; th erefore: 12 0 . 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 By z a n t i n e p r o b a b i l i t y p b 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 B l o c k i n g p r o b a b i l i t y  Static Tracker List Evolving Tracker List Fig. 3. Blocking probability in f unction of p b for |N | = 30 , |N s | = 5 , |N r | = 6 and d = 3 . The results for the static and e volving informed nodes list are shown in full and dashed, respecti ve ly . E [ C ( t ) | Y = y ] = |N s | X i =0  |N s | i  p i b (1 − p b ) |N s |− i  i + t X x = i xp t { x,t ′ − x }  . In order to vi sualize these results, we t ake the expected value of Y for the set of parameters chosen in F igur e 3 , which is equal to 1. Then, we plot E [ C ( t ) | Y = 1] for the stati c and e volving informed no de lists . It i s shown in F igur e 4 that the expected number of contaminated nod es in the static case is l inear with time. For small prob abilities p b , the E [ C ( t ) | Y = 1] is higher for the ev olving case; as p b increases, the values for both cases become s imilar . I V . S I G N A T U R E S C H E M E F O R B Y Z A N T I N E D E T E C T I O N From the previous Section, we can see that coded P2P networks are highly vulnerable to Byzantine attacks, and the contam ination can qu ickly spread throughou t the network. Alth ough 13 0 1 2 3 4 5 T i m e st e p t 0 2 4 6 8 1 0 E x p . n r . o f c o n t a m i n a t e d n o d e s E ( C ( t ) | Y = 1) p b = 0 . 0 p b = 0 . 2 p b = 0 . 4 p b = 0 . 6 p b = 0 . 8 p b = 1 . 0 Fig. 4. Expected number of contaminated nodes in function of time, for |N | = 30 , |N s | = 5 , |N r | = 6 , d = 3 and Y = 1 . T he results for the static and e volving informed nodes list are shown i n full and dashed, r especti ve ly . we only consider a particular network mod el in Section III for the purpose of analysis, such problems exist in all network coded sys tems. Therefore, it is desirable t o hav e a signature scheme that checks the validity of each receive d packet witho ut decoding the whole ﬁle. Then the contam ination can be contained in one-hop, and we can av oid the decoding delay . In uncoded systems, the source knows all th e packets being transmitted in the network, and therefore, can sign each one of them. Howe v er , in a cod ed syst em, each node p roduces “new” packets, and standard digital signature schem es do not apply . Pre vious work that attempts to solve this problem is based on homomo rphic hash functions [8], [26], [27], Secure Random Checkup [16], or W eil pairing on ellipti c curves [28]. In this s ection, we introdu ce a novel signature scheme for the coded syst em b ased on the Discrete Logarit hm problem. W e cons ider a directed graph with a set of nodes N . A sourc e node has a large ﬁle to be sent to r eceiver nodes. The ﬁle i s divided into m packets. A node in the network receiv es linear 14 combinations of the packets from the source or from other nodes. In this framework, a node is also a server to packets it has downloaded, and always sends out random linear combin ations of all the packets it has obtained so far to ot her nodes. When a recei ver has recei ved m linearly independent packets, it can re-construct the whole ﬁle. W e denote the m origi nal packets as ¯ v 1 , ..., ¯ v m , and view them as elem ents in l -dimension al vector s pace F l p , where p is a prime. The source nod e adds coding vectors to create v 1 , ..., v m , v i = (0 , ..., 1 , ..., 0 , ¯ v i 1 , ..., ¯ v il ) , where the ﬁrst m elements are zero except th e i th el ement which is 1, and ¯ v ij ∈ F p is th e j th element in ¯ v i . A packet w recei ved by a nod e is a linear combin ation of these vectors, w = m X i =1 β i v i , where ( β 1 , ..., β m ) is t he global cod ing vector . The key o bserv ation for o ur signature schem e is that the vectors v 1 , ..., v m span a subspace V of F m + l p , and a recei ved vector w i s a valid linear com bination o f vectors v 1 , ..., v m if and only if it belongs to V . Our scheme is based on standard mo dulo arithm etic (in particular the hardness o f the D iscrete Logarith m problem) and on an in v ariant signature for the linear span V . Each node veriﬁes the integrity o f a recei ved vector w by checking the m embership of w in V based on the si gnature. Our signature scheme is deﬁned by the fol lowing ingredient s: • q : a large prime number such th at p i s a divisor of q − 1 . Note that standard techniques, such as th at used in Digital Signature Algorithm (DSA) [29], apply to ﬁnd such q . • g : a generator of the group G of order p in F q . Since the order of the multipli cativ e grou p F ∗ q is q − 1 (a multiple of p ), w e can alwa ys ﬁnd a subgroup, G , with order p in F ∗ q . • Pri vate key: K s = { α i } i =1 ,...,m + l , a random set of elements in F ∗ p , only known to the source. • Public key: K p = { h i = g α i } i =1 ,...,m + l , sig ned by so me standard signature s cheme, e.g., DSA, and publ ished by the source. T o distribute a ﬁle in a secure m anner , the signatu re scheme works as foll ows. 15 1) Usi ng the vectors v 1 , ..., v m from the ﬁle, the source ﬁnds a vector u = ( u 1 , ..., u m + l ) ∈ F m + l p orthogonal to all vectors in V . Speciﬁcally , the source ﬁnds a non-zero solution, u , to th e set of equations v i · u = 0 for i = 1 , ..., m . 2) The source compu tes the vector x = ( u 1 /α 1 , u 2 /α 2 , ..., u m + l /α m + l ) . 3) The so urce signs x with some standard si gnature schem e and publishes x . W e refer to the vector x as the signature of the ﬁle being distributed. 4) The client node veriﬁes that x is signed by the source. 5) When a node recei ves a vector w and wants to verify t hat w is in V , it computes d = m + l Y i =1 h x i w i i , and veriﬁes that d = 1 . T o see that d is equal to 1 for any v alid w , we have d = m + l Y i =1 h x i w i i = m + l Y i =1 ( g α i ) u i w i /α i = m + l Y i =1 g u i w i = g P m + l i =1 ( u i w i ) = 1 , where the last equality comes from the fact that u is orthogonal to all vectors in V . Next, we show that the sy stem described abov e is secure. In ess ence, the theorem b elow sho ws that given a set o f vectors that satisfy the signature veriﬁcation criterion, it is provably as hard as the Discrete Logarithm problem to ﬁnd new vectors t hat also satisfy the veriﬁcation criterion other than t hose that are in the linear span of the vectors already kn own. Deﬁnition 1: Let p be a prim e number and G b e a multip licativ e cyclic group of o rder p . Let k and n be two i ntegers such that k < n , and Γ = { h 1 , ..., h n } be a set of generators of G . Give n a lin ear subspace, V , of rank k in F n p such that for ev ery v ∈ V , th e equality Γ v , Q n i =1 h v i i = 1 holds, we deﬁne the ( p, k , n ) -Di f ﬁe-Hellman problem as the problem of ﬁndi ng a vector w ∈ F n p with Γ w = 1 but w / ∈ V . By this deﬁnition, t he prob lem of ﬁnding an in v alid vector that satisﬁes our signature veriﬁ- cation criterion is a ( p, m, m + l ) -Di f ﬁe-Hellman problem. Note that i n general, the ( p , n − 1 , n ) - Difﬁ e-Hellman problem has no solut ion. This is because if V has rank n − 1 and a w ′ exists 16 such that Γ w ′ = 1 and w ′ / ∈ V , t hen w ′ + V spans the whole space, and any vector w ∈ F n p would satisfy Γ w = 1 . This is clearly not true, therefore, n o such w ′ exists. Theor em 3: For any k < n − 1 , the ( p, k , n ) -Dif ﬁe-Hellman problem is as hard as th e Di screte Logarithm problem. Pr oof: Assume there exists an ef ﬁcient algorithm t o solve the ( p, k , n ) -Difﬁe-Hellman problem, and we wish to compute the discrete lo garithm log g ( z ) for so me z = g x , where g is a generator of a cyclic group G with order p . W e can choose two random vectors r = ( r 1 , ..., r n ) and s = ( s 1 , ..., s n ) in F n p , and construct Γ = { h 1 , ..., h n } , where h i = z r i g s i for i = 1 , ..., n . W e then ﬁnd k li nearly independent (and otherwise random) s olutions v 1 , ..., v k to the equ ations v · r = 0 and v · s = 0 . Note that there exist n − 2 linearly ind ependent vector solutions to the above equations. Let V be the li near s pan of { v 1 , ..., v k } , then any vector v ∈ V satisﬁes Γ v = 1 . Now , if we have an algorit hm for the ( p, k , n ) -Difﬁe-Hellman problem, we can ﬁnd a vector w / ∈ V such that Γ w = 1 . This vector would satisfy w · ( x r + s ) = 0 . Since r is statisti cally independent from ( x r + s ) , wi th probability greater than 1 − 1 /p , we hav e w · r 6 = 0 . In this case, we can compute log g ( z ) = x = w · s w · r . This m eans t he abi lity to sol ve the ( p , k , n ) -Dif ﬁe-Hellman problem implies the ability to solve the Discrete Log arithm problem. This proof is an adaptati on of a proof in an earlier publi cation by Boneh et. al [30]. Our signature scheme makes use of t he linearity property of RLNC, and enables the no des to check the integrity of packets without a s ecure channel, unlike the homomorphi c hash function or SRC schemes [16], [26]. In addi tion, our scheme does not require the nodes to decode coded packets to check their validity – thus, i s efﬁc ient in terms of del ay . The computation inv olved in the s ignature generation and veriﬁcation processes is very simple. Furthermore, our scheme uses t he Discrete L ogarithm probl em, whi ch is mo re standardized and widely us ed, compared 17 to the recently de veloped W eil pairing problem used in [28 ]. Lastly , we note that our signature scheme is rateless, whi ch is not the case in end-to-end o r generation based detection schemes. V . O V E R H E A D A NA L Y S I S In the previous Sections, we showed that our signature scheme is beneﬁcial, as e ven a small amount of attack can have a dev astating effect in coded networks. Howe ver , we have not shown that this scheme is efﬁc ient in t erms of bandwidth ( i.e. overhead of augmenti ng t he signat ure scheme), and ind eed, it is not always the case that our signatu re scheme is desirable. W e now study the cost and beneﬁt of the following three Byzantine schemes: 1) our signature scheme proposed in Section IV, 2) end-to -end error correction scheme [15], and 3) generation-based Byzantine detectio n scheme [6]. If we implement Byzantine detection schemes, we can d etect contaminated data, drop them, a nd therefore, only tra nsmit va lid data; ho we ver , this beneﬁt comes with the overhead of the schemes in the form s of hashes and sign atures. It is important to note that, for t he dropped data, the receivers perform erasure correction, which i s computatio nally lighter than error correction; thus , there is no need of retransmission s. W e cons ider a node n ∈ N in th e net work as in Section IV. Node n wishes to check the validity of t he data it forwards. Assu me that nod e n receive s M packets p er t ime s lot. Recall that m is the num ber of packets in a ﬁle and l is the length of each packet, therefore, each packet consists of ( m + l ) sym bols. If n detects an error , t hen it discards that data; o therwise, it forwards the data. The probabili ty t hat n receives a contaminated packet is p n as shown i n Figure 5. Note t hat the probabilit y p n of an attack is topology dependent. Howe ver , in order to compare the performance of v arious schemes, we use a generic per node model to examine the overhea d incurred at a nod e. W e assume that th ere is an external m odel of vulnerabil ity which giv es an estim ate of p n . Not e that the blockin g probability Ψ analyzed in Section III provides such an esti mate. 18 Fig. 5. Diagram of a node n in a network A. Overhead analysis of o ur pack et-based signa tur e scheme W e examine the ove rhead incurred by our signatu re scheme. Recall from Section IV, the ﬁle size is ml log p bits . The ﬁle is divided into m packets, each of which is a vector in F l p . Thus, the overhead of the RLNC scheme is m/l times the ﬁle size, and in practical networks m ≪ l . The initi al setup of our signature scheme i n v olves the publishing of the public key , K p , whi ch is ( m + l ) log ( q ) bits. In t ypical crypto graphic appl ications, the s izes of p and q are 20 bytes (160 bits) and 128 bytes (1024 bits ), respectiv ely; th us, t he size of K p is approxi mately 6( m + l ) /ml times the ﬁle size. This ove rhead is negligible as long as 6 ≪ m ≪ n . For example, i f we ha ve a ﬁle of size 10M B, divided into m = 100 packets, then the overhead is approximately 6%. W e note th at the p ublic key K p cannot be fully reused for mult iple ﬁles, as i t is possible for a malicious node to generate a vector wh ich is not a valid li near combinati on of the original vectors yet satisﬁes the check d = 1 using information obtained from pre viously d ownloaded ﬁles. W e do no t provide t he details of this for want of space. T o preve nt thi s from happening, we can redistribute ke ys for each additional ﬁle in on e of the two methods below . The ﬁrst meth od consists of publi shing a new public key K p for each ﬁle, which would incur an overhead of 6( m + l ) /ml ti mes t he ﬁle size. Note that if we repub lish K p for every ﬁle, we can reuse the si gnature x . The second method i s to upd ate K p partially and generate a new x for each ﬁle. This incurs less overhea d than th e p re vious metho d, howe ve r , requires a high variability in w for it t o be secure. This update incurs negligible amount o f 19 overhea d as well. For example, for a 10M B ﬁle, the overhead is less than 0.1%. The initial K p distribution cost s approximately 6% of o ur ﬁle size, and the i ncremental upd ate of K p and x is much less than 6% if we use the second meth od. Therefore, we shall denote the overhea d associat ed wi th our sig nature by o p , 6 100 ( m + l ) symbo ls per packet, i.e. 6 % overhead. If n detects an error in a packet, th en i t discards it – by doing so, n can ﬁlter out all t he contaminated packets and us e its bandwidth to transmit on ly valid p ackets. Therefore, n only forwards on av erage 1 − o p m + l fraction of the data received. Our signature s cheme costs o p M s ymbols per tim e slot. Howe ver , by discarding t he contami- nated packets, node n can on a vera ge save its bandwidth by M ( m + l ) p n symbols per ti me sl ot. Therefore, the n et cost of the si gnature scheme as a fraction of the total data recei ved is: max { 0 , M o p − M ( m + l ) p n ) } M ( m + l ) = max { 0 , o p − ( m + l ) p n } m + l . (1) When p n is hi gh, then checking each packet for error sa ves on bandwidth – i.e. ( o p − ( m + l ) p n ) < 0 , which shows t hat the cost of t he signatu re scheme i s canceled b y the bandwidth gained from dropping the corrupted packets. Th erefore, this approach is the m ost sensible when the network is un reliable or under hea vy attack. B. Overhead analysis of end -to-end err or cor r ec tion In this subsection, we shall us e the rate-optimal error correction codes from Jaggi et al. [15]. As long as the attack is wi thin t he network capacity , th is scheme all o ws the intermediate nodes to t ransmit at the remainin g network capacity , i.e. th e end-to-end network capacity minu s the capacity the adversary can contam inate. In this scenario, node n just naiv ely performs RLNC and forwards the data it has recei ved. Therefore, node n transmits on av erage M ( m + l ) p n contaminated sym bols. Thus, the net cost as a fraction of the total dat a receiv ed is: M ( m + l ) p n M ( m + l ) = p n . (2) 20 C. Overhead analysis of generation-based Byzantine det ection scheme W e now analyze th e performance of the algo rithm propos ed by Ho et al . [25], whi ch uses random block li near network coding with g eneration size G (altho ugh we have focused on RLNC so far , it is possible t o extend these results by cons idering m as the g eneration size G ). This scheme is very cheap – with 2% overhead, the detection probabilit y is at least 98.9%. W e denote the overhead associated with this scheme by o g , 2 100 ( m + l ) G symbols per generation. After collecting enough p ackets from the generatio n, node n checks for possi ble error in the generation, which can incur large delay . If n detects an error , it discards the entire generation of G packets; otherwise, it forwards the dat a. This scheme requires only one hash for the ent ire generation – saving bits on the hashes compared t o our signature scheme. Howe v er , it can be inefﬁ cient, as one contaminated packet can cause n to discard an entire generation. The probability p g of droppin g a generation of G packets i s giv en by: p g = 1 − Pr ( All G packets are valid ) = 1 − (1 − p n ) G . The cost and beneﬁt of th is scheme in cludes three components: (i) the hash of o g symbols per generation, (ii) valid packets which are discarded if the generation is deemed contaminated, and (iii) bandwidth saved by droppi ng contamin ated packets. The expected n umber of valid symbols dropped per generation is p g (1 − p n )( m + l ) G . The expected number of cont aminated symbol s per generation is p n ( m + l ) G . Thus, the net cost as a fraction of the total data received is : max { 0 , o g + p g (1 − p n )( m + l ) G − p n ( m + l ) G } ( m + l ) G . (3) For th is schem e to work, n needs to receiv e at least G packets from each generatio n to decode and detect errors. This may seem to indicate that thi s scheme is only applicabl e as an end-to-end scheme, but it can b e extended to a local Byzantine detection scheme as shown in Figure 6. The cost of the generation-based scheme increases dramatically with G . If G is lar ge enough, the probability of at least one corrupt ed packet in a generation is high e ven for small p n . Thus, 21 Fig. 6. Network with non-malicious nodes A , B , C , D , E , and F where node A is transmitting at a total rate of r to node F ; howe ver , A sends half of its data through B and the other half t hrough C . Therefore, B and C can check the v alidity of the sub-gener ation they receive, where by sub-generation, we mean a collection of G/ 2 encoded packets from A . By a similar argume nt, D , E , and F can check the v alidity of a sub-generation of G/ 4 , G/ 4 , and G pack ets f rom A , respectiv ely . a large G is u ndesirable, as almos t ev ery generation is found faulty and dropped, maki ng th e throughput go to zero. This can be veriﬁed wit h an asymp totic analysis of E quation 3: lim G →∞ max { 0 , o g + p g (1 − p n )( m + l ) G − p n ( m + l ) G } ( m + l ) G → max { 0 , 1 − 2 p n } . Note in Figure 7 that t he cos t peaks at p n ≈ 0 . 2 . At p n ≈ 0 . 2 , the scheme d rops many generations for a fe w corrupted packets. Thus, at a mo derate rate of attack, the generation-based scheme suffers. When p n < 0 . 2 , the generation-based s cheme does well , since p n is l ow and the cost of h ash is distributed across G packets. As p n increases to 0.5 from 0.2, the throu ghput to the receiv er decreases as more generations are dropped. When p > 0 . 5 , th is scheme discards almost all g enerations, thus, th e expected throughput is near zero. D. T rade-of fs and comp arisons In Figures 8 and 9, we compare t he t hree schemes. As mentioned in Section V -B, t he expected cost of error correction scheme is linearly propo rtional to p n . Therefore, for large p n , this scheme 22 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of error/attack: p n Ratio between the overhead and the total data received G = 2 G = 4 G = 10 G = 20 G = 100 Fig. 7. Ratio between the expec ted overhead and the total data receiv ed by a node for generation-based detection with generation size G , packet size 1000 bits, and hash size o g = 2 100 ( m + l ) G symbols per generation performs badly . Howe ver , this simp le scheme where a node naiv ely forwards all data it recei ve s outperforms the detection schemes when p n is low ( p n < 0 . 0 3 ). When p n is sm all, t he overhead of detection excee ds th e cost introd uced by the att ackers. When p n is low , th e overhead of our si gnature is costly , since we are de voting o p symbols per packet t o detect an unlikely attack. In such a settin g, the generation-based schem e performs well, as it distributes the cost of the hash ( o g symbols) over G packets. Howe ver , as p n increases, the cost of our signature becom es negligible sin ce the bandwidth wasted by contam inated packets increases; thus, our sign ature schem e outperforms the generation-based scheme. Howe ver , it is important to note that we und erestimate the overhead associ ated with our sig nature scheme in this paper as we do not take into account the p ublic key dist ribution cost, which the generation- based scheme do es not require. Thus, depending on the public key d istribution in frastructure 23 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of error/attack: p n Ratio between the overhead and the total data received End−to−end error correction Packet−based (o p = 6%) Generation−based (G = 2, o g = 2%) Generation−based (G = 4, o g = 2%) Generation−based (G = 10, o g = 2%) Generation−based (G = 20, o g = 2%) Generation−based (G = 100, o g = 2%) Fig. 8. Ratio between the expected overhead and the total data recei ved by a node with o p = 6 100 ( m + l ) , o g = 2 100 ( m + l ) G used and the frequency of key rene wal, our s cheme will incur a higher overhead – resulti ng in an outward shift i n the overhead in Figure 8. W e brieﬂy note the comput ational cost of implementi ng these schemes. When using our si g- nature scheme or the generation-based detection scheme, node n does not waste its bandwidth in transmittin g contamin ated dat a by dropping a single packet or an entire generation. Furthermore, there i s no need of retransmiss ion of the dropped data as the recei vers can perform erasure correction on the packets or the g enerations that have been dropped. It is important to not e that for the end-to-end error correction schem e, the receiv ers need to perform error correction, which is computati onally more expensi ve than erasure correction. 24 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Probability of error/attack: p n Ratio between the overhead and the total data received End−to−end error correction Packted−based (o p = 6%) Generation−based (G = 2, o g = 2%) Generation−based (G = 4, o g = 2%) Generation−based (G = 10, o g = 2%) Generation−based (G = 20, o g = 2%) Gereration−based (G = 100, o g = 2%) Fig. 9. Ratio between the expected ov erhead and the total data receiv ed by a node with o p = 6 100 ( m + l ) , o g = 2 100 ( m + l ) G for p n ∈ [0 , 0 . 1] V I . C O N C L U S I O N S In this paper , we studied the problem of Byzantine att acks in network coded P2P networks. W e used randomly e v olving graphs to characterize the imp act of Byzantine attackers on the recei ver’ s ability to recover a ﬁle. As shown by our analysi s, ev en a small nu mber of attackers can contami nate mo st of the ﬂow to the receivers. Motiv ated by thi s result, we p roposed a novel signature scheme for any network usi ng RLNC. Th e scheme m akes u se of t he linearity o f the code, and it can be u sed t o easily check the validity of all recei ved packets. Usi ng this scheme, we can prev ent the intermediate nodes from s preading the contamination by all o wing nodes to detect contaminated data, drop them, and t herefore, only transmit valid dat a. W e emphasize that 25 there is no need of retransmissio n for the dropped data since the recei vers can perform erasure correction, which i s computation ally cheaper than error correction. W e analyzed th e cost and beneﬁt o f th e si gnature s cheme, and com pared it with the end-to- end error correction scheme and the generation-based detection schem e. W e showed that th e overhea d associated with our schem e is low . Furthermore, when the p robability of Byzantin e attack is high, it i s the most bandwidth efﬁ cient. Howe ver , i f the prob ability of att ack is low , generation-based Byzantine detecti on schemes are more appropriate. R E F E R E N C E S [1] R. Ahlswede, N. Cai, S.-Y . R. Li, and R. W . Y eun g, “Network information ﬂ o w , ” IEE E T ransactions on Information Theory , vol. 46, pp. 1204–1216 , 2000. [2] T . Ho, M. M ´ edard, M. Effros, and D. Karger , “The beneﬁts of coding ov er routing in a randomized setting, ” in Pro ceedings of IEEE ISIT , Kanagawa, Japan, July 2003. [3] Z. Li and B. Li, “Netw ork coding: the c ase of multiple un icast sessions, ” in Pr oceedings o f 42nd Annu al Allerton Confer en ce on Communication Contro l and Computing , September 2004. [4] D. L un, M. M ´ edard, and R. Koetter , “Network coding for efﬁcient wireless unicast, ” in Pr oceeding s of International Zurich Seminar on Communications , Zurich, Switzerland, February 2006. [5] R. K oetter and M. M ´ edard, “ An algebraic approach to network coding, ” IEEE /ACM T r ansaction on Networking , vol. 11, pp. 782–795, 2003. [6] D. Lun, M. Medard, R . Koetter , and M. Effros, “On coding for reliable communication over packet network s, ” P hysical Communication , vol. 1, no. 1, pp. 3–20, 2008. [7] T . Ho, M. M ´ edard, R. K oetter , M. E f fros, J. Shi, and D. R. Karger , “ A random linear coding approach to mutlicast, ” IEEE T r ansactions on Information Theory , vol. 52, pp. 4413–4430, 2006. [8] S. Aceda ´ nski, S. Deb, M. M ´ edard, and R. K oetter , “Ho w good is random linear coding based distributed network st orage?” in Proce edings of 1st Netcod , Riv a del Garda, Italy , April 2005. [9] C. Gkantsidis and P . Rodriguez, “Network coding for large scale content distr ibution , ” in Pr oceeding s of IEEE IN FOCOM , Miami, FL, March 2005. [10] “Bittorrent ﬁ le sharing protocol, ” http://www .Bit T orrent.com. [11] C. Gkantsidis, J. Miller, and P . Rodriguez, “Comprehensi ve view of a l iv e network coding p2p system, ” in Pro ceedings of ACM SIGCOMM/USENIX Internet Measur ement Confere nce , R io de Janeiro, Brazil, October 2006. [12] R. P erlman, “Netwo rk layer protocols with byzantine robustne ss, ” P h.D. dissertation, Massachusetts Institute of T echnology , Cambridge, MA, October 1988. 26 [13] M. Castro and B. Liskov , “Practical byzantine fault tolerance, ” in Symposium on Operating Systems Design and Implementation (OSDI) , February 1999. [14] L. Lamport, R. S hostak, and M. P ease, “The byzantine generals problem, ” ACM T ransa ctions on Pro gramming Languag es and Systems , vol. 4, pp. 382–401, 1982. [15] S. Jaggi, M. Langberg, S. Katti, T . Ho, D. Katabi, and M. M ´ edard, “Resilient network coding in the presence of byzantine adversa ries, ” in Pr oceedings of IEEE INFOC OM , March 2007, pp. 616 – 624. [16] C. Gkantsidis and P . Rodriguez, “Cooperati ve security for network coding ﬁ le dist ribution, ” in P r oceed ings of IEEE INFOCOM , April 2006. [17] A. B arab ´ asi and R. Albert, “Emergen ce of Scaling i n Random Networks, ” Science , vol. 286, no. 5439, p. 509, 1999. [18] L. A. Adamic, R . M. Lukose , A. R. Puniyani, and B. A. Huberman, “Search in power-la w networks, ” Phys. Rev . E , vol. 64, no. 4, p. 046135, September 2001. [19] R. Pastor-Satorras and A. V espign ani, “Epidemic spreading in scale-free networks, ” Phys. Rev . Lett. , vol. 86, no. 14, pp. 3200–3 203, Apr 2001. [20] R. M. May and A . L. Ll oyd , “Infection dynamics on scale-free networks, ” Phys. Rev . E , vol. 64, no. 6, p. 066112, Nov 2001. [21] A. G. Dimakis, P . B. Godfrey , M. J. W ainwright, and K. Ramchandran, “Network coding for distributed storage systems, ” in Proce edings of IEEE INFOCOM , Anchorage, Alaska, May 2007. [22] R. W . Y eung and N. Cai , “Network error correction, ” Communications in Information and Systems , no. 1, pp. 19–54, 2006. [23] R. Ko etter and F . Kschischang, “Coding for errors and erasures in random network coding, ” IEE E T ran sactions on Information Theory , vol. 54, pp. 3579–3591, 2008. [24] D. Silva and F . Kschischang, “ Adversarial error correction for network coding: Models and metrics, ” in Pr oceedings of Annual Allerton Conf. on Commun., Contr ol, and Computing , Monticello, IL, September 2008. [25] T . Ho, B. Leong, R. K oetter , M. M ´ edard, and M. E f fros, “Byzantine modiﬁcation detection in multicast networks using randomized network coding, ” in Pr oceeding s of IEE E I SIT , June 2004. [26] M. Krohn, M. Freedman, and D. Mazi ` eres, “On-the-ﬂy veriﬁcation of rateless erasure codes for efﬁcient content distribution, ” in Pr oceedings of IEEE Symposium on Security and Privacy , May 2004. [27] Z. Y u, Y . W ei, B. Ramkumar , and Y . Guan, “ An ef ﬁcient signature-based scheme for securing network coding against pollution attacks, ” in Pr oceedings of IE EE INFOCOM , Pheonix, AZ, April 2008. [28] D. Charles, K. Jain, and K. Lauter, “Signatures for network coding, ” in Pr oceeding s of Confer ence on Information Sciences and Systems , March 2006. [29] National Institute of Standards and T echn ology, “Digital signature standard (DSS), ” FIBS PUB 186-2 , 2000. [30] D. Boneh and M. Franklin, “ An efﬁcient public key t raitor tracing scheme, ” in Lectur e Notes i n Computer Science , vol. 1666. Springer-V erlag, 1999, pp. 338–3 53.

On Counteracting Byzantine Attacks in Network Coded Peer-to-Peer Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment