PERSEUS Technology: New Trends in Information and Communication Security

Using cryptography to protect information and communication has bacically two major drawbacks. First, the specific entropy profile of encrypted data makes their detection very easy. Second, the use of cryptography can be more or less regulated, not t…

Authors: Eric Filiol

PERSEUS Technology: New Trends in Information and Communication Security
Perseus T ec hnology: New T rends in Information and Comm unication Securit y ∗ Eric Filiol Lab oratoire de virologie et de cryptologie op ´ erationnelles F rance http://sites.google.com/site/ericfiliol ffiliol@gmail.com No vem b er 11, 2018 Abstract Using cryptograph y to protect information and communication has bacically t wo ma jor drawbac ks. First, the sp ecific entrop y profile of en- crypted data mak es their detection very easy . Second, the use of cryptog- raph y can b e more or less regulated, not to say forbidden, according to the countries. If the righ t to freely protect our p ersonal and priv ate data is a fundamen tal righ t, it m ust not hinder the action of Nation States with resp ect to National securit y . Allo wing encryption to citizens holds for bad guys as w ell. In this pap er w e prop ose a new approac h in information and com- m unication security that may solve all these issues, th us represen ting a rather interesting trade-off betw een apparen tly opp osite security needs. W e in tro duce the concept of scalable security based on computationnally hard problem of co ding theory with the Perseus technology . The core idea is to enco de date with v ariable punctured conv olutional co des in such a wa y that an y cryptanalytic attempt will require a time- consuming encoder reconstruction in order to decode. By adding noise in a suitable w ay , that reconstruction b ecomes un tractable in practice except for Intelligence services that how ever must use sup ercomputers during a significan t, scalable amount of time. Hence it limits naturally any will to unduly p erforms such attacks (eg. against citizens’ priv acy). On the users’ side, enco der and noise parameters are first exc hanged through an initial, short https session. The principles behind that ap- proac h ha ve been mathematically v alidated in 1997 and 2007. W e presen t the Perseus library w e hav e dev elop ed under the triple GPL/LGPL/MPL licences. This library can b e used to protect any kind of data. Keyw ords : Comm unication security - Co ding theory - Code reconstruction - T raffic eav esdropping - Encryption. ∗ This w ork has been presented at i A W ACS 2010. 1 1 In tro duction A necessary – but not sufficien t – condition for cryptographic securit y lies in the secret k ey size. Cryptograph y is itself defined as the use of a secret quantit y – the key – while co ding uses op en, widely known mathematical ob jects without an y secret quantit y . The main issue is then: can cryptography b e characterized b y the presence of a secret quan tity only? While it is a necessary condition, it is not a sufficient one. The deep and careful analysis of cryptographic la ws of most countries (and in ternational organizations) shows that the “legal” definition of what crypto really is and what is not, relates directly to following (noise) probability P [ c t = m t ⊕ e t ] = P [ e t = 1] where c t and m t are the ciphertext and plaintext bits resp ectively and where e t can b e defined as the noise bit pro duced by the key the cryptosystem 1 (at time instant t ). Then, if P [ e t = 1] = 1 2 ±  with  v ery close to zero, then it is cryptography , otherwise (  significantly different from 0) it is co ding theory . But are differences b et ween cryptograph y and co ding theory so easy to define? Kno wn cryptanalysis tec hniques intend to deal with the first case more or less efficien tly . On the other side, there are a lot of deco ding problems that are computationally hard. In this pap er w e are going to consider such a computationally hard problem in order to pro vide a new information and comm unication protection scheme whose security level is scalable. W e ha ve called it Perseus 2 tec hnology and we presen t here the op en source library we hav e developped to protect any kind of data and proto cols. Perseus technology’s core idea is to enco de data with punctured con volu- tional co des. Those codes are commonly used in telecommunications (GSM, satellite...) due to their very high enco ding sp eed and their high correcting p o wer. After this enco ding lay er and right b efore transmission, an artificial noise is applied to the data flo w (as would any channel do). The noise is gen- erated according to noise parameter p = P [ e t = 1] where e t is the noise bit at time instant t . The v alue of p is around 0.3. Since the conv olutational enco der is c hanging very frequently the attack er alwa ys has first to reconstruct the en- co der in order to b e able to deco de. This reconstruction has b een prov en to b e a computationally hard problem [3, 6, 7, 8]. By scalable w e mean that if it is alw ays p ossible to break Perseus -protected data, the difficulty can b e tuned up in order to require more or less computational efforts: from a few days to a few mon ths on a supercomputer. In addition, only an equiv alent, non-punctured enco der can b e recov ered [7]. Ho wev er this problem may still remain tractable to solve for an y in telligence agency with a suitable computing p ow er. 1 This holds also for block ciphers where the “effect” of the key on the plain text blo c k can formalized in this wa y . 2 Perseus is the mythic hero of Greek mythology who killed the Gorgon Medusa [20]. The botnets – against which Perseus technology has b een designed initially [5] – are themsel v es often compared to Medusa and its long tentacles. 2 The differen t parameters of the v ariable enco ders are randomly generated: p olynomial size constrain t, encoding rate, matrix puncturing, noise parameter p , enco der p olynomials... Then a short https initial session allo ws to comm unicate those parameters to the recipien t (about 256 b ytes). The recipien t and only him is able first to get rid of the artificial deterministic noise and then to set up the suitable Virb eri algorithm for data deco ding. What the in terest of using scalable securit y while generally only strong, un break able cryptograph y offers real securit y? Wh y w ould users prefer Perseus tec hnology instead of strong cryptography? On the other side, why existing national or international regulations would tolerate the use of this tec hnology? There are on the con trary many reasons to fav our the Perseus approach o ver strong encryption. • The use of encryption, b esides the fact that it w ould lead to severe con- strain ts (encryption o verhead, key managemen t...) p oses problems in terms of legal regulations, esp ecially in the con text of transnational streams with respect to the differen t national regulations. Then a critical issue arises: ho w can we protect our p ersonal and priv ate data while still allow- ing the necessary action of States (for national security for instance) in the field of communication surveillance and whithout lessening the transmis- sion rate significantly? Scalable securit y offered by Perseus pro vides suc h a trade-off v ery efficiently . Any Perseus -protected data can b e broken pro vided that a significan t amoun t of time of sup ercomputer is sp ent. This limits an y States’ inten ts to spy innocent people not inv olv ed in terrorism, mafia activities, child p ornography ... and making them fo cusing on really bad guys. Moreov er the generalization of encryption is not a go o d thing as p ointed out b y the US National Se curity A gency [16, 14] and British MI-5 [11] about HADOPI’s F rench questionable initiative. F av ouring the use of encryption to protect illegal do wnloading can sev erely hinder the cryptanalysis activities of States for national security purp oses. • Wh y use noisy enco ded data instead of encrypted data? Encrypted data b y nature exhibit a maximal entrop y profile. It is then easy to detect encrypted data. On the con trary , noisy enco ded data can exhibit a low er en tropy profile which remains closer to that of plain, unencoded data. This lo wer statistical profile enables to b ypass any detection b y entrop y test or any other statistical detection while encrypted data do not. T o summarize these t wo strong p oin ts of Perseus tec hnology , let us consider an illustrativ e example. John Doe is a US journalist in China. He w ants to send a serie of papers ab out China’s Human Righ ts infringements (and about the 2010 Chinese Peace Nob el Prize). Sending his pap ers to his agency in USA w ould be block ed b y Chinese authorithies whenev er encrypted. On the contrary , using Perseus will require a significan t time to detect (due to the low entrop y profile) and to break. The journalist will hav e time to go back and safe to USA. This pap er is organized as follows. Section 2 recalls basic facts ab out conv o- lutional co des and their reconstruction. Section 3 presents the Perseus library 3 structure while Section 4 deals with its detailed implementation. Section 5 presen ts the different exp erimen tal results we hav e obtained with resp ect to final data en tropy and p erformance while Section 7 concludes by considering future evolution of this library . 2 Theoretical Bac kground In this section, we are going to recall what a (punctured or not) conv olutional co de is as w ell as the main results with respect to their reconstruction. The aim is just to pro vide the reader with the required bac kground to understand the interest of those co des and wh y they are particularly suitable for our ap- proac h. The interested reader will refer to [13] for a more detailed presentation on conv olutional co des. 2.1 Con v olutional Co des A con v olutionnal enco der can b e seen as an enco ding system (based on a set of k shift-registers without feedback) such that, at each time instant, k information digits (t ypically the bits of data) en ter the enco der (one per register). Each information digit remains in the enco der for K time units and may affect each output during that time. The constan t K is the constraint length or the memory of the enco der. A t eac h time instant, n information digits are output, each of them result- ing from the xor of k digits pro duced b y the action of n p olynomials on each register. The enco der is thus said to be of rate k n . The action of the k n p oly- nomials and the shift are easily describ ed by p olynomial multiplications [8]. So the p olynomial representation will b e used to represen t the different streams. A message will b e composed of k interlaced input streams, each of them represen ted as a p olynomial of degree N + t denoted a i ( x ), i = 1 , . . . , k . The k n p olynomials are of degree N (hence N = K − 1) and will b e noted f i,j ( x ). Then the enco der pro duces n output streams (of length t ) represented as p olynomials of degree t , c j ( x ), j = 1 , . . . , n and we then ha ve: k X i =1 a i ( x ) f i,j ( x ) = u j, 1 ( x ) + x N c j ( x ) + x N + t u j, 2 ( x ) (1) The p olynomials u j, 1 ( x ) (resp. u j, 2 ) (the filling (resp. the empt ying) of the registers) are of degree at most N − 1. Then the co ded sequence is comp osed of the n in terlaced output streams. Th us the parameters of a conv olutionnal enco der are: • k and n defining the rate and the num b er of p olynomials, • K the constraint length (in fact it is related to internal memory of the enco der), 4 • the k n polynomials f i,j ( x ) of degree N = K − 1. The con volutionnal encoder then describ es a ( n, k, N )-co de. Generally , n and k are small integers with k < n . The most frequent case is k = n − 1. On the con trary , N must b e made large enough to ac hieve low residual deco ding error probabilities. The sym b ols are usually elements of GF (2) but generalization to GF ( q ) where q is some prime pow er ( q = p m for some p ositive integer m ) can b e easily done. W e will only consider the case q = 2 but all the implementation and results can b e generalized to any other prime q . This could b e interesting in increasing the enco ding sp eed. Figure 1 describ es a con volutional enco der of rate 1 2 . + + + u , u , ... 0 1 v , v ,... 1 1 0 1 v , v ,... 2 2 0 1 v , v , v , v ,... 1 2 1 2 0 0 1 1 Figure 1: Conv olutional enco der of rate 1 2 In the context of Perseus , w e will add an artificial noise of parameter p to the (enco ded) output sequence v = v (1) 0 , v (2) 0 , v (1) 1 , v (2) 1 , . . . The deco ding step is p erformed through the classical Viterbi algorithm whose complexit y is exp onen tial in k .N . Hence, generally their use is limited to co des of short lengths and to reduced enco ding rate k n . How ever in our case since we completely master the noise (we exactly know where the noise bits are applied while any b otnet agen t do es not), we can w ork with far higher v alues. 2.2 Punctured Con volutional Co des Punctured conv olutional co des w ere in tro duced b y Cain et al. [4] as means of greatly simplifying both Viterbi and sequen tial decoding of high rate conv olu- tional co des at the expanse of a relativ ely small p erformance p enalty . A punctured conv olutional code C is obtained b y perio dically deleting output sym b ols from a (base) ( n, k , N )-conv olutional co de C b . Output symbols from C b are deleted according to a p erio dic puncturing pattern (or p erforation pattern) whic h can b e describ ed by its puncturing matrix: P =    p 1 , 1 . . . p 1 ,M . . . . . . p n, 1 . . . p n,M    A very imp ortant problem is that of the reconstruction of such co des (punc- tured or not). In an attack con text, a monitor wan ts to hav e access to the 5 transmitted information ( the message ) without an y kno wledge on the encoder whic h pro duces the intercepted stream ( the c o de d se quenc e ). The only wa y is to reconstruct the enco der, that is to sa y to recov er all its parameters. A simple deco ding then giv es access to the message provided that the channel noise is not to o high (less than a very few p ercents). Let us consider a ( n, k , N )-(base) conv olutional co de C b . A giv en puncturing pattern P is a n × M 0 − 1 matrix with a total of I 1’s and nM − I 0’s where p i,j = 0 indicates that the i-th sym b ol of every branch in the j-th treillis section (of the treillis diagram of C b ) is to b e deleted. Then the original code C b , after b eing punctured with pattern P , has become a ( I , k M , m )-(punctured) co de 3 C [15]. Let us consider an illustrative, simple example. Example 1 L et us take the (2 , 1 , 3) c o de with p olynomials (1 + x 2 , 1 + x + x 2 ) The two output str e ams c an b e denote d as fol lows:  x 0 x 1 x 2 x 3 x 4 x 5 . . . y 0 y 1 y 2 y 3 y 4 y 5 . . .  By using the fol lowing puncturing p attern: P =  1 0 1 1  we then obtain the two fol lowing output str e ams:  x 0 x 2 x 4 . . . y 0 y 1 y 2 y 3 y 4 y 5 . . .  that we c an r e arr ange as fol lows:   x 0 x 2 x 4 . . . y 0 y 2 y 4 . . . y 1 y 3 y 5 . . .   It b e c omes then obvious that this puncturing pr o duc es a new enc o der pr o ducing thr e e output str e ams. By use of p olycyclic pseudo-cir culant matric es [7], the new p ar ameters ar e e asily define d and we have the 6 fol lowing p olynomials f 1 , 1 ( x ) = 1 + x f 1 , 2 ( x ) = 1 + x f 1 , 3 ( x ) = 1 f 2 , 1 ( x ) = 0 f 2 , 2 ( x ) = x f 2 , 3 ( x ) = 1 + x wher e f i,j denotes the j-th p arity-che ck p olynomial applie d on input message str e am i . As for Perseus is concerned, the puncturing pattern P is the last parameter to exchange during the initial https session. 3 In fact, the degree of the punctured co de may be less than N , but for most interesting punctured codes no degree reduction will take place 6 2.3 Reconstruction of Conv olutional Co des Since any punctured con volutional code is equiv alent to a non punctured con vo- lutional encoder, we will th us fo cus on the reconstruction of the latter codes. As far as co de reconstruction is concerned, it is w orth mentioning that the use of punctured co des make it more complex since we hav e equiv alen t non punctured co des whose parameters hav e higher v alues, for suitable v alues of I , k and M . It is alwa ys p ossible to reconstruct conv olutional co des in offline mo de. This is basically not a problem since for most real case s, conv olutional enco ders do not c hange very often since they are hardwired (as an example, t wo conv olutional enco ders of constraint length of 9 are embedded in the UMTS standard [1]). Consequen tly w e can sp end a lot of time to reconstruct them since the w ork is done just once. Ho wev er, there are only a v ery few kno wn cases (most of them are for tactical, military communications lik e in the Czec h army at least during the 90s) where the enco ders are randomly generated righ t b efore the transmission. The aim is clearly to hinder the co de reconstruction strongly , whic h therefore cannot b e p erformed online. In this latter case, except for very small v alues of parameters and noise probability , the reconstruction is to o muc h time consuming. The reconstruction of conv olutional co des is a very mathematical stuff and consequen tly we will not presen t it here (see [6, 3] for an exhaustiv e study). F or our purp oses, it is just necessary to recall the most significant results with resp ect to conv olutional co des reconstruction. While it is alw ays p ossible to mak e the probabilit y of false alarm ( i.e. to reconstruct a wrong enco der) tends to wards zero, the probabilit y of success de- p ends on many factors but the noise parameter has the most significant impact. Bey ond 2-3 % the reconstruction will fail unless having a large amoun t of en- co ded sequence or/and accepting to sp end a lot of time/machine ressources. In most practical cases, the Viterbi deco ding itself is likely to fail for a few p ercent of noise (less than 0.05) long b efore the reconstruction pro cess do es. Expressing the reconstruction probability of success is not easy from a mathematical p oint of view and we advise the reader to refer to [6, 3]. Exp eriments hav e confirmed that the reconstruction is b ound to fail as so on as p > 3% unless sp ending a lot of time and computing p o wer. As for the computational complexit y of the reconstruction, the general result [6, 3] states that for a ( n, k, N )-con volutional co de, the low er b ound is equal to O ( α × n 5 × N 4 ) where α ( p ) is a quantit y whic h gro ws exp onen tially with the noise probability p [3, Section 2.3.2]. T o illustrate that general result, T able 1 giv es a few exp erimen tal results [6, 3] for a few enco ders in the case of a noise level of 10 − 2 and 2 . 10 − 2 (Additiv e White Gaussian noise). 7 Enco der Reconstruction time Reconstruction time ( p = 10 − 2 ) ( p = 2 . 10 − 2 ) (4, 3, 8) 7 min 12 sec Non detected (4, 3, 9) 6 min 16 sec Non detected T able 1: Example of reconstruction time (on Pen tium IV 2.0 Ghz) for tw o noise lev els As a consequence, considering a rather high lev el of noise preven ts the re- construction to succeed unless we dev ote a huge computing time (several hours) at least. W e then will c ho ose a noise level ranging from 0.15 to 0.35. Let us men tion that Perseus technology considers (and implements) the w orst case of communication c hannel mo del with resp ect to the reconstruction problem: the A dditive White Gaussian mo del in which the noise is applied uni- formly (in other w ords the noise v ariable is a random, iden tically distributed, indep enden t v ariable). In real communications (for instance satellite communi- cations) the noise o ccurs by burst and different channel mo dels must b e consid- ered (e.g. Gilb ert-El liot model [12]). 3 Presen tation of the Perseus Library The library includes t wo main files: • A header file perseus.h which con tains the parameters settings, new t yp e definitions and function protot yp es. • A function file perseus.c which con tains the C co de of the different func- tions: random enco der generation, enco ding pro cedure, deco ding pro ce- dure. . . Additionally , different files are also provid ed with the library: • A test program perseus test.c which presents ho w to implemen t and use the Perseus library . • A makefile to compile the previous test file. • A do cumentation file howto libperseus.pdf and a comprehensive de- scription of library structure and functionalities pro duced from the source co de by means of the doxygen utility . The official co de rep ository is lo cated on code.google.com/p/libperseus . The curren t stable version is 1.0.0. 8 3.1 Setting Perseus Parameters Perseus parameters are optimally defined in the p erseus.h file to pro vide the b est trade-off b etw een security and p erformance. The reader who would desire to mo dify those parameters must k eep in mind that some of them hav e an impact on the decoding residual error. So an y mo dification should be envisaged only for programmers ha ving a rather go o d knowledge in con volutional encoding and Viterbi deco ding theory [13]. The main parameters are generated randomly during the encoder generation. So only low er X M I N and upp er b ounds X M AX are set in order to define a v alue in terv al [ X M I N ; X M AX + X M I N ]. 3.1.1 Enco der inputs The num b er of enco der inputs is given by (default v alues [1; 6]).   1 # define KMIN GEN 1 2 # define KMAX GEN 5   3.1.2 Enco der ouputs The num b er of enco der outputs is defined by (default v alues [5; 11]).   1 # define NMIN GEN 5 2 # define NMAX GEN 6   3.1.3 Constrain t length (enco der memory) The size of the enco der memory (whic h also determines the degree of enco der p olynomials) are defined by (default [20; 30]).   1 # define MIN CONT 10 2 # define MAX CONT 20   3.1.4 Puncturing matrix width The width of the puncturing matrix whose height is defined b y the v alue N ∈ [ N M I N GE N ; N M I N GE N + N M AX GE N ] (default [6; 21]).   1 # define MIN MA TWIDTH 6 2 # define MAX MA TWIDTH 16   The puncturing lev el is defined b y the n umber of null entries of that matrix. This num b er is defined as follo ws 9   1 ∗ Random generation of the puncturing matrix weigh t ∗ / 2 / ∗ ( code − > mN ∗ co de − > mMatWidth − nbzero) ∗ / 3 / ∗ where nbzero = (co de − > mN ∗ code − > mMatWidth/8) ∗ / 4 n bzero = (int) (( float ) (( co de − > mN ∗ co de − > mMatWidth) >> 3)); 5 co de − > mMatDepth = (co de − > mN ∗ co de − > mMatWidth) − nbzero;   Let us notice that it is p ossible to adapt the weigh t of the punctured matrix according to the v alues of N and mMatWidth . F or rather large v alues of their pro duct it is p ossible to divide b y 16 or ev en 32 to a v oid decoding error arising on lo w memory computers. Perseus library 2.x will implemen t suc h optimizations along with combinatorial puncturing patterns. 3.2 Perseus Securit y P arameters There is only one parameter which has a direct impact on the Perseus securit y with respect to the enco der reconstruction problem from noisy sequences. This parameter ensures that this problem remains hard in practice requiring a huge sup ercomputing p o wer during several days or even weeks for a single enco der. This parameter is defined in the Gen Noise Generator function lo cated in the p erseus.c file.   1 / ∗ Noise probabilit y generation ([0.15, 0.35]) ∗ / 2 aNGen − > proba = 15 + (in t)(20.0 ∗ alea());   A noise probability close to 0.15 will in av erage require a reconstruction time in da ys while a probability close to 0.35 will require weeks or even mon ths of computing time. The reader must b e a ware that whenever a noise probability close to 0.50 is not p ossible in the context of Perseus . Suc h a probability relates to cryptog- raph y not to noisy communications. 3.3 Perseus Noise Generator In Perseus library 1.0.0, the random generator is fixed (it will b e random from v ersions 2.x). This generator is a biased stream cipher (com bining generator [8] class). It is initialized by a random 102-bit key whic h fills up the four linear feedbac k shift registers (LFSR) at time instant t = 0. It is worth noticing that the size of the key prev ents exhaustive search (to remov e the noise b y the attac ker) only . Hence the only p ossible approach is to reconstruct the encoder in the context of a noisy comm unication. The four LFSR p olynomials are defined in the perseus.h file as follows:   1 / ∗ Noise generator feedback p olynomial 1 ∗ / 2 # define POL Y1 0x47E07L 3 # define MASK1 0x7FFFFL 10 4 # define LR1 19 5 6 / ∗ Noise generator feedback p olynomial 2 ∗ / 7 # define POL Y2 0x1772AFL 8 # define MASK2 0x7FFFFFL 9 # define LR2 23 10 11 / ∗ Noise generator feedback p olynomial 3 ∗ / 12 # define POL Y3 0x1C95269L 13 # define MASK3 0x1FFFFFFFL 14 # define LR3 29 15 16 / ∗ Noise generator feedback p olynomial 4 ∗ / 17 # define POL Y4 0x43E98841L 18 # define MASK4 0x7FFFFFFFL 19 # define LR4 31   The biased filtering Bo olean function which outputs the additive noise to com- bine with the enco ded sequence is then defined by   1 / ∗ Noise probabilit y generation ([0.15, 0.35]) ∗ / 2 aNGen − > proba = 15 + (in t)(20.0 ∗ alea()); 3 4 / ∗ Bo olean filtering function generation ∗ / 5 w = 0; 6 aNGen − > Bf = (unsigned char ∗ )callo c(16, sizeof(unsigned char)); 7 for ( w = 0; w < 16; w++) 8 { 9 v al = (in t ) (99.0 ∗ alea () ) ; 10 if ( v al < aNGen − > proba) aNGen − > Bf[w] = 1; 11 }   4 Implemen tation of the Perseus Library Using and implementing the Perseus library is almost straigh tforward and easy (Figure 2). In order to illustrate things, a sample test file perseus test.c is pro vided with the library [18]. W e are going to detail the whole pro cess as it is in the library howto file. Let us men tion that since the library uses dynamic Viterbi deco ding (which ma y b e memory consumming depending on the instances of Perseus parameters, the decoding ma y fail if you c ho ose to pro cess large amoun t of data on a computer with limited memory . W e strongly advise to split data into c hunks of less than 2 Kb. The next version of the library (from v ersions 2.x) will consider p olynomial time deco ding anf therefore this limitation will no longer exist. Let us supp ose that the data to protect are stored into the array data . John Do e from USA wan ts to send them to Jean Martin in F rance in a secure w ay . 11 D eco d e d a ta Jo h n Do e Jea n Mar ti n C o m mu n i ca ti o n s et u p P ro ce ssi n g d a ta P a r a m e t e r g e n e r a t i o n G et p ar a m et er s E n c o d e d a ta H T T P S Figure 2: Implementation structure of the Perseus library On John Do e’s side, the main steps are (in the follo wing order): 1. First generating the enco der, the noise generator and the noise generator secret key randomly .   1 / ∗ Generate the PCC enco der ∗ / 2 Pcc = generateCode(); 3 ... 4 / ∗ Noise generator secret key generation ∗ / 5 aKey = (INIT NOISE GEN ∗ )calloc(1, 6 sizeof ( INIT NOISE GEN)); 7 ... 8 aKey − > INIT1 = (unsigned long in t)((float) 9 (0xFFFFFFFFL) ∗ alea()); 10 aKey − > INIT2 = (unsigned long in t)((float) 11 (0xFFFFFFFFL) ∗ alea()); 12 aKey − > INIT3 = (unsigned long in t)((float) 13 (0xFFFFFFFFL) ∗ alea()); 14 aKey − > INIT4 = (unsigned long in t)((float) 15 (0xFFFFFFFFL) ∗ alea()); 16 17 / ∗ Noise generator v ariable allo cation ∗ / 18 NGen = (NOISE GEN ∗ )callo c(1, sizeof(NOISE GEN)); 19 ... 20 / ∗ Noise generator init ∗ / 21 if (! Gen Noise Generator(NGen, aKey)) 22 { 23 p error( ”Noise enco der generation on error ! ” ) ; 24 free ( NGen); 25 exit (0) ; 26 } 27 ...   2. Sending the secret elemen ts to Jean Martin through a HTTPS session (or 12 an y equiv alent secure channel). This part is not play ed in the p erseus test.c file (ob vious to implemen t). The secret elemen ts are the PCC encoder and the noise generator secret key . It consists in three structures (defined in file p erseus.h )   1 / ∗ Generic type for Punctured 2 Con volutional Co de ∗ / 3 t yp edef struct 4 { 5 unsigned in t mN; 6 / ∗ Num b er of output bits ∗ / 7 unsigned in t mK; 8 / ∗ Num b er of input bits ∗ / 9 unsigned in t mM; 10 / ∗ Encoder memory size ∗ / 11 unsigned long ∗ ∗ mPoly; 12 / ∗ Encoder p olynomials ∗ / 13 unsigned in t mMatWidth; 14 / ∗ Puncturing matrix width ∗ / 15 unsigned char ∗ mMatrix; 16 / ∗ Encoder puncturing matrix ∗ / 17 unsigned in t mMatDepth; 18 / ∗ Puncturing matrix w eight ∗ / 19 } PUNCT CONC CODE; 20 21 / ∗ Generic type for a noise generator ∗ / 22 t yp edef struct 23 { 24 unsigned long int Reg1; 25 / ∗ Linear F eedbac k Shift Register 1 ∗ / 26 unsigned long int Reg2; 27 / ∗ Linear F eedbac k Shift Register 2 ∗ / 28 unsigned long int Reg3; 29 / ∗ Linear F eedbac k Shift Register 3 ∗ / 30 unsigned long int Reg4; 31 / ∗ Linear F eedbac k Shift Register 4 ∗ / 32 unsigned in t L1; 33 / ∗ Length of LFSR 1 ∗ / 34 unsigned in t L2; 35 / ∗ Length of LFSR 2 ∗ / 36 unsigned in t L3; 37 / ∗ Length of LFSR 3 ∗ / 38 unsigned in t L4; 39 / ∗ Length of LFSR 4 ∗ / 40 unsigned char ∗ Bf ; 41 / ∗ Com bining Bo olean function ∗ / 42 unsigned in t proba; 43 / ∗ Noise probabilit y ∗ / 44 } NOISE GEN; 13 45 46 / ∗ Generic type for noise generator 47 secret k ey ∗ / 48 t yp edef struct 49 { 50 unsigned long int INIT1; 51 unsigned long int INIT2; 52 unsigned long int INIT3; 53 unsigned long int INIT4; 54 } INIT NOISE GEN;   3. Encoding the data   1 enco ded data size = 0L; 2 if (! pcc Co de(Pcc, data, data size , &encoded data, 3 &enco ded data size, NGen, aKey)) 4 { 5 p error( ”Enco ding error \ n” ); 6 exit (0) ; 7 } 8 9 prin tf ( ”Data after enco ding = %s \ n” , enco ded data);   The PCC enco ding includes all basic steps (c haracter to binary enco ding, the PCC coding itself, data puncturing righ t after the encoding, the binary to hex nibbles encoding, the addition of deterministic noise). The final result of the PCC enco ding is contained in the array encoded data . 4. John Do e sends the enco ded data to Jean Martin. On Jean Martin’s side, the steps are: 1. Reception of the secret elemen ts through a HTTPS session (PCC enco der and the noise generator secret k ey) from John Doe. The three correspond- ing data structures (see ab ov e John Do e’s step 2) are then initialized. This part is not pla yed in the p erseus test.c file (obvious to implemen t). 2. Decode data   1 dataLength = 0L; 2 if (! pcc deco de(Pcc, NGen, aKey, encoded data, 3 enco ded data size , &dataDeco ded, &dataLength)) 4 { 5 p error( ”Deco ding error \ n” ); 6 exit (1) ; 7 }   14 The PCC decoding step includes all basic pro cessings (remov e the de- terministic noise, hex nibble to binary transco ding, data unpuncturing and Viterbi deco ding). Encoded data are in the arra y dataCoded while Decoded data are contained in the array dataDecoded . 5 Exp erimen tal Results W e hav e tested our implementation of the Perseus library on a 2 Gb RAM, In tel Core2 Duo CPU P8400 (2.26GHz). Data ha ve b een pro cessed b y ch unks of 1 or 2 Kb. Of course the performance are dep ending on the random instances of enco ders. The main b ottlenec k remains the dynamic Viterbi deco ding whic h tak es most of the processing time (more than 70 % of the total time) and of the av ailable memory . How ever av erage p erformances are rather go o d. Let us notice that the current release (1.0.0) has not b een optimized to preserve the co de readibility . The next v ersion of the Perseus library will consider a polynomial time deco ding while requiring a negligible amount of memory . 5.1 Perseus En tropy Profile In order to illustrate the fact that Perseus -protected data may exhibit an en tropy profile whic h is close to that of plain (unprotected) data, we hav e com- puted the av erage en tropy p er b yte on several files (on differen t Indo-Europ ean languages). T able 2 summarizes the results. Noise Plain data Perseus -protected data Encrypted probabilit y av erage entrop y data data 5 % 4.21 4.96 8.00 10 % 4.21 6.19 8.00 15 % 4.21 6.46 8.00 20 % 4.21 7.11 8.00 25 % 4.21 7.39 8.00 30 % 4.21 7.45 8.00 35 % 4.21 7.71 8.00 T able 2: Average en tropy profile for plain, Perseus -protected and AES en- crypted data These results clearly sho w that the en tropy profile depends on the noise lev el (whic h is quite obvious). Our tests hav e also confirmed that the more complex the enco der is (in terms of redundancy added) the low er the en tropy profile is. Let us recall that the con volutional co de reconstruction is un tractable (in reasonable amount of time) as so on as noise probability is higher than a few p ercen ts (practically > 0 . 02). So if we wan t to low er the en tropy profile, we can 15 consider noise probabilit y of 5% while preserving the scalable-security provided b y the Perseus approach. 5.2 Secure Programming Throughout the programming pro cess, the code security w as a priority . W e ha ve paid a maximal atten tion to this p oin t. Once the Perseus library has b een achiev ed, w e ha ve p erformed co de auditing with respect to security . W e ha ve first applied the Flawfinder utilit y [10] which tracks unsecure programming. It helps preven ting buffer o verflo ws, heap o verflo ws... by chec king the nature and use of common functions. In a second step, w e ha v e analyzed ho w efficien tly and correctly the Perseus library uses memory . F or that purp ose, the Valgrind utility [19] has b een considered. As a result, the C co de of the Perseus library complies with the existing rules of secure programming and hence do es not introduce w eakness or flaws that could b e exploited for attack purp oses. 6 Applications and Implemen tations A t the present time, a few implementations and application of the Perseus tec hnology are known. W e hop e that new con tributors will volun teer to give birth to new ones. The DFT T echnologie compan y ( http://www.dft- techno.com ) has decided to pro vide the industry supp ort to the Perseus technology and to help and promote the research and developmen t effort around it. 6.1 Firefo x Plug-in This pro ject is managed b y Eddy Deligne [17]. He has applied the Perseus tec hnology to protect http proto col ( get and post methods) while using Fire- fo x [5]. This solution is materialized in the form of a C++ Firefox plug-in dev elop ed under the triple GPL/LGPL/MPL licences and complying with the sp ecifications of Mozil la developmen t, thus allowing the code to b e merged to the Firefox engine co de directly . This plug-in is a v ailable with the corresp ond- ing server (Linux, Windows) thus providing an all-round solution (client/serv er arc hitecture). A t the presen t time, all Firefox versions 3.x are co vered (Windows, Linux, Apple). The new Firefox 4.x should b e also protected very so on (many structural c hanges hav e o ccured with this new version thus requiring significan t changes in the Perseus plug-in). 16 6.2 Andromeda Library: Protecting the T orren t Proto col F abien Jobin [2] has dev elopp ed the Andr omede library 4 whic h implements the bittorrent protocol in its original version (e.g. without an y additional third- part y functionality except one devoted to the extension managemen t). In the Andromede library , the bittorrent traffic is protected b y the Perseus tec h- nology . 7 Conclusion and F uture W orks The Perseus technology intends to prop ose a new trend in information and comm unication securit y . The concept of scalable security should help to mak e con verge the needs for National Security and citizens’ natural rights for priv acy . This tec hnology preserv es the abilit y of state in telligence agencies to ha ve access to the Perseus -protected data. Indeed the noisy enco ding lay er can alw ays b e pro cessed at the price of an offline, time-consuming computing step. Only na- tional security agencies and sp ecialized p olice departmen ts ha ve such a suitable computing p ow er. But since it requires a lot of time to break this tec hnology , the num b er of attempts will b e limited to pro cess the communication of really bad guys only and not those of any ordinary citizen. Curren t research and dev elopment activities around Perseus tec hnology consider the protection of v oice and phone comm unications as w ell as file pro- tection: • dev elopment and implementation of V oIP platforms; • dev elopment of Android modules and apps to pro vide communication pro- tection for v arious kind of data: voice, sms, mms. . . • dev elopment of Linux/Windo ws application to protect files on hard disk. The main difficult y here lies in the Viterbi deco ding whic h is the most time- consuming part. How ever our recent research results to develop a new deco ding algorithm whic h has polynomial complexity are more than v ery promising. This is of nature to sp eed up the decoding step significantly , thus op ening a lot of opp ortunities with resp ect to the Perseus technology . Finally , our curren t w ork fo cus on additional plug-ins which enable first to lo wer the entrop y profile of Perseus -protected data in order to make it far closer to plain data and second to make their entrop y profile and statistical features lo ok like to those of arbitrary data (image files, PDF files...). Ac kno wledgemen t I would like to thank Olivier F errand for his guru skills with Valgring and Flawfinder as well as Eddy Deligne for his help to review the library co de. 4 In the Greek mythology , Andromede is P erseus’ wife. 17 References [1] 3rd Generation P artnership Pro ject (2003). T e chnic al Sp e cific ation Gr oup R adio A c c ess Network Gr oup - Multiplexing and Channel Co ding (FDD) , re- lease 5, TS 25.212, v5.0.0, http://www.mumor.org/public/background/ 25212- 500.pdf [2] Andromede library w ebsite http://code.google.com/p/andromede (so on a v ailable). [3] J. Barbier (2007). Analyse de c anaux de c ommunic ation dans un c on- texte non c o op´ er atif - Applic ation aux c o des c orr e cteurs d’err eurs et ` a la st ´ eganalyse (Communic ation Channel A nalysis in a non-c o op er ative c on- text - Applic ation to Err or-c orr e cting Co des and to Ste ganalysis) . Th` ese de Do ctorat (Ph D Thesis), Ecole Polytec hnique. [4] J.B. Cain, G.C. Clark Jr., J.M. Geist (1979). Punctured conv olutional co des of rate n − 1 n and simplified maximum likelihoo d decoding. IEEE T r ansac- tions on Information The ory , vol. IT-25, No.1, pp. 97-100, January 1979. [5] Eddy Deligne and Eric Filiol (2009). Perseus : A Co ding Theory-based Firefo x Plug-in to Counter Botnet Activit y . Hack.lu 2009 Confer enc e , Lux- em b ourg. P ap er and slides are av ailable at http://archive.hack.lu/ 2009 . [6] Eric Filiol (1997). Reconstruction of Conv olutional Enco ders o ver GF ( q ). In: Pr o c e e dings of the 6th IMA Confer enc e on Crypto gr aphy and Co ding , Lecture Notes in Computer Science, #1355, Springer V erlag, 1997. All results can also b e found in [8]. [7] Eric Filiol (2000). Reconstruction of Punctured Con volutional Enco ders. In: Pr o c e e dings of the 2000 International Symp osium on Information The ory and Applic ations (ISIT A) , IEICE Publishing, 2000. [8] Eric Filiol (2001). T e chniques de r e c onstruction en cryptolo gie et th ´ eorie des c o des (R e c onstruction T e chniques in Co ding The ory and in Cryptolo gy) . Th ` ese de Do ctorat (Ph D Thesis), Ecole Polytec hnique. [9] Eric Filiol and Eddy Deligne (2010). The Perseus lib: Open Source Li- brary for TRANSEC and COMSEC Security . In: iA W ACS 2010 , http: //www.esiea- recherche.eu/iawacs2010.html [10] Fla wfinder W ebsite http://www.dwheeler.com/flawfinder [11] P . F oster (2009). MI-5 comes out against cutting off in ternet pirates. The Times, Octob er 23rd, 2009. Retrieved on http://www.timesonline.co. uk/tol/news/crime/article6885923.ece [12] E. N. Gilbert (1960). Capacity of a Burst-noise Channel. Bel l System T e ch- nic al Journal , 39, pp. 1253–1265. 18 [13] R. Johannesson et K. Sh. Zygangiro v (1999). F undamentals of Convolu- tional Co ding . IEEE Press. [14] J. M. Manac h (2010). La NSA n’aime pas Hadopi ( NSA do es not like Hadopi ). http://bugbrother.blog.lemonde.fr/2010/10/02/ frenchechelon- la- dgse- en- 1ere- division/#more- 819 [15] R. J. McEliece (1998). The Algebr aic The ory of Convolutional Co des . In Handb ook of Co ding Theory , V.S. Pless and W.C. Huffman editors, North- Holland, 1998. [16] Owniliv e W ebsite (2010). Hadopi vs Crypto. http://ownilive.com/2010/ 10/03/hadopi- et- crypto [17] P erseus Firefo x w ebsites http://code.google.com/p/perseus- firefox and http://www.mozdev.org/source/browse/perseus [18] Perseus Library Official w ebsite (2010) http://code.google.com/p/ libperseus [19] V algind w ebsite http://valgring.org [20] Wikipedia, the free encyclopedia. Perseus http://en.wikipedia.org/w/ index.php?title=Perseus&oldid=298156617 . 19

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment