AGNOSCO - Identification of Infected Nodes with artificial Ant Colonies

If a computer node is infected by a virus, worm or a backdoor, then this is a security risk for the complete network structure where the node is associated. Existing Network Intrusion Detection Systems (NIDS) provide a certain amount of support for t…

Authors: Michael Hilker, Christoph Schommer

AGNOSCO - Identification of Infected Nodes with artificial Ant Colonies
A GNOSCO - Identification of Infected Nodes with ar tificial Ant Colonies Michael Hilker and Chr istoph Schommer Univ ersity of Lux emb ourg, Campus Kirchberg Dept. of Computer Science and Communication 6, Rue Richard Coudenhov e -Kalergi, L-1359 Lux embourg Email: { michael.hi lker , chr istoph.schom mer } @ uni. lu Phone: +352-466644 -5- { 311,228 } Abstract: If a co mputer nod e is infe cted by a virus, worm or a bac kdoor , then this is a securit y risk for the compl ete network structur e wher e the node is associ ated. Existing Network Intrus ion Detection Systems (NIDS) pr ovide a cer - tain amount of suppor t for the identificati on of suc h infected nodes bu t suffe r fr om the need of plent y of communication and computatio nal po wer . In this article, we pr esent a no vel ap pr oac h called A GNOSCO t o supp ort the id entific ation o f in fected nodes thr ou gh t he u sa ge of ar tificia l ant colo nies. It i s sh own that A GNOSCO over - comes the communica tion and computational power pr oblem while identifying in- fected nodes pr operly . Ke ywords: Network Protection, Intrusio n Dete ction, Bio-inspi red Compu ting, Ant Colonies . 1 Motivation In the current working a nd li fe en vironmen t, connecte d nodes - compu ters, serv ers, etc. - are essentia l. These nod es are under constant ass ault form att acks lik e e.g. worms, trojans, and hack ers. No waday s, th ere exist sev eral approach es to protect a computer node or a netw ork against criminal attac ks like virus- and malwa re guards , symbolic NIDS-solu tions like SNOR T [9 , 2, 10], and bio- inspir ed NIDS- soluti ons (Artificial Immune Systems, [6 , 7, 11]). These protec tion-s ystems ch eck each packe t, which tra verses a network node, and ev aluate if this pack et intends to attack or not. Ho w e ver , man y NIDS solution s suf fer fr om identifying (ne w) a ttacks as well a s from the need o f plenty of computati onal po wer; further more, there e xist applie d techniq ues to camouflage attac ks in a way th at NIDS are not able to identif y the attack at all. Hence , th ere are s ituatio ns when an attack i nfects a nod e and when a compute r network risks to be infected by the node. This is much more critical as it see ms since in fectio ns can cause a backdo or to other attac ks, infection s can send pack ets containing an attack to infect healthy nodes. The identificati on of s uch an infected node - sometimes also zombie-nod e called - is a well-kno w problem. In the curren t research community , only a few approa ches of ident ifying infected nodes are kno wn, for ex ample • Anomaly Detec tion: A sys tem kno ws how t o ide ntify normal net work traf fic and tries to identify ab normal networ k traf fi c using t his informat ion. A node, which tran smits a lot of abno rmal traffic, is infec ted with a high probabili ty . • Statis ticall y A nalysi s of Network T raf fic: A system observ es the network traf fic an d if some statistically par ameters are met, the node is pr obabl y in- fected . • Infer ence fr om Network T raf fic Analysis: If a network no de is infected, the netwo rk node release s se veral packe ts containing an attack in order to infect also other nodes of the netwo rk. This beha viour can be recog nized usin g intrus ion detecti on and an intel ligent inferen ce system is used in order to deri ve to the infected node. • T rust or the Byzantine General Pr oblem [8]: If a node runs a service, a watch dog can use this service regula rly in order to check for incorrect an- swers and the watchdog can observe if the node sends pack ets if it should not in order to detec t abnormal behav iour . Unfortun ately , all these approa ches ha ve significant disadv antages. First, they need infor mation from the comput er network that must b e collec ted, fusioned, and furthe r processed. Consequen tly , this results in high communicat ion costs where the centric ev aluation aff ords plenty of computationa l po wer . Second, th e last ap- proach shares se ver al other disadv antages, e.g., defining an incorrect answer and decidi ng when a node shou ld not send any pack ets. Follo wing this, our motiv ation is that nov el (bio -inspi red) systems can significantl y cont rib ute to a higher id enti- fication rate of infecte d nodes. 2 Description of the idea In th is articl e, we intro duce a n ov el approach calle d A G NOSCO t hat i s an acron ym for A Gents fo r the ideNtifi cation of infect ed nOdes uSing artificia l ant CO lonies . The adv antag es of AGNOSCO are that it works autonomous ly , distrib uted an d more ef fi ciently . AGNOSCO does not need either much additional computa tional power or much add itiona l communicat i ve time while i dentif ying infected nodes proper ly . A GNOSCO is a part of our implemented networ k intrusion detecti on syste m called SANA (= S ecurity anAlysis in iNternet trAffic), w hich is an artificial im- mune syst em that use s lightwei ght, autonomous and adap ti ve artificial cells f or the protec tion of networks. SANA is a library of non-stan dard approaches for network securi ty and compoun ds these app roache s in an artificial immune system. T o understand AGNOSCO , we shortly ha ve to introduce the conce pt of ant coloni es. Generally , an ts hav e two states a nd if a n ant carries ou t a prey , the ant will release a lot of pheromones while tra veling back to the ant -hill. This is, becaus e other ants should find this prey as well. Howe ver , if an ant does not carry out a pre y , it releases only a fe w or no pheromones . If now an ant na vigate s, it uses the pheromo nes determined on the groun d. In this respect, ant coloni es are used for a lot of compute r science proble ms, e.g. op timizati on [4, 1, 3]. For A GNOSCO , each connec tion of the netw ork contain s a pheromone-v alue that is increas ed if an packet with a ttack trav els o ve r the c onnec tion; it is decreased, if a pa cke t without attac k tra vel s ov er the con necti on. The pheromon e value i s cal- culate d by the applicatio n of an affinity- functi on. This p heromo ne-v alue represent s the rate of attack ed packets o ve r the connection . A GNOSCO flows through the netwo rk, in terpre ts the pheromone v alue and identifies the infected nodes. There- after , a disinfectin g-pro cess can be started for this infected node; this disinfe ction- proces s is not part of A GN OSCO which jus t identifies the infected nodes. 3 Implementatio n D etails So far , AGNOSCO is implemente d in SANA as one of the light weighted , adap- ti ve and autono mous artificial Cells that fl o w through the network to protect the compute r network . In SANA , there exist artificial Cells and other compone nts that ev aluate pack ets w hether they contain an attack or not [5]. Furthermor e, if a packet arri ves at the destinat ion, the node confirms this eve nt using a small confirmatio n-pack et which is sent fro m th e destinat ion to th e source. If an art ificial Cell or another compon ent identifies a packet as malicious, the node w ill send a confirmatio n-pack et as well. H o we ver , this confirmation -pack et does not inform the sou rce, it increas es the pheromone -le vel on each c onnec tion on its way ba ck to the source-nod e. The confirmation-pa cke t for a pack et without an attac k beha ves like an ant without a prey and the confirmation- pack et for a packe t with an attack beha ves lik e an ant carryin g out a prey . For sett ing the right v alue of pheromones , the follo wing af fi nity-fu nction a f is chosen : a f connec t ion = b ∑ i = 1 inc ∗ d ec # good - pack e t s i ( f or each connec t ion in t he net w ork ) where b is the number of infected (bad) packets ov er this connection and the parameter inc the incre asing -fac tor of the system. The paramete r d ec is the decrea sing-f actor of the syste m and # good - pac ket s i the number of good pack ets which tra velled ov er the connection after the i -th bad packet. In this test simu la- tion, we adjuste d inc to the v alue of 20 and d ec permane ntly to 0.95. Then, the workflo w of the af finity-functio n is as follo ws: • If a pa cke t conta ins no attack tr a ve ls o ver a connection, all # g ood - packe t s i are increased by 1 because # good - pac ket s i is the count er how many good pack ets tra vel led ov er the connectio n after the i -th bad packet . • If a pack et contains an attack tra vels o ver a con nectio n, a new summand is added to the sum. Hence, the parameter b is increas ed by 1 and # good - packe t s b - the cou nter of good packets of the ne w bad pack et - is set to 0. Consequ ently , this affinity- functi on for a connect ion increa ses heav ily if a bad- pack et is found and d ecrea ses slightly if a g ood pac ke t trav els over th e co nnecti on. W e refer for an analy sis of this functi on to the sectio n 5. If a node is infe cted, it will normally send a high nu mber of packets cont aining the attack in order to infect other nodes in the computer network. Therefore, there will orig inate pheromon e-trac ks in the syst em which point tow ards the infected nodes ; as already mention ed before , AGNOSCO id entifies the infected nodes . A GNOSCO e valuat es the phero mone-le vel of connect ions as an artificial cells of the artificial immune system SANA while flo wing thro ugh the computer net- work: if a ph eromon e-le vel is highe r than the thres hold of AGNOSCO , it fo llo ws the track. H o w e ver , if AGNOSCO follo ws a track and no connec tion in a node has a phero mone-v alue hi gher than the thresho ld of AGNOSCO , AGNOSCO sto ps and ends the track. So, A GN OSCO knows that this node is infected - if and only if the parameters of AGNOSCO are set properl y; it then informs other components of SANA for disinfectio n or isolation of this node. AGNOSCO beha ves lik e an ant which lea ves the ant hill and trie s to find a prey using the pheromones . Conse- quentl y , the sy stem simulates an artificial ant colon y where the preys are infected nodes . Summary of the Changes in the Network: In the Network Infrastructu re must be for each connect ion some storag e-spac e added in order to store the pheromone-v alue, at most 10 kB per connectio n. The Network Protocols must not be chan ged and A GNOSCO is compatible with e x- isting proto cols. Essent ially , th e N IDS-Beha viour concerning identified maliciou s pack ets must be changed ; if the NIDS identifies a pack et as maliciou s, addition- ally it m ust sen d a con firmation-p ack et for this bad -pack et in order to upda te the pheromo ne-v alues on the path from sour ce to destination . 4 Simulation Results Using the implement ation of A GNOSCO / SANA for the ident ification of infecte d nodes , we tes ted se veral scen arios , where dif ferent types of attacks ha ve infecte d netwo rk nodes. In all scenarios , A GNOSCO tried to identify the infected node s in order to start the disinfec tion. Generally , AGNOSCO identified all infecte d nodes efficien tly and produce d a good performanc e for dif ferent attack types. Ad- dition ally , we determined an appr opriat e paramete r setting. For example, a sce- nario w ith 75 nodes and 3 artificial Cells of type, AGNOSCO took abou t 30-50 proces sing-s teps to identify all infect ed nodes. If a new infection of a node occured, A GNOSCO iden tified this infected node using at most 20 processing time-steps. In the simulation , we found out a certain amount of compu tation al po wer an d the communi cation bandwid th that is requested by A GN OSCO . T he co mputatio nal po w er is nearly not recognisa ble since AGNOSCO flo ws throu gh the network and reads and rates just the pheromones v alues. For the storin g of the pheromone - v alue in each connecti on, only stor age-sp ace was needed that is limited to 10kB. The communica tion bandwidt h depended on how many arti ficial Cells of this type are flowin g through the netw ork. A simu lation of 75 nodes and 3 artificial Cells of type AGNOSCO reques ted only little storag e and one or two IP-Pack ets per time-step . C onsequ ently , hardw are requirement s can be met by stan dard computer netwo rks while addit ional commun ication do es not matt er in common used high- speed networ ks. 5 A c loser look to the Affinity-Fun ction The affini ty-fun ction is biologicall y inspired. In human af fi nity-fu nction s, an ev ent increa ses the af finity heavi ly and, ov er time if no new ev ent o ccurs , the value of the af fi nity-fu nction decrea ses primarily heav ily and afterw ards slowly . This means, that the gradient of the function is primarily high and decrease s aft erward s. Thus, the human body reacts using the af finity-fu nctio n to an e ven t heav ily; therea fter , with the high grad ient, the human body tries to compens ate an error; and after - wards , with the low grad ient, it tries to reach a stable value . 0 10 20 30 40 20 40 60 80 100 Fig. 1: A short-t erm plot of the affinit y- functi on us ed in our simulations. 0 20 40 60 80 100 200 300 400 500 Fig. 2: A long-term plot of the affinity - functi on us ed in our simulations. Consequ ently , the affinit y-func tion tries to model the behav iour that th e v alue increa ses by leaps an d bounds in order to m ark if in a short time-step a lot of bad pack ets are found for a connection. On the other hand, the affinity- functi on tries primarily to decrease the va lue fast in order to eliminate a possibl e error . After - wards , the affinity-f unctio n valu e de crease slowly in ord er to reac h a stable value so that pheromone-t racks appear in the network. The affinity -funct ion forg ets also old e ve nts of bad-pack ets in order to model the beha viour of the human af finity- functi on and in orde r to reach a stable value. Figure 1 visu alizes a short-te rm plot of the affinity- functi on. The parameters are inc = 20, d ec = 0 . 95 and the nu mber of pac ket s is 100 and the pa ck ets number 3, 10 and 15 are identified as bad and all other pack ets are good. Figure 2 visu alises the long-ter m beha viour of the af finity-fu nction if an in- fected n ode is nearby t he co nnect ion. Every fifth packe t is an identified bad-p ack et and all other packets are good . The parameters inc = 20 and d ec = 0 . 95 are equal to the last plot of the affinity- functi on. The figure shows that the va lue of the af fi nity-fu nction straig ht fa st to wards a stable v alue and t he v alue alterna tes around it. W ith thi s affinity -funct ion, the ph eromone -tracks appea r in the network prop- erly and the artificial Cells can follo w these tracks as well as identify the infecte d nodes ; in our simulatio ns the thresh old of AGNOSCO w as 10. 6 Conclusion In this article , we descr ibed a nov el approach called AGNOSCO for the identifica- tion of infecte d nodes in a c omputer n etwor k. The id ea is biologically inspired and follo ws the beha viour of ant co lonies . A G NOSCO is implemented, si mulated and tested ; A GNOSCO ef ficiently identifies the infected network nodes unless taking both additional computati onal power and additional communication bandwidth. W e are sure that AGNOSCO can enhan ce commonly used NIDS as well as SANA . Future enhancements of SANA esp eciall y th e communicat ion and collab oratio n of the artificial Cells in SANA will be our next c hallen ges. Acknowledgments SANA and A GNO SCO are part of the projec t INTRA (= INternet TR Affic man- agement and analy sis) th at are financial ly sup ported by the Uni versit y of Luxem- bour g. W e would like to thank the Mini stre Luxembour geois de l’educati on et de la recherc he for additio nal financial support and Jacob Z immermann (Queensland Uni ver sity of T echnolo gy) for worthful discus sions. References [1] A . Colorni, M. Dorigo , and V . Maniezzo. Distrib uted optimiz ation by ant coloni es. Pr oceedings of ECAL91 - Eur opean Confer ence on Artificia l Life , pages 134–14 2, 1991. [2] H . Debar , M. Dacier , and A . W espi. T ow ards a taxonomy of intrusion- detect ion syste ms. Computer Networks , 31(9):8 05–8 22, 1998. [3] M. Dorigo, V . Maniezzo , and A. C olorni . The Ant System: Optimization by a colon y of cooperatin g agents. IEEE T ra nsacti ons on Systems, Man, and Cybernet ics P art B: Cyberne tics , 26(1) :29–4 1, 1996. [4] S . Gross, R. Beckers, J.L. Deuneub our g, S . Aron, and J.M. Pastells. Ho w trail laying and trail fol lo w ing can solve foraging problems for ant colonies. In Behaviour al Mechanis ms of F o od Selecti on, Spring er-V erlag , B erlin , G 20:66 1–678 , 1990. [5] M. Hilker and C. Schommer . Description of bad-signatu res for netw ork in - trusio n detection. Austr alasi an Information Security W orksh op - Network Securi ty during the Aus tra lasia n Computer Scienc e W eek , 29, 20 06. [6] S . A. Hofme yr and S. F orres t. Immunity b y design: An artificial immune sys- tem. Pr oceedi ngs of the Genetic and Evoluti onary Computati on Confer ence , 2:128 9–129 6, 1999. [7] S . A. Hofmeyr and S . Forres t. Architecture for an arti ficial immune system. Evolutio nary Computatio n , 8(4):443–47 3, 2000. [8] R . Sho stak L. Lamport an d M. Peas e. T he byzant ine generals prob lem. A CM T ran sactio ns on Pr ogr amming Langua ges and Systems , 5(4):382– 402, 1982. [9] M. Roesch. Snort - lightweight intrusion detectio n for networks. LISA , 13:22 9–238 , 1999. [10] S. R. S napp, J. Brentano, G . V . Dias, T . L. Goan, L. T . H eberle in, Che lin Ho, K. N. L e vitt, B. M ukher jee, S. E. Smaha, T . Grance, D. M . T eal, and D. Mansur . D IDS (distrib uted intrusion detection system) - motiv ation, ar- chitec ture, and an early prototype . N ationa l Compu ter Security Confer ence , 14:16 7–176 , 1991. [11] E. H . S paf ford and D. Zamboni. I ntrusi on detecti on using autono mous agents. Computer Networks , 34:54 7–570 , 2000.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment