STORM - A Novel Information Fusion and Cluster Interpretation Technique
Analysis of data without labels is commonly subject to scrutiny by unsupervised machine learning techniques. Such techniques provide more meaningful representations, useful for better understanding of a problem at hand, than by looking only at the da…
Authors: Jan Feyereisl, Uwe Aickelin
STOR M - A No v el Informatio n F usion a nd Cluster In terpretation T ec hnique Jan F ey er eisl and U w e Aic k el in Sch o ol of Com puter Science, The Universit y of Nottingham, NG8 1BB, UK { jqf,uxa } @cs.nott.ac.uk Abst ract. Analysis o f dat a wit hou t lab els is commonly sub ject to scru- tiny by unsup ervised mac hi ne learning techniques. Su c h techniques pro- vide more mean ingf ul repre sentatio ns, useful for b ett er u nde rsta ndin g of a problem at hand, than by lo ok ing only at the data itself. Although abundan t exp ert kno wledge exis ts in many areas where unla b el led d ata is exa mine d, suc h knowledge is r arely incorp orated into automatic analy sis. Incorpo ration of exp ert know l edge is freq uently a matte r of com b ining mu l tiple data sources from disp arat e hypoth etical spa ces. In cases where such spaces b elong to differ en t data t y p es, this task b ecomes ev en more c ha llenging. In this paper w e pres en t a n ov el immune-inspired metho d that enables the fus ion of such disp arat e t yp es of da ta for a sp ecific set of pro blem s. W e show that our metho d provi d es a b ette r visual under- stan ding o f one hy p othetical space with the help of data from an othe r hyp othe tica l space. W e b elieve that our mo del h as impl icatio ns for t he field of expl orato ry data analysis and kno wledge dis co very . 1 In tro duc tion The machine lear ning com mun i ty embraces t wo t yp es o f learnin g tha t en co m pa ss the ma jor it y of algori thms pr ese n t within this field . Sup er vise d le arnin g, whe re example s of d ata of in te rest exist, and u nsup ervised le arnin g wher e no ex pli ci t example s are av aila ble. When examp les a re prese n t, a decisio n fu nc t io n ca n b e found by explo iting t he knowled ge of such exa mples. O n the othe r hand, with out such k nowledge, only simila rity b e tw ee n da ta ca n b e exploite d in o rder to find groups of d ata tha t share some commo n attri but es [1]. The human immune s ystem has insp ired a n u m b e r of algor ithms th at fall in to these t wo cate gories [3] , y et it do es not si mply op erate only within these t wo real ms. Knowledge is emb ed ded within DN A p assed d own from ge neratio n to gene rati on, eventually tran sfor ming into biologica l en tit ies or funct iona litie s tha t provide ad dit io na l knowledge to what is lea rned durin g the lifeti me of a li vi n g b eing. One exa mple of such inh e rite d knowledge c an b e found withi n T oll -l i k e recep tors (TL R), pre se n t o n several t yp es o f im mune cells [4 ]. In th is w ork we will s how tha t an analog y o f T LRs pr ovides an insi gh t in to a third class o f learnin g tha t enco d es knowled ge tha t is no t within the sa me hyp oth etica l space a s knowledge enco ded within a tra inin g or testi ng datas et. Such a t yp e of lear ning b ecome s esp e cially usef ul wher e no lab elled exam ples exist, b ut so me knowled ge ab o ut classe s of interest is acknowledge d. W e b elie v e tha t such inc o rp ora ti on provides fo r b ett er u nde rstan ding of under lying dat a based on more than b lind functio n ap prox im atio n. In the re mainin g sect ions of this pap er w e first outli ne the fun ction ality of TLRs, follow e d by o ur hy p othesis. A des crip tion of the u nderl ying machine le ar n- ing alg or ith m is the n pre sented. A the or eti cal sp ecif icatio n of our StO rM mo del is th e n describ e d, o utlining a clu ster inter pre tat ion technique stem ming from our mo d el. This is follow ed by exp e rimental evid ence co nfir ming our h yp o the s is . 2 T oll- L ik e Rec epto rs TLRs are a se t of rece ptors on the surface of immune c ells which ac t as se n sor s to fore ign microbial pro duc ts. One intere sting asp ec t o f these rece ptors is tha t they ac t like p iano keys. A diffe re n t sound is play e d whe n a di ffer en t k e y or co mbi na ti on of keys is pre ssed at once. In a si milar wa y , w hen a single rec e p tor senses a c he mical, it res ults in a di ff er e n t actio n p er formed than whe n a n um b er of r ecepto rs sense v ariou s chemical s within a sp e cific tim e p erio d [ 7] . A simple d efinitio n of T LRs is tha t the y ar e the i nitial de tec to rs of path oge ns att acki ng a syste m. The y so und an alar m whe n they enc oun te r ce rtain vir us- or bacte ria-sp ec ific chemical s, which trig ger a cascade of e v ents p o ten tia lly resu lting in an im mune resp o nse. Unlike i n many ot her par ts of the immune system, all this i s p ossib le due to evolved knowled ge passed down from parents to of fspr in g o ver many gen era tions . F or a det ailed d escriptio n of re search in the area of TL Rs the read er is dir ected to [4]. 2.1 TLR s and Le arn ing The id ea o f T LRs for the purp ose of learnin g is a very simple one. I t is a dir ec t tra nsl ati on of the r ecepto rs’ fun ction ality within t he h u man b o dy . T LR is a signat ure or a funct ion, enc o d ing so me known tru th. A set of TLR s on t he othe r hand enco de a class of inter e st. Our hy p othesi s is tha t by formulating da ta from one hyp oth etic al space as a set of T LRs, w e will b e a ble to combine info rm ation from t wo disp arat e space s in a wa y tha t will g iv e us a b ett er und ers tan din g of the p roble m. T his f usion of info rma tion is esp ecially us eful in cases where s o me knowledge ab o ut d ata of in ter es t is k nown, y et li mited am ou n t or no exa mpl es exi st. V apnik [ 10] also realise d this missing area b etw een s up ervise d and un su p er- vised learni ng and pr op o sed a r elated id ea, whic h he terms “ mas te r-c la s s l ear n - ing”. In his w ork how ever, he pr op o ses an e xtensio n o f a s up ervised le ar ni n g setting, where a tr aining datas et b elon ging to spa ce χ, with lab els, is su ppl e- mented with an ad dit ion al d es cri pti on of t his data in another spac e χ ∗ . Th is desc ript ion of da ta is called “hidden info rma tion” , which can exist in the form of exp ert knowledge d escrib ing the under lying prob lem. V apn ik co m b ine s his mo del with his supp or t v e ctor machine alg or ith m and shows tha t a p o etic de- script ion [10 ] o f a set of images o f n umb ers provides mo re useful knowled ge f or learnin g tha n a higher resolution ima ge, which holds mor e “te chn ic al ” i nf or ma - tion ab out the underl ying di gits. In V apni k’s w or k a p o etic desc ript ion is a p o et ’s tex tu al dep iction of the u nderl ying i mage. V ap nik’s ai m is to improv e the c las - sificatio n p erfo rmance of sup ervised funct ion es tim at io n based on thi s “h idd en infor mation”. In contr ast , we pro p ose a metho d to fuse exp er t knowled ge (on e hyp oth etica l space ) with “technic al” inf orm atio n (anot her hypo the tica l s pace ) for the purp ose of unsupervise d ana lysis and vis ua li sa ti on for b ett er e xp lor at or y data ana l ysi s. 2.2 TL R Mo d el Using V apnik ’s nota tio n w e ca n formalise our T LR a nalog y . In sup ervised le ar n- ing a pa ir (x i , c i ) is giv en, where x i denot es a vector o f so me di me ns i ona li ty and c i denot es a clas s lab el. In u nsup ervise d lea rning o nly ( x i ) is g iv e n. In o ur mo del a tuple (x i , s i ) is given, where s i de notes a data s tru c tu re which enco des some ad dit io na l knowledge o r side info rma tion ab o ut a data instance . As o ur knowledge mi gh t b e li mited, s i mi gh t b e e mpty whe n no knowledge exists. Data in s i b elo ngs to a space χ ∗ , whi c h is rela ted to χ . By this w e mea n tha t there exists a m ea ning ful corr elat ion b etw ee n χ ∗ and χ . Wi th ou t such cor relat ion, w e can state tha t the k nowledge repr esented in χ ∗ is not descrip tiv e o f χ. In ord er to incorp orate exp ert knowledge as p art of a l earni ng mechanis m, w e pr op o se the use of an es tabl ished machine le ar ni ng tec h nique, which provides n u merou s featur es tha t are b enefic ial to our mo del . De scrip tion of this technique fo llows. 3 The S elf-O rgan ising Map Self-or ganisin g netw o rk algorith ms provide a num b er of mechanisms desirab le fo r many co mp u tat ion a l tasks. F eatu res such as manifold lea rning, di me ns i ona li ty redu ctio n, multid ime nsio nal scali ng as w el l cluster ing a nd vis ua li sa ti on thro ugh sel f-or ga ni sati on are all mechanisms tha t in c ombi na ti o n provide a s uitable bas is for the inc o rp ora ti on of o ur T LR ana logy and for it s b e tte r un der st an di ng . O n e t yp e of se lf-or ganisin g netw o rk is the Sel f-Organi sing Ma p ( SOM) alg ori th m de - v e lop ed b y T euvo K ohone n [ 5]. F or a de tailed desc ript ion of the algor ithm th e read er is referred to K ohonen’s e xtensi v e b o o k on this t opic [ 6]. It is i mp ort an t to note how ever tha t the SO M is only one of many t yp es of algorith ms tha t w e b e liev e our mo del can b e applied to. In general any top o logy prese rving and manifold lear ning al gorit hm could p ossib ly b e exte nde d in order to achieve a compar able outco me. SOM w a s chosen due to its simp licity , sp eed and visu ali- sation ca pabil itie s. 4 StOrM - TL R E nhan ced SOM 4.1 Mo d el Overvi ew In o rder to fuse knowledge fro m dis pa rat e hyp oth et ica l space s with the un su- p ervise d learning outco me of the SOM , we exte nd the original alg or ith m in a i n u m b e r of w ays. T hese can b e d ivided into t wo cate gories. Firstl y data fusi on and c orr e lati on is p er formed thro ugh an e xtension of the original SO M. Se co n d l y a cluste r int er p r eta tio n alg or ith m is devise d, explo iting t he ex tended SO M i n or- der to provide a b ette r visua l rep res entat ion of under lying dat a. F usio n of Hy p oth etica l Spa ces - In the original SOM , the alg or ith m is prese n ted with an inp ut x = [ ξ 1 , ξ 2 , . . . , ξ n ] T e J n (1 ) where ξ is a n attr ibute of t he input vector x. This is the “te c h nic al” inf or ma - tion on which the S OM is tra in ed. In o ur exp eri men t s t his v ec tor, fo r exa m ple , comprise s nor malised real-v a lued data desc ribing a tim e-ba sed snap shot o f b e- haviour of one ru nning p ro cess a ccord ing to a num b er of host based measur es. In our mo d el t wo ad di tio na l inp uts e xists. Fir st, i nput for t he sep arat e h yp o the ti ca l space , which is in the form s = [ ς 1 , ς 2 , . . . , ς m ] T (2 ) where s is a v e ctor, how ever this time compr ising of an arbit rary nu m b e r , m, of v ariab les ς tha t enco de insta nce sp ecific info rm ation relate d to x, from space χ ∗ . In our exp er iments this vector co mprises all API calls tha t the pro cess, whose sn aps h ot is e nco d ed b y x, imp ort s. Second, a vector e enco d ing our exp ert knowledge e xis ts, e = [ 6 1 , 6 2 , . . . , 6 m ] T (3 ) Here vector e can how ev er comprise no t only o f fixe d da ta, 6, but also of functio ns, 6, tha t exp ress t he e xp ert’s se t of knowled ge tha t is desired to b e obse rv e d and id en tified within s or x. In contr ast to s, vector e is a glob al, rat her than a p e r in stance vector. Re tu rn in g to o ur im m u nological analog y , e is our rep e rtoire of TLR recepto rs, where ea c h 6 is a n individua l rec eptor . In our exp eri ments e is a v ec tor o f stri ngs t hat is re pre se nt ati ve of t he ma j ority of API call na mes asso ciated with networking fun ction ali t y in the Wind ows OS. Each e le me n t 6 is re pr ese nta ti ve of a sub c lass of ne tw orki ng func tions (e. g. 6 1 = ‘http’ ). Once the o riginal SOM is p rese n ted with input x, it finds the most sim il ar pro totyp e v e ctor and its ass o ciated no de, a lso called the “winner no de ” c , k x — m c k = m i n { k x — m i k } (4 ) This no de along wit h a ll n o des in its im mediate nei ghbour ho o d is subs e- quently s ub ject to a learning p ro ce ss, o ver a p rede fined time p er io d, with a discre te ti me -c o o rdi na te t , m i (t + 1) = m i + h ci (t)[ x(t) — m i (t)] (5) Here the functio n h ci denot es the ne ighb ourho od func tion which det erm ines the am oun t by which a pro totyp e v e ctor m i is affected durin g t he lear ning pr o- cess. T his dep ends on no d e i’s di stance fro m t he “winne r no de” c. Gene rally , the following smo o thing kernel, wri tte n in terms of t he Ga ussian func tion, is u s ed, h c i ( t ) = α ( t ) · e x p k r c — — r i k 2 (6 ) 2 σ 2 ( t ) where α deno tes the lea rning r ate and σ define s the widt h of the kerne l. V aria bles r c and r i are lo cation v e ctors o f the win ner no d e c and cu rre n tly obse rv e d no d e i in the out pu t grid. F o r more d etailed ex plan ati on see Koho nen’s b o o k [6]. Once thi s learnin g ter mina tes, the alg or ith m prese nts a d iscrete regular gri d containing a low er d imension al rep res entat ion of the input which p reserves top ol- ogy o f the learned data. This gr id comprise s of no des i, with which r ef er en ce v e ctors m i are asso ciated, tha t hold the l earned info rma tion. In our mo de l an ad dit io na l refe rence vector , l i , exists. This vector lea rns in fo rm at io n a cc or di n g to the fo llowing add ition al com put ation ste p, l c (t + 1) = l c (t) ∨ Λ(s (t) , e(t ), x (t) ) (7) where Λ i s a m a tching functio n tha t ev alua tes w hich ele ments of s a nd x satisf y condit ions sp ecified within e . In ot her words this functio n e v aluate s whic h TLRs hav e b een ac tiv a te d for the c urrently obse rved winner no de c. Op erato r ∨ is a Bo ol ean op er ator on e lements o f l a nd the out put of Λ, i.e . l c learn s whi c h known tru th has b een obser ved by a winner no de c o ver the dur ati on of the SO M learnin g pro cess. T he res ult of this ad dit io na l step is tha t desire d kn o wle dge from the se par ate hyp ot het ica l space is co rre late d with the p ro duc ed map. Thi s corr elat ion is e xhibited by the en hanced SO M out put co n taini ng no des i whi c h hav e t wo asso ciated r eferenc e v ec t o r s . One “te c h nic al”, m i , fro m χ and one o f co rre lat ed exp ert knowled ge, l i , ext ra cte d from t he sep arate hyp oth etica l spac e χ ∗ . I n our e xp er iments l i learns w he ther nod e i has e ver b een d eemed a wi nne r for so me input x asso cia ted with a pro cess th at uses W indows netw or king API functio ns. T he e nco ding o f inf o rm at io n from χ ∗ can now b e used , fo r exa m pl e, for cluster inter pre tat ion an d lab elling. In o ur exp erime n ts explo ited to del inea te a cl uster o f no d es re sp onsibl e for netw orki ng b ehaviour. It is im p ort an t to no te tha t this info rm ation can howev er b e exploite d in other w ays to enha nce the out put of t he SOM. F or examp le by affecti ng the actual SOM l earnin g fun ction , in orde r to include info rma tion from χ ∗ in the act ual m ap genera tion pro ce ss. Cl ust er In t e rpr et a tio n - Due to top o logy p reservi ng natur e of the SOM , newly intro d uced e xp ert k nowledge ca n now b e used t o i dentify cluster s o f in- terest w it hin the out put map. Our pr op o sed cluster inter pre tati on tec hn iq ue comprise s of t wo steps. Firs tly a n esta blish ed a lgo rith m called the Uni fied D is- tance M atrix (U- Ma trix ) [9 ] is exploited in order to find nod es which p os sibl y lie o n cl uster b oun dar ies. This info rma tion is sub se que ntly used by a step whic h connect s no des with si milar TLR inf or m ati o n, l i , i n o rder to delinea te clu ster s of in ter es t. Cluster B ound ary Se ar ch - T he cluster b o un dar y sear ch alg or it hm e xp lo it s an idea inc o rp ora te d within the U-Matri x vis ua li sa ti on tech ni que . Thi s me tho d shows dissi milari ties b etw een neigh b ouring nod es in order to high ligh t where p os- sible cluste r b ou ndaries lie. I n or der to find no des which lie on such b oun dar ies, w e prop o se to c ollect inf o rm a tio n ab o u t al l i n te r-no d e d istances alo ng b oth i and j dime nsion s. Onc e this inf orm atio n is ob tai ne d, it is sub jec ted to a v aria bilit y functio n wh ich identifie s di stance s b etw een nod es tha t signific antly dif fer from distance s b etw een the ma j ority of no des on the map. T his functi on is sub ject to further future resear ch, how ever here we p rovide some p ointers o n how i t mi gh t op e rate and how it is e mpl oy ed in our exp e riments. A w e ll tra in ed SOM ma p can b e tho ug h t o f a s co mprisin g o f cluster s existin g wit hin a datas et on whic h it is tra in ed. A care ful exa min ati on of p rop or tion of clusters versus in ter -c lu ste r no de s ne eds to b e p erfo rmed in o rder to de ter min e such p rop o rtion cor re ct ly . An exa mple of a qu anti tat ive measur e tha t could b e use d to rep re se n t such rat io can b e found a s follows. Assu ming 2 5% of no des w it hin a ge ner ate d map are inter-cl uste r no de s, w e define any inter- no de dista nce a b ov e the 3r d quantile o f all inter-no de distances as l ying on a cl uster b oundar y . Thus w e can lab el all no de s whose dis si mi la rit y is ab o ve the 3 rd q uan til e thr es hol d, as b e ing a clu ste r b ou nd ary no de. L ab el ling using No d e Con ne ctivit y - Once w e ob tain clust er b ou nda ry info r- mation, we can use our lea rned T LR knowled ge in or der to lab el clusters tha t exist w it hin t he SOM map. In ord er to achiev e this, the pres erv a tion of top olog y within a SO M map is explo ited by exploring neighb ouring no des w it hin a seg - me n t of a map, d elineated by b o un dar y no des. T he lab elling alg or ith m tra v er se s all no des m i within o ur map and connec ts no de s which l ie within a nei gh b ou ri ng region. A regio n i s usuall y b ounded by the previo usly fou nd b ou nd ary no de s. Once all p ossib le no des a re connected, a connected regio n i s ev a lua ted for the most fr equently o ccur ring ac tivate d TL R t yp e. T his info rm ation then pr o vid e s a lab e l for a ll nod es w it hin such c onnect ed r egion. A n e xample map where the result of the se ste ps can b e see n is i n Fig. 1(c ). In the lit era tu re othe r SO M cluster inter pre tat ion techniques exi st. T hes e techniques exp loit v a rious ad dit io na l machine learni ng metho d s to ac hie ve the ir goal. F or exa mple a t wo- sta ge pro cedure, where SOM out put is fed into a tra di - tional clusteri ng techniq ue, such as k-mean s [11 ], or hierar c hic al clusterin g [12 ], ev a lua te d with t he help of numerou s cluste r v alid ity indice s. S imilarl y to ou r w or k B rugger et al. [2] also exploit the top og ra p hic surfa ce of t he SOM, ho w ev e r with the help of an alg or ith m called clusot, rat her than the U-Ma trix me tho d used in our w or k. It is i mp ort an t to not e here tha t our mo de l is not onl y a c luster in ter pretat ion or lab elling technique. Our mo del provides a metho d for corre latin g data from disp arat e sour ces, which in this p ap er is used to i dentify a nd su bse que ntly lab el cluster s of inter e st, w ith ou t tra di tio n all y lab el led data . T he mo d el can ho w ev er b e used for other purp oses which co uld b e nefit from the ex plo ita tio n of the data f usion tha t is the res ult of our T LR fun ction ali t y . As mentioned b efor e, for example , the SOM learning fu nction can b e affected to take into ac cou n t data from the se para te hyp oth etic al space . This is one o f our future resear c h g oa ls . 5 Exp erimen ts Two e xp eri men t s w ere p er formed in ord er to v alidate our prop o sed mo d el. On e to v alida te the StOrM mo d el and o ne to pre se n t it wi th a more co mplex d at aset. Da tas ets co mprising o f χ an d χ ∗ neede d to b e ch o sen. As χ ∗ co mprises of exp ert knowledge which c orrela tes w i th χ, such exp e r t knowled ge had to b e found. Behavioural anal ysis o f runnin g p ro cesse s w a s chose n a s the ta rget d omai n and disc rimi natio n of networking app licat ions as a proble m area . Ab un d an t e xp er t knowledge exists in this d om a in . T e ch nic a l D ata - Beh avio ur of Ru nni ng Pro ces ses : In o rder t o c ol lec t “techni cal” data, χ, Micro soft Perf or ma nc e Counters API [ 8] w as used. The following seven pro cess-sp ec ific att ribu tes and one syste m wide attr ibute w ere selecte d to b e mo nitor ed: IO Write Oper at io ns /s e c, IO R e ad Op e r ati on s/sec, IO Other Oper at io ns /s e c, IO D ata Oper at io ns /s e c, % Pr ivile ge d Time, % Pr o c esso r Time, % User T ime, Da tag r ams Se nt/ se c . This set of ei gh t featur es yield s the ability to ob serve b ehaviour o f run ning pro cesses b ased o n their I/O act ivit y , CPU and netw ork usa ge on the W indows O S. F or detailed exp lan ati on of eac h at trib ut e, the reader is referr ed to [8]. T he da ta w as norma lised and tra ns for med into an 8 -dime nsional i nput fea ture vector, x. TL R K n ow l e dg e - St ati c An al ysi s of Exe cuta bl es : As w e d esire to disc rimi nate b etw een p ro ce sses tha t p erfo rm net w o rkin g activity and no n - n e t- w or king appl icat ions, suitab le exp er t knowledge fro m χ ∗ had to b e chosen. A set of W indows API ca lls used f o r netw or k co mmu nic ati on in Windows OS was se - lected fro m MSD N l ibrar y [ 8]. This library is a reso urce where e xp er t kn o wled ge on v ariou s W indows sp ecific libra ries is pre sented and ca tegorised acc ording to v a rious s yste m funct ions. The following se t of str ings, re pre se nta ti ve of n um er - ous API calls, w as c ho sen fr om a set of libra ries tha t are used for net w orki ng within the op e rating s ystem: Int erne t, Ftp, Http, WinHttp , WSA, Rpc, Uu id, Dns, Dhcp , Ne tbios, Net, Snmp, WNe t. These str ings, which rep re se n t more th an 90% of Windows ne tw ork ing funct ions, w e re selec ted as our TLR knowledge and enco d ed in e. Static binary a nalysis o f r unni ng pro cesses w a s t hen p erf or me d, in or der to ev al uate which API calls each p ro ce ss i mp orts. T his info rma tion w a s su bs eq ue ntl y t ran sf or me d into the inpu t featur e vector s, re pre se nti ng API c all s pr ese n t in a n ex ecuta ble. 5.1 Res ults Exp eri men t I - In the first exp eri me n t t wo ru nning pr o cesses w ere obs er v e d for the dur at ion of app roxim atel y 150 second s. Na mely the edito r Notep a d and 0 5 0 5 N e i g h b o u r d i s t a n c e p l o t 8 6 4 2 V 2 V 6 V 1 0 ( a ) C V o 4 m V 8 p o n e n t s ( b ) U - M a t r i x V 3 V 7 V 1 1 N e V i 5 g h b o u r V 9 d i s t a n c e p l o t N e i g h b o u r d i s t a n c e p l o t M S N M S N M S N M S N M S N M S N M S N M S N M S N 5 5 5 5 5 5 0 5 5 5 5 5 5 5 5 5 0 5 5 5 M S N M S N M S N M S N M S N M S N M S N @ @ @ 8 5 5 5 0 0 5 5 5 0 5 @ @ @ 8 5 5 5 0 0 5 5 5 0 5 M S N M S N @ @ @ 0 0 0 0 0 0 0 0 5 5 @ @ @ 0 0 0 0 0 0 0 0 5 5 N p a d N p a d M S N M S N M S N @ @ @ 0 0 0 0 5 6 5 0 5 0 0 @ @ @ 0 0 0 0 5 6 5 0 5 0 0 N p a d M S N M S N M S N @ @ 0 0 0 0 0 5 0 0 5 5 @ @ 0 0 0 0 0 5 0 0 5 5 N p a d M S N M S N M S N M S N M S N @ @ 0 0 0 0 4 5 5 5 0 5 5 @ @ 0 0 0 0 4 5 5 5 0 5 5 N p a d N p a d M S N M S N @ @ @ 0 0 0 0 5 0 0 0 0 5 @ @ @ 0 0 0 0 5 0 0 0 0 5 0 0 0 @ @ @ @ 0 5 5 N p a d M S N @ @ @ @ M S N M S N 0 0 0 5 2 0 0 0 0 0 0 5 0 5 5 2 N p a d N p a d N p a d M S N 0 0 0 0 @ @ @ 0 0 0 0 0 5 0 0 0 0 @ @ @ 0 0 0 0 0 5 N p a d N p a d N p a d N p a d N p a d M S N M S N M S N 0 0 0 0 0 @ @ 5 0 5 0 0 0 0 0 @ @ 5 0 5 ( c ) S t O r M ( d ) S t O r M + l a b e l s Fig. 1. SOM and StOrM results for Exp erim en t I - In (c) and (d) th e “@” sig n denotes no des on cluster bo und aries, lines con nec t no d es belonging to cluster o f interest and num b ers show the amoun t of TLRs fl agged during traini ng. messagi ng cli en t MSN Li ve Me ssenger. T hese t wo appl icat ions w ere chosen due to their differ ence in terms of netw o rking func tiona lity . In t his exp eri me n t we w an t to show tha t by inc o rp ora ti n g extra knowled ge as p art o f the SOM al go - rithm, w e can identify wh ich cluster wit hin the SO M out put corr esp ond s to net- w or king ac tivity a nd thus identify t he M esse nger ap pli cat ion. If we provide an y machine learni ng alg or ith m with o ur “techni cal” data th en we can generate a set of cluster s, bu t withou t lab els, and ther efore w e ar e u nable to de ter min e whi c h cluster b e longs to which ac tivity/p ro ces s. Usin g our en co ded T LR info rma tion w e can provide enoug h info rm atio n in o rder to help us dist ing uish b etw een c lus - ters tha t denot e t he activity of i nterest and cluster s tha t are ir rele v an t to ou r pr ob le m. Results of t his exp eri me n t c an b e se en in Fi gure 1. Figure 1 (a) sho ws com p o- nen t pla nes [6 ] of the SOM. T his is a sta nd ar d metho d fo r visuali sing r efe r enc e v e ctors of each no de in the map. E ach com p on e n t (att ribute ) is rep resente d as a pie slice, wher e the size shows the ma gni tud e of this attr ibute in a given no d e. T able 1. Exp erime n tal Resul ts N o des ( i ) Winne r N o des (m c ) Qua n ti t y Mean Net. Le v el Ex p erime n t T ota l TLR.on TLR .of f TLR.on TLR .of f I 100 35 14 0.9 873 0.0 029 I I 400 137 131 0.3 151 0.1 830 Figure 1 (b) shows the map usi ng the U-Ma trix [9] m e tho d . T hese t wo visual isa - tions a re sta nd ar d metho ds for pr esenting the out put of SO M. Even t hough w e can see tha t p ossib ly t wo c lusters exist i n o ur data , it is di fficult to dist ing uish which one co rresp onds to netw orki ng a ctivity and th u s M essenger . With ou t la - b els and m uch un de rsta nd ing of t he da ta, suc h disc rimi natio n is di fficult. On the other ha nd with the help of o ur StO rM mo del and the inc o rp ora ti o n of so me exp er t k nowledge, w e can au to ma ti cal ly deli neate t he cluster of inter e st, seen i n Figure 1(c ). T his c onnected set of no d es hig hlights a cluster re pr ese nta ti ve o f the behaviour of t he Messenger ap pli cat ion. This result ca n b e v al ida te d ag ai ns t a map with lab els, show i ng the lab elled true c luster regi on in Figure 1( d). The exp eri me n t has b een run te n times, yieldin g the following resu lts, s ee n in T able 1. Out of a total o f 10 0 no d es, on a verage 35 % of no d es w e re wi nne r no de s, co rre late d with giv en exp ert knowledge from χ ∗ . F o ur tee n p erc en t of all no de s were winner no d es th at had no co rrelatio n at a ll. In ord er to confir m the corr elat ion corr ectness w e l o ok at the mean netw or king level for co rrel ate d ver- sus un co rre late d winner no des. W e p erform t his ca lcula tion as w e are in ter este d in fi nding groups or c lusters o f no des w h ich re pr ese n t app lica tions denot ing net - w or king b ehav io ur. In this exp eri me n t co rrel ate d no de s, all id entified co rre ct ly as b elongin g to M essen ger, ha ve an average ne tw orkin g le v el of almo st 99% of the total netw o rkin g activ ity pr ese n t dur ing the exp eri ment, whereas un corr ela ted no de s, all b e longing to “ Not ep ad ” hav e on av erage b elow 1% . F ro m the a b ov e analysis an d the f o ur figures it ca n b e se en tha t our StO rM mo del p rovides a w ay o f cor relat ing exp er t kno wled ge with sta nd ar d “t ec hni cal” infor mation for the p urp o se of clus ter ide nti fic at io n and lab el ling. T his cor rel a- tion helps wit h the ide nti fic at io n of da ta or cluster s of intere st which, with out lab el s, w o uld other wise b e diffic ult to id e n tif y . Exp eri men t I I - In o rder to assess o ur StOrM mo d el on a mor e c o mp l ex datas et, a lar ger num b er o f r unning pr o cesses w ere monit ored a nd su bse que n tl y analyse d. I n total 33 runni ng app licat ions w ere monit ored d uring a ses sion o f sta nd ar d use o f the host ma c hine fo r app roxim ately 200 se co nd s. Figure 2 and T able 1 show results of t his e xp er imen t. With a more co mp le x datas et it is mo re difficult to inter pret result s us ing the sta nd ar d techniques se e n in F igures 2(a) and 2( b). It is p ossib le t o deduc e tha t a num b e r o f cl usters exi st within the underlyin g d ata, how ever which of those are the c lusters of in te r e st is very diffic ult to discer n. With the help of our St OrM m o del how e v er , the un- 3 ( a ) C o m p o n e n t s de rst an di ng of the SOM out put b eco mes m u ch ea sier, this can b e see n in Fig u r e 2(c) . Co nnected regio ns of the map clearl y h ig hli g h t a cluster of no des deno tin g app lications tha t exhib it netw or king activi ty d ue to their impl em entation of net- w or king functions pre se n t in the W indows OS. Lab els in Figure 2(d) show tha t ap pli ca ti on s that one w o uld in t uitively re gard a s having netw o rking fu nc ti on - ality ar e group e d withi n t he connec ted regio n and non -networkin g app lica tion s lie outside o f this cluster . This is true for mo st c ases with a ha ndful o f ex cep - tions. These are att rib ut ed to the fact tha t such appl icat ions either do no t use Wind ow s n e t w orki ng functi ons o r ha ve not b een ac tive duri ng the sessio n. T o v a lidate t he co rrect del ineat ion of the netw ork ing cl uster, quantita tive ana lysi s w as aga in p erformed on c o rr el a t e d v e rsus unc orr ela ted no de mean net w orki ng activity . In this case t he dis crim ina tion b etw ee n the t wo groups is not a s app ar- en t as in exp eri me n t o ne. This is d ue to t wo rea sons. The netw orki ng attr ibute used in our “te chnica l” data i s a global mea sure, rat her than a p er pro ce ss sig nal . Second ly no t all app licat ions use Windows net w o rki ng function s for com muni - cation. Again 10 r uns have b een p erformed and anal ysed. The pr o duce d SO M out put contains 4 00 no des, out of which 137 w i nner no des hav e co rre lat ed exp er t knowledge and 13 1 w in ner no des hav e no cor rela tion. The av era ge net w orki ng activity for co rre late d no des (3 2% o f total netw or king ac tiv ity) is ap prox ima tely 14% higher tha n tha t of un co rre lated no des ( 18% of total netw orkin g ac tivi t y) . This result again confir ms tha t the cluster hig hlighted by the StO rM m o del delinea tes no des which are re pr es e n t at ive of the cla ss of in ter e st. 6 Con clus ion s In this w o rk w e have pro p osed a nov el im mune -i nsp ire d idea tha t provides ne w p ossib ilities for k nowledge disc ov ery a nd ex plo rat ory data a nalysis. T he p rop os ed StOrM m o d el inc o rp ora te s an analo gy of the so-cal led T oll-like recep tors from the human i mmune s ystem. T his mod el pr ovides an ins igh t into a new cla ss of learnin g where ad dit io na l knowledge c an b e fused wit h tra dit ion al “t ec hni cal” data. This add itio nal inform atio n do e s not need to b elo ng to t he sa me h yp o- the tic al space as knowledge enco ded within a trai ning or testing dat ase t. T he prop osed mo d el i s exp lored w i th t he help of t wo e xp eri me n t s , grou nded wit hin b ehavioural anal ysis o f run ning pro cesses o n a host syst em. T his exp eri me n tal N e i g h b o u r d i s t a n c e p l o t N e i g h b o u r d i s t a n c e p l o t 6 6 2 1 2 2 2 0 2 1 1 1 1 1 0 2 0 0 0 2 2 0 1 2 2 0 1 2 2 1 0 1 1 0 0 0 2 2 0 0 0 0 1 1 1 2 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 5 5 2 1 1 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 2 0 1 2 2 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 2 2 0 2 2 1 1 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 2 2 2 0 1 1 1 1 1 0 0 2 0 1 0 0 0 0 M a p p i n g p l o t G o v v s s m m o a j e e u u c v v g t s s n n t i a a a c s s m m l 2 f m e e e m c c m r r e a c c n t t U h h s d d v d o h h h r s s s s a e e s s p c h h h x n p p p p p p t t d d G h d x c c q c c c c r r a a o o o f a o o o o o p p t v o h h h i l l o o o o r l g g t t t t m o t a o o o a e l l l j o e s s s s u c v g y s s s t s c c n r v v v v t i s c m t t t l 2 f h m o o e e c m m v v e o l l n t U h h s m d v o d s j a s o o o o u p h v x t n s t n d s s s s d m x c m e t t c a p v n t h s t m d a e j a e s u c h v t s t n d G t i s m c 2 f m e m p o e n t h s d o v a o a e s c h x g t n t d t i c x c l 2 f e m m m p v e U m v o j u p v x n s n s d m x m e c a n t h s t d e a e s h t d c a a p l g g g g g g g g g g g g g g s v c h o s s s v v t c c c c c c c c c c c c c h h o s s v t t c h o s t a l g c t f m v v m m o j u v n n n s m e c n n n n n n n n n t h d d i a a e e h h s s t t t d x c c m p p p l s s s s s s s s s s s s s s s o s s s s s s s s s s s s s s s r s e m P s s s s s s s s s s s s s s s s a s s s s s s s s s s s s s s s s i n s s t m D v s s m o j s s u v v t n N s m m e c e n n n n n t h t d j j a a a e u u u h v t t t t d s s s s s s s s s s s s s s s s s s s s s s s s s p m c c c i p n n h g a e c t d t f m o n s e s s a e r a c w r r s s s s s s s s c c m h h p v p v v p v v p G h h q c c c c c c r a o o o o i f f p o t o v v h h h h h l i i l o o o o r r g l r t m o t t a a o o o o o o a l l l l j o v e e s s s s u c c v v g y y s s s s s t s s c n n r r G v v v v t t i s c m m t t t t t t l 2 f f h m e o e e c m m m v o e o l n n t U h h s s m d d d d o v a d o o s j a a e s s s o u c p h h v v x g t t t n n s t t n d G t f i s s c d m m x c c l 2 m e t c m m a p p o e n n U U U h s s s t d v a o d d e a e s e c c p W h x t n n n s t t d t t i a c d x c l 2 f f f e e i m m m m r a p n e a c U t d v o o o d h r e s s p c h x o n n n p p v h d x q w c r a o f a o t v h l i o s r g l t t m t a o S e l j o e s u v v y s s c n e r v s m t h m o e a c v o l n t r h h s m d c s j a a e s o u h v h t s t t n d s s m c m e t c p n t h s d s a e s h v t d c c a p h l g o s t a l g a l g c t f m o n c m v v m d j u v v n n n n n s m e c v n n n n t t h m d d d j a a a e u h v t t t t n d s m c c e c p p p v n t h m d a e h t n d c e p t d h c p W i n s s d w e e o s u a a w e a r r a c c s u h h r S s s s s c c h h p p v p v p e l h q t c c r a o o a f o t h h i l o o r r r l g t t a a c o o l l o s e s s h y y s s v c c r G v v t t h c a o o o v h o l l l l h h m o o a s j o o u c v g t s t s n G t i s s c m t l 2 f m e e t c m m o e n t U h s d o v o d a e s p h x g n t d G c d x c l e m a p o U t o a d e c p g t G t i c d l 2 f e m m m a o e U t o v a o e c p x g t n t i c d x l 2 f e m m m a v v e U t m v o d i e j n u p v x n s n G f s d m x o m e e c a o v c n t t h s t m d d o a a e j a e s u c h v g r t s s t n d d d t i s c m c l 2 f e e e e c m m m p e a a a a a n t U h d v o r r r r a e p c c h x n s t d h h h d x c e f f f a p a i i l l t t t r e e c r G h h h f o o o o i l o t s s e g t t r l h e o U a s p t t i d 2 a e t v a e x t i x 2 e v x x c m d c m d a t i c c 2 m e v d d p x y x t h o p p n y t h o p p n y t h o n v m w W W a r i e n n s − d d d w e a o o s u u a u w w e a a r t a c s s s h u u s h r S S s d e c c c h p p s e e l l a h q t t e r o a a r f o t a i o c r r l t t c c h r l o s s e s c h h h h h y p v p c r v h h c r a o o f o t t t h o l i l o r l h t t a o s l o e o s t y s c r v s t h o t o l h s s o t v s c a t h l g o s s v G t c a a o h l l g g g o a a c c s t s t t t i i c t t t l 2 2 2 f f f e e m m m m e e a U s v v d o o o r s e p c h x x x n n n p a h d x x x q o r f a t i o c r l t t a h e l e s h y p r v q h r o t o r t a s o s s t y p s c e o o o o o o a o o o h r l l o s s c s G v v s h e t f o o o a i l o o t r e c g g r G h l l h e e f o i U l t s e p p g g g t r d l h e a a o U t t a s e e p t t G i d 2 a o o e t o o o o o o o o o v e x g x l e U p d a t e p y t h o p p n y t h o p p n y t h o n C i i n l a f o o m c T a W r r d d d d y i n d w w o v u u w m a a s u u s w S e c c W a s e G l l a t t e r a i r e o v n a c r s − m o d c a h r j s s e u c c c a h v h g o t p p p p s s n n t i a s c h m u q l w 2 f c r m m m a o o e e c m m m r f o t t e h a l i o o c s n h t U r h g l s s t d d d d t v a o o h r S l l o s a a d e e s s e s s p c h h x y s n p p s c t t e d r v v a h d x c q t h e r o o o a r f a o p p t o a i l o c r l h t t t t c s h r e l o s s s s s e o s c h t y y y p v p p v p p s c r v s h h c c c e r a o o o o t f o t t t h h h o a l i l o o o r l h t t a o o o s r l l l o s e o s s s c h t y s s s p s c r v v v s h q t t t h e o o t f t o a i l o r l h t a s r l s e o s c h t y p r v s h q h o t f t o o i r l s t a s l s e e s t y p r v a h o r o o c s s s h l e s h t p v a a q r r o t c c r s t a h h o s e h y p p s c a q e r o o r o t a l o c r h t a h r l o s o s c h y p p c c v s h q r o o t f o t i l o o o r l h t t a l o e o s y c r v s h o t t o l h s o t s t p y t h o p p n y t t t h o p p n y t h o p p n y t h o n s e a r M M c s h e O O i n a M M d r c e h x i e n C r d l a e m x e C T i r n l r a v a f o m m y c T w a W r a d d d r y i e n s − d e a o s a u w e r t a c s h s h r S s d e c h p p e a h q r o a r f o t i o c r r l t t a c h l o e s h y p c r v h r o o o l h s t s o e o p t c a s y o t t t r h l c h s o h e o p n p a s y r t t r o h c s t o h o e n p c a r o r o l c h t h o o p c s r o t o l h t o o c s o t l h o s t W i n S C P p y t h o p n y t t t t t t t h o p p n y t h o n 4 4 0 0 1 2 0 2 2 0 0 0 0 0 0 2 0 1 1 1 0 @ 0 R T H D C P s s e L a r M M c h O i n M i n d f e o x c e C a i r n l r a f d o m c T a W r W d d d y i n L d L W o o W w g i n L i s n d S L P W o o e r w g a o i n i s r x n d c S y P W h o e r w a o i n s r x d c S y W h o e w a i n s r d c S h o e w a s r c S i e h e x a p r l o c h r v v e m w a r e − a u t h s s d e r v i c e s p y t h o n 0 0 0 @ 2 0 0 0 0 0 0 0 0 0 2 0 1 1 1 1 1 C C C R T H s e D a C r M c P h O L i n M d M e x O e C M M i r n l a f o m c C T a i n l r a f d d d o m P y c a T a i r n d d d t y D o t N e t v m w a r e − a u t h s s d e r v i c s e r r r r v v i c s e e s s r v i c e s 0 0 @ 0 @ 1 2 2 2 0 0 0 0 0 0 2 0 @ 1 @ 0 0 0 1 C s s e C R a C T r r c H h D i n C d M P e x O L e M r C l a m C T l r a a m y C T l r a a a m y y C T l r a a m y T r v v a m y w a r e − a u t h s s d e r v i c s s e r v i c s s e e e r r r r v v v i i i c c c c s e e e s s s r v i c s s e r v i c e s 3 0 0 @ 1 1 0 2 2 0 0 0 0 0 0 0 0 @ 0 @ 0 @ 1 1 1 i e x p l o i i e r e x p l o r C C e C R C T H s s e e D a a C r r c c P s h h e L i i n a d r c e e s h x x e i e n a r d r c e h x i e n r d e x e r v m w a r e − a u t h s s d e r r r v i c e s s e r v i c e s 2 0 @ 1 1 0 2 2 0 0 0 0 0 0 @ 0 @ 0 @ 0 @ 0 @ 1 @ 0 @ 0 @ R T H D R R C T P H L D C P s s e L a a a a a a a a a a r c s h e i n a d r c e h x i e n r d e x e r M O M M P O a M i n M t D O o M t M N O e t M P a i n t D o t N e t s e r v i c s e e s r r r r r v i c s s e s r v i c e s 2 0 @ 0 @ 1 0 0 0 0 0 0 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 1 @ 0 @ 0 @ S k y p e R i T e H x p D R l o C T r P H e L D C i e P x s p e L l a o r r c e s h e i n a d r c e h x i e n r d e P x e a r i n t D o t M N O e t M M O M M O M P a i n t P D a o i t n N t P D D e a t o o i t n N t D e e t t o t N e t s e r v i c e s 2 2 2 2 @ 0 @ 0 0 0 0 0 0 0 @ 0 @ 0 0 0 0 @ 0 @ 0 @ 0 @ 0 @ 0 w i n l o g o n i e x p R l o T T r H e D C P L w i n l o w w g o i n n l o g o n P a i n t D o t N e t C C C C C C P a i n t D o t N e t s k y p e s P k y M p e s P k y y M p e s P k y M p e s s P k y M p e P M s e r v i c e s p y t h o n 2 0 2 2 0 @ 0 @ 0 0 0 0 0 @ 0 0 0 0 @ 0 @ 0 @ 0 @ 0 0 w i n l o g o n S k y p e S k y p e i e x p l o r C C e C C C C C C C C s k y p e s s P k y y M p e s s P k y M p e s s P k y M p e s s P k y M p e P M n o t e p a d + + e x p l o r e r 2 0 2 2 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 0 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 w i n l o g o n S k y p e S k y p e C C C C C C s k y p e s P k y y y y y y y y y M p e P M n o t e p n n a o d t e + p + a d + + e x p l o r e r 2 0 @ 2 2 2 @ 0 @ 0 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 0 0 @ 0 @ 0 1 1 w i n l o g o n S k y p e S k y p e S k y p e c s r s s n o t e p n n a o d t t t t t t t e + p + n a a o d t e + p + + n a o d t e + p + a d + + e x p l o r e r 0 0 @ 0 @ 0 @ 0 @ 0 @ 0 0 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 @ 0 0 0 0 0 c s r s s c s r s s l s a s s l s a s s l s a s s n o t e p n n a o d t t t t t t t t t t t t t t t e + p + a d + + e x p l o r e r 1 1 1 @ 1 @ 1 @ 0 @ 0 @ 0 0 @ 0 @ 0 @ 0 0 0 @ 0 @ 0 0 0 0 0 j q s j q s j q s j q s j q s c s r s s c s r s s c s r s s l s a s s l s a s s l s a s s n o t e p a d + + e x p l o e e r e x p r l o e e r e x p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p r l o r e r 1 1 1 1 0 0 0 0 0 @ 0 @ 0 @ 0 0 0 @ 0 @ 0 0 0 0 0 j q s j q s j q s j q s c s r s s c s r s s c s r s s l s a s s l s a s s l s a s s l s a s s n o t e p a d + + e x p l o r e r e x p l o r e r V 2 V 6 V 1 0 V 3 V 7 V 1 1 V 4 V 8 V 5 V 9 ( b ) U - M a t r i x ( c ) S t O r M ( d ) L a b e l s F i g . 2 . S O M a n d S t O r M r e s u l t s f o r E x p e r i m e n t I I domai n provides for meanin gful selec tion of e xp ert kn owledge e nco d ed with in our mo del. Such know le dge is i n t he fo rm o f exp e rt infor mat ion as provided b y existin g functiona l ca tegor isat ion of pro grammi ng metho d s imp lem ente d wi thi n the W indows O S. Perfor med exp eriments s how e ncoura ging re sults in ter ms o f improved vis ua li sa ti on for ex plo rat ory data anal ysis and knowledge d isc o ver y due to the pro p osed auto mati c cluster inter pr eta ti on al go rit h m. Our mo del highlights a uni que t yp e of learni ng which b ec omes es p ec ial ly useful, where no lab elle d e xample s e xist but so me knowledge abo ut classes o f interest is ackno wled ged . F ro m t he fie ld of info rm ation secur ity a ll the w ay to medica l science s, many do mains where exp ert k nowledge is abu nd an t e xist, y et such knowled ge i s difficu lt to incorp orate within tra dit ion al knowledge d isco ver y techniques. W e b eliev e tha t our technique is a step for ward tow a rds co m bi ni ng such disp arat e knowledge with tra di tio nal sour ces o f info rm ation and tha t suc h fusion can greatl y i mprov e un de rsta nd ing of t he pro blem at han d. References 1. C. M. Bishop. Patte rn Re c o g nitio n and M achine L e arni ng (In form ation Scienc e and Statisti cs). Spring er, A ugust 2006. 2. D. Bru gger , M. Bogd an, and W. Rosenstiel. Automatic cluster detecti on in k oho- nen’s som . Neura l Networks, IEEE T r ansact ions on , 19 (3):442–459, 2008 . 3. L. N. d e Cas tro and J. Timm is. Artificia l Immune Systems: A N ew Computational Intel l igenc e Appr o ach. Springe r, 1 st edition, No vem b er 20 02. 4. A. Dunne an d L . A. J. O’Neill. The interleukin-1 receptor/ toll-like recept or sup er- family: signal transducti on during infla mm atio n and h ost def ens e. Scienc e’s STK E 2003, 2003:re3, 20 03. 5. T. Kohonen. Automatic formation of to polo gical maps of pat ter ns in a self- organizing system. Pro c ee ding s of the 2nd Sc andina vian Confer enc e o n Image A nal ysis, p ages 214–220, 19 81. 6. T. Kohone n. Self-Or ganizin g Maps. Springer-V erlag, Ber lin, 199 6. 7. B. Malissen and J. J. Ewbank. ’T aiLoRin g’ th e response of dend ritic cells to pathoge ns. Natur e Imm u nolo gy, 6( 8):749–750, 200 5. 8. Micro sof t . Perfor manc e cou nters api. http://msdn.micr osoft.c om/ en- us/library/aa366 781(V S.85). aspx, 1(1) :1–1 , March 2 00 9. 9. A. Ul tsc h and H. P . Siemon. Ko honen’s self organizin g feature maps for exp lor ato ry dat a a naly sis. In Pr o c e e dings Intern. Neur al Networks, pages 305–308, P aris, 1990 . Kluw er Academic Press. 10. V. V apnik, A. V ashist, and N. Pa vl ovitch. L earn ing using hidden inf ormat ion : Master-class lear nin g. In F . F ogelman-S ouli, D. P e rrotta, J. Piskorski, and R. Stein- b erger, editors, NA TO Scienc e for Pe ac e an d Se curity S eries, D: Informati on and Com mun ic at ion Se curity, volume 19, pages 3–14. IOS Press, 200 8. 11. J. V esanto and E. Alhoni emi. C lustering of the self-or gan izing ma p. Neur al Net- works, IEEE T r ansactions on, 11(3):586–600, 2 000. 12. S. W u. Clustering of the self-o rgan izing map using a clusterin g v ali dit y index based on inter-cluster and intr a -cluster density . Patte rn R ec o gnit ion, 37( 2):17 5– 188, F ebru ary 2004.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment