A Recommender System based on the Immune Network
The immune system is a complex biological system with a highly distributed, adaptive and self-organising nature. This paper presents an artificial immune system (AIS) that exploits some of these characteristics and is applied to the task of film reco…
Authors: Steve Cazyer, Uwe Aickelin
A Recommender S ystem based on the Imm une Network Proceeding s CEC2002 , pp 807-813 , Honol ulu, USA, 200 2. Steve Cayzer 1 and Uwe A ickelin 2 1 Hewlett-Packard Laboratories, Filton Road, BS12 6QZ Bristol, UK, steve_cayzer@hp.com 2 School of Computer Science, Un iversity of Nottingham, NG8 1BB UK, uxa@cs.nott.ac.uk Abstract-The immune system is a complex biological s ystem with a highly distributed, adaptive and self-organising nature. This paper presents an artificial immune system (AIS) that exploits some of these characteristics and is applied t o the task of f ilm reco mmendation by collaborative filtering (CF). Natural evolution and in particular the immune system have not been designed for classical o ptimisation. However, for t his probl em, we a re not interested in f inding a single opti mum. Rather we intend to identify a sub-set of good matches on whi ch recommendations can be based. It is our hypo thesis that a n AIS built on two central aspects of the biological immune system will be an ideal candidate to achieve this: Antige n - an tibody interaction f or matchi ng and ant ibody - antibody interaction f or diversity. Computa tional results are p resented i n supp ort o f this conjecture and compared to those found by o ther CF techniques. I. INTRODUCTION Over th e last f ew y ears, a n ovel comput ational int elligence technique , inspired by biology, has eme rged: the artificial immune s ystem (AIS). This sect ion introd uces the AIS and shows how it can b e used for solving com putational problems. In e ssence, the immune sy st em is used here a s inspir ation t o create an unsupervised machine-lear ning algorithm. The im mune sy stem metaphor will be explored, involving a bri ef overvi ew o f th e basi c i mmunolo gical theories that are relevant to our work. We also intro duce the basic c oncepts of collabo rative filteri ng (CF). Overvi ew of the Immun e System A det ailed overview of th e immun e s y ste m c an b e found in many textboo ks [14]. Briefly, the p urpose of the im mune system is to protect the b ody agai nst infection and includ es a set of mech anisms coll ectively termed humo ral immu nity. This refers to a p opulation of ci rculatin g white blood cells called B-l ymphocytes, and the antib odies the y create. The features that are particularl y relevant to our research are matching, diversit y and distribute d control. Matching refers t o t he bi nding b etween anti bodies and anti gens. Diversit y refe rs to t he fact that, in order t o achieve op timal antigen space coverage, an tibod y diver sity must be encour aged [11]. Dist ributed cont rol means that there is no central controller, rather, the i mmune sy ste m is governed by local inter actions betwe en cells and antib odies. The idiotypic n etwo rk h ypothesi s [13] ( disputed by s ome immunol ogists) builds o n th e rec ognition that antibodies can match other antibodies as well as antigens. H ence, an antibod y may be matched by other antibodies, which in t urn may be m atched b y yet o ther an tibodies. This activation can contin ue to spread t hrough the populati on and pote ntially h as much e xplanator y power. The idi otypic netwo rk has been formali sed by a numb er of th eoretical immun ologists [15]. There are many more features of t he immune s ystem, in cluding adaptation, i mmunol ogical memor y and p rotection again st auto-immune attack. Si nce t hese are not directly relevant t o this wo rk, they will not be reviewed here. Overvi ew of Collabo rative Filte ring In t his p aper, we are using an AIS as a CF tech nique. CF is th e term for a broad range of al gorithms that u se similarit y meas ures to obtain recommendati ons. Th e best-known exampl e i s p robably t he “peo ple w ho bought this also bough t” feature of th e intern et company Amazon [2]. However, an y p robl em domai n where users are r equired to rate i tems i s a menable t o CF t echni ques. Co mmercial appli cations are usually ca lled reco mmender s ystems [16]. A canonica l example i s movie recom mend ation. In tradition al CF, th e ite ms t o be recomm ended are t reated as ‘black bo xes’. That is, your reco mmendations are based purel y on the votes of y our neighbo urs, and no t on th e conte nt of t he item . The preferenc es of a u ser, us ually a set of votes on an it em, compris e a user p rofile, and t hese profiles are compared to build a n eighbourhood. The key deci sions to be mad e are: Data encoding: Perhaps the most obvious representation for a user profile is a st ring of numb ers, where the le ngth is th e number o f items , and t he positi on is t he it em identifi er. Each number repr esents the 'vot e' for an it em. Votes are som etimes bin ary (e.g. did y ou vi sit t his web page?) but can also b e integers in a range (say [0,5 ]) or rational numbers. Similar ity Measu re: The most co mmon method to compar e two users is a co rrelation-bas ed m easure li ke Pea rson or Spearm an, which gives t wo n eighbours a m atching sc ore bet ween -1 and 1. Vecto r based, e.g. cosin e of the angle bet ween vectors, and probabilistic metho ds are a lternative appro aches. The can onical examp le is the k Near est Nei ghbour algorit hm, which us es a matchi ng me thod t o sel ect k reviewer s with h igh similarit y m easu res. The votes from th ese reviewers, suitably weighted, are used to make predi ctions and reco mmendations . Many imp rovements on thi s metho d are possi ble [10] . For exampl e, th e u ser pro files ar e u sually ext remel y sparse becaus e man y it ems a re not rated. This me ans that similarit y meas urements are both ineffi cient (th e so-called ‘ curse of di mensionality’) an d difficult t o calculate due to the s mall overl ap. D efault votes are s ometimes used for items a user has not explicitl y vot ed on , and t hese can i ncrease the o verlap size [ 4]. Dimensionali ty reductio n meth ods, such a s Single Value Decomposition, b oth impro ve efficiency an d incr ease overlap [3 ]. Other pre-processing methods ar e often used , e.g. clustering [1]. Co ntent-based info rmation can be used to enhance the pure CF approac h [ 10 ], [6] . Finally, t he weightin g of each ne ighbour can be adjust ed b y t raining, and there are man y learning algorithms available for t his [ 7]. All these i mprovements could in princi ple be applied to our A IS but i n the i nterest s of a clear an d un cluttered co mparison w e have kept the CF algorith m as simple as possible. The evaluation of a CF al gorithm usually cen tr es on its accurac y. Th ere i s a dif ference bet ween prediction (gi ven a movie, predict a given us er’s rating of th at mo vie) and recommendat ion (gi ven a user, suggest movies that ar e likely to attr act a high rating). Predi ction is easie r to assess quantitativel y but reco mmendation is a mor e n atural fit to the movie domain. We present re sults evalu ating bot h t hese behaviou rs. Using an AIS for Collabor ative Filtering To us , the attraction of th e immu ne s y st em is this: i f an adaptive pool of antibodies can produce 'int elligent' behaviou r, can we harness the p ower o f t his comput ation to tackle the problem of preference matching and recommendat ion? Thus , in the first i nstance we i ntend to build a model w here kn own user pref erences ar e ou r pool of antibodies an d the new preferenc es to be matched is t he antigen in question. Our conjecture is th at if t he concentr ations of t hose antibodies t hat provid e a bett er m atch are allowed to inc rease over t ime , we shoul d en d up with a subset of good m atch es. However, we are not interested i n optimisi ng, i.e . in finding the one best match . Instea d, we re quire a set of antibodies that are a close m atch but whi ch at the sam e ti me distinct from e ach other for successful recommend ation. This i s where we propos e to h arness t he i diotypic effects of binding antibodies to similar anti bodies to encou rage diversit y. The next section presents more det ails of our problem and explains th e AI S mo del we intend to us e. We then describe the experimental set-up and p resent some initial results. Finally we re view t he re sults and d iscuss some possi bilities for future work. 2. ALGORITHMS Appli cation of the AIS to the EachMovi e Tasks The EachM ovie d atabase [ 5] is a public database, whi ch records expl icit votes of use rs for movies. It holds 2,81 1,983 votes t aken from 72,916 users on 1,628 films. T he tas k is to use this data to m ake predictions and reco mmendations. In the fo rmer case, we provide an estim ated vote fo r a previousl y unse en movie. In the latter case, we pres ent a ranked list of movies th at the user might like. The basi c appr oach of CF, is to u se i nformation from a neighbour hood to make us eful predicti ons and recommendat ions. Th e central task we set ours elves is t o id entify a s uit able neigh bourhood. The SWAMI (Shared Wisd om t hrough the Amalgamati on of Many Int erpr etations) framew ork [ 9] is a p ublicly acces sible software for CF experi ments. Its c entral algorithm is as follows: Select a set of test users randomly from th e database FOR each test user t R eserve a vote of this user, i.e. hide from predictor) F rom remaining votes create a new training user t’ S elect neighbourhood of k reviewers based on t’ U se neighbourhood to predict vote C ompare this with actual vote and collect statistics NEXT t The cod e sho wn in italics indicates a plac e wh ere SW AMI allows an i mplement ation-dependent choice of algorithm. We use an AIS to pe rform selection a nd predictio n as below. Algorit hm Choices We us e the SWAMI data encoding: { } { } { } { } n n score id score id score id User , ... , , , 2 2 1 1 = Where i d corresponds t o the uniq ue i dentifi er o f t he movie being rated and s core to t his user ’s sco re for that m ovie. T his captu res the essenti al features of th e d ata available. EachMovie vote d ata li nks a per son with a m ovie and assign s a s core (taken from the set { 0, 0.2, 0.4, 0.6, 0.8, 1.0} where 0 is t he worst ). U ser d emogra phic i nformation (e.g. age a nd gender) is provid ed but this is not us ed i n our encodin g. Con tent inform ation abo ut movies ( e.g. category) is si milarl y not us ed. Si milarity Measure The P earson measu re is used to co mpare two users u and v : ( ) ( ) ( ) ( ) ) 1 ( 1 1 2 2 1 ∑ ∑ ∑ = = = − − − − = n i n i i i n i i i v v u u v v u u r Where u a nd v are users, n is the n umber o f overlappi ng votes (i .e. movies for whi ch both u an d v have voted) , u i is th e vote of us er u for mo vie i and ū is t he average vot e o f user u ov er all fi lms (not j ust t he overlapping votes). The meas ure is amended as follows ( ) ( ) ) ( , ) 2 ( , 0 , 0 1 1 2 2 penalty overlap P where r P n r P n if ceDefault ZeroVarian r v v u u if efault NoOverlapD r n if n i n i i i = = < = = − − = = ∑ ∑ = = The t wo d efault valu es are re quire d becaus e it is impossibl e to calcul ate a Pea rson me asure in such case s. Both were set t o 0. Some experi mentation showed that an overlap penal ty P was b eneficial ( this l owers the abs olute correlation for us ers with onl y a small overlap) b ut that th e exact value was not criti cal. We ch oose a value o f 100 b ecause t his is the maximum overlap e xpected. Neighbourh ood Selection For a Simple P earson pre dictor, neighbourhoo d selection means s impl y choos ing th e best k ( absolute) correlatio n scores, where k is t h e neighbo urhood size. N ot e very potential neighbo ur will have rat ed the f ilm to b e predi cted. Reviewers who did not vote o n the film are not a dded to th e neighbour hood. For the AIS predictor, a more involved proc edure is required: Initialise AIS Encode user for whom to make predictions as anti gen Ag WHILE (AIS not stabilised) & (Reviewers available) DO Add next user as an antibody Ab Calculate matchin g scores between Ab and Ag Calculate matchin g scores between Ab and other antibodies W HILE (AIS at full size) & (AIS not stable) DO Iterate AIS OD OD Our AIS behaves as follow s: A t ea ch st ep (it eration) an antibod y’s c oncentra tion is i ncreas ed by an am ount dependent o n i ts matchin g to the antigen and d ecreased by an amount which depends on its mat ching t o other antibodies . In absen ce of either, an antibody’s concentration will slo wly decrease o ver time. Antib odies wit h a sufficientl y low concentrati on are removed fro m t he s y stem, whereas antibodies with a high con centration may sat urate. An AIS iteration is governed b y the following e quation: ) ( ) ( , , ) 3 ( 3 2 1 3 1 2 1 antigen or antibody of ion concentrat y or x antibodies Number N Rate Death k n Suppressio k n Stimulatio k r m x k x x m N k y x m k rate death ppression su antibody n stimulatio antigen dt dx i ij i N j j i ij i i i = = = = = = − − = − − = ∑ − This is a sli ghtly modi fied version of Far mer et al’s equation [8]. In p articular, the fi rst t erm is si mplified as we only have one antigen, an d we normalise th e suppression term to allow a ‘li ke f or l ike’ co mparison betw een t he different r ate con stants. k 1 a nd k 2 w ere varied as described in the next s ection. k 3 was fixed at 0.1, while t he concentration range w as se t at 0–100 (i nitially 10). We fixed N at 1 00. T he matching func tion is t he absolute value of the Pe arson correlation measure . T his allows us to have b oth posi tivel y and negatively correl ated users in our n eighbourhood, whi ch increases the pool of n eighbours availabl e to us. The AIS is co nsidered stable aft er it erating fo r t en iteratio ns w ithout changin g in si ze. St abilisation thus means th at a suffi cient numb er of ‘goo d’ nei ghbours have b een id entified a nd th erefore a predi ction can be made. ‘P oor’ nei ghbours would be expected to dro p out of the AIS aft er a few ite rations. Once the AIS has stabilised using the above algorith m, we use the a ntibod y concent ration t o weigh th e neighb ours. However, early e xperi ments s howed that t he most recently added antibodies were at a disadvant age compared to earlier antibo dies. This is b ecause they have h ad no time to m ature (i.e. in crease i n concentr ation). Likewise, the earliest antibo dies h ad satu rated. To overcome t his, we r eset the concentr ations and allo w a l imited run of th e A IS to di fferentiate the concentrations: Reset AIS (set all antibodies to initial concentrations) WHILE (No antibody at maximum concentration) DO It erate AIS OD Predictio n We pr edict a rat ing p i b y u si ng a weight ed average o ver N , th e neighbourhood of u , which was taken as the entire AIS. ( ) ) ( ) 4 ( absolute not relative NB x r w w v v w u p v uv uv N v uv N v i uv i = − + = ∑ ∑ ∈ ∈ Where w uv is the wei ght between users u and v , r uv is the correlat ion sco re betwee n u and v , and x v i s t he conce ntration of the antibod y corre sponding to u ser v . Evaluati on Predictio n Accuracy: W e t ake th e mean absolute error, where n p is the numbe r of predictions: ) 5 ( p n predicted actual MAE ∑ − = Mean num ber of reco mmendations: This is the t otal numb er of uniq ue films rated b y the neighbou rs. Mean over lap si ze: This i s the numb er of reco mmendations th at the user has also seen. Mean ac curac y o f reco mmendations: Each overlapped film has a n actual vote ( from th e antigen) and a p redicted vote (from the neighb ours) . The overl apped films we re ranked on both a ctual and predicted vote, breaking ties b y movie ID. The two ranked li sts were compared using K endall’ s T au τ. This measure reflects th e level of c oncordance in the li sts by counting the numb er of discordant pairs. To do this we order the films by vote and appl y the following f ormulae: ( ) ( ) ( ) > = = − − = ∑ ∑ = + = otherwise r r if r r D r r D N n n N j i j i n i n i j j i D D 0 1 , ) 6 ( , 1 4 1 1 1 τ Where n is th e overlap s ize and r i is t he rank of f ilm i as recommended b y t he nei ghbourhood. Not e that i here refers to the ant igen ran k of t he film, n ot th e film ID. N D is the numb er of discordant p airs, or, equivalentl y, the e xpected cost of a b ubble sort to r econcile t he t wo list s. D is s et to o ne if the ran kings are discor dant. Mean number of r eviewers. This is t he number of reviewers looked at b efore the AIS st abilised. Mean numb er of nei ghbours: This is th e fin al nu mber o f neighbours in the stabilis ed AIS. 3 EXPERIMENT S Experime nts were carried out on a Pentiu m 700 with 256MB RAM, runn ing Wi ndows 2 000. The AIS w as coded in J ava TM J DK1.3. Each run involved l ooking at up to 15,000 reviewers (20 % of the EachM ovie data set, ra ndomly chose n) to pro vide p redictions and recomm endations for 100 u sers. Averaged sta tistics are then taken for each run. Runtimes ranged f rom 5 to 60 minutes , largel y dependen t on the numb er of reviewers. Experiments on Simple AIS Initial experiments concentrated on a simple AIS, with no idiotypic effect s. The goal was to find a good stimulation rate, but also to ensure that th e ‘bas eline’ sy stem operates similarl y to a Si mple Pear son predi ctor (SP). T herefor e, we set the supp ression rate t o zero, an d varied o nly t he stimulation rat e, i.e. the w eighting given t o antigen binding. Other parameters h ad been fixed b y preliminary experi ments. E ffe c t o f S tim u latio n on N e ig h b o u rh o o d s iz e 0 10 20 30 40 50 60 70 80 90 100 0 0.2 0. 4 0.6 0.8 1 S ti m u la ti o n R a te Ne i g h b o u rh o o d S i z e E ffe c t o f s timu la tio n o n n u m b e r of us e rs lo o k e d a t 0 50 00 100 00 150 00 0 0. 2 0. 4 0. 6 0. 8 1 S ti m u l a ti o n Ra te N u m b e r o f u s e r s l o o k e d a t Figure 1: E ffect of stimul a tion rate o n neighbourhood and reviewe rs. The graphs sho w aver aged result s over fi ve runs at each stimul ation rate. The bars sh ow s ta ndard de viations. In order to have a fair comparison , the Simple Pearson parameters (nei ghbourhood and n umber of review ers looked at) match th e AIS v alues for each r ate. In figure 2, we show the predi ction error, number of recommend ations, number of overl aps and reco mmendation accu racy f or each algorithm. Note that low pr ediction error val ues are bet ter, whereas for th e other measures we are looking f or high values. E ffec t of s tim u latio n o n p re dic tio n e rr or 0. 5 0. 55 0. 6 0. 65 0. 7 0. 75 0. 8 0. 85 0. 9 0. 95 1 0 0. 2 0. 4 0. 6 0.8 1 S ti m u l a ti o n R a t e M e a n A b s o l u t e E r r o r A IS (av ) S P (av ) E ffe c t o f stim u la tio n on rec o m me n d a tio n a c c u ra c y 0. 35 0.4 0. 45 0.5 0. 55 0 0. 2 0. 4 0. 6 0. 8 1 S ti m u l a tion R a te Reco m m en d ati o n Ac cu r acy (Ken d al l 's T au ) A IS (av) S P (av ) E ffe c t o f S tim u latio n o n n u m b er o f re c o m m e n d a tio n s 0 200 400 600 800 1000 1200 0 0. 2 0. 4 0 . 6 0.8 1 S ti m u l a tio n Ra te Nu m b er o f rec o m m en d ati on s AIS (a v ) SP (av ) E ffe c t o f S timu latio n o n n u m b e r o f o v e rla p s 0 10 20 30 40 50 60 0 0. 2 0 . 4 0.6 0.8 1 Sti m ul a tio n Num be r o f o v erl ap s AIS (a v ) SP (av ) Figure 2: Effect of stim ulation rate on prediction and recom mendation. It can be seen th at the simple AIS gives broadl y similar prediction per formance to the S imple P earson. T he MAE measurem ents from di fferent runs are not nor mally distributed, so a non-para metric s tatistic is appropriate. We performe d a Wilcoxon an alysis, whi ch show ed that the differen ce between predictio n errors of SP and AIS is zero with 95% confidence. In additio n, t he choice of an appropriat e sti mulation rate did make a significant difference (a rate of 0.2 compared wit h 0.02 at the 95% level). For recommendati on, t he AIS performs better than the SP at stimulatio n rates ab ove 0.1. Agai n, w e p erfo rmed a positive 95 % Wi lcoxon analysis to ass ess s ignificanc e. We excluded cases where a rec ommendatio n score was unavailable (due to an insufficient number of overlaps). The numb er of rec ommend ations and overlaps s how si milar trends t hough t he AIS gives a m ore co nstant value. Again, some stim ulation was b eneficial. In later experim ents, the stimulation rate was fixed at one of t he better values (0.2, 0.3 or 0 .5), i n order t o give us a good b ase to w ork o n. T hese valu es give u s general ly goo d performa nce, while keep ing a good n eighbou rhood size and still evaluati ng a reason able number of reviewers. Experiments on the Idiot ypic AIS Having fixed all th e simple p arameters, we t ested the effect of s uppressi on fo r sti mulation rates of 0 .2, 0.3 an d 0.5. Not surprisin gly we fou nd that suppre ssion chan ged th e numb er of revie wers looked at and the numb er of neighbour s: E ffe c t o f su p p re s sio n on n eig h b o u rho o d s iz e 0 10 20 30 40 50 60 70 80 90 100 0 0 . 2 0.4 0. 6 0. 8 1 S u pp re ss ion ra te Nei g h bo u rh o od si z e Rat e 0. 2 Rat e 0. 3 Rat e 0. 5 E ffe c t o f su p pr e s sio n o n nu m b e r o f re v ie w ers loo k e d a t 0 2000 4000 6000 8000 100 00 120 00 140 00 160 00 0 0. 2 0.4 0.6 0.8 1 S u pp re ss ion Ra te Nu m b er revi e we rs Rat e 0. 2 Rat e 0. 3 Rat e 0. 5 Figure 3: E ffect of suppression rate on nei ghbourhood siz e and reviewers. We th en tested the effect of suppres sion on th e AIS perfo rmance. Here we fi xed the baselin e rate at stimul ation onl y (no su ppress ion), and took m easu rements rel ative to this baselin e. Again, it s hould b e n oted t hat t he first graph sh ows predi ction error (h ence, a good r esult is low ). E ffe c t of s u p p re s sio n on p r e d ictio n erro r 70. 0% 80. 0% 90. 0% 100 . 0% 110 . 0% 120 . 0% 130 . 0% 0 0.2 0.4 0. 6 0 . 8 1 S up p re ss ion ra te M ean ab so l u te erro r (r el ati ve to ba se l in e ) Rat e 0 . 2 Rat e 0 . 3 Rat e 0 . 5 E ffe c t o f s u p p re s s io n o n re c o m m e n d a tio n ac c u ra c y 80. 0% 90. 0% 100 . 0% 110 . 0% 120 . 0% 0 0.2 0.4 0.6 0. 8 1 S up p r e ss io n ra te Rec om m en dat i on acc u rac y (Ke nd a ll ) rel ati v e to bas el i n e Rat e 0 . 2 Rat e 0 . 3 Rat e 0 . 5 E ffe c t o f su p pr e s s io n o n nu mb e r o f ov e rla p s 0. 0% 20. 0 % 40. 0 % 60. 0 % 80. 0 % 100 . 0% 0 0. 2 0. 4 0.6 0. 8 1 S u p pre ss io n r a te Num b er o f o ve rl ap s (re la tiv e to ba sel i n e) Rat e 0. 2 Rat e 0. 3 Rat e 0. 5 E ffe c t o f s u p p re s s io n o n n u m b e r o f re c o m m e n d a tio n s 0. 0% 20. 0% 40. 0% 60. 0% 80. 0% 100. 0 % 120. 0 % 0 0. 2 0. 4 0.6 0.8 1 Su p p re ss io n ra te Num be r o f r ec om m e n dati o n s (rel ati ve to b asel i n e) Rat e 0. 2 Rat e 0. 3 Rat e 0. 5 Figure 4: Effect of suppression rate on predictio n and recomm e ndation. Again, the graphs sh ow averaged res ults over fi ve runs at each suppressio n rate. The b ars show stand ard d eviations (similar si ze b ars for rat es 0.2 and 0. 5 have been omitted i n the i nterests of clarity). At l ow levels of stimul ation, prediction a ccuracy is not signifi cantly affect ed. However recommendat ion accurac y i s improved significantl y (95% Wilcoxon). Fo r ins tance, for 0.3 stimulation, rates from 0.05 to 0.2 gave a s ignificantl y impro ved p erformance. In actual terms, the Kend all measure rises from 0 .5 to n early 0.6 . This means that t he chan ce of any t wo randomly sa mpl ed pairs being corr ectly ranked ha s risen from 60% to 8 0%. Too m uch suppressio n had a detrim ental effect o n all me asures. 4. CONCLUSIONS It is not particularl y surp rising that the simpl e AIS performs similarl y t o t he SP pr edictor. Th is is becau se the y ar e, at their core, bas ed aroun d the sam e algorithm. The sti mulation rate (in absence o f an y i diot ypic effect ) is eff ectivel y s etti ng a threshold for cor relation. This has both strengt hs and weaknesses . It h as been shown t hat a thresh old is useful i n discarding the potentially misle ading predictions of poorly correlated revi ewers [ 10]. On the other hand, a rigid threshold means th at one has t o ‘p rejudge’ t he appropriate level to avoid both pr emature co nvergence and empt y communiti es. In deed, d etailed examinati on of the individual runs sho wed that th e AIS had a tenden cy t o fill its neighbour hood either early or not at al l. The setting of a threshold al so means that sufficie ntly good a ntibodies are taken on a first c ome, first served b asis . It is inter esting to obser ve th at su ch a strategy neverth eless s eems (in these experiments ) to p rovide a mo re con stant le vel of overlaps, and bett er recommendati on qualit y. The ri chness of our AI S mod el comes whe n we all ow intera ctions between ant ibodies. E ar l y, qualitati ve experiment ation wi th the i diotypic n etwor k showed an tibod y concentrati on risi ng an d fallin g d y n amic all y as t he popul ation varied. For inst ance, in the simple AI S, t he concentrati on o f an antibo dy w ill monotoni cally increa se to saturation, or decrease t o elimin ation, unaff ected b y t he ot her antibodies . H owever, th ere is a deli cate balance to be st ruck between stimulati on and suppression. An i mbalance may lead to a l oss in populati on si ze or d iversit y . The grap hs show that a small amount of suppression may ind eed be b eneficial to AIS p erforman ce, in p articular rec ommendation. It is interestin g to not e that t he i ncrease i n recom mend ation qualit y occurs with a relatively const ant overlap size. At t oo hi gh le vels of suppression, it is harder t o fill the nei ghbourhood, wit h conseq uent lack of diversit y and hence recom mendation accuracy. We believe th at t hese initi al results show t wo t hings. Firstl y, p opu lation eff ects can be benefi cial for CF algorit hms, partic ularl y for re commend ation; sec ondly, that CF is a promising n ew applicatio n area for artificial imm une systems . In fact, we can wid en the cont ext, since th e process of neighbou rhood s election described in this p aper can ea sily be gene ralized to th e task of ad-hoc co mmunit y format ion . REFEREN CES [1] Aggarwal C an d Yu P, On Text M ining Techniques for Personal ization Lecture Notes in Artificial Intelligence, vol. 1711, p p. 12-18, 1999. [2] Amazon.com Reco mmendations (http://www.am azon.com/ /). [3] Billsus, D. and Pazzani, M. J., "L e arning Co ll aborative I n formatio n Filters," Proce edings of the Fifteenth Internatio n al Confe rence on Machine Learning. p p. 46-54, 1998. [4] Breese JS, Hec kerm a n D and Kadie C, Empirical Analysis of predictive a lgorithms for collaborati v e filtering, Proceedings of the 14 Conference on U ncertainty in Reasoning, pp. 43-52, 1998. [5] Compaq Systems Research Centre. Ea chMovie coll aborative filtering data set (http://w ww .research.comp aq.com/SRC/eachmo vie/). [6] Delgado J, Ishii N and Tomoki U. Content-based Collabo rative Information Filtering: Activ e ly Learning to Classify and Recommend Documents. In : Coopera tive Information Agents II. Learning, Mobility and E lectronic Comm erce for Inform a tion Discovery on the Intern et, ed. M. Klusch, G. W . E. Springer-Verl ag , 1998. [7] Delgado J. and Ishii, M ulti-agent Learning in Recom mende r System s For Informatio n Filtering on th e Internet Journal of Co-operative Information S ystems, vol. 10, pp. 8 1-100, 2001. [8] Farmer JD, Packard NH and Perelson AS, The immune s ys tem, adaptation, and m achine learning Physi c a, vol. 22, p p. 187-204, 1986. [9] Fisher D, Hildrum K , Hong J, Newman M and Vuduc R, SWAMI: a framework for c ollaborative f iltering a lgorithm development and evaluation 1999. htt p ://guir.berkel ey.edu/projects/swami / . [10] Gokhale A, Improvements to Collaborative Filtering Algorithms 1 999. Worcester Polytechnic Institute. http://www.cs. wpi.edu/~claypool/ms /cf-improve/. [11] Hightower RR, Forrest S and Pe relson AS. "The evol ution of emergent organizatio n in immune s y stem gene libraries," Proceedings of the 6th International Co nference on Genetic Al gorithms, pp. 344 -- 350, 1995. [12] Hunt J, King C an d Cooke D, Immunizing against fraud, IEEE Colloquium on Knowl e dge Discover y and Data Mining, vol. 4, pp. 1-4, 1996. [13] Jerne NK, Towards a network theory of the imm u ne system Annals of Imm unology, vol. 125, no. C, pp. 3 73-389, 1973. [14] Kirkwood E and Lewi s C. Understanding M edical Immunol o gy, John Wiley & Sons, C hichester, 1989. [15] Perelson AS and Weisbuch G, Imm u nology for ph y sicists Reviews of Modern Physi c s, vol. 69, pp. 1219-1267, 199 7. [16] Resnick P and Varian HR, Recomm ender systems Communications of the ACM, vol. 40, p p. 56-58, 1997.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment