Cross-situational and supervised learning in the emergence of communication

Cross-situational and supervised learning in t he emergence of com m unication Jos´ e F. F ontanari 1 , ∗ and Angelo Cangelosi 2, † 1 Instituto de F ´ ısic a de S˜ ao Carlos, Universidade de S˜ ao Paulo, Caixa Postal 369, 13560-970 S˜ ao Carlos, S˜ ao Paulo, Br azil 2 Centr e for R ob otics and Neur al Systems, Uni versity of Plymout h, Plymouth PL4 8AA, Unite d Ki ngdom Scenarios for the emergence or bo otstrap of a lexicon inv olve the repeated interacti on b etw een at least t wo agen ts who must reach a consensu s on how to name N ob jects using H words. H ere we consider minimal mo dels of tw o types of learning algorithms: cross-situational learning, in which the individuals determine the meaning of a word by lookin g for something in common across all observed uses of that word, and sup ervised op erant conditioning learning, in which there is strong feedbac k b etw een ind iv iduals about the intended meaning of the w ords. Despite the stark diﬀerences b etw een these learning sc hemes, w e sho w th at they yield the same communication accuracy in th e realistic limits of large N and H , which coincides with the result of the classica l o ccupancy problem of randomly assigning N ob jects to H words. I. INTRO DUCTION How a c oherent lexic o n ca n emerge in a gro up of inter- acting agents is a ma jor op en issue in the lang uage evolu- tion a nd acquisition research area (Hurford, 1 989; Nowak & Krak auer , 1999; Steels, 2 002; Kirby , 2002; Smith, Kirby , & Brighton, 200 3). In addition, the dy namics in the self-org anization of shar e d le x icons is one of the iss ues to which co mputational and mathematical modeling can contribute the most, as the e mer gence o f a lex icon from scratch implies so me type o f self-org anization and, pos- sibly , thresho ld phenomenon. This cannot b e co mpletely understo o d without a thorough explor a tion of the pa- rameter spac e o f the mo dels (Baro nchelli, F elici, L o reto, Caglioli, & Steels, 2 006). There are tw o main r e s earch aven ues to inv estigate the emergence or b o o tstrapping of a lexicon. The ﬁrst approach, ins pir ed by the seminal work of Pinker a nd Blo om (1990 ) who ar gued that natural selectio n is the main design pr inciple to ex pla in the emerg e nce a nd com- plex s tructure of language, resor ts to evolutionary algo- rithms to evolv e the shar ed lexicon. The key element here is that a n improvemen t on the communication a bility of an individua l results, in av erage, in an increa se of the nu m be r of o ﬀspring it pro duces (Hurford, 1 989; Now ak & Krak a uer, 1999 ; Cangelosi, 2001; F o ntanari & Perlo vsky , 2007, 2 0 08). T he second resea rch av enue, which we will follow in this pap er , arg ues for a culturally based view of language evolution and so it assumes that the lexicons are acquired and mo diﬁed sole ly thro ugh learning during the individual’s lifetime (Steels, 200 2 ; Smith, Kirby , & Brighton, 200 3). Of course , if there is a fact abo ut languag e whic h is uncontro v ersial, it is that the lexico n m ust b e le a rned from the active or passive in ter action betw een c hildren and la ng uage-pr oﬁcient adults. The iss ue of whether this ability to learn the lex icon is due to so me domain-ge ne r al ∗ Electronic address: fon tanari@ifsc.usp.br † Electronic address: A.cangelosi@plymouth.ac.uk learning mechanism, o r is an innate ability , unique to hu mans, is still o n the table (Bates & E lma n, 199 6). In the problem we address here, there is s imply no langua ge- proﬁcient individuals, so it is not so far-fetched to put forward a biological rather than a cultura l explanation for the emer g ence of a self-o rganize d lexic o n. Nevertheless, in this contribution we will use many insights pro duced by resea rch on language acquisition by children (see, e.g., Gleitman, 199 0; B lo om, 200 0 ) to study diﬀer e nt learning strategies. F rom a developmen tal p ers pe ctive, there are basi- cally tw o comp eting schemes for le x icon acquisition by children (Rose nthal & Zimmerman, 1978). The ﬁrst scheme, termed cross - situational or obse rv ational learn- ing, is based on the intuitiv e idea that one w ay that a learner can determine the meaning o f a w ord is to ﬁnd something in common across all obs erved uses of that word (Pinker, 1984; Gleitman, 1990 ; Siskind, 1996 ). Hence learning takes plac e through the statistica l sam- pling of the contexts in which a word app ears . Since the lea rner r eceives no feedback ab out its inferences, we refer to this sc heme a s unsuper vised lear ning. The sec- ond s cheme, known generally a s op era nt co nditioning, in- volv e s the active pa r ticipation of the agents in the learn- ing pro c e ss, with e x change o f non-linguistic cues to pro- vide feedba ck on the hearer inferences. This sup er vised learning scheme has b een applied to the design of a sys- tem for communication by autonomous r ob ots – the so- called lang uage game in the T alking Heads exp er iment s (Steels, 2 003). Despite the technological a ppea l, the em- pirical evidence is that most pa rt of the lexicon is ac- quired by children as a pro duct of unsup ervised learning (Pinker, 198 4; Gleitman, 1990 ; Blo om, 200 0). Int erestingly , from the p er s p e c tive o f ev olving o r b o ot- strapping a lexicon, the unsup ervised scheme is very at- tractive to o, since it eliminates altog ether the issue of honest sig na ling (Dawkins & Kre bs, 197 8), as no signal- ing is inv olved in the learning pro cess, which requir e s only observ ation and some elements of intuitiv e psychol- ogy (e.g. Theo ry of Mind). Many diﬀere nt computational implementations a nd v ariants of these tw o sch emes for b o otstrapping a lex icon 2 hav e b een pr o p osed in the litera ture. F or example, Smith (2003a , 200 3b), Smith, Smith, Blythe, & V ogt (2 0 06), and De Beule, De Vylder , & Belpaeme (200 6) hav e ad- dressed the unsup ervise d learning scheme, wher eas Steels & Kaplan (1999), Ke, Minett, Au, W ang (2002 ), Smith, Kirby , & Brig hton, (200 3), and Lenaerts, Ja nsen, T uyls, & De Vylder (2005), the sup ervised scheme. How ever, except for the extensive statistica l ana lysis of a v ariant of the supervis ed learning algorithm whic h reduces the problem to that of naming a s ingle ob ject (Bar onchelli, F elici, L oreto, Cag lioli, & Steels, 200 6 ), the study of the eﬀects of changing the parameters of those mo dels have bee n us ually limited to the display of the time evolu- tion of so me mea sure of the communication accura cy of the po pulation. Although at ﬁr st s ight the sup ervis ed learning scheme may seem to be clear ly s uper ior to the unsupe r vised one (alb eit less realistic in the context o f language acquisition by child ren), we ar e no t aw are of any thorough compariso n betw een the p erfor mances of these t wo learning scenar ios. In fac t, in this contribution we show that in a realistic limit of v ery large lexicon sizes the supervised and unsupervis ed learning p erfor mances are essentially identical. In this paper we study minimal models of the sup er- vised and unsup ervised lear ning schemes which preser ve the main ingredients of these t wo cla ssical la nguage ac- quisition par adigms. F or the sake of simplicit y , here we interpret the lexicon as a mapping b etw een ob jects and w ords (or sounds) rather than a s a mapping b e- t ween mea nings (conceptual structures) a nd sounds. A more complete scenario would inv olve ﬁrst the creation of mea nings, i.e., the bo otstrapping of an ob ject-meaning mapping (Steels, 19 96; F on tanari, 20 06) and then the emergence of a meaning -sound mapping (see, e.g., Smith, 2003a , 200 3b; F on tanari & Perlovsky , 200 6). II. MODEL F ollowing a common a ssumption in lexicon bo o tstrap- ping mo dels, such a s the p opula r itera ted learning mo del (Smith, Kirby , & Brig ht on, 2003; Brighton, Smith, & Kirby , 2 0 05 ), w e c onsider her e only tw o agents who play in turns the r o les of spea ker and hea rer. The a gents live in a ﬁxed environmen t comp osed of N o b jects and have H words av ailable to name thes e ob jects. As we are in- terested in the limit where N and H are very large with the r atio α ≡ H / N ﬁnite we do not need to acco unt for the p ossibility of creation of new words as in some v ariants of the sup ervised le a rning s cheme (Baro nchelli, F elici, Lor eto, Caglioli, & Steels, 2 006). W e a s sume that each agent is characterized by a N × H verbalization ma trix P the entries of which p nh ∈ [0 , 1], with p nh ∈ [0 , 1] for all v a lues of n = 1 , . . . , N , b eing int erpreted a s the probability that ob ject n is asso ciated with w ord h . This as s umption rules out the e x istence of ob jects without names, but it allows for words which are never used to name ob jects. T o describ e the comm u- nicative b ehavior o f the agents through the verbalization matrix (i.e., the asso ciations b etw een ob jects and words for us e b oth in pro ductio n and interpretation) we need to sp ecify how the sp eaker chooses a word for any given ob ject a s well as how the hearer infers the ob ject the sp eaker intended to name by tha t word. T o name an ob ject, say ob ject n , the speaker sim- ply chooses the word h ∗ which is asso cia ted to the largest entry o f row n o f the matrix P , i.e., h ∗ = max h { p nh , h = 1 , . . . , H } . In a dditio n, to guess which ob ject the sp ea ker named by word h the hear er selects the ob ject that co r resp onds to the la rgest of the N entries p nh , n = 1 , . . . , N . In other w ords, the hearer choose s the ob ject that it its e lf would b e most likely to asso ciate with word h (Smith, 2003a , 2003b). This amounts to assuming tha t the ag ent s ar e endow ed with a ‘Theory of Mind’ (T oM), i.e., that the heare r is so mehow a ble to understand that the sp eaker thinks s imilar to itself and hence would b ehav e likewise when facing the same situa- tion (Donald, 199 1). W e note that the or iginal infer ence scheme, termed “obv erter” (Oliphant & Batali, 19 97), assumed that the heare r has access to the verbalization matrix of the sp eaker (thro ugh mind rea ding, a s the crit- ics were ready to p oint out). Here we follow the more rea- sonable scheme, dubb ed “introspective obv e rter” (Smith, 2003a ), which r e q uires endowing the ag ents with a The- ory of Mind r ather than with telepathic abilities. Eﬀective communication ta kes place when the tw o agents reach a co nsensus o n which word m ust b e assigne d to each ob ject. T o achieve this, we mu st pr ovide a pre- scription to modify their initially rando m v er balization matrices. Here we will consider tw o lear ning pro cedures that diﬀer basically on whether the agents r eceive feed- back (sup ervis ed lear ning) or not (unsup ervis ed lear n- ing) ab out the success o f a communication episo de. But befo re do ing this we need to set up the language ga me scenario where the a gents interact. F rom the list of N ob jects, the agent who plays the sp eaker r o le choo ses randomly C o b jects without replace- men t. This set of C ob jects forms the context. Then the sp eaker choo ses randomly o ne ob ject in the context a nd pro duces the word associa ted to that ob ject, according to the pr o cedure sketc hed b efore. The hearer has access to that word a s well as to the C o b jects that comprise the context. Its task is to guess which ob ject in the con- text is named by that word. T his is then an ambiguous language acquisition scenario in which there ar e m ultiple ob ject candidates for any word. Once the verbalization matrices are up dated the tw o ag ents interc ha nge the roles of sp ea ker and hearer and a new context is g enerated fol- lowing the same pr o cedure. T o control the co nv erg ence pro pe rties of the learning algorithms describ ed next we assume that the entries p nh are discrete v ariables that can take on the v a lues 0 , 1 / M , 2 / M , . . . , 1 − 1 / M , 1. In our sim ulations w e cho ose M = 10 4 . The recipro c a l of M can be int erpreted as the algorithm learning rate. In a ddition, as there are tw o agents who alterna te in the r oles of sp ea ker and hea rer, 3 henceforth we will a dd the super scripts I or J to the ver- balization ma trix in order to identif y the agent it corre - sp onds to. At the b eginning of the language g a me each agent has a diﬀerent, r andomly gener ated verbalization matrix. Mo re p ointedly , to g enerate the row n of P I we distribute with equal probability M balls amo ng H slo ts and set the v alue of e nt ry p I nh as the ra tio b etw een the nu m be r of balls in slot h and the total num b er of balls M . An analog ous pro ce dur e is used to set the initial v a lue o f P J . A. Unsup ervise d l earning In this scheme, the lis t of ob jects in the context n 1 , . . . , n c and the a c companying word h ∗ is the only in- formation fed to the learning algorithm. Hence, in the unsupe r vised scheme, only the hea rer’s verbalization ma - trix is up dated. Of cour se, since the ag ents change roles at each learning episo de, the verbalization matrices of bo th agents are up dated during the lear ning stag e. F or concreteness, let us as s ume that agent I is the speaker and so agent J is the hear e r in a pa rticular learning episo de. As p ointed out befor e, the idea her e is to mo del the cr oss-situatio nal learning sc e nario (Siskind, 1 996) in which the agents infer the meaning of a given word by monitoring its occurr ence in a v ar iety of con tex ts. Ac- cordingly , the lea rning pr o cedure increases the entries p J n 1 h ∗ , . . . , p J n c h ∗ by the amount 1 / M . In addition, for each ob ject in the context, say n 1 , a word, say h , is cho- sen rando mly and the entry p J n 1 h is decrea sed by the same amount 1 / M , thus keeping the correct normalizatio n o f the rows of the verbalization matrix. (The p oss ibilit y that h = h ∗ is no t r uled out.) This pro cedure which is in- spired by Moran’s mo del o f p opulation genetics (Ewens, 2004) guarantees a minimum disturbance in the verbal- ization matrix and can b e in terpr e ted as the la teral in- hibition of the comp eting word-ob ject as so ciations. W e note that during the learning stage the agent playing the hearer role do es not need to guess which o b ject in the context is named by word h ∗ . An extra rule is nee de d to keep the entries p J nh within the unit interv al [0 , 1]: we assume that once an entry reaches the v alues p J nh = 1 or p J nh = 0 it b ec o mes ﬁxed, s o the extr emes o f the unit interv al act as abs orbing barr iers for the sto chastic dynamics of the lear ning algor ithm. B. Sup ervised le arning The setting is identical to that descr ib ed b efore ex- cept that now the hear er must guess which ob ject in the context the sp eaker named b y h ∗ and then communi- cate its c hoice to the s pe a ker (using some nonlinguistic means, such a s po int ing to the chosen ob ject). In turn, the sp eaker m ust provide another nonlinguistic hint to indicate whic h ob ject in the context it named b y word h ∗ . Let us assume tha t the sp eaker asso c iates word h ∗ to ob ject n 1 . If the hearer’s gues s ha pp ens to be the correct one, then b oth en tries p I n 1 h ∗ and p J n 1 h ∗ are incr emented by the a mount 1 / M . F urther more, t wo words, s ay h s and h h , ar e chosen randomly and the en tr ie s p I n 1 h s and p J n 1 h h are decreased by 1 / M so the normaliz a tion of row n 1 is preser ved in b o th verbalization matrices. Suppos e now the hea rer’s g uess is wro ng , sa y , ob ject n 2 instead of n 1 . Then b o th entries p I n 1 h ∗ and p J n 2 h ∗ are decr eased by the amount 1 / M and, as be fo re, tw o words h s and h h are chosen randomly and the entries p I n 1 h s and p J n 2 h h are increas ed by 1 / M . As in the unsup er vised case, the extremes p I ,J nh = 1 and p I ,J nh = 0 ar e a bsorbing barr iers. The weak p oint o f this learning scheme is the need for nonlinguistic hints to c ommunicate the success or failure of the comm unica tion episo de. This implies that, prior to lea rning, the age nts are already c a pable to commu- nicate (and understand) sophisticated meanings such as success and failure a nd b ehav e (by upda ting their ver- balization matrices ) accordingly . In fact, feedba ck ab out the o utco me o f the communication episo de may be seen as a form of telepathic meaning tra nsfer. II I. RESUL TS Sim ulation exp eriments of the t wo le arning algo rithms describ ed above show, not surprisingly , that after a tran- sient the tw o agents b ecome identical, in the sense that they a re describ ed by the s a me verbalization ma tr ix. In addition, in the case of uns up er vised learning the stochas- tic dynamics always leads to binar y verbalizatio n matr i- ces, i.e., matrice s whose entries p nh can take on the v a l- ues 1 or 0 only . Of c o urse, o nce the dynamics pr o duces a binary matrix it b eco mes frozen. This same outcome characterizes the sup ervised case as w e ll, except in the cases that the lexicon size H is on the s ame order of the context size C . How ever, as we fo c us on the regime wher e C is ﬁnite and N and H ar e lar ge we ca n guar antee that the sto chastic dynamics leads to bina ry verbalization ma- trices rega rdless o f the lea r ning pro cedure. Once the dy na mics b ecomes frozen (and so the lear n- ing stage is ov er ) we meas ur e the av erage communication error ǫ a s fo llows. The s p ea ker chooses ob ject n fr o m the list o f N ob jects a nd emits the corres p o nding w ord (there is a unique w ord assigned to any given ob ject, i.e., there is a single entry 1 in any row of the verbalization matr ix). The hear er must then infer which ob ject is na med by that w ord. Since the same word can name ma ny ob jects (i.e., there may be many entries 1 in a given column), the pro bability φ n that the hear er’s guess is correc t is simply the recipro cal of the n um be r of ob jects named by that word. This pr obability is the communication accu- racy rega rding ob ject n . The pro ce dur e is rep ea ted for the N ob jects, so the av erage communication err or is de- ﬁned as ǫ = 1 − φ wher e φ = P n φ n / N is the average communication a c curacy of the algo rithm. As already p ointed out, the no rmalization condition on the rows of the verbalization matrix P a llows for the 4 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 ε α FIG. 1: Communicatio n error ǫ as function of t he ratio α = H / N b etw een the num b er of words H and the number of ob jects N for N = 16( ▽ ) , 24( △ ) and 96(  ). The open (ﬁlled) symbols represen t the data for th e unsup ervised (sup ervised) algorithm. The error bars are smaller than t he sy mbol sizes. The solid line is the result of the extrap olation for N → ∞ (see Fig. 2 ) whereas the d ashed line represents the optimal p erformance 1 − α . The parameters are C = 2 and M = 10 4 . po ssibility that a cer tain num ber of words a re not us e d by the lexicon ac quisition algo rithms. Let H u ≤ H stand for the actual nu mber of w ords used by those a lgorithms. Then w e can easily convince ourse lves that H u = P N n φ n simply b y noting that P ′ n φ n = 1 when the sum is re- stricted to ob jects that are asso ciated to the same word. Finally , we note that in the deﬁnitions of these commu- nication mea sures the context plays no role at a ll; indeed the context is relev a nt only during the learning stage. It is imp o r tant to estimate the optimal (minim um) communication error ǫ m in our learning scenar io since, in addition to b e ing a low er bound to the communica- tion error pr o duced by the learning a lgorithms, it allows us to rate their a bs olute p erfor mances. F or H ≤ N the optimal co mm unication erro r is o btained by making a one-to-one a ssignment b etw een H − 1 words and H − 1 ob jects, and then assigning the sing le r emaining word to the remaining N − H + 1 ob jects. This pr o cedure yields ǫ m = 1 − H/ N = 1 − α . F or H > N we can obtain ǫ m = 0 simply by discarding H − N words and making a one-to-one word-ob ject assignment with the other N words. In fa ct, using o ur ﬁnding that φ = H u / N w e s ee that, as expected, the optimal pe r formance is o bta ined by setting H u = H if H ≤ N and H u = N if H > N . Figure 1 shows the compariso n b etw een the optimal per formance a nd the actual p e r formances of the tw o learning algorithms as function of the ratio α . In this, as w ell as in the other ﬁgures of this pap er, each symbol stands for the av erage ov er 10 4 independent samples or language games. The perfor mance of the sup ervised alg o- rithm deter iorates as the n um ber of ob jects N incr eases, 0.54 0.55 0.56 0.57 0.58 0 0.002 0.004 0.006 0.008 0.01 ε 1/N FIG. 2: Dep enden ce of th e comm unication error ǫ on th e reciprocal of the num b er of ob jects 1 / N for α = 0 . 5 for the unsup ervised ( ◦ ) and sup ervised ( • ) learning algorithms. The error bars are smaller than the symbol sizes. The linear ﬁ t- tings (solid straig ht lines) y ield ǫ = 0 . 5690 ± 0 . 0003 (un su- p ervised) and ǫ = 0 . 5677 ± 0 . 000 4 (sup erv ised) for N → ∞ . The Monte Carlo estimate of th e error for the random assign- ment of ob jects to wo rds is given by the symbols × and th e dashed h orizon tal line corresp onds to the estimate of Eq. (3), ǫ r = 0 . 5677. The parameters are C = 2 and M = 3 10 4 . in contrast to that of the unsup ervis ed algo rithm which actually shows a slight improvemen t in this case. F or N → ∞ , b oth algor ithms pro duce the same commun ica- tion error (see Fig. 2), whic h is shown b y the solid line in Fig. 1 . W e note that a preliminar y c o mparative analysis of these algorithms for N = 8 led to an inco rrect c la im ab out the gener al sup erior ity o f the sup ervised learning scheme (F ontanari & Perlovsky , 200 6). F or sma ll v al- ues of α the pe r formances o f the t wo learning a lgorithms are practically indistinguishable from the optimal p erfo r - mance, but as we will a r gue b elow the algorithms actually never achiev e that p erformance , ex c e pt for α = 0. It is instructive to calcula te the comm unication error in the case that the N ob jects are a ssigned r a ndomly to the H w ords. This is a classica l o ccupancy problem discussed at length in the celebrated b o ok by F eller (1968 ). In this o ccupancy problem, the probability P m that the num b er of words m not us ed in the assignment o f the N o b jects to the H words (i.e., m = H − H u ) is P m =  H m  H − m X ν =0  H − m ν  ( − 1) ν  1 − m + ν H  N , (1) which in the limits N → ∞ a nd H → ∞ r e duces to the Poisson distribution p ( m ; λ ) = e − λ λ m m ! (2) where λ = H exp ( − N /H ) remains b o unded (F eller, 1968). Hence the average comm unica tion a ccuracy re- 5 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 ε α FIG. 3: Comm unication error ǫ of the un sup ervised lexi- con acquisitio n alg orithm for context size C = 4 and N = 24( ▽ ) , 36( △ ) , 48( × ), and 96(  ). The error bars are smaller than the symb ol sizes. The learning rate is 1 / M = 10 − 4 and the solid line is the result of Eq. (3). sulting from the rando m assignment of ob jects to words is simply ( H − h m i ) / N , which yields the communication error ǫ r = 1 − α + α e − 1 /α . (3) Surprisingly , this equation describes p erfectly the com- m unication err or of the tw o learning algo r ithms in the limit N → ∞ (solid line in Fig. 1). W e note that the (small) discrepancy observed in Fig. 2 for the extrapo - lated data o f the unsup er vised algor ithm and the analyt- ical prediction can be reduced to zero by decreas ing the learning r ate 1 / M . Equation (3 ) ex plains also wh y the per formances of the algor ithms are practically indistin- guishable from the optimal p erformanc e for small α , since the diﬀerence b etw een them v anishes as ex p ( − 1 / α ). In addition, Eq. (3) shows that in the limit of lar g e α , the communication er ror v a nishes a s 1 /α . A word is in or der ab out the eﬀect of the context size C on the p erforma nce of the tw o learning a lg orithms, s ince Figs. 1 and 2 exhibit the re s ults for C = 2 o nly . Sim ula- tions for larg er v alues o f C show that this parameter is completely irrelev ant for the perfor ma nce of the sup er- vised algor ithm. O f course, this is e xp ected since regar d- less of the context size, at most tw o r ows (ob ject lab els) of the verbalization matric es ar e up dated. B ut the situ- ation is far from obvious for the unsup ervised algor ithm since C deter mines the num ber of rows to b e up da ted in each round of the ga me. Howev er , the results summa- rized in Fig. 3 for C = 4 indicate that, despite str o ng ﬁnite-size eﬀects particular ly for small α , the communi- cation error ultimately tends to ǫ r in the limit of lar g e N . IV. CONCLUSION In this pap er we hav e unveiled tw o r e mark able res ults. First, the sup erv ised and unsuper vised sch emes for b o o t- strapping a lexicon y ield the same communication accu- racy in the limit of very lar g e lexicon sizes. F or ﬁnite lex- icon sizes the super vised scheme alwa ys outp erfor ms the unsupe r vised one, but its p e rformance degrades as the lexicon size increa ses, whereas the p erfor mance o f the unsupe r vised lea r ning algor ithm improv es slight ly with increasing lexicon size (see Fig. 1). Second, tho se p e r for- mances tend to the commun ication accur acy obtained by a r andom oc c upancy problem in which the N ob jects are assigned rando mly to the H words. These ﬁndings reveal a surprising ineﬃciency of traditional lexicon b o otstra p- ping scenar io s when ev aluated in the rea lis tic r egime of very large lexicon sizes. It w ould b e mos t interesting to devise sensible scenarios that reproduce the optimal communication p erforma nc e o r, at least, that exhibit an communication error that decays faster than the r andom o ccupancy result, 1 /α = N / H , in the case the n um ber of av aila ble w o rds is muc h greater than the num b er of ob jects ( H ≫ N ). The scenario s studied here are easily a da pted to mo del the problem of lexicon a cquisition (r ather than bo ot- strapping): we ha ve just to assume that one of the agents, named the master in this case , knows the corre ct lexi- con and so its verbalization matrix is k e pt ﬁxed during the entire learning pro cedure; the verbalization matrix of the other ag ent – the pupil – is allow ed to change fol- lowing the up da te algorithms describ ed b efore (see, e.g., F ontanari, Tikhanoﬀ, Cang elosi, Ilin, & Perlovsky , 2009). Most int erestingly , in this context, statistical world lea rn- ing ha s been observed in controlled exp eriments inv olv- ing infants (Smith & Y u, 2008) and a dults (Y u & Smith, 2007). Similar exp er iment s, but now aiming a t bo ot- strapping a lexicon, could b e eas ily carried out by re- placing our virtual ag ent s by tw o a dults, who would then resort to some conscious o r unconscious mechanism to track the co- o ccurrence o f words and ob jects. Of co ur se, the very emerg ence of pidgin - a means of communication betw een tw o or mor e g roups which lack a co mmon lan- guage (Thomason & Ka ufman, 1988 ) - can b e se en as a realization of such an exp eriment and serves a s additiona l justiﬁcation for the study of lexico n bo o tstrapping. Ackno wle dgments The r esearch at S˜ ao Carlos was s uppo rted in part by CNPq, F APESP and SOARD gran t F A9550-1 0-1-0 006. J.F.F. tha nk s the hospitality of the Adaptive Behaviour & Cognitio n Resea rch Group, Universit y of P lymouth, where this r esearch w as initiated. The visit was sup- po rted by euCognition.or g trav el gr ant NA-097-6 . Can- gelosi a lso a ckno wle dg es the co nt ribution of the IT ALK pro ject from the E urop ean Commissio n (FP 7 ICT Co g- nitive Systems a nd Rob otics). 6 References Baronchelli, A., F elic i, M., Lo reto, V., Ca glioli, E ., & Steels, L. (20 06). Sharp transitio n tow ar ds shar e d vo- cabularies in multi-agent systems. Journal of Statistic al Me chanics , P06 014. Bates, E., & Elman, J. (1 996). Lear ning rediscov ered. Scienc e , 27 4, 1849-1 850. Blo om, P . (2000). How childr en le arn t he me aning of wor ds . Cambridge, MA: MIT Press . Brighton, H., Smith, K., & Kirby , S. (2005 ). Language as an e volutionary system. Physics of Life R eviews , 2, 177-2 26. De Beule, J., De Vylder, B., & Belpa eme, T. (2006). A cross -situational learning algorithm for damp- ing homonym y in the g uessing game. In L.M. Ro cha, M. Bedau, D. Flor eano, R. Go ldstone, A. V espigna ni, & L. Y aeger (Eds.), Pr o c e e dings of the Xth Confer enc e on Ar- tiﬁcial Life (pp. 466-4 72). Cambridge, MA: MIT P ress. Cangelosi, A. (2001 ). Evolution of Communication and Language using Signa ls, Symbols a nd W or ds. IEEE T r ansactions on Evolutionary Computation , 5 , 93-10 1. Dawkins, R., & Krebs, J.R. (19 78). Animal sig nals: information or manipulation? In: J.R. Krebs, & N. B. Davies (Eds.), Behaviour al e c olo gy: an evolutionary ap- pr o ach (pp. 2 82-3 0 9). O xford, UK: Blackw el Scientiﬁc Publications. Donald, M. (1991 ). Origins of the Mo dern Mind . Cam- bridge, MA: Harv ard Univ ersity Pr e ss. Ewens, W.J. (2004 ). Mathematic al Population Genet- ics . New Y o r k: Springer-V erla g. F eller, W. (1 968). An Intr o duction to Pr ob abili ty The- ory and Its A pplic ations . V ol. I, 3rd Editio n. New Y ork: Wiley . F ontanari, J .F. (2006). Statistical analy sis of discrimi- nation games. Eur op e an Physic al J ournal B , 5 4, 12 7-130 . F ontanari, J.F., & Perlovsky , L.I. (200 6). Mea ning cre- ation and communication in a co mmunit y of a gents. In Pr o c e e dings of t he 2006 International Joint Confer enc e on Neura l Networks (pp. 28 92-28 97). P iscataw ay , NJ: IEEE Press . F ontanari, J.F., & P erlovsky , L.I. (2007). E volving comp ositionality in evolutionary language games. IEEE T r ansactions on Evolutionary Computation , 11 , 758-7 69. F ontanari, J .F., & Perlo vsky , L.I. (2008 ). A game the- oretical approa ch to the evolution of structured commu- nication co des. The ory in Bioscienc es , 127 , 205-2 14. F ontanari, J.F., Tikhano ﬀ , V., Cangelosi, A., Ilin, R., & Perlovsky , L.I. (2009). Cross - situatio na l le arning of o b ject-word mapping using Neur a l Mo deling Fields . Neur al Net works , 22, 579-5 85. Gleitman, L. (1990 ). The str uc tur al sour ces of verb meanings. L anguage Ac quisition , 1 , 1 -55. Hurford, J.R. (19 89). Biologic a l evolution o f the Saus- surean sign as a c o mp o nent of the langua ge acquisition device. Lingu a , 77, 18 7-222 . Ke, J, Minett, J .W., Au, C.-P ., & W a ng, W.S.-Y. (2002). Self-orga niz a tion and Selec tion in the Emergence of V o cabulary . Complexity , 7, 4 1-54 . Kirby , S. (2002). Natur a l lang ua ge from artiﬁcia l life. Artiﬁcia l Life , 8, 18 5-21 5 . Lenaerts, T., Jansen, B., T uyls, K., & De Vylder, B. (2005). The evolutionary lang uage game: An orthogona l approach. Journal of The or etic al Biolo gy , 235 , 566-5 82. Now ak, M.A., & Kra k auer, D.C. (19 99).The evolution of languag e. Pr o c e e dings of the National A c ademy of Sci- enc es USA , 96 , 8028- 8033. Oliphant, M., & Ba tali, J. (19 97). Le arning a nd the emergence of co ordinated communication, Center for Re- se ar ch on L anguage Newsletter , 1 1. Pinker, S. (1984 ). L anguage le arnability and language development . Cam bridge, MA: Har v ard Universit y P ress. Pinker, S., & B lo om, P . (199 0). Natural languag e s an natural s election, Behavior al and Br ain Scienc es , 13, 707-7 84. Rosenthal, T., & Zimmerman, B. (1978). So cial L e arn- ing and Co gnition . New Y ork : Academic P ress. Siskind, J.M. (199 6). A computational study o f cross- situational tec hniques for learning word-to-mea ning map- pings. Co gnition , 61, 39 -91. Smith, A.D.M. (2003 a). Semantic generaliza tion and the inference o f mea ning. L e ctur e Notes in Artiﬁcial In- tel ligenc e , 280 1, 49 9-506 . Smith, A.D.M. (2003 b). Intelligen t meaning cr eation in a clumpy world helps communication. Artiﬁcial Life , 9, 557- 5 74. Smith, K., Kirby , S., & Brighton, H. (2003). Iter a ted Learning: a framework for the emer gence of la nguage. Artiﬁcia l Life , 9, 37 1-38 6 . Smith, K., Smith, A.D.M, Blythe, R.A., & V ogt, P . (2006). Cr oss-Situationa l Lear ning: A Mathematical Ap- proach. L e ctur e N otes in Computer S cienc e , 4 211, 3 1-44. Smith, L.B., & Y u, C. (2008). Infants rapidly lear n word-referent mappings via cross - situa tional statistics. Co gnition , 1 06, 1558- 1 568. Steels, L. (1996). Perceptually gr ounded meaning c r e- ation. In M. T o koro (Ed.), Pr o c e e dings of t he Se c ond International Confer en c e on Multi-A gent Systems (pp. 338-3 44). Menlo Park, CA: AAAI P ress. Steels, L., & Kaplan, F. (1 999). Situated g rounded word seman tics. In Pr o c e e dings of the Sixte enth Int er- national J oint Confer en c es on Artiﬁcial Intel ligenc e (pp. 862-8 67). San F rancisco, CA: Morg an Kauﬀman. Steels, L. (2002). Grounding symbols through evolu- tionary la nguage ga mes. In A. Cangelo si, & D. Parisi (Eds.), Simu lating t he Evolution of L anguage (pp. 211 - 226). Lo ndon: Springer -V erlag . Steels, L. (200 3). Evolving Grounded Communication for Rob ots. T r ends in Co gnitive Scienc es , 7, 308 -312. Thomason, S.G., & Kaufman, T. (1988 ). L anguage c ontact, cr e olizatio n, and genetic linguistics . Berkeley: Univ ersity of Ca lifornia Press. Y u, C., & Smith, L.B (200 7). Rapid word lea r ning under uncertaint y via cr oss-situa tional statistics . Psy- cholo gic al Scienc e , 18, 41 4-420 .

Cross-situational and supervised learning in the emergence of communication

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment