Adapting Heuristic Mastermind Strategies to Evolutionary Algorithms
The art of solving the Mastermind puzzle was initiated by Donald Knuth and is already more than 30 years old; despite that, it still receives much attention in operational research and computer games journals, not to mention the nature-inspired stoch…
Authors: Tomas Philip Runarsson, Juan J. Merelo-Guervos
Adapting Heuristic Masterm ind Strategies to Ev olutionary Algorithms Thomas Philip Runarsson ∗ Juan J. Merelo-Guerv ´ os † August 14, 2018 Abstract The art of solving the Mastermind puzzle was initiated by Donald Knuth and is already more than 30 yea rs old; despite that, it still recei ves much attention in operational research and co mputer games journals, not to mention the nature- inspired stochastic algorithm lit erature. In this paper we try to suggest a str ategy that will allow nature-inspired algorithms to obtain results as good as those based on exhausti ve search strategies; in order to do that, we first rev iew , compare and improv e current approaches to solving the puzzle; then we test o ne of these strate- gies with an estimation of distribution algorithm. Finally , we try to fi nd a strategy that falls short of being exh austi ve, and is then amenable for inclusion in nature inspired algorithms (such as evo lutionary of particle swarm algorithms). This pa- per pro ves that b y the incorporation of local en tropy into the fitness function of the e volutionary algorithm it becomes a better player than a rando m one, and giv es a rule of thumb on ho w to incorporate the best heuristic strategies to evo lutionary algorithms without incurring in an excessi v e computational cost. Keywords : gam es, Mastermind, b ulls and co ws, search strategies, oracle games 1 Introd uction Mastermind in its cur rent version is a board game that was introd uced by the telecom- munication s expert Mordecai Merowitz [15] and sold to the com pany In victa Plastics, who renamed it to its actual name; in fact, Mastermind is a version of a traditional puz- zle called bulls and cows that dates back to the Middle Ages. In any case, Mastermind is a puzzle (rather than a game) in which tw o persons, the codemaker and codebreaker try to outsmart each other in the following w ay: • The co demaker sets a leng th ℓ combina tion o f κ symbols. In th e classical ver- sion, ℓ = 4 and κ = 6, and color pe gs are used as symbols over a board with rows of ℓ = 4 h oles; howe ver , in this pap er we will use upp ercase letters starting with A instead of colour s. ∗ Uni versi ty of Iceland, email tpr@hi.is † Departmen t of Architectur e and Computer T echn ology , ETSIIT , Uni v ersity of Granada, Spain, email jmerelo@geneura.ug r .es 1 • The codebreaker then tries to gu ess this secret code b y produ cing a combination. • The co demaker gives a response consisting on the numbe r of symbols gu essed in the right p osition (usually rep resented as black pegs) a nd t he number of s ymbo ls in an incorr ect position(usually represented as white pegs). • The codeb reaker th en, using that info rmation as a h int, pr oduces a new com bi- nation until the secret code is foun d. For i nstance, a game could go like this: The codemaker sets the secret code ABBC . The rest of the game is shown in T ab le 1. Combination Response AABB 2 black, 1 white A CDE 1 black, 1 white FFD A 1 white ABBE 3 black ABBC 4 black T able 1 : Progress in a Master mind g ame that tries to guess the s ecret combination ABBC . The player here is n ot particularly clueful, play ing a third combinatio n tha t is not con sistent with the first one, not coin ciding in two po sitions and one colo r (co rre- sponding to the 2 black/1 white response gi ven by the codemaker) with it. Different variations of th e gam e include giving info rmation on which p osition has been guessed co rrectly , av oiding repeated sy mbols in the secret co mbination ( bulls a nd cows is actually th is way) , or allowing the c odemaker to cha nge the cod e durin g the game (but only if this does not make responses made so far fals e). In a ny c ase, the codebrea ker is allowed to make a maximum number of comb ina- tions (usually fifteen , or mor e for larger values o f κ and ℓ ), and score corr esponds to the number of combin ations needed to fin d th e secret cod e; after repeating the g ame a num ber o f times with co demaker and codeb reaker ch anging sides, the one with th e lower sco re wins. Since Mastermind is asymmetric, in the sense tha t the position of one of the players after setting the secret code is almo st c ompletely p assi ve, an d limited to giv e hin ts as a r esponse to the guesses of the codeb reaker , it is rather a puzzle than a game, since the codeb reaker is not rea lly matching his skills ag ainst th e code maker , but facing a problem that must be solved with the h elp of hints, th e imp lication being that p laying Mastermind is mor e similar to solv ing a S udok u than to a gam e of ch ess; thus, the solution to Masterm ind, unless in a very p articular situatio n (always play ing with an oppon ent who has a par ticular bias for cho osing codes, or ma ybe playing the dy namic code version), is a s earch pro blem with constraints. What m akes this pr oblem interestin g is its relation to other, generally ca lled ora- cle pro blems such as circu it and pro gram testing, differential cryp tanalysis and other puzzle games ( these similarities were revie wed in ou r pr evious pap er [ 10]) is the fact 2 that it h as be en p roved to b e NP-co mplete [1 4, 5] and that there are sev eral open is- sues, n amely , wh at is the lowes t average n umber of g uesses you can a chieve, how to minimize the number of evaluations needed to find them (and thu s the ru n-time of the algorithm ), and o bviously , h ow it scales when increa sing κ and ℓ . This paper will concentr ate o n the first issue. This NP com pleteness implies that it is d ifficult to find algo rithms that solve the problem in a reasonable amo unt of time, an d that is why in our previous work [10 , 9, 2 ] we intro duced stochastic evolutionary an d simulate d an nealing alg orithms that so lved the Mastermind puzzle in the gen eral case, finding solutions in a reasonable amount of time that scale d rou ghly logarithmica lly with problem size. The strategy fo llowed to play the game was o ptimal in th e sense th at is was g uaranteed to find a solu tion a fter a finite num ber o f com binations; howe ver , there was no additio nal selection o n the combinatio n played o ther th an th e fact th at it was co nsistent with the responses given so far . In th is paper, after revie wing h ow the state of the art in solving this puzzle has ev olved in the last few y ears, we examin e h ow we could improve the cod e-breakin g skills o f an ev olutionary alg orithm b y usin g different tech niques, an d h ow these tech- niques can be furth er optimized . In o rder to do that we examin e different ways of scoring co mbination s i n th e search space, h ow to choo se one com bination out of a set of com binations that hav e exactly the same score, and how all that can be a pplied to a simple estimation of distribution algorithm to improve results over a standa rd on e. This paper presents for the first time an ev olutiona ry algo rithm tha t biases search so that combin ations played have a better chance of reducin g the size of th e remaining search space, and adapt to an stochastic en vironmen t determ inistic techniques that had been previously publish ed; all te chniques, unlike our f ormer p apers, hav e be en tested over the who le code sp ace, instead of a ran dom sample, so th at they can be co mpared and yield significant results. The rest of the p aper is organized as follows: next we establish terminolo gy and examine th e state of the art; then heur istic strategies for Mastermind are examined in Section 3; the way they could be ad apted to an ev olutiona ry algor ithm is presented in Section 5, and finally , conclusions are drawn in the closing sectio n 6 2 State of the art Before presen ting the state of the art, a few definitions are needed. W e will u se the term r espo nse for th e return code of the co demaker to a play ed combination, c pl ayed . A response is therefo re a functio n of the combination , c pl ayed and the secret comb ination c secre t , let th e resp onse b e den oted by h ( c pl ayed , c secre t ) . A combin ation c is co nsistent with c pl ayed iff h ( c pl ayed , c secre t ) = h ( c pl ayed , c ) (1) that is, if the com bination has as many black and white pins with respec t to the played combinatio n as the play ed combinatio n with respec t to the secret comb ination. Fur - thermor e, a com bination is consistent iff h ( c i , c ) = h ( c i , c secre t ) for i = 1 .. n (2) 3 where n is the n umber of co mbination s, c i , played so far; that is, c is consistent with all gu esses mad e so far . A co mbination tha t is consistent is a cand idate solution . T he concept of consistent comb ination will be imp ortant f or characte rizing d ifferent ap- proach es to the game of Mastermind. One o f the earliest strategies, b y Knu th [6], i s perh aps the m ost intuitive for Master- mind. In this strategy the player selects t he guess that reduces the number o f rema ining consistent guesses and the op ponent th e retur n cod e leading to th e maximum nu mber of guesses. Using a complete minimax search Knuth shows th at a maximum of 5 guesses are n eeded to solve the g ame using this strategy . This typ e of strategy is still th e most widely used today: mo st algorithms for Mastermind start by searching for a consistent combinatio n to play . In some cases o nce a sing le consistent g uess is fou nd it is immed iately p layed, in which case the o bject is to find a consistent g uess as fast as possible. For example, in [10] an e volutionary algo rithm is d escribed for this p urpose. These strategies are fast and do no t need to examine a big par t of the sp ace. Playin g a c onsistent comb inations ev entually p roduces a numb er of g uesses that uniqu ely determin e the code. However , the max imum, and average, n umber of com binations needed is usually high . Hence, some b ias mu st be introduced in the way c ombinatio ns a re search ed. I f not, the guesses will be no better than a p urely random appro ach, as solutions found (and played) are a random sample of the space of consistent guesses. The alternative to discovering a single consistent guess is to collect a set of consis- tent guesses and select among th em the best alternati ve. For this a number of heuristics have been developed over the y ears. T ypically these heu ristics r equire all consistent guesses to be first found . T he alg orithms then use som e kin d o f search over the space of consistent combination s, so th at o nly the guess that extracts the most inf ormation from th e secret co de is issued , or else the o ne th at reduces as much a s possible th e set of r emaining consistent combin ations. Howev er , this is obviously not known in ad - vance. T o each com bination corresponds a partition of the r est of the sp ace, accord ing to the ir match (the nu mber of blacks a nd wh ite pegs that would be th e respo nse when matched with each o ther). Let u s consider the first combin ation: if the c ombination considered is AABB, th ere will be 256 combination s wh ose respo nse will be 0b, 0w (those with oth er co lors), 2 56 with 0b , 1w (th ose with either an A or a B), etc. Some partitions may also b e empty , o r con tain a single elem ent (4b, 0w will con tain just AABB, obvio usly). F o r a more exhausti ve exp lanation see [7]. Each com bination is thus characterize d by the featu res of th ese p artitions: the n umber o f non-em pty on es, the average n umber of combin ations in them, the ma ximum, an d o ther char acteristics one may think of. The pa th leading to th e most successful strategies to date includ e using the worst case , e xp ected case , entr opy [13, 3] and most parts [7] s trategies. The entr opy strategy selects the gue ss with the h ighest en tropy . T he entropy is compute d as follows: for each possible r esponse i f or a particular consistent g uess, the n umber o f rem aining con sistent guesses is found . Th e ratio of redu ction in the number of guesses is also the a priori probab ility , p i , of the secre t code being in the cor respondin g partition. The entr opy is then com puted as ∑ n i = 1 p i log 2 ( 1 / p i ) , wher e log 2 ( 1 / p i ) is th e inform ation in bit(s) p er partition, an d can be used to select the next comb ination to play in Masterm ind [ 13]. The worst case is a on e-ply version of Knuth’ s app roach, but Irving [4] sugg ested using 4 the expected ca se rather than the worst case. K o oi [ 7] n oted, however , that the size o f the pa rtitions is ir relev an t an d that rather the number of non emp ty par titions created, n , was important. This strategy is called mo st parts . The strategies above requir e o ne-ply look-ah ead an d either determining the size of resulting partitions and/or the number of them. Compu ting th e number of them is, howev er , faster than deter mining th eir size. For this reason the most parts strate gy has a computation al ad vantage. The he uristic strategies describ ed above use som e f orm of loo k-ahead wh ich is computatio nally expensi ve. I f no loo k-ahead is used to guide th e search a gue ss is se- lected pu rely at random . Howe ver , it may be possible to d iscriminate b y using local informa tion. If this were possible one cou ld even dismiss sear ching fo r all co nsis- tent guesses and search fo r a single consistent gu ess with th e bias. In sectio n 3 these heuristic strategies are comp ared. In section 4 an E D A u sing only loca l informa tion is compare d with those that need to examine all co nsistent g uessed in or der to select the best one. 3 Comparison of heuristic strategies As has b een mentioned b efore, ther e have been a number of different strategies p ro- posed over the y ears for selecting amon g consistent guesses in Mastermind. These heuristics do no t conside r an exhaustive minima x search, but rather one -ply sear ch. What is, howe ver, not clear in these researc h pap ers is how ties are broken, which probab ly implies that a first come, fi rst served approach is taken, using the first combi- nation in lexicographica l order out o f all tied combination s. For this reason we propose to perfor m a com parison o f the heuristic methods here where the ties are broken ra n- domly . Each strategy is, therefor e, used o n all possible secret com binations (th ey ar e 6 4 = 1296) using ten independ ent ru ns. The h euristics compare d are th e entr o py , most pa rts and worst case strategy , as perfor med by Bestavros and Belal [ 3]. Th e worst case refers to the fact that for each possible return code fo r a particu lar gu ess th e smallest red uction in assumed, i.e. the worst case. The actu al con sistent gue ss cho sen is the one which max imizes the worst case. Finally , the expected size strategy , [ 4] is also tested; in this strategy the expected case is used instead of the worst case. T hese strategies are c ompared with the rand om strategy . The resu lts of the expe riments a re g i ven in table 2. The first com bination played is always AABC, as p roposed by [4]. The W ilco xon rank sum (used instead of t- test since the variable do es not follow a normal distribution) with a 0.05 significan ce lev el is used to determin e which results are statistically different fo rm anoth er . The horizon tal lines ar e used to gro up together he uristics that are not statistically different from th e other . From the se r esults we can g ather that ther e is no statistical difference between the entr opy and mo st pa rts strategies. However , out of all gam es p layed the maximum number of gu esses needed by the E ntropy strategy was only 6 while f or most p arts it was 7 . Th ese strategies ar e also better than th e worst and expected case, which are statistically eq uiv alent. For the worst case strategy u sed, nevertheless, only a max imum of 6 g uesses, un like the expected case with 7 . T he worst pe rformer is the random strategy which also requ ired a max imum of 8 g uesses. Finally , note th at th e 5 Strate gy min mean median max st.dev . max guesses Entropy 4.383 4.4 08 4.408 4.424 0.012 6 Most parts 4.383 4.4 10 4.412 4.430 0.013 7 Expected size 4.447 4.4 70 4.468 4.490 0.015 7 W orst case 4.461 4.4 79 4.473 4.506 0.016 6 Random 4.566 4.6 08 4.608 4.646 0.026 8 T able 2: A co mparison of the m ean num ber of gam es played u sing all 6 4 colour com - binations and breakin g ties random ly , ran ked fro m best to worst average n umber o f guesses n eeded. Statistics a re g i ven for 10 in depende nt experim ents. The maximu m number of m oves used for the 1 0 × 6 4 games is also presented in th e final colu mn. Horizontal separators are giv en for stati stically indep endent results. optimal expected result on playing all s ecrets is 4 . 34 0 [8]. 4 Estimation of distrib ution algorithm using local en- trop y The commo n app roach to using ev olutionar y algorithm s for Masterm ind, is simp ly to search for a sing le consistent guess which is th en immediately playing it. This is especially true f or the generalized version of the game, for N > 6 and L > 4, where the task of just finding a consistent guess can be d iffi cult. Th e resu lt o f such an ap proach is likely to do as well as th e rand om strategy discussed in the previous section s. For steady state evolutionary algo rithms it may , howev er , be th e c ase th at the co nsecutive consistent guesses fou nd may be similar to o thers p layed bef ore. That is, th e strategy of play may not necessarily b e pure ly rand om. In any case it is highly likely that ev olu tionary algorith ms of this ty pe will not do better than the rand om strategy , as seen above, since consistent com binations foun d are a rand om sample o f the set of consistent combinatio ns. In this section we in vestigate the per forman ce of strategies that find a single con- sistent gue ss and play it immed iately . In this case we use a n estimation o f distribution algorithm [1 2] EDA included with the Algorithm::Ev olutionary Perl modu le [11], with the wh ole EDA-solving algorithm available as Algor ithm::MasterM ind::EDA from CP AN (th e compr ehensive Perl Ar chiv e Network). This is an standard EDA that uses a pop ulation of 2 00 individuals an d a replacem ent rate of 0.5; each gener ation, half the pop ulation is generated fr om the previously gen erated distribution. The fir st combinatio n play ed was AABB, since it was no t f ound sign ificantly different from using AABC, as before. The fitness function used previously [10 ] to find consistent guesses is as follows, f ( c guess ) = n ∑ i = 1 | h ( c i , c guess ) − h ( c i , c secre t ) | 6 that is, the su m of th e absolu te d ifference of th e num ber of white and black pe gs needed to m ake the g uess con sistent. Howe ver, this appro ach is likely to perfor m as well as the ran dom strategy discu ssed in the previous sectio n. When fin ding a sing le con sistent guess we canno t app ly the heuristic strategies fr om th e pr evious section. For this r ea- son we introduce n ow a local entropy measure, wh ich can be applied to non-consistent guesses and so bias ou r search . The local en tropy assumes that the fact th at som e combinatio ns are better than other s dep ends on its informa tional co ntent, and that in turn dep ends on th e entro py of the co mbination alon g with the rest o f the c ombina- tions played so far . T o com pute local entr opy , the combinatio n is concaten ated with n combinatio ns played so far and its Shannon entropy computed : s ( c guess ) = ∑ g ∈{ A ,..., F } # g ( n + 1 ) ℓ log ( n + 1 ) ℓ # g (3) with g bein g a symb ol in the alpha bet an d # d enotes the num ber of them . Thus, the fitness function which includes the local entropy is defined as, f ℓ ( c guess ) = s ( c guess ) 1 + f ( c guess ) In this way a bias is introd uced to the fitness to as to select the guess with the h igh- est local entropy . When a consistent comb ination is fo und, the combination with the highest entropy fou nd in the gener ation is played (which mig ht be the on ly one or one among se veral; howev er , n o special provision is done t o gen erate se veral). The result of ten in depend ent runs of th e EDA over the whole search spac e are now com pared with the results of th e previous sectio n. These r esults may b e seen in table 3. T wo EDA experim ents are shown, on e using the fitness fun ction design ed to find a con sistent guess o nly ( f ) an d ones using lo cal entropy f ℓ . The EDA using local entropy is statistically be tter than playing p ure rando m, whereas the other EDA is not. In or der to co nfirm the usefulness o f the lo cal en tropy , an ad ditional exper iment was perfor med. Th is time, as in the previous sections, all co nsistent guesses are f ound and the one with the h ighest local en tropy played. Th is results is labelled Loca lEntr opy in table 3. The results a re not statistically different from the E D A results u sing fitness function f ℓ . As a local con clusion, the Entr op y method seemed to pe rform the best on average, but the estimation o f distribution algorith m is not statistically different from (ad mit- tedly n aiv e) exh austiv e search strategies such as LocalEn tropy and perf orms signifi- cantly better than the Random algorithm on av erage. W e should r emark th at th e ob jectiv e of this p aper is not to show wh ich strategy is the best ru ntime-wise, or which one o ffers th e b est algorithmic per forman ce/runtime trade-off; but in any case we should note that the algorith m with the least number o f ev aluatio ns and lowest r untime is the EDA. Howe ver, its average p erforman ce as a player is not a s g ood as the re st, so some imp rovement might be obtained b y cre ating a set of possible solutions. It remains to be seen h ow many solutions w ould be needed, but that will be in vestigated in the next section. 7 Strate gy min mean median max st.dev . max guesses Entropy 4.383 4.4 08 4.408 4.424 0.012 6 Most parts 4.383 4.4 10 4.412 4.430 0.013 7 Expected size 4.447 4.4 70 4.468 4.490 0.015 7 W orst case 4.461 4.4 79 4.473 4.506 0.016 6 LocalEntro py 4.529 4.5 69 4.568 4.613 0.021 7 ED A+ f ℓ 4.524 4.5 71 4.580 4.600 0.026 7 ED A+ f 4.562 4.616 4.619 4.665 0.0 32 7 Random 4.566 4.6 08 4.608 4.646 0.026 8 T able 3: A co mparison of the m ean num ber of gam es played u sing all 6 4 colour com - binations and breaking ties randomly , ran ked fro m best to worse mean number o f com- binations. Statistics are g iv en for 10 indepe ndent e xperimen ts. The maxim um number of moves used fo r the 10 × 6 4 games is also presented in the fina l colum n. Horizon tal separators are giv en for statistically independen t results. 5 Heuristics based on a subset of consistent guesses Follo wing a tip in one of ou r former pap ers, recently Berghm an et al. [1] p roposed an evolutionary algorithm wh ich finds a nu mber of co nsistent gu esses and th en u ses a strategy to select which one o f th ese sh ould be p layed. T he strategy th ey ap ply is not unlike th e expected size strategy . Howe ver, it differs in some fun damental ways. In their approach each consistent guess is assum ed to be the secret in turn and each guess played against every different secret. The return codes are th en u sed to co mpute th e size of the set of remaining consistent guesses in the set. An a verage is then tak en over the size of th ese sets. Here, the key difference betwee n the expected size method is that only a subset o f all p ossible consistent guesses is u sed and some r eturn codes may no t be co nsidered o r co nsidered m ore freq uently th an o nce, w hich m ight le ad to a b ias in the r esult. Indeed th ey re mark that their ap proach is co mputation ally intensive which leads them to re duce the size of this subset fu rther . Note that Berghman et al. only present the result of a single evolutionary run and so their results cann ot be co mpared with those here. Their approach is, howe ver, interesting, an d lead us to co nsider the case wh ere an ev olu tionary algorithms has been designed to find a maximu m o f µ consistent guesses within some finite time. It will b e assumed that this subset is sampled unifo rmly and random ly fr om all p ossible consistent g uesses. The question is, how do th e heur istic strategies discussed ab ove work on a rand omly samp led subset of co nsistent gu esses? The experimen t p erformed in the previous sections are now rep eated, but this time only using the four best o ne-ply look -ahead he uristic strategies on a random subset of guesses, bou nded b y size µ . If th ere are many guesses that g i ve th e sam e num - ber of p artitions or similar entro py then perhap s taking a rand om subset would b e a good rep resentation for all guesses. This h as implicatio ns no t only with respect to the application of EAs but also to the common strategies discussed here. The s ize of the subsets are fixed at 10, 20, 30 , 40, an d 50, in o rder to in vestigate the 8 influence of the subset size. The results f or these experimen ts an d their statistics are presented in table 4. The results are presented are as expected better as th e subset size, µ , gets bigger . Noticeable is the fact that the entr o py and most parts strategies p erform the best as before, howe ver, at µ = 40 and 50 the entropy strategy is better . Strate gy min mean median max st.dev . max guesses µ = 10 Most parts 4.429 4.4 54 4.454 4.477 0.016 7 Entropy 4.438 4.4 68 4.476 4.483 0.016 7 Expected size 4.450 4.4 72 4.474 4.493 0.014 7 W orst case 4.447 4.4 86 4.487 4.519 0.020 7 µ = 20 Entropy 4.394 4.4 23 4.426 4.455 0.021 7 Most parts 4.424 4.4 31 4.427 4.451 0.009 7 Expected size 4.427 4.4 54 4.455 4.481 0.017 7 W orst case 4.429 4.4 53 4.451 4.486 0.017 7 µ = 30 Entropy 4.380 4.4 13 4.410 4.443 0.020 6 Most parts 4.393 4.4 16 4.416 4.435 0.015 7 Expected size 4.426 4.4 53 4.456 4.491 0.019 7 W orst case 4.434 4.4 59 4.461 4.477 0.013 7 µ = 40 Entropy 4.372 4.3 98 4.399 4.426 0.017 7 Most parts 4.383 4.4 24 4.427 4.448 0.020 7 Expected size 4.418 4.4 57 4.455 4.491 0.023 7 W orst case 4.424 4.4 58 4.457 4.490 0.022 7 µ = 50 Entropy 4.365 4.3 97 4.393 4.438 0.020 6 Most parts 4.400 4.4 24 4.422 4.454 0.017 7 Expected size 4.419 4.4 53 4.453 4.495 0.022 7 W orst case 4.431 4.4 56 4.457 4.474 0.012 6 T able 4: Statistics fo r the av erage nu mber of guesses fo r d ifferent m aximum sizes µ of sub sets of consistent guesses. The horizon tal line s are used as before to ind icate statistical in depend ent, with the excep tion of one case: for µ = 10 the expec ted size and worst case are not independent. Is the re a statistical difference b etween the different subset sizes? T o answer th is we look at o nly the two best strategies in more d etail, e ntr opy an d most parts , and compare th eir p erforma nces fo r the different subset sizes, µ , and u sing the co mplete set, case when µ = ∞ , as presented in table 3. Th ese results are g i ven in table 5 and 6. From th is analysis it may be conclu ded that a set size of µ = 20 is sufficiently large and not statistically different fr om using the entire set of consistent gu esses. This is actually qu ite a large r eduction is th e set size, which is about 25 0 on average after th e 9 first guess, then 55, followed by 12 [ 1]. µ = min mean median max st.dev . 10 4.438 4.468 4.476 4.483 0.0 16 20 4.394 4.423 4.426 4.455 0.0 21 30 4.380 4.413 4.410 4.443 0.0 20 40 4.372 4.398 4.399 4.426 0.0 17 50 4.365 4.397 4.393 4.438 0.0 20 ∞ 4.383 4.408 4.408 4.424 0. 012 T able 5: No statistical advantage is g ained wh en using a set size larger than µ = 30 when using th e entr opy strategy . Howe ver , there is a lso no statistically difference b e- tween µ = 2 0 and bo th µ = 3 0 and µ = ∞ (th e only cases not indicated by the h orizontal lines). µ = min mean median max st.dev . 10 4.429 4.454 4.454 4.477 0.0 16 20 4.424 4.431 4.427 4.451 0.0 09 30 4.393 4.416 4.416 4.435 0.0 15 40 4.383 4.424 4.427 4.448 0.0 20 50 4.400 4.424 4.422 4.454 0.0 17 ∞ 4.383 4.410 4.412 4.430 0. 013 T able 6: No statistical advantage is gaine d when using a s et size larger than µ = 20 for the mo st p arts strategy . Howe ver, there is a statistical d ifference between µ = 20 and µ = ∞ (the only case not indicated by the horizo ntal lines. This implies th at, at lea st in this case, using a subset of th e com bination p ool that is around 1 / 10 th of the total size potentially yields a r esult that is as goo d as using the whole set; even as algorithm ically finding 20 ten tativ e solutions is harder than finding a single one, using this in stocha stic search algorithm s such as the EDA mention ed above or an ev o lutionary algo rithm holds th e pro mise of c ombining the accura cy o f exhaustiv e search algor ithms with the speed of an ED A o r an EA. In any ca se, fo r spaces bigger than κ = 6 , ℓ = 4 there is n o other option , and this 1/1 0 g iv es at least a rule o f thumb. How this pro portion grows with search space size is still an o pen question. 6 Discussion and Conclusion In this p aper we ha ve tried to study an d compa re the d ifferent heuristic strategies f or the simplest version of M astermind in order to co me up with a nature-in spired algo rithm that is able to beat them in terms o f run ning time and scalability . Th e main problem with heu ristic strategies is that they need to have th e whole search sp ace in m emory; ev en the m ost advanced ones that ru n over it only on ce will bec ome unwieldy as soon 10 as ℓ o r κ increase. However , ev olu tionary algorithms ha ve already been proved [10] to scale much better, the only pro blem b eing th at their perf ormance as players is no better than a random player . In this paper, after improving (or maybe j ust clarifying) heuristic and deterministic algorithm s with an random cho ice of a combina tion to play , we h av e inco rporated the simplest of those strategies to an estimation of distribution alg orithm (the so-called local entr opy , which takes into accoun t the amount of surprise the n e w co mbination implies); results are promising, but still fall s hort of the best heuristic stra tegies, which take into a ccount the partition of search space created by each combinatio n. That is why we have tried to com pute the sub set tha t would be able to ob tain r esults tha t are indistinguish able, in the statistical sense, from tho se obtained with the whole set, coming up with a sub set whose size is ar ound 1 0% of the whole on e, bein g thu s less computatio nal intensive and easily inco rporated into an e volutionary algorithm. Howe ver, how this is incorporated within the e volutionary algorithm remains to be seen, and w ill be one of ou r future lines of work . So far, d istance to consistency and entropy a re com bined in an ag gregative fitness function ; th e qua lity o f partitions in- duced wil l also ha ve to be taken into acco unt; ho wever , there are se veral ways o f doing this: p utting con sistent solutions in an ar chive , in the same fashion that multiob jec- ti ve optimization algorith ms do, leav e them into the po pulation and take the q uality of partitions as another objectiv e, not to mention the e volutionary parameter issues them- selves: pop ulation size, op erator rate . Ou r ob jectiv e, in this sense, will be no t on ly to try and min imize the nu mber of average/median games play ed, but also to minimize the propo rtion o f the search space examined to find the final s olution. All the t ests and algorithms h av e been implemented using th e Matlab pack age, and are av ailab le as open source source software with a GPL licenc e f rom the autho rs. The ev o lutionary algo rithm and sev eral m astermind strategies ar e also av ailab le from CP AN; most results and configura tion files needed to compute them are av ailable from the group ’ s CVS server . Ackno wledgements This paper has been fu nded in part by th e Spanish MICYT projects NoHNES ( Spanish Ministerio de Ed ucaci ´ on y Ciencia - TIN2007 -6808 3) and TIN2 008-0 6491- C04-01 an d the Junta de Andaluc ´ ıa P0 6-TIC-02 025 and P07-TIC-03 044. T he authors are also very grateful to the traffic jams in Granad a, which allowed limitless moments of discussion and interaction over th is problem . Refer ences [1] Berghman, L., Goossens, D., Leus, R.: Efficient solu- tions for Mastermind usin g genetic algorith ms. Comput- ers and Opera tions Research 36 (6), 18 80–1 885 (2009). URL http://www.s copus.com/in ward/record.url?eid=2- s2.0- 56549123376&partnerID=40 11 [2] Bernier, J.L., Herr ´ aiz, C.I., Merelo-Guerv ´ os, J.J., Olmeda, S., Prieto, A.: Solv ing mastermind u sing GAs and simulated ann ealing: a case of d ynamic constrain t optimization . In: Proceed ings PPSN, Parallel Prob lem So lving from Nature IV , no. 1141 in Lecture No tes in Co mputer Science, pp . 554– 563. Sprin ger-V erlag (1996 ). http://ci teseer.nj.nec .com/context/1245314/0 [3] Bestavros, A. , Belal, A.: Master mind, a gam e of diagnosis strategies. Bul- letin of the Faculty of Eng ineering, Alexandria Un i versity (1 986). URL citeseer.ist .psu.edu/bes tavros86mastermind.html . A vail- able from http://www .cs.bu.edu/f ac/best/res/papers/alybull86 .ps [4] Irvin g, R.W .: T owards an optimum mastermind strategy . Jou rnal of Recreational Mathematics 11 (2), 81–87 (1978- 79) [5] Kendall, G., Parkes, A ., Sp oerer, K.: A sur vey of NP- complete puzzles. ICGA Jour nal 31 (1 ), 13 –34 (200 8). URL http://www.s copus.com/in ward/record.url?eid=2- s2.0- 42949163946&partnerID=40 . Cited By (since 1996) 1 [6] Knuth , D.E.: T he compu ter as Master Mind. J. Recreational Mathematics 9 ( 1), 1–6 (1976- 77) [7] K ooi, B.: Y e t another Masterm ind strat- egy . ICGA Journal 28 (1 ), 13–20 (200 5). URL http://www.s copus.com/in ward/record.url?eid=2- s2.0- 33646756877&partnerID=40 [8] K oyama, K., Lai, T . W .: An op timal Mastermind strategy . J. Recre ational Mathe- matics 25 (4) (1993 /1994) [9] Merelo- Guerv ´ os, J.J., Carpio , J., Castillo, P ., Ri vas, V .M., Romero, G.: Finding a needle in a ha ystack u sing hin ts and ev o lutionary compu tation: the case of genetic mastermind. In: A.S.W . Scott Brave (ed.) Late break ing p apers at the GECCO99, pp. 184–19 2 (1 999) [10] Merelo-Gu erv ´ os, J.J., Castillo, P ., Ri vas, V .: Find ing a needle in a haystack using hints and evolutionary computation : the case of evo- lutionary MasterMin d. Applied Sof t Computin g 6 (2), 17 0–179 (2006 ). http://www.s ciencedirect .com/science/article/B6W86- 4FH0D6P- 1/2/40a99afa8e9c77 3 4 b a a e 3 4 0 a b e c c 1 1 3 a ; http://dx.do i.org/10.101 6/j.asoc.2004.09.003 [11] Merelo-Gu erv ´ os, J.J., Castillo, P .A., Alba, E.: Algorithm::E volutionary , a flexible Perl m odule for evolutionary computatio n. Soft Com puting (20 09). DO I 10 .1007 /s00500- 009- 0504- 3 . T o be published , ac cesible at http://sl.ugr .es/000K [12] M ¨ uhlenbein, H., Paass, G.: From recom bination of genes to the estimatio n of distributions: I . binary pa rameters. Lecture notes in computer science 1141 , 178 – 187 (1996) 12 [13] Neuwirth, E.: Some strategies for masterm ind. Zeitschrift fur Op erations Re- search. Serie B 26 (8) , B25 7–B278 (1982) [14] Stuckman , J., Zhan g, G .Q.: M astermind is np- complete. CoRR abs/cs/05120 49 (2005 ) [15] W ikip edia: Mastermind (board game) — Wikipedia, The Free E ncyclopedia (2009 ). URL http://en.wi kipedia.org/ w/index.php?title=Mastermind_(board_game)&oldid=31768 6 7 7 1 . [Online; accessed 9-October-2009 ] 13
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment