Using Logistic Regression to Analyze the Balance of a Game: The Case of StarCraft II

Using Logistic Regression to Analyze the Balance of a Game: The Case of StarCraft I I TM Hy okun Y un yun3@purdue.edu Departmen t of Statistics Purdue Univ ersit y Octob er 25, 2018 1 In tro duction 1.1 Motiv ation Recen tly , the mark et size of online game has b een increasing astonishingly fast, and so do es the imp ortance of goo d game design. In online games, usually a h uman user comp etes with others, so the fairness of the game system to all users is of great imp ortance not to lose in terests of users on the game. F urthermore, the emergence and success of electronic sp orts (e-sp orts) and professional gaming whic h specially talented gamers compete with others dra ws more atten tion on whether they are comp eting in the fair en vironmen t. No matter ho w ﬁerce the debates are in the game-design comm unit y , it is rarely the case that one employs statistical analysis to answ er this question seriously . But considering the fact that w e can easily gather large amoun t of user b eha vior data on games, it seems potentially beneﬁcial to make use of this data to aid making decisions on design problems of games. Actually , mo dern games do not aim to p erfectly design the game at once: rather, they ﬁrst release the game, and then monitor users’ b ehavior to b etter balance the game. In such a scenario, statistical analysis can b e particularly helpful. Sp eciﬁcally , we chose to analyze the balance of StarCraft II TM , which is a v ery successful recen tly-released real-time strategy (R TS) game. It is a cen tral icon in current e-Sp orts and professional gaming communit y: from April 1st to 15th, there w ere 18 tournaments of StarCraft I I TM . Ho w ev er, there is endless debate on whether the winner of the tournamen t is actually sup erior to others, or it is largely due to certain design ﬂaws of the game. In this pap er, w e aim to answer such a question using traditional statistical to ol, logistic regression. 1.2 Problem Setting In 1 vs. 1 match of this game, each gamer chooses his/her race of army to play . There are three races: T erran, Protoss, and Zerg. Note that it is allow ed for t w o gamers to choose the same race. Also, the map 1 is c hosen, usually according to the rule of the tournament. When races 1 Actually , it is more accurate to call it the battleground of the war, but it is conv entionally called as a map . Technical Repor t (2011), 1 of tw o pla yers and the map are chosen, tw o gamers b egin a w ar un til one gamer giv es up and admits that he lost a game, or certain end-of-the-game conditions are met. In traditional games like c hess or go, tw o gamers are in p erfectly same condition except the righ t of the ﬁrst mov e. In games like StarCraft II, the gamer can c hoose v ery important c haracteristics of his/her arm y to play , so whether it is a fair game is an important issue in the communit y . Particularly , p eople are extremely in terested in whether pla ying a certain race is particularly adv an tageous against the other. F or example, a lot of p eople argue that it is diﬃcult for a Zerg pla y er to defeat a T erran pla y er. Ho w ev er, note that the balance betw een t wo races also depends hea vily on the map they are playing in. F or example, there are maps whic h bases of t w o pla yers are lo cated closely . It is generally conceiv ed that such a lo cation of bases fav ors the T erran race, b ecause the T erran arm y is p o w erful but immobile relative to others. But there are n umerous other factors of the map design that designers of the map can utilize to make the game balanced, and we are usually in terested in over al l eﬀect of such factors. 1.3 Data Description In this researc h, we used the result of 852 games in Global Starcraft League TM (GSL) 2 from Octob er 2010 to Marc h 2011. In this the most prestigious league of StarCraft I I TM , 64 num b er of professional gamers compete to each other, and the winner of the league gets $87,000. Eac h record of data consists of the iden tiﬁers of t wo play ers who play ed the game against eac h other, their corresp onding choice of races, the map which the game was pla yed, the date of the match, and the duration of the matc h: see T able 1. Pla y ers rarely c hange his race b et w een games, since it requires a lot of pratice to b e goo d at just a single race. Th us, one ma y view t w o v ariables as t w o levels of a hierarchical information: race provides higher-level information and pla y er provides lo wer-lev el information. There w as only one gamer who play ed randomly c hosen race for just tw o games un til b eing eliminated from the tournament, and his games w ere omitted from data. There are 136 users and 14 maps in this data. The data was gathered from the oﬃcial w ebsite of GSL 3 , using Python-based w eb cra wler we created on our own. V ariable T yp e Description Example Winner Binary (0 or 1) 1 if Play er 1 won the game, 0 otherwise 0, 1 Pla yer 1 Nominal Iden tity of one play er of the game Jonathan W alsh Race 1 Nominal The race of the Play er 1 Protoss Pla yer 2 Nominal Iden tity of the other play er of the game Greg Fields Race 2 Nominal The race of the Play er 2 Zerg Map Nominal The map the game was play ed Xel’Naga Cav erns Date In terv al The date of the match Jan. 01, 2011 Duration In terv al The duration of the match 21 min 35 sec T able 1: List of V ariables in the Data 2 http://wiki.teamliquid.net/starcraft2/GOMTV Global Starcraft II League 3 http://esports.gomtv.com/gsl/ Technical Repor t (2011), 2 2 Metho ds 2.1 Sp eciﬁcation of Mo del Recall from T able 1 that W inner i = 1 if Pla yer 1 wins the game. W e mo del eac h W inner i to b e indep endent Bernoulli random v ariable with π i := P ( W inner i = 1) as: log it ( π i ) := log π i 1 − π i = β P layer 1 i − β P layer 2 i + β ( M ap,Race 1 ,Race 2) i . (1) Let us try to undertand the in tuition b ehind the mo del, using an example with a ﬁgure. Supp ose that famous gamers Greg Fields and Jonathan W alsh are playing in the map, Xel’Naga Ca v erns. Please refer to equation (1) and Figure 1. β Greg F ields = 2 . 098 − β J onathanW al sh = − 1 . 797 β Xel’Naga Cav erns ,Z er g,T err an = 0 . 0632 0 - 0 . 36336 ≈ 2 . 098 − 1 . 797 + 0 . 0632 Figure 1: Illustration of the Mo del If pla y er 1 is a great gamer, than β P layer 1 will be high, and it will increase π , consequently increasing the probability that he will win the game. Greg Fields’ is one of the greatest Zerg pla y er in the w orld, so his estimated β is very high: ˆ β Greg F ields = 2 . 098. On the other hand, no matter ho w goo d the play er 1 is, if his opp onen t, play er 2, is also a strong one, then the probability he will win the game should certainly decrease. It is being considred by β P layer 2 . The opp onen t Jonathan W alsh is also one of the greatest T erran play ers, and his estimated parameter ˆ β J onathanW al sh = 1 . 797. This v alue is subtracted from β Greg F ields , decreasing Greg’s probability of winning the game. Ho w ev er, w e should consider the map tw o are pla ying in. It is Xel’Naga Ca verns, and it turns out that the map fav ors Zerg slightly ov er T erran. Thus, β (Xel’ Naga Cav erns ,Z er g ,T er ran ) = 0 . 0632, increasing the probability Greg wins on this map. Finally , since log it ( ˆ π i ) = 0 . 36336, we may transform the logit function to get ˆ P (Greg Wins) := ˆ π i = 1 1 + exp( − 0 . 36336) = 0 . 516 , (2) using standard mathematical op eration used frequen tly in logistic regression. Since both are v ery goo d play ers, it turns out that it is v ery hard to predict who will win the game. This is sensible, since the game is b y its nature not very predictable. If w e could easily predict the result of the game, nob o dy would w an t to w atc h the game to w aste time. Using statistical analysis, ho w ev er, we can get the o v erall tendency in games, even though the signal ma y not b e very strong. In this case, the ﬁtted mo del tells you it is more lik ely for Greg to win ov er Jonathan. Technical Repor t (2011), 3 Note that w e are assuming each games to b e c onditional ly independent, giv en pla y ers and map of the game. The p ossible problems of this assumption will b e discussed with other limits of the mo del in Section 2.3.2. 2.2 Restrictions on Parameter Space Note that when the p osition of Pla y er 1 and Pla y er 2 is switc hed in equation (1), then it should still give us an equiv alent result. In the former example of Greg and Jonathan, as ˆ P (Greg Wins) = 0 . 516, ˆ P (Jonathan Wins) should b e 1 − 0 . 516 = 0 . 484. T o impose suc h a restriction, we need following conditions: for each map, w e should hav e β M ap,T err an,Z erg = − β M ap,Z er g,T er ran , (3) β M ap,T err an,P r otoss = − β M ap,P rotoss,T err an , (4) β M ap,P rotoss,Z er g = − β M ap,P rotoss,Z er g . (5) Suc h a restriction can b e naturally done within the framework of standard logistic regression, not making the inference step any harder. One has to use carefully designed data matrix, but details are omitted since it is pretty straigh tforw ard. 2.3 Characteristics of the Model 2.3.1 Adv an tages o ver T raditional Approac h Let us discuss wh y we need suc h a statistical mo del to analyze this data. T o compare the p erformance of eac h play ers, the traditional wa y of analyzing the result of games is to calculate the win rate of individual pla y ers. F or example, one ma y calculate the fr action of game Gr e g won and the fr action of game Jonathan won , and compare t w o n um b ers. But this metho d is ob viously problematic, since in suc h a calculation, it does not distinguish how strong one’s opp onen t had b een. If one gamer has only encoun te red w eak gamers by c hance, and ha ve not y et b een challenged by strong ones, then would you still think he is a go o d gamer only b ecause he has a goo d win rate? Certainly not. The b eauty of ha ving a statistical mo del is that we can tak e care of this. F or example, Greg won only 41.67% of the game, while Jonathan won 50%. But the ﬁtted mo del do es not tell Jonathan is a b etter play er, due to their resp ective history of match: instead, it tells Greg is generally a better pla y er, with its parameter v alues ˆ β Greg F ields = 2 . 098 and ˆ β Greg F ields = 1 . 797. On the other hand, it is alwa ys a hot debate whether a certain map is balanced or not. But this is a hard question to answer, since y ou cannot simply sa y that the map named Xel’Naga Ca v erns fa v ors Zergs, since many Zergs are winning o v er T errans in this map. Ma yb e we hav e not seen go o d enough T errans playing in this map. Or, we hav e not seen enough observ ations in this map. When the fr action of games Zerg has won ov er T erran is calculated (for example, = 0.6), what is its standard error? It is hard to answ er, since eac h game is not mar ginal ly indep enden t of each other. W e know that go od gamers are more likely to win, while bad gamers are less lik ely to do so. Ho wev er, it is m uch more reasonable to assume that they are c onditional ly indep enden t, whic h is our assumption, and in this case w e can estimate the v ariability of our estimates. Finally , it is hard to c ombine estimates in traditional approac h. When Jonathan is playing a game with Greg in Xel’Naga Cav erns, how would y ou combine b oth gamers’ win rate and the Technical Repor t (2011), 4 fraction of Zerg w on o v er T erran in Xel’Naga Cav ern? Usually , people stop to b e quantitative , and follow the qualitative approach. In our mo del, we can quantitativ ely combine fs to calculate o v erall eﬀect. 2.3.2 Limits No matter ho w more attractive the model is compared to traditional approaches, it is by no means a p erfect mo del. T o list wh y: • Constancy of Par ameter V alues over Time : The strength of eac h gamer is not constan t o v er time. As a gamer accumulates exp erience, one generally gets b etter and b etter. On the other hand, it has b een frequently observed that once legendary gamers b ecome plain ones as they gro w old and they cannot react as quic kly as younger gamers. Th us using a single β parameter for ev ery game is actually problematic. Ho w ev er, in this data such an assumption was inevitable s ince we ha v e only observed for sev en months. When a study with larger longitudinal scale is done, we may even attempt to model this time-series eﬀect. • Conditional Indep endenc e : Each game may not b e ev en conditionally indep enden t given b oth play ers and map. F or example, when tw o gamers are pla ying a b est-of-ﬁv e matc h, then the result of the ﬁrst match certainly aﬀects the second. When a pla yer loses the ﬁrst matc h, the play er ma y get depressed, or having already seen whic h kind of style his opp onen t pla y ed, he may adjust his st yle very w ell and win the following game. But since most of the match w as play ed as a league match or as a best-of-three match, we assume that such a dep endence betw een games is not v ery strong. Also, since we are ha ving a lot of games (852), suc h a dependence betw een three or ﬁv e games ma y not pla y a very signiﬁcan t role. • Inter action Betwe en Players : Every one also knows that there should b e a certain interac- tion eﬀect b et w een play ers. F or example, a play er with aggresiv e style is hard to win ov er a very defensive gamer. Ho w ev er, a defensiv e gamer ma y hav e hard time ﬁghting agains a gamer who exploits the fact his opp onent is defensiv e, and make a expansion v ery quic kly . Since w e ha v e only 852 games and there are 136 gamers in the data, it is not p ossible to estimate 136 × 136 = 18496 parameters with 852 n um bers of games. How ev er, we ma y partially tak e this in to accoun t using mixed-mem b ership or laten t feature models. W e lea v e this interesting p ossibilities for future work. • Inter action b etwe en Player and Map : Sometimes it is clearly seen that certain user is very go od at certain kinds of map. But although this kind of in teraction is more tractable to deal with compared to play er-pla y er in teraction, consideration of suc h factors are left for future work. 2.4 Mo del P arsimon y Although w e ha ve b een trying hard to k eep our mo del simple, w e still ha ve to o man y parameters, since w e give each play er one parameter. Since some gamers pla y ed only one or t w o games just to lose and then b e eliminated from the tournament, modeling even suc h gamers will result in o v er-ﬁtting of the problem and numerical instabilit y . Such a problem can generally be taken Technical Repor t (2011), 5 care of by using r e gularize d estimation approac h, but it is sligh tly out of the scop e of the course and we lose the notion of pr ob ability there. Instead, one may try to do the variable sele ction himself/herself, not relying on indirect regularized estimation. In general data analysis problems, this is hard to do since it is hard to consider every combination of v ariables. How ever, it is not that diﬃcult when one has go o d idea of which v ariables are not v ery necessary , and it turns out to b e the case here: we know that the play ers with small num b er of games are problematic. Th us, w e ﬁx β ’s of such play ers to b e 0. That is, the model gives up to estimate the p erformance of gamers who ha v e not yet pla y ed enough, since w e do not hav e enough data. Ho w ev er, those gamers are c ol le ctively taken in to accoun t, since they do aﬀect the estimation of performance of other play ers and balance of the map. And it has v ery natural in terpretation: when tw o pla y ers with not enough data are playing against each other, the reasonable prediction is that it is a 50-50 game. When we kno w what map they are pla ying in, then we may use the o v erall trend in the map to predict the result, not b eing able to use further information at all. It turns out to giv e similar result compared to the use of L 1 regularization (lasso), which will b e discussed in the next section 4 . 3 Results 3.1 Adequacy of the Model Firstly , the lack of ﬁt w as tested: as a ﬁrst step, it was done against constan t mo del ( β j = 0 for all j ), and the p -v alue was 10 − 6 , naturally rejecting the null hypothesis that the constan t mo del suﬃces in almost an y signiﬁcance level. Since this is almost alwa ys the case for the data with considerable size, w e also conducted Hosmer-Lemesho w test 5 , using 10 groups: the p -v alue was 0.153, again fav oring our mo del 6 . Secondly , more mo dern method of cross-v alidation was used to ev aluate the qualit y of the ﬁt. W e conducted 10-fold cross-v alidation and ev aluated accuracy of our predictor in both training and test data. Note that in our case, losses of type I and I I pr e diction error are the same (symmetric loss), so it suﬃces to chec k accuracy , unlike general cases. The av erage accuracy in training data w as 0 . 727 ± 0 . 00797 (mean ± standard deviation), while that of test data was 0 . 706 ± 0 . 0632. It seems our mo del generalizes quite w ell (just 2% drop of accuracy), but standard deviation is a bit high. The reason should be that w e do not ha v e enough data about ev ery pla y er: for some pla y ers with small n um b er of games, training data ma y not con tain enough information about them and cause inaccuracy of estimation for them, although on av erage the mo del w orks quite w ell. T o see whether o v erﬁtting is problematic here, we’v e also used L 1 p enalt y (lasso) to estimate parameters and ev aluated its accuracy on exactly same partitions of data. The accuracy was 0 . 708 ± 0 . 0676, which is not signiﬁcan tly diﬀerent from the mo del using no p enalt y . W e did not set any parameter to b e 0 for lasso, but the set of nonzero parameters c hosen by lasso using another 10-fold cross-v alidation on training set was v ery similar to what w e’v e done b y using the num b er of games eac h gamer pla y ed: lasso also remov ed 94% of play ers w e remo v ed. In conclusion, o v erﬁtting w as not a big problem, and our selection of v ariables was not as ad-ho c as it could hav e sounded. 4 Note: the mo del is uniden tiﬁable b y itself, but by setting parameters for some pla yers to 0, it becomes iden tiﬁable. It is identiﬁable without such a treatmen t in lasso case. 5 R co de in http://www.stat.purdue.edu/~ovitek/STAT526-Spring11 files/4-logistic.R were used 6 In this case, null hypothesis is our mo del. Technical Repor t (2011), 6 −4 −2 0 2 4 −3 −2 −1 0 1 2 Predicted values Residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● glm(fmla) Residuals vs Fitted 118 736 501 Figure 2: Residual vs. Fitted Plot Thirdly , we visually chec k ed the quality of ﬁt. See Figure 2. The plot is not as ﬂat as it is desired to b e. This is b ecause our ﬁt is not perfect: sometimes we predict a play er to win by high probability , which do es not turn out to b e the case. Although we did not suﬀer ov erﬁtting in terms of accuracy , the lac k of regularizers may result in ov erﬁtting of estimated probabilities, sometimes being o verconﬁden t when the model should not b e. Such an ov erﬁtting may naturally o ccur when dealing with mid-sized data like this. The use of regularization do es not really help, since in that case w e lose the notion of probability . Lastly , we c heck ed whether our conditional indep dendence assumption was adequate. When quasi-binomial mo del was ﬁt, the disp ersion parameter w as only 1.158738. T o estimate the distribution of estimated disp ersion parameter, we b o otstrap sampled 1000 datasets. The mean and standard deviation of estimates w ere 1 . 275 ± 0 . 284, clearly indicating that there do es exist o v er-disp ersion, but the magnitued is not v ery serious. 3.2 In terpretation Recall that there is one parameter giv en to eac h pla y er, whic h ev aluates relativ e performance compared to others. Figure 3. (1) plots estimated parameter for each play er: it is naturally cen tered in the p oin t bigger than zero, implying pla y ers with enough information are better than those pla y ers whose parameters were set to b e zero b ecause they did not play enough games. F or interested readers ab out ranks of paramete rs, refer to T able 6. In Figure 3. (2) to (4), parameters whic h estimate the balance b et ween tw o races for eac h map w ere display ed. Since there are only 14 maps, the histogram is very spiky . Mean and standard deviation of estimates regarding T erran vs. Protoss, T erran vs. Zerg, and Protoss vs. Zerg balance of map was resp ectiv ely 1 . 064 ± 0 . 821, 0 . 749 ± 0 . 566, and − 0 . 369 ± 0 . 596. It seems lik e the balance dep ends on the map, but most maps fa vor T erran ov er Protoss and Zerg, while the balance b et ween Protoss vs. Zerg seemed more adequate than others. T o answ er the higher-level question of “So, is the game well-balanced?”, we need to a v erage o v er maps, since maps already tak e balances into account individually . Note that it is similar to testing hypothesis ab out the ov erall mean in cell-means mo del of one-w a y ANOV A. The parameter we test is: β Race 1 ,Race 2 = 1 m m X i =1 β M ap m ,Race 1 ,Race 2 , (6) for eac h Race 1 , Race 2. Since logistic regression do es not ha ve closed-form solutions of parameter Technical Repor t (2011), 7 Histogram of Estimated Parameters for Each Pla yer estimated parameters for each player Frequency −1 0 1 2 3 4 0 5 10 15 Balance of Maps for Terran vs. Pr otoss parameter estimate Frequency −0.5 0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Balance of Maps for Terran vs. Zer g parameter estimate Frequency −0.5 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 Balance of Maps for Protoss vs. Zerg parameter estimate Frequency −1.5 −1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Figure 3: Estimated Parameters F rom left: (1) Estimated parameter for each play er, (2) Estiam ted parameters for T erran vs. Protoss in eac h map, (3) Same plot for T erran vs. Zerg. (4) Same plot for Protoss vs. Zerg. Terran vs. Pr otoss parameter estimates Frequency −6 −4 −2 0 2 4 6 0 1000 2000 3000 4000 Terran vs. Zer g parameter estimates Frequency −2 0 2 4 0 500 1000 1500 2000 2500 3000 Protoss vs. Zerg parameter estimates Frequency −10 −5 0 5 0 1000 2000 3000 4000 5000 Figure 4: Bo otstrap distributions of parameter estimates (Bo ostrap sample num ber: 10000) distributions, w e bo otstrap sampled 10000 datasets and estimated (6) for eac h race combination. As a result, P ( β T err an,P r otoss > 0) ≈ 0 . 839, P ( β T err an,Z erg > 0) ≈ 0 . 948, and P ( β P rotoss,Z er g > 0) ≈ 0 . 290, which implies that the un balance betw een races are not v ery signiﬁcan t in 5% signiﬁcance level ev en when eac h h ypothesis that the parameter v alue is exactly zero is tested individually (when multiple hypotheses are simulatenously chec ked, the signiﬁcance level of the test drops). How ever, certainly indications w ere seen that there ma y b e some balance problems, esp ecially in the case of T erran vs. Zerg. In terested readers ma y refer to histograms of estimates: Figure 4. 4 Discussion T o authors’ kno wledge, this is the ﬁrst time a standard statistical technique which is more complex than mere summary statistics were used to analyze user b ehavior data in online games. Using our tec hnique, game designers may mak e use of the results they gained from b eta-testers more carefully to reduce the cost of testing. Esp ecially in the past with StarCraft I, many times v ery unbalanced maps w ere sometimes used in the tournament, causing some strong play ers to b e eliminated even early in the tournamen t. Our mo del w ould be very helpful to preven t such a disaster. Since we’v e already discussed many of the tec hnical problems in ab o v e sections, we conclude this section brieﬂy . Technical Repor t (2011), 8 References [1] Agresti A. (2002) Categorical Data Analysism 2nd Edtion. Chapter 4. Binary Response [2] F ara w a y J.J. (2006) Extending the Linear Mo del With R. Chapter 2. Binomial Data [3] Kutner M., Nac h tsheim C., Neter J. Li W. (2005) Applied Linear Statistical Mo dels. Chapter 14. Logistic Regression [4] h ttp://wiki.teamliquid.net/starcraft2/GOMTV Global Starcraft I I League [5] h ttp://esp orts.gom tv.com/gsl/ Technical Repor t (2011), 9 5 App endix 5.1 App edix A : Descriptive Statistics 5.1.1 Num b er of races There are three types, Pr otos, T er an, Zer g , of races in the Starcraft I I. F rom the ra w data, we found three observ ation whose race is r . Only one play er (ID : GuMihofOu ) used option r andom (rare) which randomly assigns one of three races. Since he had play ed only three games in total, w e eliminated these observ ations. After this elimination, we ha ve 136 pla y ers . The bar graph for race of these 136 play ers is shown in Figure 5. Protos(39) Teran(58) Zerg(41) Races Number of Players 0 10 20 30 40 50 60 Figure 5: Number of Races There is an outstanding preference to T er an whic h p ossesses 42.6% of the total play ers. At this p oin t, balancing b et w een races can b e issued. W e wan t to analyze this balancing problem using statistical approach. 5.1.2 Num b er of observ ations (games) p er pla yer There are 852 observ ations in the data set. The av erage num ber of games of each pla yer is 6.26. Ho w ev er, it is well-kno wn that a b etter play er plays more games than others. In other words, there should be a large deviation of the n um b er of games of the pla y ers. W e observ e this using a histogram in Figure 6 and T able 2 games gla y ers games gla y ers 1-5 64 31-35 2 6-10 21 36-40 5 11-15 9 41-45 3 16-20 7 46-50 2 21-25 10 51-55 0 26-30 12 56-60 1 T able 2: Number of games of play ers 64 play ers (47%) pla y ed only 1-5 games. Esp ecially , 38 play ers (27.9%) pla y ed only at most 2 games. Statistical results based on such pla y ers ma y not reliable. Hence w e need to consider Technical Repor t (2011), 10 data reduction. F or example, a bar graph of pla y ers who had play ed more than 5 games is in sho wn in Figure 7. 5.1.3 Game frequencies b et w een races T able 3 sho ws how many games are done betw een eac h race com bination. Notice that eac h com bination is not in order. F or example, the frequency of Protoss vs. Teran cov ers b oth (player1,player2) = (Protoss, Teran) and (Teran, Protoss) . F or balancing analysis purp ose, we reduce our fo cus on battles b et w een diﬀerent races. Number of games Number of players 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Figure 6: Game frequencies 6~10 11~15 16~20 21~25 26~30 31~35 36~40 41~45 46~50 51~55 56~60 Number of games Number of Players 0 5 10 15 20 25 Figure 7: Number of pla y ers vs. Number of games Technical Repor t (2011), 11 5.1.4 Game frequencies b et w een diﬀerence races T able 4 sho ws the num b er of games of diﬀerent races. As an in tuitiv e c hec k, we can compare the win/loss ratios, 121 121+112 = 0 . 5193133 , 132 132+1333 = 0 . 4981132 and 67 67+64 = 0 . 5114504. A t a glance, those ratios do not lo ok considerably apart from 0.5 5.1.5 Time trend of the num b er of play ers for eac h race Observing trend of race prop ortions will help understand balancing problem. W e divide the data into 7 sub-data b y months, and see the num ber of play ers of each race. Each cell coun t is frequency and the n um b ers in eac h paren thesis is the proportion of eac h race conditioning on eac h p erio d row. Refer to the T able 5 and Figure 8. Races F requency Protoss vs. Protoss 45 Protoss vs. T eran 233 Protoss vs. Zerg 131 T eran vs. T eran 134 T eran vs. Zerg 265 Zerg vs. Zerg 44 T able 3: Number of games of race com bination Race vs. Race F requency Num b er of Wins T eran vs. Protoss 233 T eran: 121 Protoss: 112 T eran vs. Zerg 265 T eran: 132 Zerg: 133 Protoss vs. Zerg 131 Protoss: 67 Zerg: 64 T able 4: Game frequencies b et w een diﬀerence races P erio d Protoss Pla y ers T eran Play ers Zerg Play ers Septem b er, 2010 16 (0.37209) 17 (0.39534) 10 (0.23255) Octob er, 2010 20 (0.31746) 28 (0.44444) 15 (0.23809) No v em b er, 2010 12 (0.19047) 25 (0.39682) 26 (0.41269) Decem b er, 2010 4 (0.20000) 9 (0.45000) 7 (0.35000) Jan uary , 2011 17 (0.26984) 28 (0.44444) 18 (0.28571) F ebruary , 2011 17 (0.24285) 32 (0.45714) 21 (0.30000) Marc h, 2011 12 (0.28571) 19 (0.45238) 11 (0.26190) T able 5: Time trend of the num ber of play ers for eac h race Technical Repor t (2011), 12 5.2 App edix B: V alidation c hec k with ranks T able 6 shows the prize ranks up to Marc h 19th, 2011. The third column is the estimated ranks based on our model. Prize ranks more than 20 w ere not publicized thus not display ed. Rank Name Rank in Prize Money (Korea W on) 1 Min-Chul Chang 1 2 Y ong-Hwa Choi 3 Kang-Ho Hwang 4 Jae-Duk Lim 2 5 Y oung-Jin Kim 6 Jun-Sik Y ang 7 Sung-Jun P ark 8 8 Jun-hyuk Song 15 9 Hyun-W o o P ark 10 W on-Ki Kim 3 T able 6: Parameter Estimate Rank and Prize Money Rank(up to March. 19. 2011). Although it is not very coherent with the prize rank, as exp erts of this problem we see that this rank to b e very convincing. Some of the gamers rank ed high here hav e b een recently came to the tournament, not ha ving enough opp ortunities to get high prize money . ● ● ● ● ● ● ● 1 2 3 4 5 6 7 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Time(Month) Proportion ● Protoss Teran Zerg Figure 8: Prop ortions of races according to time Technical Repor t (2011), 13 Ac kno wledgemen t W e thank Kyungmin Ahn, Jinhak Kim, and professor Olga Vitek for helpful commen ts and con tributions on descriptive analysis of data. Technical Repor t (2011), 14

Using Logistic Regression to Analyze the Balance of a Game: The Case of StarCraft II

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment