Rapid Prediction of Player Retention in Free-to-Play Mobile Games

Rapid Pr ediction of Player Retention in Fr ee-to-Play Mobile Games Anders Drachen Eric Thurston Lundquist Y ungjen Kung Pranav Simha Rao Diego Klabjan Rafet Sifa Julian Runge Abstract Predicting and improving player retention is crucial to the success of mobile Free-to-Play games. This paper explores the problem of rapid retention prediction in this context. Heuristic modeling approaches are intro- duced as a way of b uilding simple rules for predicting short-term retention. Compared to common classiﬁca- tion algorithms, our heuristic-based approach achieves reasonable and comparable performance using informa- tion from the ﬁrst session, day , and week of player ac- tivity . Introduction Predictiv e modeling in Free-to-Play (F2P) games has be- come a regular occurrence in the mobile g aming industry as well as within the associated academic ﬁelds in vestig ating player behavior at large scales. Pre vious work has seen the dev elopment of a v ariety of machine learning-based mod- els (Runge et al. 2014; Sifa et al. 2015; Hadiji et al. 2014; El-Nasr et al. 2013; Pittman and GauthierDickey 2010; Thawonmas et al. 2011; Mahlmann et al. 2010; Y ang and Roberts 2014; Xie et al. 2015), and has focused on situa- tions where there is at least a week or e ven more data avail- able about the players (Hadiji et al. 2014; Sifa et al. 2015; Runge 2014). Howev er , in a commercial context there is a direct interest in being able to predict player retention as f ast as possible. There are multiple reasons for this, but one of the primary ones is that F2P games generally lose a majority of the players to churn within the ﬁrst fe w days after an install (Nozhnin 2013; Runge et al. 2014; Rothenbuehler et al. 2015). Predictions are also important for appropriate incentivization of players to remain in the game (Runge et al. 2014). Essentially , there are two steps in solving the problem of players lea ving a game: 1) Predict- ing if a player will churn or not, and when; 2) Identifying how to prev ent this from happening or , if not deemed pos- sible, recommending a dif ferent suitable game to the player . The earlier a correct prediction can be made after a player starts playing a new game, the more valuable that knowl- edge will be. Fast predictions enable companies to build well tailored customer relationship management and respond to Copyright c  2016, Association for the Advancement of Artiﬁcial Intelligence (www .aaai.org). All rights reserved. user behavior proactiv ely (Runge 2014; Sifa et al. 2015; Rothenbuehler et al. 2015; Xie et al. 2015). As many companies in the mobile gaming industry have rather small operations, they cannot afford their own in- house analytics. It is thus imperati ve to identify simple, fru- gal, but effecti ve prediction models to make the beneﬁts of predictiv e analytics accessible to them. But heuristic mod- els also bear value re gardless of company size and cash bal- ance. Especially when a game is freshly launched and there is an overly full pipeline of features to be built, reducing a complex predicti ve effort to an easily implementable de- cision rule is of v alue to lar ge and small companies alike. T o address this challenge we introduce the idea of heuristic modeling and forecasting (Goldstein and Gigerenzer 2009; Gigerenzer and Brighton 2009; Artinger et al. 2015). Heuris- tics are simple, computationally fast and robust rule sys- tems that are often derived from intuition or a combina- tion of intuition and data-driv en modelling. They are po- tentially beneﬁcial along several dimensions: a) They are easy to deploy as they can often be implemented as a sim- ple rule systems in the client device; b) They tend to ha ve lower computational cost than machine learning-based mod- els; c) They are more straightforward to communicate to non-analytics decision makers and thus obtain organiza- tional acceptance for them. Ho wev er , heuristic-based rules eschew detailed predictions at the individual level. This of- ten makes them more robust for predictions in vastly dif fer- ent en vironments, but leads to a loss of granular predicti ve ability in stable environments (Chintagunta and Nair 2011; Goldstein and Gigerenzer 2009). Here we benchmark sim- ple predictiv e heuristics against machine-learning models for rapid prediction of retaining players in a stable en viron- ment. Contribution Here the feasibility of predicting retention in F2P mobile games based on very short-term user beha vior (i.e. as soon as possible after game do wnload by the player) is ev aluated. Retention prediction models are developed using a num- ber of machine-learning models co vering dif ferent windo ws of observ ation. These are benchmarked against a heuristic model developed using Decision Trees. Models are built based on a dataset of 130,000 players of the large mobile F2P game Jelly Splash. The dataset cov ers over 15 million sessions from the ﬁrst 90 days of acti vity for a single co- hort of users who installed the game over a seven day pe- riod. Accuracy v aries across the observation windo w: Game- play data covering a single session have minimal predicti ve power . Extending the observ ational period to the ﬁrst day of gameplay slightly improves predictive accuracy , and ﬁnally using a one-week window substantially improves predicti ve ability of the models (accuracy 0.785-792). All three models exhibit similar accuracies across the feature windo ws sug- gesting that the advantage of modeling nonlinear relation- ships is limited. The accuracies of the models exceed those of a simple heuristic-based Decision T ree-model, but not substantially . This indicates that there is potential in using heuristic models for rapid, affordable and robust client-side predictions in F2P games. Related work Due to space constraints the focus in this section will be on work directly related to the approaches used here: Churn models ha ve been developed across a number of ICT sec- tors such as wireless communication, banking and insur- ance. In games, previous work on forecasting player be- havior has focused on either Massiv ely Multi-Player On- line Games (MMOGs) or F2P mobile games. There have been very few cross-games studies, with exceptions includ- ing (Pittman and GauthierDicke y 2010), who examined tw o MMOGs, and (Sifa, Bauckhage, and Drachen 2014) who examined playtime patterns across more than 3000 titles. The methods that hav e been utilized vary from historical analysis, simple forecasting and multiple re gression to ma- chine learning techniques. The latter notably includes Deci- sion T rees, Random Forest, Support V ector Machines, Neu- ral Networks and Hidden Markov Models (Sifa et al. 2015; Runge et al. 2014; Hadiji et al. 2014; Thawonmas et al. 2011; Y ang and Roberts 2014; Xie et al. 2015). In the lat- ter context, previous work has mainly concentrated on churn prediction (Runge et al. 2014; Hadiji et al. 2014) or predict- ing purchase decisions (Sifa et al. 2015; Xie et al. 2015). (Hadiji et al. 2014) introduced dif ferent view-points to study the concept of churn and training classiﬁers to detect churn that is deﬁned as a binary classiﬁcation problem. The au- thors deﬁned the concept of hard- and soft-churn, pro vide two different data generation methods to train any classi- ﬁcation model and showed important factors for churning behavior in ﬁve different mobile free-to-play games. Sim- ilarly , (Runge et al. 2014) predicted the departure of high value players in two casual social F2P games by comparing the performance of different classiﬁers and feature sets. T o- gether with a supervised model for engagement modeling, (Xie et al. 2015) concentrates on predicting ﬁrst purchase in two social games using different classiﬁers. Finally , (Sifa et al. 2015) particularly concentrate on predicting future pur- chase acti vities of players by formulating the process as a combination of a classiﬁcation and a regression problem. The authors also emphasize the presence of rarity when an- alyzing premium players and provide a synthetic oversam- pling solution to predict rare purchase decisions. Across re- lated work on F2P-based churn prediction, the importance of temporal features has been highlighted, e.g. features associ- ated with the number of sessions per time period, the time between sessions, and av erage duration of sessions. Features related to speciﬁc game design were generally reported to be less important. Unlike pre vious work, the focus here is on the problem of rapid prediction of retaining players by considering heuris- tic approaches owing to their ease of implementation and interpretation. Heuristics are strategies deriv ed from experi- ence with similar problems, using readily accessible infor- mation to control problem solving. They can be likened to rules of thumbs. They are often associated with the concept of satisfacing from economic decision-making (Gigerenzer and Brighton 2009). When ﬁnding an optimal solution is im- possible or impractical, heuristic methods can be used for a satisfactory solution. They are used in a similar fashion in computer science, when the computational burden of com- plex methods is excessi ve. (Goldstein and Gigerenzer 2009) present a comprehensi ve revie w of their use in forecasting and prediction. (W ubben and W angenheim 2008)) empiri- cally in vestigate their viability for use in database marketing. (Artinger et al. 2015) detail their application in management more broadly . The work presented here can be vie wed as a special case and extension of the previous authors’ contrib u- tions. Deﬁnitions: retention and associated terms This paper operationalizes short-term retention prediction as a binary classiﬁcation task: each player will be classiﬁed as either retained (1) or churned (0) by both our heuristic-based decision rules and comparison machine learning models. W e deﬁne retention as having any game activity during the sec- ond week of game e xposure. More speciﬁcally , a player will be labeled as retained if and only if he/she registers at least one game round in the period 7-14 days following instal- lation. Examining the players second week of game expo- sure has several beneﬁts: it facilitates the identiﬁcation of engaged players while taking into account possible seasonal patterns in play (e.g. weekday vs. weekend); it minimizes confounding instances of disengaged players registering a single round long after they ha ve stopped playing regularly; it enables training models and generating initial predictions shortly after launch, when the number of new players is highest and retention predictions most useful. W ith respect to the single response deﬁned above, we ex- amine several different prediction periods and classiﬁcation strategies. Each of our classiﬁers generate retention predic- tions using a players game activity from his/her installation date up until the end of one of three feature windows. A fea- ture windo w is deﬁned as an interv al of time between the player’ s installation date and one of three cutof f points: 1) end of the players ﬁrst session; 2) end of the players ﬁrst day or: 3) end of the players ﬁrst week. These feature windows represent periods of increasing game e xposure and informa- tional content. For each of these feature windows, three classiﬁcation strategies are examined: i) Heuristic-based decision rules; 2) Several classiﬁers previously utilized for churn predic- tion; 3) An ensemble strategy combining the results of mul- tiple classiﬁers. The goal here is to in vestigate the relation- Figure 1: Feature- and e valuation windo ws used for model construction. ship between accuracy and actionability: observing more game activity yields more accurate predictions, b ut lowers the ov erall business v alue of these predictions as players who might have been incentivized to remain engaged will hav e already churned (Runge et al. 2014; Sifa et al. 2015; Hadiji et al. 2014; Rothenbuehler et al. 2015). In addi- tion, training traditional classiﬁers requires staf f with spe- cialized skills/knowledge, the transmission of user data to and from a central location, and an initial data collection period. In contrast, simple heuristic-based approaches can be deployed immediately after launch on the client devices themselves, and require little to no upkeep/monitoring af- ter deployment. Howe ver , they are only useful if suf ﬁciently accurate (W ubben and W angenheim 2008). Method and approach Data and pre-pr ocessing Data for this analysis were provided by W ooga, are fully anonymized and notably contain installation, session- and rounds played data for a single cohort of users who installed the g ame ov er a se ven day period in 2014. The data are from the game Jelly Splash on Apple’ s iOS platform. W e observe all game sessions within the ﬁrst year of exposure as well as all game rounds within the ﬁrst 90 days for this single cohort of users. It is important to note that a session cor- responds to a unique instance of a player opening the ap- plication on his or her device, while all actual game play occurs within rounds. It is possible for a player to record a session with no rounds, b ut all rounds must occur within sessions. In the dataset, 137,397 players installed the game, and 137,244 (or 99.9%) of these players recorded a session (i.e. opened the game on their device) at some point. Of these players, only 94.5% recorded at least one round (i.e. actually played the game). W e restrict analysis to users registering a game session within the ﬁrst sev en days after installation and playing at least one g ame round during that ﬁrst session. These sample restrictions preclude the confounding effects of individuals who install the game but nev er play , while also allowing for a common sample across our three feature windows. These restrictions reduce our sample size do wn to roughly 112,000 users. A small number of records with il- logical timestamps and/or data v alues were further excluded prior to feature creation and analysis. Figure 2: Number of active players on relativ e day since date of game installation). F eature deﬁnition and engineering The creation of features that adequately capture user char- acteristics and behavior is one of the most important aspects of any classiﬁcation task. W e did not hav e access to in-app purchase or player spending data, so our 18 created features primarily represent installation information and gameplay patterns. Many commonly used measures in the churn pre- diction literature are represented as well as several game- speciﬁc metrics relev ant to our data set. Installation mea- sures include the users device type (e.g. phone, tablet), ge- ographic location, and whether or not he/she was referred from a marketing effort (acquired). Gameplay measures fo- cus on play time (total days, total sessions, total rounds, av- erage session duration, a verage round duration, total elapsed play time), intersession measures (current absence time rel- ativ e to the end of the feature windo w , av erage time between sessions), social interaction (connected friends, player inter - action), and round-speciﬁc statistics (a verage mov es, aver - age stars, maximum lev el). Installation-based measures are common across all three feature windows, whereas separate versions of each gameplay measure were created using only the sessions and rounds falling within each feature windo w . Heuristic model dev elopment W e explore short-term heuristics to rapidly predict whether a player will be retained days after game installation using simple decision trees. 10-fold cross validation was used to examine performance of heuristics based on gameplay data from the ﬁrst session, day and week. The size of the tree w as limited to keep the number of decision rules in each heuristic to 3 or 4. Results show that a days worth of player informa- tion can determine playing beha vior a week or more into the future with decent accuracy . Multiple combinations of fea- ture and ev aluation windows were tested to in vestigate the trade-off between data collection times and heuristic perfor- mance. The key variables used in the 1-day heuristic decisions are number of rounds, current absence time, and maximum Figure 3: A verage number of rounds played per player over relativ e day since date of game installation. lev el reached. The splits from the tree intuitively sho w that absence time of more than 20 hours after installation is a reliable determinant of player churn. T o ev aluate the robustness of the heuristics, we used an empirical approach to in vestigate the sensitivity of the heuristic’ s accurac y with respect to different training data. First, the entire data set is split into ten separate chunks (i.e. mutually exclusi ve random samples), with one reserved as a test set, and the remaining nine used for training samples. Decision trees are then trained separately on each of the nine chunks and tested on the hold-out sample. W e were also interested in whether our decision trees can correctly classify users with playing behaviors similar to those in the hold-out sample. T o this end, we introduced ”perturbations” to the hold-out sample by mapping each of its users to his/her nearest neighbor (in our feature space) outside of the hold-out sample with the same class label. (W e chose to use this ”perturbation” method in order not to make any assumptions about the smoothness of the un- derlying probability distrib utions of the classes, churned or retained.) As an e xtension, we also in vestigated whether our deci- sion trees can correctly classify users with playing behav- ior incr easingly dissimilar to those of players in the hold- out sample. T o this end, we mapped the players to not just his/her nearest neighbor as before, but also to the player’ s i th (with 0 ¡ i ¡ 10, referred to as the perturbation le vel below) nearest neighbor in the feature space. Comparing the results from each training chunk indicated that the performance of the decision trees is not sensitive to changes in the training data; our e vidence from the second part of this empirical in vestigation also suggests that the per- formance in our decision trees is not particularly sensitiv e to perturbations in our hold-out sample. The range, mean, and standard de viation of misclassiﬁcation rates at each pertur- bation lev el for a sample run is giv en in table 2 below . Figure 4: One day heuristic based on Decision T ree model. Short-term prediction model de velopment In this section we present an experimental ev aluation of our churn prediction method using three popular machine learn- ing classiﬁers for each feature window under study . The fea- tures relev ant to each time period discussed above are used to train each classiﬁer and predict whether or not a player will be retained in the second week after he/she installs the game. W e compare the results of Logistic Regression (LR), Support V ector Machines (SVM), and Random Forest (RF) to assess the relativ e strengths and weaknesses of the dif- ferent algorithms with respect to our speciﬁc data set, three feature windows, and prediction task. W e report only key methodological steps and ﬁndings as an in-depth discus- sion of the classiﬁers themselv es is beyond the scope of this work. Included raw predictors, two-way interactions, and func- tional forms for all LR models were initially deri ved via an AIC-based stepwise search procedure and then ﬁne-tuned by hand using 10-fold cross-validation error to compare candidate models. Hyperparameters for SVM (kernel, cost, gamma) and RF (v ariables per split, number of trees) models were tuned using a grid search method assessing candidate models with 10-fold cross-validation error . The data were randomly subsampled do wn to 10,000 observ ations for tun- ing to accommodate larger grid sizes and additional candi- date comparisons under a reasonable amount of time and re- sources. W e e valuated the relati ve and absolute performance of each classiﬁer using 10-fold cross v alidation on the full data set. The same cross validation partitions were used for each of our three models to facilitate f air comparisons be- tween the dif ferent classiﬁers. In addition, we also e xamined the performance of a simple majority-v ote ensemble of the three models to assess the e xtent to which weaknesses in a single model could be ov ercome by strengths in the other two. As the raw class distribution is 40.5% retained, 59.5% represents the naiv e baseline accuracy of class-weighted ran- dom predictions. With that baseline in mind, we see the models using only a single session of gameplay hav e little predictiv e power . Model accuracy improves slightly when using the ﬁrst day of game activity , and substantially when Feature W indow Evaluation W indow Accuracy Precision Recall F1 1 session 8 - 14 days 0.613 0.555 0.228 0.323 1 day 8 - 14 days 0.686 0.639 0.509 0.567 1 day 2 - 8 days 0.703 0.756 0.738 0.747 1-3 days 4 - 10 days 0.747 0.787 0.681 0.73 1-7 days 8 - 14 days 0.786 0.785 0.651 0.712 T able 1: Overvie w of feature- and ev aluation windows and the prediction results for each. Holdout 1-NN 2-NN 3-NN 4-NN 5-NN 6-NN 7-NN 8-NN 9-NN Minimum 0.317 0.31 0.308 0.312 0.31 0.308 0.307 0.309 0.31 0.308 Maximum 0.324 0.316 0.314 0.318 0.314 0.314 0.315 0.316 0.315 0.315 Mean 0.32 0.313 0.312 0.314 0.312 0.311 0.31 0.312 0.312 0.312 Standard Deviation 0.002 0.002 0.002 0.002 0.001 0.002 0.002 0.003 0.002 0.002 T able 2: Minimum, maximum, mean, and standard deviation of misclassiﬁcation rates for 1-day heuristic (ev aluation window 8-14 days) for each perturbation lev el. taking into account the ﬁrst week. Its interesting to see that all three models exhibit similar overall accuracy with respect to each feature window , suggesting there may not be a large advantage to modeling nonlinear relationships in our data. Howe ver , important dif ferences in the precision/recall trade- off exist across different model types, with the LR models typically exhibiting lower precision and higher recall than the SVM models. The majority-vote ensemble is the best ov erall performer, b ut adds little value o ver an y one compo- nent model due to the similarity of all three. Analysis and discussion Model comparison While the accuracies of the three machine learning algo- rithms generally exceed those of the simple heuristic-based decision trees, the performance dif ference between the two approaches is not substantial. W ith respect to the single- session feature windo w the best machine learning algorithm outperformed the simple heuristic tree by only 1.2 percent- age points of accuracy and had an F-1 score only 0.009 higher . For the single-day window the dif ference was ev en smaller: 0.3 percentage points of accurac y and an F-1 differ - ence of 0.001. Lastly , using a full week of information the best machine learning algorithm improved accuracy over the heuristic by 0.6 percentage points and yielded an F-1 dif fer- ence of 0.015. These results indicate that simpler decision rules implemented client-side can be possible for short-term retention prediction in mobile games. The predicti ve po wer of our models falls generally withi n the range reported by the relev ant literature. Looking at the results pertaining to the retention and feature windo w def- initions most closely resembling those used in our exper - iments, (Hadiji et al. 2014) arrive at retention F-1 scores ranging from 0.682 to 0.880 for ﬁve different F2P games. The authors use similar machine learning algorithms, but importantly hav e access to player purchase behavior to aug- ment feature engineering. (Rothenbuehler et al. 2015) exam- ine a 7-day moving average feature window with a similar retention deﬁnition and arriv e at A UC values ranging from 79.1 to 79.6 for Neural Net and SVM models. These authors restrict features to generic session data (i.e. do not look at game-speciﬁc measures). Calculating the A UC of our 7-day feature window supervised learning ensemble model yields a value of 77.4, very close to the abov e-mentioned result. Some caution should be taken in comparing these results directly: each paper deﬁnes churn/retention uniquely , uses slightly different feature windows, and analyzes a dif ferent set of mobile games. F eature importance Understanding the relationships between speciﬁc predictors and retention likelihoods helps inform intervention target- ing. T owards this, we examine which player characteristics are most strongly related to retention ov erall and in each feature windo w . W e ev aluate these relationships using pair- wise predictor -response correlations, logistic regression co- efﬁcient and standard error v alues, and random forest v ari- able importance plots to assess the strength, size, and direc- tion of each relationship. T otal Rounds and T otal Playtime hav e the strongest over - all effect on retention for the single-session feature win- dow . Additionally , A verage Stars surprisingly has a signif- icant negativ e relationship with retention. W e see a posi- tiv e relationship for A verage Duration and A verage Moves, and retention rates also vary by Install Device T ype: users installing the game on tablets generally exhibit lo wer re- tention relati ve to phone installations. Despite information from only the players ﬁrst session not having much predic- tiv e po wer, the relationships above appear mostly intuiti ve: those who play longer immediately after installing the game are less likely to churn. Looking at the single-day and sev en-day feature windows, ov erall playtime and consistent playtime are the main deter- minants of retention. T otal Rounds, T otal Sessions, and A v- erage Duration are the strongest positiv e predictors, whereas Current Absence T ime, A verage Stars, and A verage T ime Between Sessions are the strongest negati ve factors. For the sev en day feature windo w Current Absence T ime becomes by far the strongest predictor , dominating re gression models and random forest variable importance plots. These results seem to suggest a large number of players churn very soon after installing the game, whereas those who play for longer and ov er a more consistent basis in the ﬁrst week are much more likely to be retained in the second. These ﬁndings are Figure 5: Example of feature importance results, here for the single-day random forest model. Note current absence time and number of rounds played as the most important features. largely consistent with the wider literature. Another interest- ing ﬁnding is that measures related to skill (lo wer A verage Mov es, higher A verage Stars) are actually inv ersely related to retention likelihood. This could represent certain players ﬁnding the games initial lev els too easy and quickly losing interest. Ho we ver , the fact that later le vels are more dif ﬁcult and require more mov es on average may confound this re- lationship as players who immediately lose interest for any reason are unlikely to e ver attempt these higher lev els. Ability to identify long-term users In addition to identifying those users likely to churn rapidly after installing the game, these modeling techniques can also be used to identify long-term, potentially high-value cus- tomers. Identifying these customers and deli vering targeted monetization strategies may be as or more important than knowing which users are lik ely to leav e soon after installing the game, as an ov erwhelming proportion of F2P in-app pur- chases are generated by a v ery small proportion of players (Sifa et al. 2015; Runge et al. 2014). T o approximately iden- tify these long-term and potentially high-value players we look at 60-day retention, or whether the user registers a game round in the period 60-67 days after he/she installs the game. Although we cannot observe player spending directly with our av ailable data, this long-term retention measure provides a simple deﬁnition of those players who are consistently en- gaged and lik ely to yield the highest ROI with respect to any targeted interv entions. In our analysis sample 15.2% of players are categorized as long-term retained using the abov e deﬁnition. When look- ing at the results from our single-day models 27.1% of those users predicted as short-term retained continue to play regu- larly past 60 days of game exposure. For the sev en-day mod- els 31.2% those players predicted to be short-term retained meet the deﬁnition of long-term retention. While these per- centages may seem low in an absolute sense, it is useful to benchmark them against the percentage of actual short-term retained players who continue on to be long-term retained. Of players categorized as short-term retained only 30.9% Feature W indow Modeling Method Accuracy Precision Recall F1 Single Session LR 0.623 0.58 0.21 0.308 SVM 0.621 0.589 0.173 0.267 RF 0.625 0.577 0.233 0.332 ENSEMBLE 0.625 0.596 0.197 0.296 First Day LR 0.684 0.641 0.505 0.565 SVM 0.688 0.659 0.48 0.555 RF 0.683 0.634 0.515 0.568 ENSEMBLE 0.689 0.655 0.492 0.562 First W eek LR 0.785 0.741 0.713 0.727 RF 0.789 0.776 0.666 0.717 SVM 0.791 0.789 0.655 0.716 ENSEMBLE 0.792 0.755 0.677 0.714 T able 3: Results of e valuation of the relati ve and absolute performance of each classifer using 10-fold cross validation, across three models, as well as for the majority-vote ensem- ble of the three models. are additionally cate gorized as long-term retained, imply- ing the predictions from the short term models are actually slightly more accurate at identifying long-term players than the short-term class labels themselves. In essence, identify- ing long-term and potentially high-value players using only the ﬁrst week of game exposure is a dif ﬁcult problem. Conclusion Previous work on churn prediction in games has generally focused on mid-length observ ation and prediction windo ws. e.g. 3-14 days of observation with prediction windows 7- 14 days into the future (Sifa et al. 2015; Runge et al. 2014; Hadiji et al. 2014; Xie et al. 2015). Howe ver , in many F2P games there is substantial churn happening in the v ery be- ginning of the gameplay , meaning that the sooner predic- tion models can be build, the more designers (and educators) can proactiv ely incentivize players to remain active. Predic- tion is equally interesting in a commercial conte xt as well as from the perspecti ve of human moti vational- and attentional research. Here, the feasibility of rapid prediction of player retention in mobile F2P games is inv estigated, with multiple machine learning models applied across different windows of obser- vation for comparison. The models exhibit similar accura- cies within observation windows. This suggests that mod- eling of non-linear relationships only yields limited bene- ﬁts. The accuracies of the models vary as a function of the observation window , increasing with length of the observa- tion period. A further focus of the work presented here has been the introduction of the concept of heuristic models to prediction of player behavior . It can be concluded that the accuracies of the three advanced classiﬁers exceed those of a simple heuristic derived from a Decision T ree-model, b ut not substantially . This indicates that retaining players can be successfully determined with a short history of behavioral information and using heuristic prediction approaches. Fi- nally , it suggests that a large part of the value of adv anced analytics in games can potentially be accessed by relying on static heuristic models. The y are beneﬁcial in being robust, understandable and easy to deploy and scale. References [Artinger et al. 2015] Artinger , F .; Petersen, M.; Gigerenzer , G.; and W eibler , J. 2015. Heuristics as adaptiv e decision strategies in management. Journal of Or ganizational Be- havior 36:33–52. [Chintagunta and Nair 2011] Chintagunta, P . K., and Nair , H. S. 2011. Discrete-choice models of consumer demand in marketing. Marketing Science 25:977–996. [El-Nasr et al. 2013] El-Nasr et al. 2013. Game Analytics: Maximizing the V alue of Player Data . Springer . [Gigerenzer and Brighton 2009] Gigerenzer , G., and Brighton, H. 2009. Homo heuristicus: Why biased minds make better inferences. T opics in Cognitive Science 1:107–143. [Goldstein and Gigerenzer 2009] Goldstein, D. G., and Gigerenzer , G. 2009. Fast and frugal forecasting. International J ournal of F or ecasting 25:760–772. [Hadiji et al. 2014] Hadiji, F .; Sifa, R.; Drachen, A.; Thurau, C.; K ersting, K.; and Bauckhage, C. 2014. Predicting Player Churn in the W ild. In Pr oc. of IEEE CIG . [Mahlmann et al. 2010] Mahlmann, T .; Drachen, A.; T o- gelius, J.; Canossa, A.; and Y annakakis, G. N. 2010. Pre- dicting Player Behavior in T omb Raider: Underworld. In Pr oc. of IEEE CIG . [Nozhnin 2013] Nozhnin, D. 2013. Predicting Churn: When Do V eterans Quit? Gamasutra . [Pittman and GauthierDickey 2010] Pittman, D., and Gau- thierDickey , C. 2010. Characterizing V irtual Populations in Massiv ely Multiplayer Oline Role-playing Games. In Proc. of MMM . [Rothenbuehler et al. 2015] Rothenbuehler , P .; Runge, J.; Garcin, F .; and Faltings, B. 2015. Hidden marko v models for churn prediction. In Pr oc. of SAI IntelliSys . [Runge et al. 2014] Runge, J.; Gao, P .; Garcin, F .; and Falt- ings, B. 2014. Churn Prediction for High-v alue Players in Casual Social Games. In Pr oc. of IEEE CIG . [Runge 2014] Runge, J. 2014. Predictiv e analytics set to become more valuable in light of rising CPIs. http://www .gamasutra.com/blogs/. [Sifa, Bauckhage, and Drachen 2014] Sifa, R.; Bauckhage, C.; and Drachen, A. 2014. The Playtime Principle: Large- scale Cross-games Interest Modeling. In Pr oc. of IEEE CIG . [Sifa et al. 2015] Sifa, R.; Hadiji, F .; Runge, J.; Drachen, A.; Kersting, K.; and Bauckhage, C. 2015. Predicting Purchase Decisions in Mobile Free-to-Play Games. In Pr oc. of AAAI AIIDE . [Thawonmas et al. 2011] Thawonmas, R.; Y oshida, K.; Lou, J.-K.; and Chen, K.-T . 2011. Analysis of re visitations in online games. Entertainment Computing 2(4):215–221. [W ubben and W angenheim 2008] W ubben, M., and W an- genheim, F . 2008. Instant customer base analysis: Man- agerial heuristics often ”get it right”. Journal of Marketing 72:82–93. [Xie et al. 2015] Xie, H.; Devlin, S.; Kudenk o, D.; and Cowl- ing, P . 2015. Predicting Player Disengagement and First Purchase with Event-frequenc y Based Data Representation. In Pr oc. of CIG . [Y ang and Roberts 2014] Y ang, P . Harrison, B., and Roberts, D. L. 2014. Identifying patterns in combat that are predictive of success in moba games. In Pr oc. of FDG .

Rapid Prediction of Player Retention in Free-to-Play Mobile Games

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment