Personality Trait Inference Via Mobile Phone Sensors: A Machine Learning Approach
This study provides evidence that personality can be reliably predicted from activity data collected through mobile phone sensors. Employing a set of well informed indicators calculable from accelerometer records and movement patterns, we were able t…
Authors: Wun Yung Shaney Sze, Maryglen Pearl Herrero, Roger Garriga
P E R S O N A L I T Y T R A I T I N F E R E N C E V I A M O B I L E P H O N E S E N S O R S : A M A C H I N E L E A R N I N G A P P RO A C H W un Y ung Shaney Sze Barcelona School of Economics Barcelona, Spain wun.sze@bse.eu Maryglen Pearl Herr ero Barcelona School of Economics Barcelona, Spain maryglen.herrero@bse.eu Roger Garriga K oa Health Barcelona, Spain roger.garrigacalleja@koahealth.com January 18, 2024 A B S T R A C T This study provides e vidence that personality can be reliably predicted from acti vity data collected through mobile phone sensors. Emplo ying a set of well-informed indicators calculable from ac- celerometer records and mov ement patterns, we were able to predict users’ personality up to a 0.78 F1 score on a tw o-class problem. Gi ven the fast gro wing number of data collected from mobile phones, our nov el personality indicators open the door to e xciting a venues for future research in social sciences. Our results re veal distinct behavioral patterns that pro ved to be differentially-predicti ve of big fiv e personality traits. They potentially enable cost-effecti ve, questionnaire-free in vestigation of personality-related questions at an unprecedented scale. Overall, this paper shows ho w a combination of rich behavioral data obtained with smartphone sensing and the use of machine learning techniques can help to advance personality research and can inform both practitioners and researchers about the different beha vioral patterns of personality . These findings hav e practical implications for organi- zations harnessing mobile sensor data for personality assessment, guiding the refinement of more precise and efficient prediction models in the future. Introduction Is there a way for us to predict the personality of a person without them having to tak e surveys? How much can one kno w about your personality type just by looking at the way you use your phone? Psychologists ha ve widely adopted the Big Five Model, a framework encompassing the fun- damental personality dimensions of Extra version, Agree- ableness, Conscientiousness, Neuroticism, and Openness. These dimensions capture traits such as gregariousness, trust, competence, emotional stability , and curiosity , re- spectiv ely (Goldberg, 1992). T raditionally , personality assessments relied on surve y questionnaires for quantify- ing these traits based on indi viduals’ responses. Howe ver , recent research demonstrates that digital beha vior traces, including those generated through social media and smart- phone usage, can effecti vely infer an individual’ s standing on these personality dimensions (Kosinski et al., 2013; Azucar et al., 2018; Park et al., 2018). The emerging field extends beyond personality trait prediction, encompass- ing the prediction of mental health states through mobile phone sensors, underscoring the utility of such data in enhancing our understanding and support of individuals’ psychological well-being (Saeb et al., 2016; Stachl et al., 2019). T raditional self-reported personality prediction has its lim- itations and can be time-consuming and inaccurate. T ech- nological advances in mobile phones and sensing tech- nology hav e now created the possibility to automatically record large amounts of data about humans’ natural be- havior (Chittaranjan et al., 2013; Farrahi and Gatica-Perez, 2010; Khwaja et al., 2019; Montag et al., 2016; Quercia et al., 2011). The collection and analysis of these records makes it possible to analyze and quantify behavioral dif fer- ences at unprecedented scale and efficiency . The idea of predicting people’ s personalities from their mobile phone data stems from recent advances in data collection, ma- chine learning, and computational social science showing that it is possible to infer various psychological states and traits from the way people use ev eryday digital technolo- gies. Exploration of smartphone data’ s potential in predicting the Big Fi ve personality traits has been some what limited (de Montjoye et al., 2013; Montag et al., 2016; Schoedel et al., 2018). Earlier studies reported relatively high pre- dictiv e accuracy for these traits b ut with the constraint of small sample sizes (Chittaranjan et al., 2013; de Montjoye et al., 2013). Ho wever , subsequent research, conducted with more extensi ve participant pools, un veiled diminished predictiv e performance, exposing prior o ver-optimism at- tributable to model o verfitting (Schoedel et al., 2018). It’ s notew orthy that past inv estigations predominantly focused on communication-related behavior as predictors. Y et, smartphones encompass a multitude of functions, and per- sonality traits can manifest in a broader array of behaviors (Funder, 2001; Stachl et al., 2019). Therefore, explor - ing behavioral patterns across various activities may ex- tend predicti ve capabilities beyond just Extra version, align- ing with insights gleaned from social media data research (Schwartz et al., 2013; Azucar et al., 2018). Our objective twofold: (1) explore the relationship be- tween personality and human behaviours sensed passi vely by smartphone sensors; (2) build predicti ve models based on previous literature and inv estigate how they perform ov er time and in different populations. In our study , we employed a machine learning approach to predict the pre- dictability of self-assessments for the Big Fiv e personality traits using smartphone-deriv ed variables, building upon pre vious research in this emer ging field (Chittaranjan et al., 2013; de Montjoye et al., 2013; Schoedel et al., 2018). Lev eraging data collected from 144 participants through smartphone sensors, we deriv ed a comprehensiv e set of behavioral features, encompassing aspects such as physical activity and daily behavioral patterns. Our research aims to expand the current understanding of the practical appli- cations of smartphone data in personality trait prediction, while also considering the potential influence of cultural v ariations (Markus and Kitayama, 1991; T erracciano et al., 2005). Methods In this study , we first e xtract behavioral predictors from a div erse set of daily activities. Second, we use these vari- ables to apply a machine learning approach for forecasting individuals’ self-reported big fiv e personality scores, en- compassing both ov erarching factor le vels and more spe- cific facets. Third, we analyze our results by looking at which features play a significant role in predicting each distinct dimension of an individual’ s personality traits. Fi- nally , we discuss the impact of these v ariables within the context of prior research findings and outline potential av enues for further confirmatory inv estigations. Dataset. The research dataset encompasses 3,282 recorded e vents spanning from March 2021 to May 2021, in volving 144 distinct users who are all students at the London School of Economics (LSE). The data was e xclu- siv ely gathered from Iphone devices. This dataset contains information on the activities in which users were in volv ed and the duration of their engagement in these activities. The acti vities encompass fi ve cate gories: walking, running, cycling, dri ving, and stationary periods. Additionally , the dataset includes details about the distance co vered by the users each day during the recorded time frame, the number of floors they ascended and descended, and the longest period their phones remained untouched. Furthermore, the dataset encompasses the participants’ re- sponses to the Big Fiv e In ventory (BFI), a comprehensiv e questionnaire that assessed their Big Fi ve personality traits (John, 1991). This self-report tool comprises 50 items, with responses rated on a 5-point Likert-type scale. The BFI is a well-established instrument widely employed in personal- ity research, known for its strong psychometric properties (McCrae and Costa Jr, 1997; Sutin et al., 2016). Figure 1 provides a visual representation of the distributions of the fi ve personality traits—Extra version, Agreeableness, Conscientiousness, Neuroticism, and Openness—across our study cohort, using kernel density estimate plots. Prediction T arget. Our primary objectiv e was to b uild predictiv e models for assessing the Big Fi ve personality traits in users. W e divided the classification into two parts: a 2-class classification and a 3-class classification. The for- mer aimed to distinguish between two lev els within each trait, while the latter categorized individuals into three lev els. W e determine percentiles individually for each of the 5 personality tar get variable (representing personality traits). T o create a 2-class, or binary classification, we set the threshold at the 50% percentile for each variable. This categorizes v alues into two groups: 0 for users with v ari- able values belo w the 50% percentile and 1 for users with variable values abo ve the 50% percentile. In essence, this process transforms the target v ariable into a binary format by labeling values in the lo wer half (0-50% percentile) as 0 and values in the upper half (50-100% percentile) as 1. In the context of a 3-class, or multiclass problem, we seg- ment the v alues within each target v ariable into three dis- tinct categories. W e set the percentiles at 33% and 67% to create the 3-class classification. V alues belo w the 33% percentile are assigned the label 0, representing traits in the lower third, v alues between the 33% and 67% percentiles receiv e the label 1, indicating traits in the middle third, and val ues above the 67% percentile are labeled as 2, sig- nifying traits in the upper third. A similar approach has been utilized in v arious other personality prediction work using machine learning (Lima and de Castro, 2014; T eli and Chachoo, 2023). Featur e Extraction. Out of the initial dataset, we deriv ed features extracted from physical activity data. This data encompasses information obtained from 2 Figure 1: Distribution of the Big Fiv e Personality Traits in the study population. accelerometer data and processed in the phone. In particular , it records the specific activity a participant w as engaged in at any given time, such as being stationary , walking, running, dri ving (automoti ve), or cycling, as well as the distance cov ered and floors ascended. This information allows us to infer v arious patterns, such as identifying extended periods of stationary time, which can be indicativ e of sleep duration if those happen at night. The features were built follo wing these steps: • First we built two synthetic acti vities. A physical activity label, that account for those instances where the user was either running or cycling, and a non-physical label for the rest of the instances. • Then, we aggregated the data in a daily basis. This aggregation included the total amount of time a user w as doing each acti vity (including the synthetic physical and non-physical activities), the number of occurances of each activity , the total distance covered, the floors ascended and the inferred hours of sleep. • The features were subsequently categorized into two distinct groups: weekdays and weekends. This separation enables us to differentiate and capture behavioral variations that occur during typical working days and leisurely weekends, pro- viding a more comprehensive understanding of participants’ acti vity across different conte xts and routines. • Finally , we computed the av erage over weekdays, weekends and ov erall for each of the v ariables to summarize the entire history of the user . Featur e Selection. After the feature extraction step, we obtained 83 features, a relativ ely high dimensional dataset giv en the size of our data. T o reduce the dimension, we used recursiv e feature elimination with cross-validation (RFE-CV) to select the features (Darst et al., 2018). The RFE-CV approach systematically eliminates less impor- tant features from a high-dimensional dataset by training a model, ranking feature importance, and removing the lowest-rank ed features iterati vely . For both the Random Forest and XGBoost models, and for each of the traits the feature selection process is carried out separately , result- ing in distinct sets of features optimized for each of these models. For each model, RFE-CV automatically selects the number of features that resulted in the best mean score. Machine Learning Models. T o characterize the user and predict personality traits, we employed two main types of classification approaches depending on the prediction target: binary and multi-class. F or each trait and modeling approach, we built different Machine Learning models. Giv en the relati vely high interpretability and the high performance that tree-based models and boosting algorithms show in tab ular data (Grinsztajn et al., 2022), we chose two tree-based models, Random Forest and XGBoost (Chen and Guestrin, 2016). T o assess the model’ s performance, we emplo yed stratified k-fold cross-validation and computed the F1 scores, which combines both precision and recall. This in volv ed dividing the dataset into k subsets or folds while ensuring that each fold maintained the same distribution of class labels as the original dataset. During each iteration, the model was trained on k-1 folds and e valuated on the remaining fold. This process was repeated k times, with each fold serving as the v alidation set exactly once. Using stratified k-fold is particularly addressing potential issues related to class imbalance. It guards against overfitting and ensures that the ev aluation results are not skewed by the peculiarities of a single train-test split, especially when dealing with datasets where certain classes may be underrepresented. W e integrated Bayesian search techniques into our mod- eling pipeline for hyperparameter tuning (Perrone et al., 2019; Lei et al., 2021; Y ang and Shami, 2020; Stachl et al., 2019). Bayesian optimization operates on the principle of sequentially exploring and e xploiting the hyperparameter space to identify the configuration that optimizes a cho- sen ev aluation metric, such as F1 score in our case. This process begins with an initial set of hyperparameter config- urations, often determined through random or grid search. Subsequently , Bayesian optimization refines this initial set by iterativ ely selecting new configurations based on their predicted performance using a surrogate probabilistic model. 3 Figure 2: Comparison of Personality Traits against BBC T est. The Bayesian search systematically explored the hyperpa- rameter space by selecting candidate configurations that showed promise in impro ving model performance (Snoek et al., 2012). It utilized a probabilistic surrogate model to model the relationship between hyperparameters and the chosen ev aluation metric, effecti vely guiding the search tow ards the most promising regions of the hyperparameter space. Results Population Comparison. Our study cohort differs from the general population. Figure 1 shows that the distri- butions of all personality traits except Extrav ersion are ske wed. W e compared the statistics of av erage traits of our study cohort to the Big Personality T est dataset conducted by the British Broadcasting Corporation (BBC). The BBC test uses a sample of N = 386 , 375 British residents, map- ping the distributions of the Big Fiv e Personality traits (Rentfrow et al., 2015). When juxtaposed with the general British population, shown in Figure 2, LSE students exhibited quantifiable variations. Specifically , LSE students demonstrated signif- icantly higher levels of Openness, with a mean score of 41.51 compared to the BBC dataset’ s mean of 36.7. This suggests a propensity for greater receptivity to div erse aca- demic and cultural influences within the LSE academic en vironment. Additionally , our analysis rev ealed that LSE students tend to score lo wer in Neuroticism, recording an av erage of 21.27 compared to the BBC dataset’ s mean of 29.7. This finding implies a potential inclination to- ward resilience and adaptability fostered by their academic pursuits. Moreover , our cohort displayed higher le vels of Agreeableness, registering an average score of 43.74, in contrast to the BBC dataset’ s mean of 37.4. This observa- tion implies a greater propensity for cooperativ e and har- monious social interactions, possibly influenced by their academic and communal experiences. These empirically observed distinctions underscore the unique character of our study cohort and provide crucial context for our on- going research into predicting Big Five personalities from mobile phone sensor data within the specific academic setting of the cohort. Model performance. W e present the results of the bi- nary and multiclass classification in T able 1. Considerable differences in the F1 scores across various personality traits are observed. Specifically , the binary models exhibit notable performance ranging between 0.56 and 0.78 in the all traits. The highest performance is achie ved when predicting Extrav ersion, with an F1 score of 0.78. Interest- ingly , Random F orest showed higher performance in Ex- trav ersion, Agreebleness and Neuroticism, while XGBoost performed better for Conscientiousness and Openness. The multiclass models exhibit a more v aried performance ranging between 0.25 and 0.47 in the all traits. The highest performance is achiev ed when predicting Openness, with an F1 score of 0.47. For multiclass models, Random For - est showed higher performance in predicting Neuroticism, XGBoost performed better for Agreebleness, Conscien- tiousness, Openness, and both models scored about the same for predicting Extrav ersion. Featur e Analysis. T o obtain the top predicti ve features for each trait, indi vidual models were trained and tested using the dataset split by binary or multiclass lev els. T able A1 in the appendix presents the top three features for each model associated with a particular personality trait. Extravers ion was predicted by the time spent stationary and doing outdoor activities, such as automotiv e (car), running or cycling. Extraverts are known for their sociable and outgoing nature, and this result aligns with the idea that they may eng age in more activities outside their home or workplace, leading to reduced stationary time. Agr eeabless was predicted by the ascended and descended floors, av erage activ e pace and number of accumulated steps, this may imply that agreeable individuals may en- gage more in physical acti vities like walking and stair climbing due to their health-conscious nature, preference for routine, and tendency to wards social interactions. Conscientiousness was predicted by the sleep related fea- tures, such as time of w aking up, cycling or accumulated steps during the weekend. This may imply that conscien- tious individuals are more likely to wake up at certain hours and maintain their physical activity routines e ven during 4 Binary Multiclass Personality trait Random For est XGBoost Random For est XGBoost EXT 0.78 0.61 0.39 0.39 A GR 0.58 0.56 0.36 0.42 CON 0.64 0.75 0.25 0.33 NEU 0.61 0.58 0.47 0.36 OPE 0.58 0.61 0.42 0.47 T able 1: Cross validated F1 Scores for out-of-sample classifications. weekends, reflecting their self-discipline, org anization, and commitment to personal goals and health. Neur oticism was predicted by the number of floors as- cended and descended during the weekday , distance trav- elled and the duration that one takes part in physical acti vi- ties. This may suggest that individuals with higher lev els of Neuroticism might use physical acti vities, like ascend- ing and descending stairs during weekdays, as a way to manage stress and anxiety , or it could reflect their v aried response to daily routines and stressors. Openness was predicted by amount of sleep that a user gets and cycling during the weekends. This may indicate that indi viduals high in Openness, who are often curious and open to new experiences, might prioritize sufficient sleep for cognitiv e and creative functioning and engage in exploratory activities, leading to traveling in bic ycle during the weekend. Discussion. Our study culminates in sev eral noteworthy findings. Primarily , our results demonstrate the potential of lev eraging activity data collected through mobile phone sensors for classifying the Big Fiv e personality traits of students with a with a performance between 0.56 and 0.78 for F1 scores. This underscores the viability of utilizing pervasi ve technology as a conduit to gain insights into in- dividuals’ psychological dispositions (W u et al., 2015). In an era where smartphones ha ve become ubiquitous com- panions, the ability to discern personality traits through unobtrusi ve data collection is a paradigm-shifting adv ance- ment in the field of personality psychology . Each personality trait’ s prediction w as tied to distinct be- havioral patterns observed through the sensors. The dis- cov ery that Extra version is best predicted by the time spent in outdoor acti vities substantiates the theory that e xtraverts tend to be more sociable and outgoing, possibly engaging in more external activities, thereby reducing the time spent in stationary states (Sri vastav a et al., 2008; Lochbaum et al., 2013). Meanwhile, the linkage between Agreeableness and the number of steps and floors taken highlights the physical aspect of Agreeableness. This suggests that individuals who exhibit more physical acti vity , such as stair climbing, tend to possess higher le vels of Agreeableness, a f acet of their personality possibly reflected in their willingness to engage in physical cooperation. The relationship between Conscientiousness and the num- ber of steps and and cycling duration on the weekends can imply that individuals high in Conscientiousness may engage more actively in physical acti vities during their leisure time, reflecting their self-discipline and commit- ment to personal health and goals, even outside of struc- tured weekday routines. Neuroticism was best predicted by floors descended or ascended on the weekdays, and Openness was best predicted by the sleep duration and cycling during the weekends. Our research, confined to analyzing activity data from smartphone sensors within a student population, opens up nov el avenues for understanding personality traits. These insights are pi votal in shaping interventions and recommen- dations for personal dev elopment and well-being, tailored to individual lifestyle choices and personality profiles. Further Research. Despite the v aluable insights offered by our study , it is imperati ve to acknowledge inherent lim- itations. Our dataset primarily comprises students from LSE, thereby prompting questions about the generalizabil- ity of our findings to a more div erse and representativ e population. As a plausible remedy , forthcoming research endeav ors should prioritize expanding the sample size to encompass a broader and more v aried demographic spec- trum, thereby ensuring that the deri ved insights retain their applicability across heterogeneous populations. In light of the observed comparability discrepancies be- tween our uni versity student cohort and the general pop- ulation, as discerned through comparativ e analyses with data compiled by the BBC, an in-depth exploration into the demographic biases embedded within our dataset becomes imperativ e. Further in vestigations should striv e to unravel how the distinctiv e characteristics of our cohort potentially influence the predictiv e accuracy of personality traits. Con- sequently , there exists an indispensable need to engage in a comprehensiv e exploration of this domain, elucidat- ing the boundary conditions and e xtending the domain of generalizability pertaining to our findings. It is notew orthy to consider the rich tapestry of analogous methodologies and techniques that ha ve been deployed within the landscape of personality prediction. These en- compass the utilization of artificial neural network models for classification (Ba ¸ saran and Ejimogu, 2021), the employ- ment of textual data and their alignment with prev ailing 5 personality models (Kunte and P anicker, 2019; Christian et al., 2021), as well as the innov ative approach of harness- ing graphology and digital handwriting analysis to discern personality traits (Samsuryadi et al., 2023). The integra- tion of these div erse methodologies promises an enhanced understanding of human personality traits. Our study not only underscores the vast potential inherent in mobile phone sensor data for personality trait prediction but also casts a luminous spotlight on the intricate nature of the relationship between personality and lifestyle. This, in turn, lays the cornerstone for an auspicious avenue of re- search, geared to wards further enriching our understanding of the intricate nexus in the field of personality prediction. Acknowledgements The authors would like to thank Koa Health for provid- ing the dataset and laying the groundwork for this line of inquiry . The authors would also like to thank Hannes Mueller , Jesús Cerquides, Christian Brownless, and Elliot Motte for their encouragement in this intellectual pursuit. References Azucar , D., Marengo, D., and Settanni, M. (2018). Predict- ing the big 5 personality traits from digital footprints on social media: A meta-analysis. P ersonality and Individ- ual Differ ences , 124:150–159. Ba ¸ saran, S. and Ejimogu, O. H. (2021). A neural network approach for predicting personality from facebook data. SA GE Open , 11(3):21582440211032156. Chen, T . and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Pr oceedings of the 22nd acm sigkdd international confer ence on knowledge discovery and data mining , pages 785–794. Chittaranjan, G., Blom, J., and Gatica-Perez, D. (2013). Mining lar ge-scale smartphone data for personality stud- ies. P ersonal and Ubiquitous Computing , 17:433–450. Christian, H., Suhartono, D., Chowanda, A., and Zamli, K. (2021). T ext based personality prediction from multiple social media data sources using pre-trained language model and model av eraging. Journal of Big Data , 8. Darst, B. F ., Malecki, K. C., and Engelman, C. D. (2018). Using recursi ve feature elimination in random forest to account for correlated v ariables in high dimensional data. BMC Genetics , 19(Suppl 1):65. de Montjoye, Y .-A., Quoidbach, J., Robic, F ., and Pentland, A. (2013). Predicting personality using nov el mobile phone-based metrics. In Social Computing, Behavioral- Cultural Modeling and Pr ediction: 6th International Confer ence, SBP 2013, W ashington, DC, USA, April 2-5, 2013. Pr oceedings 6 , pages 48–55. Springer . Farrahi, K. and Gatica-Perez, D. (2010). Probabilistic mining of socio-geographic routines from mobile phone data. IEEE J ournal of Selected T opics in Signal Pr ocess- ing , 4(4):746–755. Funder , D. C. (2001). Personality . Annual Revie w of Psychology , 52(1):197–221. PMID: 11148304. Goldberg, L. R. (1992). The dev elopment of markers for the big-fi ve factor structure. Psycholo gical assessment , 4(1):26. Grinsztajn, L., Oyallon, E., and V aroquaux, G. (2022). Why do tree-based models still outperform deep learning on typical tab ular data? Advances in Neural Information Pr ocessing Systems , 35:507–520. John, O. P . (1991). The big fiv e inv entory—versions 4a and 54. (No T itle) . Khwaja, M., V aid, S. S., Zannone, S., Harari, G. M., Faisal, A. A., and Matic, A. (2019). Modeling personality vs. modeling personalidad: In-the-wild mobile data analysis in fiv e countries suggests cultural impact on personality models. Pr oceedings of the ACM on Interactive, Mobile, W earable and Ubiquitous T echnologies , 3(3):1–24. K osinski, M., Stillwell, D., and Graepel, T . (2013). Pri- vate traits and attributes are predictable from digital records of human behavior . Pr oceedings of the national academy of sciences , 110(15):5802–5805. Kunte, A. and Panicker , S. (2019). Using textual data for personality prediction:a machine learning approach. Lee, K., Lee, T . C., Y efimova, M., Kumar , S., Puga, F ., Azuero, A., Kamal, A., Bakitas, M. A., Wright, A. A., Demiris, G., et al. (2023). Using digital phenotyping to understand health-related outcomes: A scoping re- view . International Journal of Medical Informatics , page 105061. Lei, B., Kirk, T . Q., Bhattacharya, A., et al. (2021). Bayesian optimization with adapti ve surrogate models for automated experimental design. npj Computational Materials , 7:194. Lima, A. C. E. and de Castro, L. N. (2014). A multi-label, semi-supervised classification approach applied to per - sonality prediction in social media. Neural Networks , 58:122–130. Special Issue on “ Affecti ve Neural Net- works and Cognitiv e Learning Systems for Big Data Analysis”. Lochbaum, M., Litchfield, K., Podlog, L., and Lutz, R. (2013). Extrav ersion, emotional instability , and self- reported ex ercise: The mediating effects of approach- av oidance achiev ement goals. Journal of Sport and Health Science , 2(3):176–183. Markus, H. R. and Kitayama, S. (1991). Culture and the self: Implications for cognition, emotion, and moti va- tion. Psychological re view , 98(2):224. McCrae, R. R. and Costa Jr , P . T . (1997). Personality trait structure as a human uni versal. American psychologist , 52(5):509. Montag, C., Duke, É., Markowetz, A., et al. (2016). T o- ward psychoinformatics: Computer science meets psy- chology . Computational and mathematical methods in medicine , 2016. 6 Park, S., Matic, A., Garg, K., and Oli ver , N. (2018). When simpler data does not imply less information: a study of user profiling scenarios with constrained vie w of mobile http (s) traf fic. ACM T r ansactions on the W eb (TWEB) , 12(2):1–23. Perrone, V ., Shen, H., Seeger , M., Archambeau, C., and Jenatton, R. (2019). Learning search spaces for bayesian optimization: Another view of hyperparameter transfer learning. Quercia, D., Kosinski, M., Stillwell, D., and Crowcroft, J. (2011). Our twitter profiles, our selves: Predicting personality with twitter . In 2011 IEEE thir d interna- tional confer ence on privacy , security , risk and trust and 2011 IEEE third international confer ence on social computing , pages 180–185. IEEE. Rentfrow , P . J., Jokela, M., and Lamb, M. E. (2015). Re- gional personality dif ferences in great britain. PLoS One , 10(3):e0122245. Saeb, S., Lattie, E. G., Schueller , S. M., K ording, K. P ., and Mohr , D. C. (2016). The relationship between mobile phone location sensor data and depressiv e symptom sev erity . P eerJ , 4:e2537. Samsuryadi, K urniawan, R., Supardi, J., Sukemi, Mo- hamad, F . S., and Y ang, G. (2023). A framew ork for determining the big fi ve personality traits using machine learning classification through graphology . JECE , 2023. Schoedel, K. A., Szeto, I., Setnik, B., Sellers, E. M., Le vy- Cooperman, N., Mills, C., Etges, T ., and Sommerville, K. (2018). Abuse potential assessment of cannabidiol (cbd) in recreational polydrug users: A randomized, double-blind, controlled trial. Epilepsy and Behavior , 88:162–171. Schwartz, H. A., Eichstaedt, J. C., K ern, M. L., Dziurzyn- ski, L., Ramones, S. M., Agraw al, M., Shah, A., Kosin- ski, M., Stillwell, D., Seligman, M. E., et al. (2013). Per- sonality , gender , and age in the language of social media: The open-vocab ulary approach. PloS one , 8(9):e73791. Snoek, J., Larochelle, H., and Adams, R. P . (2012). Prac- tical bayesian optimization of machine learning algo- rithms. In Pereira, F ., Burges, C., Bottou, L., and W ein- berger , K., editors, Advances in Neural Information Pr o- cessing Systems , volume 25. Curran Associates, Inc. Sriv astav a, S., Livingstone, K., and V allereux, S. (2008). Extrav ersion and positiv e affect: A day reconstruction study of person–en vironment transactions. J ournal of Resear ch in P ersonality - J RES PERSON AL , 42:1613– 1618. Stachl, C., Au, Q., Schoedel, R., Buschek, D., Völkel, S., Schuwerk, T ., Oldemeier , M., Ullmann, T ., Hussmann, H., Bischl, B., and Buehner, M. (2019). Behavioral patterns in smartphone usage predict big fiv e personality traits. Suman, C., Gupta, A., Saha, S., and Bhattacharyya, P . (2020). A multi-modal personality prediction system. In Pr oceedings of the 17th International Conference on Natural Language Pr ocessing (ICON) , pages 317–322, Indian Institute of T echnology Patna, P atna, India. NLP Association of India (NLP AI). Sutin, A. R., Stephan, Y ., Luchetti, M., Artese, A., Oshio, A., and T erracciano, A. (2016). The fi ve-factor model of personality and physical inacti vity: A meta-analysis of 16 samples. Journal of Researc h in P ersonality , 63:22– 28. T eli, M. A. and Chachoo, M. A. (2023). Pre-trained word embeddings in deep multi-label personality classifica- tion of youtube transliterations. In 2023 International Confer ence on Intelligent Systems, Advanced Comput- ing and Communication (ISA CC) , pages 1–6. T erracciano, A., Abdel-Khalek, A. M., Adam, N., Adamov ová, L., Ahn, C.-k., Ahn, H.-n., Alansari, B. M., Alcalay , L., Allik, J., Angleit ner, A., et al. (2005). Na- tional character does not reflect mean personality trait lev els in 49 cultures. Science , 310(5745):96–100. W ang, Y . and Ni, X. S. (2019). A xgboost risk model via feature selection and bayesian hyper-parameter opti- mization. W u, Y ., K osinskihttps://www .semanticscholar .org/me/account, M., and Stillwell, D. (2015). Computer-based person- ality judgments are more accurate than those made by humans. Pr oceedings of the National Academy of Sciences , 112:1036 – 1040. Y ang, L. and Shami, A. (2020). On hyperparameter opti- mization of machine learning algorithms: Theory and practice. CoRR , abs/2007.15745. 7 A Appendix 8 Most Important F eatures 1st 2nd 3rd Binary RF EXT Stationary Duration weekday Automotiv e Count weekday Automotiv e Duration weekday A GR Floors Ascended weekend Floors Descended weekend Physical Acti vity Count weekday CON Hour of W aking Up weekday Automotiv e Count weekday Physical Activity Count week end NEU Floors Ascended weekend Stationary Duration weekday Distance T ravelled weekday OPE Sleep Duration weekend Sleep Duration weekday Stationary Duration weekday XGB EXT Accumulated Steps weekend Floors Ascended weekday Automotiv e Duration weekday A GR Floors Ascended weekend Stationary Count weekday Sleep Duration weekday CON Cycling Duration weekday Stationary Count weekend Running Duration weekday NEU Physical Acti vity Duration weekend Stationary Duration weekday Cycling Duration weekday OPE Automotive Count weekday Cycling Duration weekday Running Duration weekend Multiclass RF EXT Stationary Duration W eekday Automotiv e Count weekday Distance T ravelled weekday A GR Activity Count for 24h weekday Sleep Duration weekend Floors Ascended weekend CON Hour of W aking Up weekday Hour of Asleep weekend Sleep Duration weekend NEU Distance T ravelled weekday Floors Descended weekday Stationary Duration weekday OPE Stationary Duration weekday Sleep Duration weekday Sleep Duration weekday XGB EXT Running Duration weekend Stationary Duration weekday Cycling Duration weekday A GR Floors Descended weekend Running Count weekday Floors Ascended weekday CON Cycling Count weekend Sleep Duration weekend Accumulated Steps weekend NEU Cycling Duration weekend W alking Count weekend W alking Duration weekday OPE Cycling Duration Pct weekend Cycling Duration weekend Cycling Count weekend T able A1: Most important features for binary and multiclass classifications. 9
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment