Figure - available from: PLOS One
This content is subject to copyright.
Two churners and two active users Each row describes play log records of a user, and a user with no play log record in the churn prediction period is defined as a churner. When desired, a larger CP can be chosen for defining churn.
Source publication
Internet-connected devices, especially mobile devices such as smartphones, have become widely accessible in the past decade. Interaction with such devices has evolved into frequent and short-duration usage, and this phenomenon has resulted in a pervasive popularity of casual games in the game sector. On the other hand, development of casual games h...
Similar publications
Nowadays, to obtain information covering urban land, the city is one of the most important and widely used
management tools in the study of Earth changes. Classification of images is one of the most common methods
of extracting information from remote sensing data. Complex and dense urban areas are one of the problems in
the analysis of remote sens...
A mensuração de diversas fontes de riscos para instituições financeiras se encontra diante de uma situação favorável, tanto pelo avanço computacional e consequentemente dos mais variados modelos, quanto pelo grande número de variáveis disponíveis. Especialmente dentro do universo das seguradoras, a classificação de clientes propensos a obterem sini...
Loan status prediction is an effective tool for investment decisions in peer-to-peer (P2P) lending market. In P2P lending market, most borrowers fulfill the repayment plan; however, some of them fail to pay back their loans. Therefore, an imbalanced classification method can be utilized to discriminate such default borrowers. In this context, the a...
As one of the most challenging and attractive issues in pattern recognition and machine learning, the imbalanced problem has attracted increasing attention. For two-class data, imbalanced data are characterized by the size of one class (majority class) being much larger than that of the other class (minority class), which makes the constructed mode...
Numerous studies have been carried out to measure wind pressures around circular cylinders since the early 20th century due to its engineering significance. Consequently, a large amount of wind pressure data sets have accumulated, which presents an excellent opportunity for using machine learning (ML) techniques to train models to predict wind pres...
Citations
... Their universal churn model improved player retention strategies through data-driven insights in a competitive gaming market. Kim et al., (2017) focused on churn prediction in mobile and online casual games, using play log data to analyze churn among new players. Their research employed logistic regression, gradient boosting, random forests, CNN, and LSTM models, emphasizing feature selection and churn period definitions (OP and CP) for effective churn management. ...
This research presents a prediction of gaming player churn along with a thorough analysis. It employs predictive modeling techniques utilizing machine learning approaches to predict player churn (customer attrition) on gaming platforms. Using real-world gaming data from player demographics, in-game purchases, social interactions, and historical gaming behavior, this study proposes a new framework that integrates data preprocessing, segmentation, and predictive modeling to determine which players will churn. Additionally, it uses Logistic Regression and Random Forest, a powerful ensemble learning algorithm, to estimate player churn within a limited time horizon. We found that this approach accurately identified potential churners through a thorough exploration and understanding of the dataset. This predictive model provides insight into the key factors influencing player attrition, allowing game developers to take countermeasures to prevent churn risks and improve player retention strategies. In addition, Power BI insights highlight the key factors influencing player churn. These findings provide actionable recommendations for game developers to mitigate churn risks and enhance player retention strategies. This study contributes to predicting player turnover in the gaming industry, providing a valuable tool for fostering sustainable growth and profitability.
... Monetizers, who consistently make in-game purchases, are primary revenue drivers for SCG companies, thereby contributing to their financial success (Choi et al., 2023;Lelonek-Kuleta et al., 2021;Zendle et al., 2020). Non-monetizers, although not direct revenue sources, play an essential role by engaging with rewarded advertisements and promoting word-of-mouth marketing (Guo et al., 2022;Teng et al., 2024), which helps expand the player base and creates future opportunities for monetization (Kim et al., 2017b). Therefore, distinguishing and analyzing these groups is both beneficial and necessary. ...
... In addition, Syahrivar et al. (2022) explored compensatory factors influencing the intention to purchase in-game virtual goods, while Balakrishnan and Griffiths (2018) reported that online mobile game addiction was positively correlated with the intention to purchase in-app features. Although some studies have incorporated monetary features (Hadiji et al., 2014;Jeon et al., 2017;Kim et al., 2017b) into analyses of general player behavior, most research treats monetizers and non-monetizers as a homogeneous group (Coussement and De Bock, 2013;Milo� sevi� c et al., 2017), leaving non-monetizer dynamics underexplored. ...
... Commonly analyzed player profile variables include character growth, demographics, experience, level, location, scores and stars (Coussement and De Bock, 2013;Loria and Marconi, 2021;Peri� si� c and Pahor, 2020). Other studies focus on economic factors such as purchases and transactions (Kim et al., 2017b;Milo� sevi� c et al., 2017) or installation details (Drachen et al., 2016). Furthermore, social Internet Research interactions, including friendships, guild memberships, invitations, multiplayer activities and player visits, have also been studied Zhao et al., 2023). ...
Purpose
Although prior research has employed various variables to predict player churn, the dynamic evolution of the behavioral patterns of players has received limited attention. In this study, churn prediction models are developed by incorporating the progress level, in-game purchase, social interaction, behavioral pattern and behavioral variability (BV) of players in social casino games (SCGs). The study distinguishes churn prediction between two player groups: monetizers and non-monetizers.
Design/methodology/approach
This study employs three machine learning techniques—logistic regression, decision trees and random forests—using real-world player data from an SCG company to construct churn prediction models. Two experiments were conducted. In Experiment 1, BV was combined with four other variable categories to effectively predict churn behaviors across all players ( n = 52,246). In Experiment 2, churn prediction models were developed separately for monetizers ( n = 16,628) and non-monetizers ( n = 35,618).
Findings
The findings from Experiment 1 indicate that incorporating BV significantly improves the overall performance of churn prediction models. Experiment 2 demonstrates that churn prediction models achieve better performance and predictive accuracy for monetizers and non-monetizers when BV is calculated over the 3-day to 7-day and 7-day to 14-day windows, respectively.
Originality/value
This study introduces BV as a novel variable category for churn prediction, emphasizing within-person variability and demonstrating its effectiveness in enhancing model performance. Churn prediction models were independently constructed for monetizers and non-monetizers, utilizing different time windows for variable extraction. This approach improves predictive performance and highlights key differences in critical variables influencing churn across the two player groups. The findings provide valuable insights into churn management strategies tailored for monetizers and non-monetizers.
... Kim et al. [20] utilized game log data and employed multiple algorithms to predict user churn in mobile and online casual games, achieving an average accuracy of 88%. They found that active duration and the number of games played are effective predictive indicators, emphasizing the importance of in-game behavior features. ...
Although some machine learning methods have been widely applied to customer churn prediction across various fields, few studies have focused on customer churn in card and board mobile games. Moreover, the massive volume of game log data presents new challenges for customer churn prediction. To address this, this paper proposes a customized high-performance solution based on the big data technology, Spark. Initially, Spark is utilized to preprocess the game log data and compute features; subsequently, the neighborhood cleaning rule (NCL) method is employed to handle the issue of class imbalance; finally, an ensemble learning model based on the stacking method is constructed. We validated our approach on a large-scale real dataset (260 GB). The results show that the method proposed in this paper performs optimally, achieving an accuracy of 96.33%, and the entire solution takes only 11 min to execute, meeting the practical requirements for prediction accuracy and data processing capability.
... Metrik ini sangat penting untuk memahami loyalitas dan retensi basis pelanggan perusahaan. Menganalisis CCR dapat memberikan wawasan berharga mengenai alasan di balik kehilangan pelanggan, memungkinkan perusahaan untuk menangani penyebab utama dan menerapkan langkah-langkah untuk mengurangi tingkat kehilangan pelanggan (Eardhana, 2023;Wardhana, 2023;Arai et al., 2023;Matuszelański & Kopczewska, 2022;Rudd et al., 2021;Ewieda et al., 2021;Mathai, 2020;Mohammed et al., 2020;Ahn, 2020;Mohammed et al., 2020;Ahn et al., 2020;Kassem et al., 2020;Tsai et al., 2019;Agrawal et al., 2018;Ge et al., 2017;Kim et al., 2017;Abedzadeh & Nematbakhsh, 2012;Saradhi & Palshikar, 2011;Glady et al., 2009;Ghorbani & Taghiyareh, 2009). ...
Tingkat kehilangan pelanggan atau Customer Churn Rate (CCR) juga disebut analisis pelanggan yang hilang atau Lost Customers Analysis (LCA). Customer Churn Rate (CCR) adalah metrik penting yang digunakan oleh perusahaan untuk mengukur seberapa cepat pelanggan menghentikan hubungan mereka dengan perusahaan atau mengukur berapa banyak pelanggan yang berhenti menggunakan produk atau layanan perusahaan dalam periode waktu tertentu. Semakin rendah CCR, semakin tinggi tingkat retensi pelanggan dan semakin puas mereka dengan produk atau layanan perusahaan. Metrik ini sangat penting untuk memahami loyalitas dan retensi basis pelanggan perusahaan. Menganalisis CCR dapat memberikan wawasan berharga mengenai alasan di balik kehilangan pelanggan, memungkinkan perusahaan untuk menangani penyebab utama dan menerapkan langkah-langkah untuk mengurangi tingkat kehilangan pelanggan (Eardhana, 2023; Wardhana, 2023; Arai et al., 2023; Matuszelański & Kopczewska, 2022; Rudd et al., 2021; Ewieda et al., 2021; Mathai, 2020; Mohammed et al., 2020; Ahn, 2020; Mohammed et al., 2020; Ahn et al., 2020; Kassem et al., 2020; Tsai et al., 2019; Agrawal et al., 2018; Ge et al., 2017; Kim et al., 2017; Abedzadeh & Nematbakhsh, 2012; Saradhi & Palshikar, 2011; Glady et al., 2009; Ghorbani & Taghiyareh, 2009).
... Menganalisis Customer Churn Rate (CCR) dapat memberikan wawasan berharga tentang alasan di balik kehilangan pelanggan, memungkinkan bisnis untuk mengatasi penyebab utamanya dan menerapkan langkahlangkah untuk mengurangi tingkat kehilangan pelanggan. (Eardhana, 2023;Arai et al., 2023;Matuszelański & Kopczewska, 2022;Rudd et al., 2021;Ewieda et al., 2021;Mathai, 2020;Mohammed et al., 2020;Ahn, 2020;Mohammed et al., 2020;Ahn et al., 2020;Kassem et al., 2020;Tsai et al., 2019;Agrawal et al., 2018;Ge et al., 2017;Kim et al., 2017;Abedzadeh & Nematbakhsh, 2012;Saradhi & Palshikar, 2011;Glady et al., 2009;Ghorbani & Taghiyareh, 2009). ...
Dalam lanskap bisnis yang berkembang pesat, memahami dan memenuhi kebutuhan serta preferensi konsumen telah menjadi kewajiban krusial bagi organisasi yang ingin mempertahankan keunggulan kompetitif dan memastikan kesuksesan jangka panjang. Mengukur kepuasan konsumen telah muncul sebagai strategi penting, memungkinkan perusahaan untuk memperoleh wawasan berharga, menyempurnakan penawaran mereka, dan pada akhirnya, membangun hubungan yang langgeng dengan konsumen target mereka. (Wardhana, 2023; Shin & Elliott, 2023; Anh et al., 2022; Mustafidah et al., 2020; Malik & Bhargaw, 2019; Gimpel et al., 2018; Singh, 2018; Ghoumrassi & Ţigu, 2017; Pizam et al., 2016; Yang, 2011; Evanschitzky et al., 2008; Gilbert & Veloutsou, 2006; Das & Canel, 2006; Finn, 2005; Stein & Bowen, 2003; Wilson, 2002; Dubrovski, 2001; Fornell et al., 1996; Halstead, 1993; Pacheco, 1989).
Beberapa metode pengukuran kepuasan konsumen (Consumer Satisfaction Measurement Methods) yang umum digunakan: Customer Satisfaction Survey (CSS), Focus Group Discussion (FGD), Net Promoter Score (NPS), Mystery Shopping (MS)/Ghost Shopping (GS), Social Media Analytics (SMA), Customer Effort Score (CES), Customer Journey Mapping (CJM), Online Reviews, Loyalty Program Metrics (LPM) atau Metrik Program, Customer Satisfaction Index (CSI), Customer Churn Rate (CCR)/ Lost Customers Analysis (LCA), Customer Lifetime Value (CLV), Complaints and Compliments Analysis (CCA), Social Media Listening (SML), Complaint Resolution Index (CRI), Social Media Monitoring (SMM), Customer Retention Rate (CRR), Complaint Analysis (CA), Customer Loyalty Index (CLI), Sentiment Analysis (SA), Customer Observation (CO), Business Performance Matrix (BPM), Importance Performance Analysis (IPA)
... Machine learning algorithms have been used to predict attrition of new users in several industries (e.g., telecommunication, banking, insurance), including the casual gaming industry (see Kim et al., 2017), whose relatively brief video game sessions on free computer or mobile apps have various parallels with digital mental health interventions (e.g., users can simply not return to the app without notice). In casual gaming, algorithms (including ensemble models, of which random forest is a type) using passive features have predicted early attrition with high accuracy (e.g., 85%-93%; area under receiver operating characteristic curve [AUC] = .72-.82; Kim et al., 2017). ...
... Machine learning algorithms have been used to predict attrition of new users in several industries (e.g., telecommunication, banking, insurance), including the casual gaming industry (see Kim et al., 2017), whose relatively brief video game sessions on free computer or mobile apps have various parallels with digital mental health interventions (e.g., users can simply not return to the app without notice). In casual gaming, algorithms (including ensemble models, of which random forest is a type) using passive features have predicted early attrition with high accuracy (e.g., 85%-93%; area under receiver operating characteristic curve [AUC] = .72-.82; Kim et al., 2017). Machine learning has also recently been used to predict attrition from outpatient psychotherapy (e.g., Bennemann et al., 2022) and from pharmacotherapy (see Chekroud et al., 2021). ...
Objective: Web-based cognitive bias modification for interpretation (CBM-I) can improve interpretation biases and anxiety symptoms but faces high rates of dropout. This study tested the effectiveness of web-based CBM-I relative to an active psychoeducation condition and the addition of low-intensity telecoaching for a subset of CBM-I participants. Method: 1,234 anxious community adults (Mage = 35.09 years, 81.2% female, 72.1% white, 82.6% not Hispanic) were randomly assigned at Stage 1 of a sequential, multiple-assignment randomized trial to complete five weekly sessions of CBM-I or psychoeducation on our team’s public research website. After the first session, for Stage 2, an algorithm attempted to classify CBM-I participants as higher (vs. lower) risk for dropping out; those classified as higher risk were then randomly assigned to complete four brief weekly telecoaching check-ins (vs. no coaching). Results: As hypothesized (https://doi.org/j2xr; Daniel, Eberle, & Teachman, 2020), CBM-I significantly outperformed psychoeducation at improving positive and negative interpretation biases (Recognition Ratings, Brief Body Sensations Interpretation Questionnaire) and anxiety symptoms (Overall Anxiety Severity and Impairment Scale, Anxiety Scale from Depression Anxiety Stress Scales–Short Form), with smaller treatment gains remaining significant at 2-month follow-up. Unexpectedly, CBM-I had significantly worse treatment dropout outcomes than psychoeducation, and adding coaching (vs. no coaching) did not significantly improve efficacy or dropout outcomes (notably, many participants chose not to interact with their coach). Conclusions: Web-based CBM-I appears effective, but supplemental coaching may not mitigate the challenge of dropout.
... Churn prediction is a specialized domain within data analytics that uses ML algorithms to detect users likely to discontinue a product or service [21]. It has found application across numerous industries, notably telecommunications [50], financial services [73], gaming [36], and e-commerce [25]. Recently, there's been a surge in interest in applying churn prediction to mHealth apps. ...
Digital health interventions (DHIs) offer promising solutions to the rising global challenges of noncommunicable diseases by promoting behavior change, improving health outcomes, and reducing healthcare costs. However, high churn rates are a concern with DHIs, with many users disengaging before achieving desired outcomes. Churn prediction can help DHI providers identify and retain at-risk users, enhancing the efficacy of DHIs. We analyzed churn prediction models for a weight loss app using various machine learning algorithms on data from 1,283 users and 310,845 event logs. The best-performing model, a random forest model that only used daily login counts, achieved an F1 score of 0.87 on day 7 and identified an average of 93% of churned users during the week-long trial. Notably, higher-dimensional models performed better at low false positive rate thresholds. Our findings suggest that user churn can be forecasted using engagement data, aiding in timely personalized strategies and better health results.
... Specifically, average account balance, customer relationship duration, and transaction frequency were identified as the most relevant financial variables. Likewise, behavioural features have also been recognised as important factors for churn predictions in various industries, including online gaming [59], [63], [64], telecommunication [65], [66] and the financial sector [67]. ...
... L. Kim et al. observed that social relationships among game players and irregularities in time-spending have a direct relationship with customer churn, and particularly, the latter increases the probability of churn [70]. Kim et al. [63] observed that active duration, play count, win ratio and purchase count are also highly relevant features for churn prediction. Additionally, RFM-based behavioural features have been recognised as important churn predictors [71], [72] and an optimal method for analysing customer behaviour [45]. ...
... QoS [84], [106], [116], [117] QoS variables, although uncommon among various industries, can enhance predictability when combined with other features. Activities, Logs and Time Spending Behaviour [59], [60], [63]- [65], [68]- [70], [80], [118], [119] Predominantly used in the online gaming industry, these variables include total time spent in-game, playing frequency, and game logs data. ...
Due to market deregulation and globalisation, competitive environments in various sectors continuously evolve, leading to increased customer churn. Effectively anticipating and mitigating customer churn is vital for businesses to retain their customer base and sustain business growth. This research scrutinizes 214 published articles from 2015 to 2023, delving into customer churn prediction using machine learning methods. Distinctive in its scope, this work covers key stages of churn prediction models comprehensively, contrary to published reviews, which focus on some aspects of churn prediction, such as model development, feature engineering and model evaluation using traditional machine learning-based evaluation metrics. The review emphasises the incorporation of features such as demographic, usage-related, and behavioural characteristics and features capturing customer social interaction and communications graphs and customer feedback while focusing on popular sectors such as telecommunication, finance, and online gaming when producing newer datasets or developing a predictive model. Findings suggest that research on the profitability aspect of churn prediction models is under-researched and advocates using profit-based evaluation metrics to support decision-making, improve customer retention, and increase profitability. Finally, this research concludes with recommendations that advocate the use of ensembles and deep learning techniques, and as well as the adoption of explainable methods to drive further advancements.
... In addition to these baseline user characteristics, user clinical functioning (ie, current symptoms and psychological processes that lead to the maintenance of these symptoms), self-reported user context and reactions to interventions (eg, perceived credibility of DMHIs, which is associated with increased engagement and reduced dropout [10]), and passively detected user behavior influence attrition rates in digital platforms [15,31]. This behavior includes time spent using an intervention [38,49,50], the passively detected context (eg, time of the day and day of the week) [49], and type of technology (eg, web, smartphone, computer based, or wearable) [20,51]. ...
... Researchers in computer and data science and the mobile gaming industry more commonly leverage passively collected behavioral data from users and have found success in predicting attrition ("churn") using more advanced techniques, such as linear mixed modeling [37], survival analysis [38], and probabilistic latent variable modeling [36]. More recently, advanced machine learning models, such as deep neural networks, have also been useful for modeling and predicting attrition in mobile gaming [38,50,56,57] and in digital health care applications [20,58]. Our approach builds on work predicting attrition in DMHIs [37,45,47,54,[58][59][60] and incorporates both passively collected behavioral data and self-reported data [1,17,31,[60][61][62][63]]. ...
Background
Digital mental health is a promising paradigm for individualized, patient-driven health care. For example, cognitive bias modification programs that target interpretation biases (cognitive bias modification for interpretation [CBM-I]) can provide practice thinking about ambiguous situations in less threatening ways on the web without requiring a therapist. However, digital mental health interventions, including CBM-I, are often plagued with lack of sustained engagement and high attrition rates. New attrition detection and mitigation strategies are needed to improve these interventions.
Objective
This paper aims to identify participants at a high risk of dropout during the early stages of 3 web-based trials of multisession CBM-I and to investigate which self-reported and passively detected feature sets computed from the participants interacting with the intervention and assessments were most informative in making this prediction.
Methods
The participants analyzed in this paper were community adults with traits such as anxiety or negative thinking about the future (Study 1: n=252, Study 2: n=326, Study 3: n=699) who had been assigned to CBM-I conditions in 3 efficacy-effectiveness trials on our team’s public research website. To identify participants at a high risk of dropout, we created 4 unique feature sets: self-reported baseline user characteristics (eg, demographics), self-reported user context and reactions to the program (eg, state affect), self-reported user clinical functioning (eg, mental health symptoms), and passively detected user behavior on the website (eg, time spent on a web page of CBM-I training exercises, time of day during which the exercises were completed, latency of completing the assessments, and type of device used). Then, we investigated the feature sets as potential predictors of which participants were at high risk of not starting the second training session of a given program using well-known machine learning algorithms.
Results
The extreme gradient boosting algorithm performed the best and identified participants at high risk with macro–F1-scores of .832 (Study 1 with 146 features), .770 (Study 2 with 87 features), and .917 (Study 3 with 127 features). Features involving passive detection of user behavior contributed the most to the prediction relative to other features. The mean Gini importance scores for the passive features were as follows: .033 (95% CI .019-.047) in Study 1; .029 (95% CI .023-.035) in Study 2; and .045 (95% CI .039-.051) in Study 3. However, using all features extracted from a given study led to the best predictive performance.
Conclusions
These results suggest that using passive indicators of user behavior, alongside self-reported measures, can improve the accuracy of prediction of participants at a high risk of dropout early during multisession CBM-I programs. Furthermore, our analyses highlight the challenge of generalizability in digital health intervention studies and the need for more personalized attrition prevention strategies.
... age), subscription-describing variables (e.g. the length of the current subscription), and so on. Kim et al. (2017) conducted customer churn prediction for three casual games by analyzing the behavioral play log records. Li et al. (2021) forecast churn behavior using variables such as spending amount and spending habits in the traditional broadcasting industry. ...
Churn prediction on imbalanced data is a challenging task. Ensemble solutions exhibit good performance in dealing with class imbalance but fail to improve the profit-oriented goal in churn prediction. This paper attempts to develop a new bagging-based selective ensemble paradigm for profit-oriented churn prediction in class imbalance scenarios. The proposed approach exploits an over-produce and choose strategy, which uses a cost-weighted negative binomial distribution to generate training subsets and a cost-sensitive logistic regression with a lasso penalty to combine base classifiers selectively. Extensive experiments were carried out on ten real-world data sets exhibiting a high level of imbalance from the telecommunication industry. The experimental results show that our proposed method obtains better performance than the other twelve state-of-the-art ensemble solutions for class imbalance in both accuracy-based and profit-based measures. Our research provides a new ensemble tool for imbalanced churn prediction for both academicians and practitioners.