Article

A Mathematics Model of Customer Churn Based on PCA Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Customer churns analysis and predication is an important part of customer relationship management (CRM). Because of the discrepancy of collecting channel and data gathering, raw customer data have imprecise, unbalanced and high dimensional characteristics, which degrade model performance. Customer retention and customer acquisition are two supports which have great influences on the bottom line compared with the increase of market share, the reduction of unit costs, and other competitive tools. In order to solve this problem the paper addresses a prediction model based on principal component analysis (abbr. PCA) and least square support vector machine (abbr. LS-SVM). The procedure includes two steps. In the first step PCA is used to compress raw input data and extract features, which can implement de-correlation. In the second step samples are used to train LS-SVM and establish customer churn forecasting model. In this way, the two algorithms have combined, whose advantages have been made a full use. Case studies are applied to test the proposed model.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Chapter
This article presents research in progress on customer-focused churn prevention. The research aims at providing a customer-centric methodology for churn prevention based on customer social data (social CRM), as a significant complement to an approach based on customer data derived from their calling activities (e.g. discovering calling patterns) from Orange systems (like CRM). In scope of the presented research a customer churn tendency is intended to be inferred as knowledge from customers’ messages posted on a company portal (such as Orange one) thanks to discovering customer’s emotions, opinions and sentiment of customers’ messages. As a result of the present stage of research works a literature review was provided, as well as a first conceptual model derived from the review in order to underpin a role of social CRM for churn prediction. A research agenda was also elaborated. This article provides an attempt of presenting a concept of a measure of customer churn tendency based on a customer experience, in particular customer satisfaction. This particular work considers a business case for a telecom industry, as an example of an industry for which reducing customer churn is one of the fundamental requirements. It is a work in progress concerned on analysing data from a social media channel related to one of the telecom companies (Orange) aiming at discovering signals hidden in textual messages, which can be signs of a potential churn.
Conference Paper
With the development of Web 2.0, traditional customers have increasingly transferred to online purchase and created a large volume of User Generated Content (UGC) on the Internet, which brought traditional customer relationship management great challenges and attract many scholars' attention on customer review. Most of the previous researches focus on the influence of customer's review behavior on customer's purchase behavior, but little researches explore the impact on the reverse direction. In this paper, our study seeks insights into analyzing the impact of customer's purchase behavior on its review behavior and discovering how this effect could be fully utilized to predict customer review's churn in the next stage. Based on data from Dianping.com, a famous comprehensive website which contains review and purchase platforms, we build the Logit regression model, considering customer's own factors, review behavior and purchase behavior and finding the impact of user's purchase behavior on its review behavior. Finally, we also use ten-fold cross-validation to prove the stability of our model. Our study can provide a theoretical basis for research on User Generated Content.
Conference Paper
Full-text available
As markets become increasingly saturated, companies have acknowledged that their business strategies should focus on identifying those customers who are likely to churn. Customer churns analysis and prediction is an important part of Customer Relationship Management (CRM). With the purpose of retaining customers, academic as well as practitioners find it crucial to build a churn prediction model that is as accurate as possible. In another side, the development of automated data collection tools and the imperative need for the interpretation and exploitation of massive data volumes have resulted in the development and flourishing of data mining techniques. Support Vector Machines (SVM) algorithm is one of the most effective machine learning algorithms and successfully solves classification problems in many domains. In this paper, a hybrid model, combining genetic algorithms with SVM classifier, is employed to predict customer churn in Mondrian food mart department store. The procedure includes two steps. In the first step genetic algorithm determines SVM parameter values while discovering a subset of features and in the second step customer churn prediction model is established based on determined parameters. Empirical results are compared with some classifiers such as simple SVM, Decision tree, Case-Based Reasoning and Artificial Neural Networks, and indicate that the predictive performance of the hybrid model in a specific case is more effective.
Conference Paper
Full-text available
Identifying customers with a higher probability to leave a merchant (churn customers) is a challenging task for sellers. In this paper, we propose a system able to detect churner behavior and to assist merchants in delivering special offers to their churn customers. Two main goals lead our work: on the one hand, the definition of a classifier in order to perform churn analysis and, on the other hand, the definition of a framework that can be enriched with social information supporting the merchant in performing marketing actions which can reduce the probability of losing those customers. Experimental results of an artificial and a real datasets show an increased value of accuracy of the classification when random forest or decision tree are considered.
Article
Full-text available
Principal component analysis (PCA) is one of the statistical techniques fre-quently used in signal processing to the data dimension reduction or to the data decorrelation. Presented paper deals with two distinct applications of PCA in image processing. The first application consists in the image colour reduction while the three colour components are reduced into one containing a major part of information. The second use of PCA takes advantage of eigenvectors prop-erties for determination of selected object orientation. Various methods can be used for previous object detection. Quality of image segmentation implies to results of the following process of object orientation evaluation based on PCA as well. Presented paper briefly introduces the PCA theory at first and continues with its applications mentioned above. Results are documented for the selected real pictures.
Article
Full-text available
In this letter we discuss a least squares version for support vector machine (SVM) classifiers. Due to equality type constraints in the formulation, the solution follows from solving a set of linear equations, instead of quadratic programming for classical SVM''s. The approach is illustrated on a two-spiral benchmark classification problem.
Article
Full-text available
The support vector machine (SVM) is a method for classification and for function approximation. This method commonly makes use of an ε-insensitive cost function, meaning that errors smaller than ε remain unpunished. As an alternative, a least squares support vector machine (LSSVM) uses a quadratic cost function. When the LSSVM method is used for function approximation, a nonsparse solution is obtained. The sparseness is imposed by pruning, i.e., recursively solving the approximation problem and subsequently omitting data that has a small error in the previous pass. However, omitting data with a small approximation error in the previous pass does not reliably predict what the error will be after the sample has been omitted. In this paper, a procedure is introduced that selects from a data set the training sample that will introduce the smallest approximation error when it will be omitted. It is shown that this pruning scheme outperforms the standard one.
Article
Customer losing problems are concerned by telecom operators as market becoming more competitive. Based on data mining technology, Bayesian networks classifier is used in the analysis of the problems. During the process of Bayesian networks modeling, K2 and MCMC algorithms are utilized together. Effective variables are distilled through topology of networks, and churn rules are drawn based on CPT (condition probability table), then high probability churn customer groups are obtained. Considering loss function in classifier, different criterions and their class effects are provided. In contrast with other algorithm, such as decision tree and ANN (artificial neural networks), Bayesian networks can be modeled without over-sampling, when churn rate is relatively low.
Article
Despite their diverse applications in many domains, neural networks are difficult to interpret owning the lack of mathematical models to express the training result. While adopting the rule extraction method to develop different algorithms, many researchers normally simplify a network's structure and then extract rules from the simplified networks. This type of data limits such conventional approaches when attempting to remove the unnecessary connections. In addition to developing network pruning and extraction algorithms, this work attempts to determine the important input nodes. In the proposed algorithms, the type of input data is not limited to binary, discrete or continuous. Moreover, two numerical examples are analyzed. Comparing the results from the proposed algorithms with those from See5 demonstrates the effectiveness of the proposed algorithms.
Chapter
In this chapter we consider bounds on the rate of uniform convergence. We consider upper bounds (there exist lower bounds as well (Vapnik and Chervonenkis, 1974); however, they are not as important for controlling the learning processes as the upper bounds).
Book
Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.
Article
By using a binomial logit model based on a survey of 973 mobile users in Korea, the determinants of subscriber churn and customer loyalty are identified in the Korean mobile telephony market. The probability that a subscriber will switch carrier is dependent on the level of satisfaction with alternative-specific service attributes including call quality, tariff level, handsets, brand image, as well as income, and subscription duration. However, only factors such as call quality, handset type, and brand image affect customer loyalty as measured by the intention/non-intention to recommend the service provider to other people. The insignificance of subscription duration in affecting the loyalty-induced action indicates that lock-in effects are likely to be concentrated among the "spuriously loyal" customers who are not willing to churn just because of switching costs. These findings provide implications for the mobile business as well as competition policies for the mobile telephony market.
Conference Paper
The quantitative method known as credit scoring has been developed for the credit assessment problem. Credit scoring is essentially an application of classification techniques, which classify credit customers into different risk groups. The financial institutions are being more and more obliged to build credit scoring models assessing the risk of default of their clients. Support vector machine is a promising new technique that has recently emanated and become popular for data classification. Least squares support vector machines (LS-SVM) are re-formulations to the standard SVMs. The cost function is a regularized least squares function with equality constraints. The solution can be found efficiently by iterative method like the conjugate gradient algorithm. Then in this paper, least squares support vector machine is considered by approaching to the credit scoring with the data of Thai financial institutions. The optimum model is able to divide the group of customers into four groups: very good, rather good, suspiciously bad and very bad with high accuracy.
Article
Taiwan deregulated its wireless telecommunication services in 1997. Fierce competition followed, and churn management becomes a major focus of mobile operators to retain subscribers via satisfying their needs under resource constraints. One of the challenges is churner prediction. Through empirical evaluation, this study compares various data mining techniques that can assign a ‘propensity-to-churn’ score periodically to each subscriber of a mobile operator. The results indicate that both decision tree and neural network techniques can deliver accurate churn prediction models by using customer demographics, billing information, contract/service status, call detail records, and service change log.
Article
CRM gains increasing importance due to intensive competition and saturated markets. With the purpose of retaining customers, academics as well as practitioners find it crucial to build a churn prediction model that is as accurate as possible. This study applies support vector machines in a newspaper subscription context in order to construct a churn model with a higher predictive performance. Moreover, a comparison is made between two parameter-selection techniques, needed to implement support vector machines. Both techniques are based on grid search and cross-validation. Afterwards, the predictive performance of both kinds of support vector machine models is benchmarked to logistic regression and random forests. Our study shows that support vector machines show good generalization performance when applied to noisy marketing data. Nevertheless, the parameter optimization procedure plays an important role in the predictive performance. We show that only when the optimal parameter-selection procedure is applied, support vector machines outperform traditional logistic regression, whereas random forests outperform both kinds of support vector machines. As a substantive contribution, an overview of the most important churn drivers is given. Unlike ample research, monetary value and frequency do not play an important role in explaining churn in this subscription-services application. Even though most important churn predictors belong to the category of variables describing the subscription, the influence of several client/company-interaction variables cannot be neglected.
Conference Paper
To solve the prediction of customer churn, the paper proposed a new algorithm. Based on rough set theory, the algorithm used the consistency of condition attributes and decision attributes in information table, and the conception of super-cube and scan vector to discretize the continuous attributes, reduce the redundant attributes. And furthermore, it took BP neural network as the calculating tool to predict customer churn. The experimental results showed the refined data by rough set was more concise and more convenient to be applied in BP neural network, whose prediction result was more accurate. So, the algorithm via BP neural network based on rough set theory is efficient and effective
Article
The aim of this paper is to learn a linear principal component using the nature of support vector machines (SVMs). To this end, a complete SVM-like framework of linear PCA (SVPCA) for deciding the projection direction is constructed, where new expected risk and margin are introduced. Within this framework, a new semi-definite programming problem for maximizing the margin is formulated and a new definition of support vectors is established. As a weighted case of regular PCA, our SVPCA coincides with the regular PCA if all the samples play the same part in data compression. Theoretical explanation indicates that SVPCA is based on a margin-based generalization bound and thus good prediction ability is ensured. Furthermore, the robust form of SVPCA with a interpretable parameter is achieved using the soft idea in SVMs. The great advantage lies in the fact that SVPCA is a learning algorithm without local minima because of the convexity of the semi-definite optimization problems. To validate the performance of SVPCA, several experiments are conducted and numerical results have demonstrated that their generalization ability is better than that of regular PCA. Finally, some existing problems are also discussed.
Article
Excessive secretion of glucagon is a major contributor to the development of diabetic hyperglycemia. Secretion of glucagon is regulated by various nutrients, with glucose being a primary determinant of the rate of alpha cell glucagon secretion. The intra-islet action of insulin is essential to exert the effect of glucose on the alpha cells since, in the absence of insulin, glucose is not able to suppress glucagon release in vivo. However, the precise mechanism by which insulin suppresses glucagon secretion from alpha cells is unknown. In this study, we show that insulin induces activation of GABAA receptors in the alpha cells by receptor translocation via an Akt kinase-dependent pathway. This leads to membrane hyperpolarization in the alpha cells and, ultimately, suppression of glucagon secretion. We propose that defects in this pathway(s) contribute to diabetic hyperglycemia.
Conference Paper
Nowadays, churn prediction and management is critical for more and more companies in the fast changing and strongly competitive telecommunication market. In order to improve customer retention, telecommunication companies must be able to predict customers at risk who are prone to switch service provider. In this study, to overcome the limitations of lack of information of customers of Personal Handyphone System Service (PHSS) and to build an effective and accurate customer churn model, three research experimentations (changing sub-periods for training data sets, changing misclassification cost in churn model, changing sample methods for training data sets) are put forward to improve the prediction performance of churn model by using decision tree which is used widely, some optimal parameters (the time of sub-period being 10 days, misclassification cost being 1:5, and random sample method for train set) of models are found under the help of three research experimentations. The empirical evaluation results suggest that customer churn models built have a good performance through the course of model optical selecting, and show that the methods and techniques proposed are effective and feasible under the condition that information of customers is very little and class distribution is skewed. This study benefits not only churn prediction research and practice but also other data mining applications with similar characteristics.
Conference Paper
This paper focuses on radar target identification using support vector machines (SVM). The radar features used in this study are impulse response features representing the down range profile of the target as seen by stepped frequency radar. The purpose of this paper is to shed additional light on the benefits of SVM in radar target identification (RTI) under various scenarios of adversity that are commonly addressed in the RTI literature. This paper attempts to maximize the performance of SVM in RTI but does not introduce new SVM kernels, or SVM training methods. The focus is on defining the rewards of SVM in target identification assuming a classifier that is presented with time domain signatures representing the target impulse response at a certain azimuth angle. In particular this paper focuses on assessing the SVM classifier performance in different scenarios, which are discussed in this paper
Article
Principal component analysis (PCA) is one of the most general purpose feature extraction methods. A variety of learning algorithms for PCA has been proposed. Many conventional algorithms, however, will either diverge or converge very slowly if learning rate parameters are not properly chosen. In this paper, an adaptive learning algorithm (ALA) for PCA is proposed. By adaptively selecting the learning rate parameters, we show that the m weight vectors in the ALA converge to the first m principle component vectors with almost the same rates. Comparing with the Sanger's generalized Hebbian algorithm (GHA), the ALA can quickly find the desired principal component vectors while the GHA fails to do so. Finally, simulation results are also included to illustrate the effectiveness of the ALA
An adaptive learning algorith for PCA
  • Chen Liang-Hua
Chen Liang-Hua, et al. An adaptive learning algorith for PCA. IEEE Trans NN. pp. 1255-1263, 1996.