Article

Bi-level clustering in telecommunication fraud

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper we describe a fraud detection clustering algorithm applied to the telecom industry. This is an ongoing work that is being developed in collaboration with a leading telecom operator. The choice of clustering algorithms is justified by the need of identifying clients' abnormal behaviors through the analysis of huge amounts of data. We propose a novel bi-level clustering methodology, where the first level is concerned with the clustering of transactional data and the second level gathers data from the first phase, along with other information, to build high-level clusters.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We apply the weight of evidence reformulation of AdaBoosted naive Bayes scoring due to Ridgeway et al. (1998) to the problem of diagnosing insurance claim fraud. The method effectively combines the advantages of boosting and the explanatory power of the weight of evidence scoring framework. We present the results of an experimental evaluation with an emphasis on discriminatory power, ranking ability, and calibration of probability estimates. The data to which we apply the method consists of closed personal injury protection (PIP) automobile insurance claims from accidents that occurred in Massachusetts (USA) during 1993 and were previously investigated for suspicion of fraud by domain experts. The data mimic the most commonly occurring data configuration, that is, claim records consisting of information pertaining to several binary fraud indicators. The findings of the study reveal the method to be a valuable contribution to the design of intelligible, accountable, and efficient fraud detection support.
Article
We propose a system for mobile-phone fraud detection based on a bidirectional artificial neural network (bi-ANN). The key advantage of such a system is the ability to detect fraud not only by offline processing of call detail records (CDR), but also in real time. The core of the system is a bi-ANN that predicts the behavior of individual mobile-phone users. We determined that the bi-ANN is capable of predicting complex time series (Call_Duration parameter) that are stored in the CDR.
Article
A system to prevent subscription fraud in fixed telecommunications with high impact on long-distance carriers is proposed. The system consists of a classification module and a prediction module. The classification module classifies subscribers according to their previous historical behavior into four different categories: subscription fraudulent, otherwise fraudulent, insolvent and normal. The prediction module allows us to identify potential fraudulent customers at the time of subscription. The classification module was implemented using fuzzy rules. It was applied to a database containing information of over 10,000 real subscribers of a major telecom company in Chile. In this database, a subscription fraud prevalence of 2.2% was found. The prediction module was implemented as a multilayer perceptron neural network. It was able to identify 56.2% of the true fraudsters, screening only 3.5% of all the subscribers in the test set. This study shows the feasibility of significantly preventing subscription fraud in telecommunications by analyzing the application information and the customer antecedents at the time of application.
Article
We have been developing signature-based methods in the telecommunications industry for the past 5 years. In this paper, we describe our work as it evolved due to improvements in technology and our aggressive attitude toward scale. We discuss the types of features that our signatures contain, nuances of how these are updated through time, our treatment of outliers, and the trade-off between time-driven and event-driven processing. We provide a number of examples, all drawn from the application of signatures to toll fraud detection.
Conference Paper
We report an experiment aimed at generating synthetic test data for fraud detection in an IP based video-on-demand service. The data generation verifies a methodology previously developed by the present authors [E. Lundin et al., (2002)] that ensures that important statistical properties of the authentic data are preserved by using authentic normal data and fraud as a seed for generating synthetic data. This enables us to create realistic behavior profiles for users and attackers. The data is used to train the fraud detection system itself, thus creating the necessary adaptation of the system to a specific environment. Here we aim to verify the usability and applicability of the synthetic data, by using them to train a fraud detection system. The system is then exposed to a set of authentic data to measure parameters such as detection capability and false alarm rate as well as to a corresponding set of synthetic data, and the results are compared.