Conference PaperPDF Available

Study to use NEO4J to analysis and detection SIM-BOX fraud

Authors:
  • Wadi Alshatti University

Abstract

The high price of incoming international calls is a common method of subsidizing telephony infrastructure in the developing world. Accordingly, international tele- phone system interconnects are regulated to ensure call quality and accurate billing. High call tariffs create a strong incentive to evade such interconnects and deliver costly international calls illicitly. Specifically, adversaries use VoIP-GSM gateways informally known as “simboxes” to receive incoming calls over wired data connections and deliver them into a cellular voice network through a local call that appears to originate from a customer’s phone. In this paper, we analyze and compare known methods of fraud detection (sim-box), explaining the advantages and defects for each method and proposed a new method. System relies on analyze CDR files using data mining technology (Neo4j), and then use known method TCG (test call generation) to increase efficiency and to be more sure to results.
A preview of the PDF is not available
... SIMboxing also known as bypass fraud is a form of fraud where international calls are diverted to a cellular device through the internet and the connections are routed back into the network as local calls resulting in a revenue leakage (Kala, 2021). SIMBoxing fraud occurs when the legal interconnect gateways are bypassed by fraudsters resulting in the diversion of traffic to illegal interconnect gateways (Nassir, 2020). This leads to a loss of revenue by the service provider and also poses a security threat as terrorists and other fraudsters can utilize this bypass to commit crimes as the call path is hard to trace. ...
Article
Full-text available
Background: Mobile network technology has exponentially advanced in the last decade and with this development, fraud activities have risen in equal measure resulting in companies and customers losing huge amounts of money as a result of, especially in developing economies that lag in the regulatory frameworks when it comes to Mobile network fraud. The purpose of the study was to explore Mobile network fraud in Kenya identifying the most common types of fraud, ways which service providers and regulators are employing to prevent or reduce fraud, methods currently used to detect fraud, and gaps thereof. Finally, the effect of concept drift in the automated fraud detection process. Method: A qualitative research method was adopted for the study and using a semi-structured question guide, four focus group discussions composed of 23 participants were conducted. The criteria used for selecting and placing participants into focus groups considered the following: The expert area of the participant, years of experience in the fraud ecosystem of the participant, and the organization to which the participant is attached. The availability and willingness of the participants were also considered in the selection process. The focus group approach was selected as it facilitated balanced discussion amongst all the players in the Kenyan fraud ecosystem, harnessing the power of group dynamics as it involved the regulators and the service providers who were drawn from different organizations. Results: The mobile network fraud ecosystem was stratified into three dimensions namely Fraud prevention which looked at the policies and methods used by both regulators and service providers to reduce fraud, Fraud categorization which aimed at categorizing different types of mobile frauds, and finally the Fraud detection which looked at the current tools being used to detect fraud. From the study, it emerged that although the regulators have provided strict guidelines on the customer onboarding process, not all service providers are currently using biometric approaches while onboarding new customers as this was highlighted as the entry point of most fraud cases. The study established five major types of Mobile network fraud in Kenya: SIM swap, SIM boxing, Wangiri, Commission arbitrage, and Hoax SMS and scams. Most of this fraud is committed using either SMS or voice channels; in some cases, both channels are used. Different matrixes derived from multiple factors are used by service providers while evaluating the criticality of fraud cases though not enforced by the regulators. The study also revealed that most of the fraud detection processes amongst the service providers still use manual tools that constantly require human input. While some of the detection processes are automated, concept drift is a major challenge for automated classification models due to the constant evolution of fraud patterns. Conclusion: The study revealed gaps in Mobile Network fraud prevention processes in Kenya as service providers still use non-biometric customer validation processes that are open to forgery and exploitation. A strict customer onboarding process that is fully automated and integrated should be used to address this gap. In the fraud categorization, there is no clear universal categorization matrix to guide the service providers while assessing the criticality of fraud and in this regard, a qualitative scientific model should be developed and used by all the stakeholders as a reference point. When it comes to fraud detection, concept drift is a major challenge, and service providers in Kenya still rely on manual processes due to the dynamic nature of mobile fraud. This exposes a huge gap in the detection process and there is a need to address this by developing systems and processes that will automatically detect and react to concept drift while automating the detection processes.
Article
The high asymmetry of international termination rates is fertile ground for the appearance of fraud in telecom companies. International calls have higher values when compared with national ones, which raises the attention of fraudsters. In this paper, we present a solution for a real problem called interconnect bypass fraud, more specifically, a newly identified distributed pattern that crosses different countries and keeps fraudsters from being tracked by almost all fraud detection techniques. This problem is one of the most expressive in the telecommunication domain, and it has some abnormal behaviours like the occurrence of a burst of calls from specific numbers. Based on this assumption, we propose the adoption of a new fast forgetting technique that works together with the Lossy Counting algorithm. We apply frequent set mining to capture distributed patterns from different countries. Our goal is to detect as soon as possible items with abnormal behaviours, e.g., bursts of calls, repetitions, mirrors, distributed behaviours and a small number of calls spread by a vast set of destination numbers. The results show that the application of different techniques improves the detection ratio and not only complements the techniques used by the telecom company but also improves the performance of the Lossy Counting algorithm in terms of run-time, memory used and sensibility to detect the abnormal behaviours. Additionally, the application of frequent set mining allows us to capture distributed fraud patterns.
Conference Paper
Full-text available
Voice traffic termination fraud, often referred to as Subscriber Identity Module box (SIMbox) fraud, is a common illegal practice on mobile networks. As a result, cellular operators around the globe lose billions annually. Moreover, SIMboxes compromise the cellular network infrastructure by overloading local base stations serving these devices. This paper analyzes the fraudulent traffic from SIMboxes operating with a large number of SIM cards. It processes hundreds of millions of anonymized voice call detail records (CDRs) from one of the main cellular operators in the United States. In addition to overloading voice traffic, fraudulent SIMboxes are observed to have static physical locations and to generate disproportionately large volume of outgoing calls. Based on these observations, novel classifiers for fraudulent SIMbox detection in mobility networks are proposed. Their outputs are optimally fused to increase the detection rate. The operator's fraud department confirmed that the algorithm succeeds in detecting new fraudulent SIMboxes.
Conference Paper
Full-text available
The Short Messaging Service (SMS), one of the most successful cellular services, generates millions of dollars in revenue for mobile operators. Estimates indicate that billions of text messages are traveling the airwaves daily. Nevertheless, text messaging is becoming a source of customer dissatisfaction due to the rapid surge of messaging abuse activities. Although spam is a well tackled problem in the email world, SMS spam experiences a yearly growth larger than 500%. In this paper we present, to the best of our knowledge, the first analysis of SMS spam traffic from a tier-1 cellular operator. Communication patterns of spammers are compared to those of legitimate cell-phone users and Machine to Machine (M2M) connected appliances. The results indicate that M2M systems exhibit communication profiles similar to spammers, which could mislead spam filters. Beyond the expected results, such as a large load of text messages sent out to a wide target list, other interesting findings are made. For example, the results indicate that the great majority of the spammers connect to the network with just a handful of different hardware models. We find the main geographical sources of messaging abuse in the US. We also find evidence of spammer mobility, voice and data traffic resembling the behavior of legitimate customers.
Chapter
Full-text available
One of the most severe threats to revenue and quality of service in telecom providers is fraud. The advent of new technologies has provided fraudsters new techniques to commit fraud. SIM box fraud is one of such fraud that has emerged with the use of VOIP technologies. In this work, a total of nine features found to be useful in identifying SIM box fraud subscriber are derived from the attributes of the Customer Database Record (CDR). Artificial Neural Networks (ANN) has shown promising solutions in classification problems due to their generalization capabilities. Therefore, supervised learning method was applied using Multi layer perceptron (MLP) as a classifier. Dataset obtained from real mobile communication company was used for the experiments. ANN had shown classification accuracy of 98.71 %.
Article
In a recent comprehensive global survey of 150 telecommunications network operators, two issues were identified as the most significant threats to operators' revenues. One of these has already cost operators an average of 20% of their termination revenues this year. The other has been a risk for many years but continues to threaten revenues on 80% of the networks surveyed. So what are these threats and what can we do about them? Mobile network operators have long been targets for fraud and revenue risk. The nature of these companies' businesses mean these organisations generate significant revenues – and this raises significant risk of fraud. With more OTT players entering the market, this threat will continue to increase, leaving mobile operators increasingly exposed in the future. Andy Gent of Revector explains that telecommunications providers need to have an in-depth understanding of the current fraud landscape as well as investing in new and reliable technologies to detect and prevent fraud.
Conference Paper
Data mining is a process of inferring knowledge from such huge data. Data Mining has three major components Clustering or Classification, Association Rules and Sequence Analysis. By simple definition, in classification/clustering analyze a set of data and generate a set of grouping rules which can be used to classify future data. Data mining is the process is to extract information from a data set and transform it into an understandable structure. It is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns. Data mining involves six common classes of tasks. Anomaly detection, Association rule learning, Clustering, Classification, Regression, Summarization. Classification is a major technique in data mining and widely used in various fields. Classification is a data mining (machine learning) technique used to predict group membership for data instances. In this paper, we present the basic classification techniques. Several major kinds of classification method including decision tree induction, Bayesian networks, k-nearest neighbor classifier, the goal of this study is to provide a comprehensive review of different classification techniques in data mining.
Article
Most people think about fraud and security in the mobile industry as having their phone stolen or hacked. However there is an underground industry that Juniper Research believes is worth 58bnayearinrevenuesthatarebeinglosttofraudandlackofeffectiverevenueprotection.1Thisdwarfsissuesaroundpersonalsecurityandoutlinesanimmenseproblemfromwhichmobilenetworkoperatorssuffer,butoftenstruggletorecognise.Mostpeoplethinkaboutfraudandsecurityinthemobileindustryashavingtheirphonestolenorhacked.HoweverthereisanundergroundindustrythatJuniperResearchbelievesisworth58bn a year in revenues that are being lost to fraud and lack of effective revenue protection.1 This dwarfs issues around personal security and outlines an immense problem from which mobile network operators suffer, but often struggle to recognise. Most people think about fraud and security in the mobile industry as having their phone stolen or hacked. However there is an underground industry that Juniper Research believes is worth 58bn a year in revenues that are being lost to fraud and lack of effective revenue protection. This is an immense problem for mobile network operators, but which they often struggle to recognise. Fraudsters exploiting weaknesses in mobile networks operate as businesses, often providing services to other fraudsters in a chain of fraud. The combining of multiple fraud practices makes detection difficult and prevention harder. Mark Yelland of Revector details how these frauds work and what can be done about them.
Mobile Revenue Assurance Fraud Management
  • H Windsor
-H. Windsor, "Mobile Revenue Assurance Fraud Management," Juniper Research, http://goo.gl/GX7G4.
Mining Insurance Data For Fraud Detection: The Case of Africa Insurance Share Company
  • Rd Generation Partnership Project
-RD GENERATION PARTNERSHIP PROJECT. 3GPP TS 46.010 v11.1.0. Tech. Rep. Full rate speech; Transcoding. [4]-Tariku, A. (2015). Mining Insurance Data For Fraud Detection: The Case of Africa Insurance Share Company. AAU, Faculty of Informatics,
Introduction to Data Mining and Knowledge Discovery
  • Two-Crows
-Two-Crows. (2006). Introduction to Data Mining and Knowledge Discovery (3rd edition ed.): Two [7]Crows Corporation. Bounsaythip, C., & Rinta-Runsala, E. (2001).