Article

# Managing a pool of rules for credit card fraud detection by a Game Theory based approach

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract

In the automatic credit card transaction classification there are two phases: in the Real-Time (RT) phase the system decides quickly, based on the bare transaction information, whether to authorize the transaction; in the subsequent Near-Real-Time (NRT) phase, the system enacts a slower ex-post evaluation, based on a larger information context. The classification rules in the NRT phase trigger alerts on suspicious transactions, which are transferred to human investigators for final assessment. The management criteria used to select the rules, to be kept operational in the NRT pool, are traditionally based mostly on the performance of individual rules, considered in isolation; this approach disregards the non-additivity of the rules (aggregating rules with high individual precision does not necessarily make a high-precision pool). In this work, we propose to apply, to the rule selection for the NRT phase, an approach which assigns a normalized score to the individual rule, quantifying the rule influence on the overall performance of the pool. As a score we propose to use a power-index developed within Coalitional Game Theory, the Shapley Value (SV), summarizing the performance in collaboration. Such score has two main applications: (1) it can be used, within the periodic rule assessment process, to support the decision of whether to keep or drop the rule from the pool; (2) it can be used to select the k top-ranked rules, so as to work with a more compact rule set. Using real-world credit card fraud data containing approximately 300 rules and 3×105 transactions records, we show that: (1) this score fares better – in granting the performance of the pool – than the one assessing the rules in isolation; (2) that the same performance of the whole pool can be achieved keeping only one tenth of the rules — the top-k SV-ranked rules. We observe that the latter application can be reframed in terms of Feature Selection (FS) task for a classifier: we show that our approach is comparable w.r.t benchmark FS algorithms, but argue that it presents an advantage for the management, consisting in the assignment of a normalized score to the individual rule. This is not the case for most FS algorithms, which only focus in yielding a high-performance feature-set solution.

## No full-text available

... The feasibility of the overall approach relies on the assumption that one can compute the power indices α p with little e ort. Indeed, although the de nition (2) suggests that the computation of the power indices has exponential complexity, in practice, a su ciently useful estimate of the index can be obtained in polynomial time by sampling a reasonable number of coalitions [10]: in fact one does not have to discover the exact value of the indices, but for the proposed approach one just needs to nd out which ones are the highest k indices. ...
... The approach of the present work bears some analogy with the problems of classi cation rule selection and rule pool management in fraudulent credit card transaction detection as it is addressed in the work [10] (that work uses the Shapley Value as a scoring index). However the very nature of the involved objects (classi cation rules on one side and information streams on the other) makes the actual details of the methods rather di erent. ...
... In short, an event, or more in general, the truth to be discovered, are richly structured phenomena and their description is very different from the one of credit card transactions. 1 Even so, should one manage to formulate the performance metric in terms of a single real number the formalism could be the same used in reference [10]. However the two setting are distinguished also by a structural di erence, which forces the adoption of di erent modeling tools. ...
... While various verification methods have been implemented, the number of fraud cases involving credit cards has not been significantly decreased [6]. The potential for substantial monetary gains, combined with the ever-changing nature of financial services, creates a wide range of opportunities for fraudsters [7]. Funds from payment card fraud are often used in criminal activities that are hard to prevent, e.g., to support terrorist acts [8]. ...
... This can be done through two main steps: training and testing. AI is employed to build systems for fraud detection, such as classificationbased systems [19,6,7,8], clustering-based systems [17,20,21], neural network-based systems [18,22,23], and support vector machine-based systems [9]. Although AI-based systems can perform well, they suffer from some critical issues. ...
... In this context, data mining tasks, such as classification, clustering, applying association rules, and using neural networks, are employed [2]. In addition, AI is employed to build systems for fraud detection, such as classification-based systems [19,6,7,8], clustering-based systems [17,20,21], neural network-based systems [18,22,23] and support vector machine-based systems [9]. The techniques employed to construct credit card fraud detection systems using AI can be categorized into four main groups. ...
... The significance of rule-based methods is the superior model interpretability that is important in commercial-grade applications. Sanchez et al. [8], Dunman and Ozcelik [21], Gianini et al. [26], and Duman and Elikucuk [50] have exploited rule-based methods in solving payment fraud detection problem. ...
... Later the members from the same research group used migrating bird's optimization to improve the same classifier further Duman and Elikucuk [50]. Gianini et al. [26] proposed a rule-pool based method, relying on the shapely value (which is the average marginal contribution from different rules to the final decision), which is a concept in game theory. Further, the researchers tried to minimize the rule pool to reduce computational complexity, by using three algorithms. ...
Preprint
Card payment fraud is a serious problem, and a roadblock for an optimally functioning digital economy, with cards (Debits and Credit) being the most popular digital payment method across the globe. Despite the occurrence of fraud could be relatively rare, the impact of fraud could be significant, especially on the cardholder. In the research, there have been many attempts to develop methods of detecting potentially fraudulent transactions based on data mining techniques, predominantly exploiting the developments in the space of machine learning over the last decade. This survey proposes a taxonomy based on a review of existing research attempts and experiments, which mainly elaborates the approaches taken by researchers to incorporate the (i) business impact of fraud (and fraud detection) into their work , (ii) the feature engineering techniques that focus on cardholder behavioural profiling to separate fraudulent activities happening with the same card, and (iii) the adaptive efforts taken to address the changing nature of fraud. Further, there will be a comparative performance evaluation of classification algorithms used and efforts of addressing class imbalance problem. Forty-five peer-reviewed papers published in the domain of card fraud detection between 2009 and 2020 were intensively reviewed to develop this paper.
... These rules are then tested for every new transaction in order to trigger a signal indicating that a SIF has been detected. This process is similar to the Near-Real Time (NRT) fraud detection phase classically used in credit card fraud detection systems, as described in [Gia+20]. In the context of the SiS platform, as of today 224 such rules have been declared in the rule engine. ...
... However rule engines as fraud detection system suffer from several drawbacks. First of all, as discussed in [Gia+20] they are difficult to maintain. For example, due to the dynamic nature of fraud, new rules have to be added when fraudsters discover new ways to cheat the system. ...
Thesis
Supplier Impersonation Fraud (SIF) is a kind of fraud occuring in a Business-To-Business context (B2B), where a fraudster impersonates a supplier in order to trigger an illegitimate payment from a company. Most of the exisiting systems focus solely on a single, "intra-company" approach in order to detect such kind of fraud. However, the companies are part of an ecosystem where multiple agents interacts, and such interaction hav yet to be integrated as a part of the existing detection techniques. In this thesis we propose to use state-of-the-art techniques in Machine Learning in order to build a detection system for such frauds, based on the elaboration of a model using historical transactions from both the targeted companies and the relevant other companies in the ecosystem (contextual data). We perform detection of anomalous transactions when significant change in the payment behavior of a company is detected. Two ML-based systems are proposed in this work: ProbaSIF and GraphSIF. ProbaSIF uses a probabilistic approach (urn model) in order to asert the probability of occurrence of the account used in the transaction in order to assert its legitimacy. We use this approach to assert the differences yielded by the integration of contextual data to the analysis. GraphSIF uses a graph-based approach to model the interaction between client and supplier companies as graphs, and then uses these graph as training data in a Self-Organizing Map-Clustering model. The distance between a new transaction and the center of the cluster is used to detect changes in the behavior of a client company. These two systems are compared with a real-life fraud detection system in order to assert their performance.
... Likewise, AI and ML models are extensively used in the financial-service industry for fraud detection, credit scoring, and enhancing cybersecurity (Prasad & Rohokale, 2020;(Leo et al., 2019, p. 29). ML models effectively safeguard clients' credit cards and other confidential information and alert the authority if any suspicious activity occurs (Sarker et al., 2020;Gianini et al., 2019;Wei et al., 2013). ...
Preprint
Full-text available
This study explores the importance of artificial intelligence and the future of work and identifies issues and challenges of artificial intelligence and the future of work in Asia. This study also proposes a policy framework and provides recommendations for artificial intelligence and automation in organisational processes. The findings of this study are relevant for regulatory authorities, standard-setting bodies, and corporations.
... The experimental implementation of the presented system depicted enhanced results in comparison to other decision-modeling techniques. Gianini et al. (2020) applied an intelligent approach for assigning the quantified score in credit card transactions. Specifically, the authors proposed a Coalition Game theory model for summarizing the performance of the users in the context of online fraud. ...
Article
Full-text available
Innovations in the Internet of Things (IoT) technology have revolutionized several industrial domains for smart decision-modeling. The capacity to perceive data about ubiquitous instances has resulted in numerous innovations in sensitive sectors like national security, and police departments. In this paper, an extensive IoT-based framework is introduced for assessing the integrity of police personnel based on his/her performance. The work introduced in this research is centered around analyzing several activities of police personnel to assess his/her integral behavior. In particular, the Probabilistic Measure of Integrity (PMI) is formalized based on professional data analysis for classification based on Bayesian Model. Moreover, the 2-player game model has been presented to assess the performance of police personnel for efficient decision-making. For validation purposes, the presented framework is deployed over challenging datasets acquired from the online repository of UCI. Based on the comparative analysis with the state-of-the-art decision-making models, the presented approach has registered enhanced performance in terms of Temporal Delay, Classification, Prediction, Reliability, and Stability.
... The dataset contains a very large number of features. A large number of techniques is available for feature selection [21,39]. We selected the features by applying a nature-inspired evolutionary optimization algorithm, the modified crow search algorithm (MCSA) developed by Gupta et al. [25]. ...
Article
Full-text available
Healthcare organizations and Health Monitoring Systems generate large volumes of complex data, which offer the opportunity for innovative investigations in medical decision making. In this paper, we propose a beetle swarm optimization and adaptive neuro-fuzzy inference system (BSO-ANFIS) model for heart disease and multi-disease diagnosis. The main components of our analytics pipeline are the modified crow search algorithm, used for feature extraction, and an ANFIS classification model whose parameters are optimized by means of a BSO algorithm. The accuracy achieved in heart disease detection is $$99.1\%$$ with $$99.37\%$$ precision. In multi-disease classification, the accuracy achieved is $$96.08\%$$ with $$98.63\%$$ precision. The results from both tasks prove the comparative advantage of the proposed BSO-ANFIS algorithm over the competitor models.
... Fraud detection mechanisms lie on a broad spectrum of approaches (Kou et.al, 2004;Bhattacharyya et al., 2011;Le Borgne et al., 2022). On one end, we have rule-based approaches, devised and updated by domain experts (Gianini et al., 2020). This requires human time, effort and maintenance, and cannot model very complex patterns. ...
Preprint
Full-text available
Fraud detection systems (FDS) mainly perform two tasks: (i) real-time detection while the payment is being processed and (ii) posterior detection to block the card retrospectively and avoid further frauds. Since human verification is often necessary and the payment processing time is limited, the second task manages the largest volume of transactions. In the literature, fraud detection challenges and algorithms performance are widely studied but the very formulation of the problem is never disrupted: it aims at predicting if a transaction is fraudulent based on its characteristics and the past transactions of the cardholder. Yet, in posterior detection, verification often takes days, so new payments on the card become available before a decision is taken. This is our motivation to propose a new paradigm: posterior fraud detection with "future" information. We start by providing evidence of the on-time availability of subsequent transactions, usable as extra context to improve detection. We then design a Bidirectional LSTM to make use of these transactions. On a real-world dataset with over 30 million transactions, it achieves higher performance than a regular LSTM, which is the state-of-the-art classifier for fraud detection that only uses the past context. We also introduce new metrics to show that the proposal catches more frauds, more compromised cards, and based on their earliest frauds. We believe that future works on this new paradigm will have a significant impact on the detection of compromised cards.
... Gianini et al. optimize a system of 51 rules using a game theory approach [6]. They measure rule importance using Shapley values [13] as a measure of contribution to the system. ...
Preprint
Full-text available
Fraud detection is essential in financial services, with the potential of greatly reducing criminal activities and saving considerable resources for businesses and customers. We address online fraud detection, which consists of classifying incoming transactions as either legitimate or fraudulent in real-time. Modern fraud detection systems consist of a machine learning model and rules defined by human experts. Often, the rules performance degrades over time due to concept drift, especially of adversarial nature. Furthermore, they can be costly to maintain, either because they are computationally expensive or because they send transactions for manual review. We propose ARMS, an automated rules management system that evaluates the contribution of individual rules and optimizes the set of active rules using heuristic search and a user-defined loss-function. It complies with critical domain-specific requirements, such as handling different actions (e.g., accept, alert, and decline), priorities, blacklists, and large datasets (i.e., hundreds of rules and millions of transactions). We use ARMS to optimize the rule-based systems of two real-world clients. Results show that it can maintain the original systems' performance (e.g., recall, or false-positive rate) using only a fraction of the original rules (~ 50% in one case, and ~ 20% in the other).
... The transaction volumes are massive, reflecting a variety of different transaction types, and so are the outlier detection queries for detecting anomalous transaction patterns in contexts changing over time (e.g., sudden above-market payments as it enters pandemic lockdown, indicating hoarding activities; recurring micro-payments for items typically paid at once, indicating a possible tax evasion attempt). A lot of queries meant to detect different fraudulent activities are run at the same time, and the queries keep changing as the need for information and the accuracy changes [1,9]. □ Time Time t 1 t 2 Time t 1 t 2 q 2 q 1 q 3 q 2 q 3 q 4 q 5 ...
... Many other approaches have been used recently in the identification of credit card fraud. Gianini et al. [36] proposed a method of rule pool management based on game theory in which the system distributes suspicious transactions for manual investigation while avoiding the need to isolate the individual rules. Based on generative adversarial networks, Fiore et al. [37] proposed a method to generate simulated fraudulent transaction samples to improve the effectiveness of classification models. ...
Article
Full-text available
Credit card fraud detection (CCFD) is important for protecting the cardholder’s property and the reputation of banks. Class imbalance in credit card transaction data is a primary factor affecting the classification performance of current detection models. However, prior approaches are aimed at improving the prediction accuracy of the minority class samples (fraudulent transactions), but this usually leads to a significant drop in the model’s predictive performance for the majority class samples (legal transactions), which greatly increases the investigation cost for banks. In this paper, we propose a heterogeneous ensemble learning model based on data distribution (HELMDD) to deal with imbalanced data in CCFD. We validate the effectiveness of HELMDD on two real credit card datasets. The experimental results demonstrate that compared with current state-of-the-art models, HELMDD has the best comprehensive performance. HELMDD not only achieves good recall rates for both the minority class and the majority class but also increases the savings rate for banks to 0.8623 and 0.6696, respectively.
... The paper describes (which is consistent with our industry partners' experience) that as the ruleset increases, the effort to maintain a transaction monitoring system also increases, and consequently, the accuracy of fraud detection decreases. An interesting approach that assigns a normalized score to the individual rule, quantifying the rule influence on the pool's overall performance, is described in [39]. ...
Article
Full-text available
... The DEAL is highly adaptive and robust towards data imbalance and latent transaction patterns. Gianini et al., 2020 have managed a set of rules using game theory and Zhu et al., 2020 have proposed WELM algorithm to achieve high fraud detection performance. ...
Chapter
Machine learning (ML) proven to be an emerging technology from small-scale to large-scale industries. One of the important industries is banking, where ML is being adapted all over the world by employing online banking. The online banking is using ML techniques in detecting fraudulent transactions like credit card fraud detection, etc. Hence, in this chapter, a Credit card Fraud Detection (CFD) system is devised using Luhn's algorithm and k-means clustering. Moreover, CFD system is also developed using Fuzzy C-Means (FCM) clustering instead of k-means clustering. Performance of CFD using both clustering techniques is compared using precision, recall and f-measure. The FCM gives better results in comparison to k-means clustering. Further, other evaluation metrics such as fraud catching rate, false alarm rate, balanced classification rate, and Mathews correlation coefficient are also calculated to show how well the CFD system works in the presence of skewed data.
... A system was proposed [30] in which the authors ignored the non-additivity of the composition of rules in the pool. The authors suggested utilizing a method for predicting every rule's contribution to the pool's performance by using the Shapley value (SV). ...
Article
Full-text available
Online sales and purchases are increasing daily, and they generally involve credit card transactions. This not only provides convenience to the end-user but also increases the frequency of online credit card fraud. In the recent years, in some countries, this fraud increase has led to an exponential increase in credit card fraud detection, which has become increasingly important to address this security issue. Recent studies have proposed machine learning (ML)-based solutions for detecting fraudulent credit card transactions, but their detection scores still need improvement due to the imbalance of classes in any given dataset. Few approaches have achieved exceptional results on different datasets. In this study, the Kaggle dataset was used to develop a deep learning (DL)-based approach to solve the text data problem. A novel text2IMG conversion technique is proposed that generates small images. The images are fed into a CNN architecture with class weights using the inverse frequency method to resolve the class imbalance issue. DL and ML approaches were applied to verify the robustness and validity of the proposed system. An accuracy of 99.87% was achieved by Coarse-KNN using deep features of the proposed CNN.
Article
Internet of Things (IoT) technology backed by Artificial Intelligence (AI) techniques has been increasingly utilized for the realization of the Industry 4.0 vision. Conspicuously, this work provides a novel notion of the smart sports industry for provisioning efficient services in the sports arena. Specifically, an IoT-inspired framework has been proposed for real-time analysis of athlete performance. IoT data is utilized to quantify athlete performance in the terms of probability parameters of Probabilistic Measure of Performance (PMP) and Level of Performance Measure (LoPM). Moreover, a two-player game-theory-based mathematical framework has been presented for efficient decision modeling by the monitoring officials. The presented model is validated experimentally by deployment in District Sports Academy (DSA) for 60 days over four players. Based on the comparative analysis with state-of-the-art decision-modeling approaches, the proposed model acquired enhanced performance values in terms of Temporal Delay, Classification Efficiency, Statistical Efficacy, Correlation Analysis, and Reliability.
Article
This paper proposes to design a majority vote ensemble classifier for accurate detection of credit card frauds. In this technique, the behaviour, operational and transactional features of users are combined into a single feature. The user behaviours over a banking website are collected and so that normal and abnormal behaviours of users are classified using Web Markov Skeleton Process (WMSP) model. The operational and transaction features of users are collected and classified using the Random Forest (RF) classifier and Support Vector Machine (SVM), respectively. Finally, the classification results of WMSP, RF and SVM are passed on to the majority voting based ensemble (MVE) classifier, which accurately predicts fraud users. By experimental results, it was shown that the MVE classifier achieve higher detection rate with good accuracy.
Article
Purpose The best algorithm that was implemented on this Brazilian dataset was artificial immune system (AIS) algorithm. But the time and cost of this algorithm are high. Using asexual reproduction optimization (ARO) algorithm, the authors achieved better results in less time. So the authors achieved less cost in a shorter time. Their framework addressed the problems such as high costs and training time in credit card fraud detection. This simple and effective approach has achieved better results than the best techniques implemented on our dataset so far. The purpose of this paper is to detect credit card fraud using ARO. Design/methodology/approach In this paper, the authors used ARO algorithm to classify the bank transactions into fraud and legitimate. ARO is taken from asexual reproduction. Asexual reproduction refers to a kind of production in which one parent produces offspring identical to herself. In ARO algorithm, an individual is shown by a vector of variables. Each variable is considered as a chromosome. A binary string represents a chromosome consisted of genes. It is supposed that every generated answer exists in the environment, and because of limited resources, only the best solution can remain alive. The algorithm starts with a random individual in the answer scope. This parent reproduces the offspring named bud. Either the parent or the offspring can survive. In this competition, the one which outperforms in fitness function remains alive. If the offspring has suitable performance, it will be the next parent, and the current parent becomes obsolete. Otherwise, the offspring perishes, and the present parent survives. The algorithm recurs until the stop condition occurs. Findings Results showed that ARO had increased the AUC (i.e. area under a receiver operating characteristic (ROC) curve), sensitivity, precision, specificity and accuracy by 13%, 25%, 56%, 3% and 3%, in comparison with AIS, respectively. The authors achieved a high precision value indicating that if ARO detects a record as a fraud, with a high probability, it is a fraud one. Supporting a real-time fraud detection system is another vital issue. ARO outperforms AIS not only in the mentioned criteria, but also decreases the training time by 75% in comparison with the AIS, which is a significant figure. Originality/value In this paper, the authors implemented the ARO in credit card fraud detection. The authors compared the results with those of the AIS, which was one of the best methods ever implemented on the benchmark dataset. The chief focus of the fraud detection studies is finding the algorithms that can detect legal transactions from the fraudulent ones with high detection accuracy in the shortest time and at a low cost. That ARO meets all these demands.
Article
Full-text available
Using an efficient and scientific planning tool for planning and scheduling project can be considered a crucial process. Two approaches have been applied to find the total completion time of project, namely program evaluation and review technique (probabilistic PERT network) and binomial distribution cumulative density function (CDF). Cumulative density function is assumed that time is a random variable that followed the discrete distribution (binomial distribution). The coefficient of variation that depends on (S, X-) has been calculated to determine uncertainty of activity completion (c.v) at each stage of project where it is between (0,0.103), and it is a very weak value so this illustrates that most activities are worked as it planned. The final results show that the cumulative function method is more accurate than the traditional method (PERT) where the wasted time was decreased around 4 days. The total project completion time by using PERT is 33 days where it is 29 days by using the cumulative function method.
Preprint
With the advent of the Internet of things (IoT) era, more and more devices are connected to the IoT. Under the traditional cloud-thing centralized management mode, the transmission of massive data is facing many difficulties, and the reliability of data is difficult to be guaranteed. As emerging technologies, blockchain technology and edge computing (EC) technology have attracted the attention of academia in improving the reliability, privacy and invariability of IoT technology. In this paper, we combine the characteristics of the EC and blockchain to ensure the reliability of data transmission in the IoT. First of all, we propose a data transmission mechanism based on blockchain, which uses the distributed architecture of blockchain to ensure that the data is not tampered with; secondly, we introduce the three-tier structure in the architecture in turn; finally, we introduce the four working steps of the mechanism, which are similar to the working mechanism of blockchain. In the end, the simulation results show that the proposed scheme can ensure the reliability of data transmission in the Internet of things to a great extent.
Article
Historically, fraudulent episodes result in high losses for investors, layoff of executives, and the erosion of confidence in the stock market, among other negative consequences. In this paper, we established a framework that connects Game Theory and Detection Software to estimate the probability of defrauding by manipulation of stock prices in terms of the effort-damage ratio of the audit team. Furthermore, the method allows to define optimum thresholds values for the score random variable in the detection software's alarm structure. The proposed approach is illustrated by analyzing the financial episode called the plier bubble that occurred in Brazil in 2011. The results reveal a suitable way to quantify the fraud risk probability, thus being a valuable tool to risk management in the stock market.
Article
With the advent of the Internet of things (IoT) era, more and more devices are connected to the IoT. Under the traditional cloud-thing centralized management mode, the transmission of massive data is facing many difficulties, and the reliability of data is difficult to be guaranteed. As emerging technologies, blockchain technology and edge computing (EC) technology have attracted the attention of academia in improving the reliability, privacy and invariability of IoT technology. In this paper, we combine the characteristics of the EC and blockchain to ensure the reliability of data transmission in the IoT. First of all, we propose a data transmission mechanism based on blockchain, which uses the distributed architecture of blockchain to ensure that the data is not tampered with; secondly, we introduce the three-tier structure in the architecture in turn; finally, we introduce the four working steps of the mechanism, which are similar to the working mechanism of blockchain. In the end, the simulation results show that the proposed scheme can ensure the reliability of data transmission in the Internet of things to a great extent.
Chapter
The arrival of communication methods, along with online imbursement dealings, is growing every day. Together with this, fiscal scams linked with these dealings are also escalating. Amid several fiscal scams, credit card scam is the utmost common and hazardous one owing to its extensive practice. To perceive these illegal bases of communications, a credit card scam recognition structure is obligatory. In this paper, we propose to design a user behaviour-based accurate detection for credit card frauds. In this technique, user behaviours over a banking website are collected and analysed. The main features related to user behaviour are frequently visited pages, time spent on each page, clicking, etc. From these details, the page reachability and page utility are determined. From this, the normal and abnormal behaviours of users are classified using web Markov skeleton process (WMSP) model.
Article
Full-text available
Supervised learning techniques are widely employed in credit card fraud detection, as they make use of the assumption that fraudulent patterns can be learned from an analysis of past transactions. The task becomes challenging, however, when it has to take account of changes in customer behavior and fraudsters’ ability to invent novel fraud patterns. In this context, unsupervised learning techniques can help the fraud detection systems to find anomalies. In this paper we present a hybrid technique that combines supervised and unsupervised techniques to improve the fraud detection accuracy. Unsupervised outlier scores, computed at different levels of granularity, are compared and tested on a real, annotated, credit card fraud detection dataset. Experimental results show that the combination is efficient and does indeed improve the accuracy of the detection.
Conference Paper
Full-text available
Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However, most studies consider credit card transactions as isolated events and not as a sequence of transactions. In this article, we model a sequence of credit card transactions from three different perspectives, namely (i) does the sequence contain a Fraud? (ii) Is the sequence obtained by fixing the card-holder or the payment terminal? (iii) Is it a sequence of spent amount or of elapsed time between the current and previous transactions? Combinations of the three binary perspectives give eight sets of sequences from the (training) set of transactions. Each one of these sets is modelled with a Hidden Markov Model (HMM). Each HMM associates a likelihood to a transaction given its sequence of previous transactions. These likelihoods are used as additional features in a Random Forest classifier for fraud detection. This multiple perspectives HMM-based approach enables an automatic feature engineering in order to model the sequential properties of the dataset with respect to the classification task. This strategy allows for a 15% increase in the precision-recall AUC compared to the state of the art feature engineering strategy for credit card fraud detection.
Patent
Full-text available
The invention relates to a system and method for managing financial transactions, the system including at least one database containing the transaction information, at least one fraud detection device comprising at least one rule evaluation module and a rule selection module, connected to each other and comprising means for implementing the method, the evaluation module making it possible to calculate an estimate of the contribution of each rule of the set of rules relative to a parameter representing the overall performance of a set of rules stored in a first memory of the detection device, and an evaluation report file analysed by the selection module to select a subset of rules from among the evaluated rules of the set of rules, the selected rules being stored in a second memory of the fraud detection device to be used for transaction control. https://worldwide.espacenet.com/publicationDetails/biblio?II=0&ND=3&adjacent=true&locale=en_EP&FT=D&date=20181025&CC=WO&NR=2018193085A1&KC=A1#
Article
Full-text available
Credit card fraud detection is a very challenging problem because of the specific nature of transaction data and the labeling process. The transaction data are peculiar because they are obtained in a streaming fashion, and they are strongly imbalanced and prone to non-stationarity. The labeling is the outcome of an active learning process, as every day human investigators contact only a small number of cardholders (associated with the riskiest transactions) and obtain the class (fraud or genuine) of the related transactions. An adequate selection of the set of cardholders is therefore crucial for an efficient fraud detection process. In this paper, we present a number of active learning strategies and we investigate their fraud detection accuracies. We compare different criteria (supervised, semi-supervised and unsupervised) to query unlabeled transactions. Finally, we highlight the existence of an exploitation/exploration trade-off for active learning in the context of fraud detection, which has so far been overlooked in the literature.
Article
Full-text available
Detecting frauds in credit card transactions is perhaps one of the best testbeds for computational intelligence algorithms. In fact, this problem involves a number of relevant challenges, namely: concept drift (customers' habits evolve and fraudsters change their strategies over time), class imbalance (genuine transactions far outnumber frauds), and verification latency (only a small set of transactions are timely checked by investigators). However, the vast majority of learning algorithms that have been proposed for fraud detection rely on assumptions that hardly hold in a real-world fraud-detection system (FDS). This lack of realism concerns two main aspects: 1) the way and timing with which supervised information is provided and 2) the measures used to assess fraud-detection performance. This paper has three major contributions. First, we propose, with the help of our industrial partner, a formalization of the fraud-detection problem that realistically describes the operating conditions of FDSs that everyday analyze massive streams of credit card transactions. We also illustrate the most appropriate performance measures to be used for fraud-detection purposes. Second, we design and assess a novel learning strategy that effectively addresses class imbalance, concept drift, and verification latency. Third, in our experiments, we demonstrate the impact of class unbalance and concept drift in a real-world data stream containing more than 75 million transactions, authorized over a time window of three years.
Conference Paper
Full-text available
Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.
Conference Paper
Full-text available
Global card fraud losses amounted to 16.31 Billion US dollars in 2014 [18]. To recover this huge amount, automated Fraud Detection Systems (FDS) are used to deny a transaction before it is granted. In this paper, we start from a graph-based FDS named APATE [28]: this algorithm uses a collective inference algorithm to spread fraudulent influence through a network by using a limited set of confirmed fraudulent transactions. We propose several improvements from the network data analysis literature [16] and semi-supervised learning [9] to this approach. Furthermore, we redesigned APATE to fit to e-commerce field reality. Those improvements have a high impact on performance, multiplying Precision@100 by three, both on fraudulent card and transaction prediction. This new method is assessed on a three-months real-life e-commerce credit card transactions data set obtained from a large credit card issuer.
Conference Paper
Full-text available
Fraud detection is a critical problem affecting large financial companies that has increased due to the growth in credit card transactions. This paper presents a new method for automatic detection of frauds in credit card transactions based on non-linear signal processing. The proposed method consists of the following stages: feature extraction, training and classification, decision fusion, and result presentation. Discriminant-based classifiers and an advanced non-Gaussian mixture classification method are employed to distinguish between legitimate and fraudulent transactions. The posterior probabilities produced by classifiers are fused by means of order statistical digital filters. Results from data mining of a large database of real transactions are presented. The feasibility of the proposed method is demonstrated for several datasets using parameters derived from receiver characteristic operating analysis and key performance indicators of the business.
Article
Full-text available
The Shapley value is arguably the most central normative solution concept in cooperative game theory. It specifies a unique way in which the reward from cooperation can be "fairly" divided among players. While it has a wide range of real world applications, its use is in many cases hampered by the hardness of its computation. A number of researchers have tackled this problem by (1) focusing on classes of games where the Shapley value can be computed efficiently, or (2) proposing representation formalisms that facilitate such efficient computation, or (3) approximating the Shapley value in certain classes of games. However, given the classical \textit{characteristic function} representation, the only attempt to approximate the Shapley value for the general class of games is due to Castro \textit{et al.} \cite{castro}. While this algorithm provides a bound on the approximation error, this bound is \textit{asymptotic}, meaning that it only holds when the number of samples increases to infinity. On the other hand, when a finite number of samples is drawn, an unquantifiable error is introduced, meaning that the bound no longer holds. With this in mind, we provide non-asymptotic bounds on the estimation error for two cases: where (1) the \textit{variance}, and (2) the \textit{range}, of the players' marginal contributions is known. Furthermore, for the second case, we show that when the range is significantly large relative to the Shapley value, the bound can be improved (from $O(r,\sqrt{\nicefrac{1}{m}})$ to $O(\sqrt{r},\sqrt{\nicefrac{1}{m}})$). Finally, we propose, and demonstrate the effectiveness of, using stratified sampling to improve the bounds.
Article
Full-text available
Working with multiple regression analysis a researcher usually wants to know a comparative importance of predictors in the model. However, the analysis can be made difficult because of multicollinearity among regressors, which produces biased coefficients and negative inputs to multiple determination from presum ably useful regressors. To solve this problem we apply a tool from the co-operative games theory, the Shapley Value imputation. We demonstrate the theoretical and practical advantages of the Shapley Value and show that it provides consistent results in the presence of multicollinearity. Copyright © 2001 John Wiley & Sons, Ltd.
Conference Paper
Full-text available
We present and study the Contribution-Selection algorithm (CSA), a novel algorithm for feature selection. The algorithm is based on the Multiperturbation Shapley Analysis, a framework which relies on game theory to estimate usefulness. The algorithm iteratively estimates the usefulness of features and selects them accordingly, using either forward selection or backward elimination. Empirical comparison with several other existing feature selection methods shows that the backward elimination variant of CSA leads to the most accurate classification results on an array of datasets.
Article
Full-text available
Many multiagent domains where cooperation among agents is crucial to achieving a common goal can be modeled as coalitional games. However, in many of these domains, agents are unequal in their power to affect the outcome of the game. Prior research on weighted voting games has explored power indices, which reflect how much “real power” a voter has. Although primarily used for voting games, these indices can be applied to any simple coalitional game. Computing these indices is known to be computationally hard in various domains, so one must sometimes resort to approximate methods for calculating them. We suggest and analyze randomized methods to approximate power indices such as the Banzhaf power index and the Shapley–Shubik power index. Our approximation algorithms do not depend on a specific representation of the game, so they can be used in any simple coalitional game. Our methods are based on testing the game’s value for several sample coalitions. We show that no approximation algorithm can do much better for general coalitional games, by providing lower bounds for both deterministic and randomized algorithms for calculating power indices. We also provide empirical results regarding our method, and show that it typically achieves much better accuracy and confidence than those required.
Article
Full-text available
A few applications of the Shapley value are described. The main choice criterion is to look at quite diversified fields, to appreciate how wide is the terrain that has been explored and colonized using this and related tools.
Article
One of the fundamental research challenges in network science is centrality analysis, i.e., identifying the nodes that play the most important roles in the network. In this article, we focus on the game-theoretic approach to centrality analysis. While various centrality indices have been recently proposed based on this approach, it is still unknown how general is the game-theoretic approach to centrality and what distinguishes some game-theoretic centralities from others. In this article, we attempt to answer this question by providing the first axiomatic characterization of game-theoretic centralities. Specifically, we show that every possible centrality measure can be obtained following the game-theoretic approach. Furthermore, we study three natural classes of game-theoretic centrality, and prove that they can be characterized by certain intuitive properties pertaining to the well-known notion of Fairness due to Myerson.
Article
Due to the growing volume of electronic payments, the monetary strain of credit-card fraud is turning into a substantial challenge for financial institutions and service providers, thus forcing them to continuously improve their fraud detection systems. However, modern data-driven and learning-based methods, despite their popularity in other domains, only slowly find their way into business applications. In this paper, we phrase the fraud detection problem as a sequence classification task and employ Long Short-Term Memory (LSTM) networks to incorporate transaction sequences. We also integrate state-of-the-art feature aggregation strategies and report our results by means of traditional retrieval metrics. A comparison to a baseline random forest (RF) classifier showed that the LSTM improves detection accuracy on offline transactions where the card-holder is physically present at a merchant. Both the sequential and non-sequential learning approaches benefit strongly from manual feature aggregation strategies. A subsequent analysis of true positives revealed that both approaches tend to detect different frauds, which suggests a combination of the two. We conclude our study with a discussion on both practical and scientific challenges that remain unsolved.
Chapter
With the advancements in various data mining and social network-related approaches, datasets with a very high feature—dimensionality are often used. Various information theoretic approaches have been tried to select the most relevant set of features, and hence bring down the size of the data. Most of the times these approaches try to find a way to rank the features, so as to select or remove a fixed number of features. These principles usually assume some probability distribution for the data. These approaches also fail to capture the individual contribution of every feature in a given set of features. In this paper, we propose an approach which uses the Relief algorithm and cooperative game theory to solve the problems mentioned above. The approach was tested on NIPS 2003 and UCI datasets using different classifiers and the results were comparable to the state-of-the-art methods.
Article
Independence between detectors is normally assumed in order to simplify the algorithms and techniques used in decision fusion. In this paper, we derive the optimum fusion rule of N non-independent detectors in terms of the individual probabilities of detection and false alarm and defined dependence factors. This has interest for the implementation of the optimum detector, the incorporation of specific dependence models and for gaining insights into the implications of dependence. This later is illustrated with a detailed analysis of the two equally-operated non-independent detectors case. We show, for example, that not any dependence model is compatible with an arbitrary point of operation of the detectors, and that optimality of the counting rule is preserved in presence of dependence if the individual detectors are “good enough”. We have derived also the expressions of the probability of detection and false alarm after fusion of dependent detectors. Theoretical results are verified in a real data experiment with acoustic signals.
Article
Every year billions of Euros are lost worldwide due to credit card fraud. Thus, forcing financial institutions to continuously improve their fraud detection systems. In recent years, several studies have proposed the use of machine learning and data mining techniques to address this problem. However, most studies used some sort of misclassification measure to evaluate the different solutions, and do not take into account the actual financial costs associated with the fraud detection process. Moreover, when constructing a credit card fraud detection model, it is very important how to extract the right features from the transactional data. This is usually done by aggregating the transactions in order to observe the spending behavioral patterns of the customers. In this paper we expand the transaction aggregation strategy, and propose to create a new set of features based on analyzing the periodic behavior of the time of a transaction using the von Mises distribution. Then, using a real credit card fraud dataset provided by a large European card processing company, we compare state-of-the-art credit card fraud detection models, and evaluate how the different sets of features have an impact on the results. By including the proposed periodic features into the methods, the results show an average increase in savings of 13%.
Article
We present a sensitivity analysis-based method for explaining prediction models that can be applied to any type of classification or regression model. Its advantage over existing general methods is that all subsets of input features are perturbed, so interactions and redundancies between features are taken into account. Furthermore, when explaining an additive model, the method is equivalent to commonly used additive model-specific methods. We illustrate the method's usefulness with examples from artificial and real-world data sets and an empirical analysis of running times. Results from a controlled experiment with 122 participants suggest that the method's explanations improved the participants' understanding of the model.
Article
The transferable belief model is a model to represent quantified beliefs based on the use of belief functions, as initially proposed by Shafer. It is developed independently from any underlying related probability model. We summarize our interpretation of the model and present several recent results that characterize the model. We show how rational decision must be made when beliefs are represented by belief functions. We explain the origin of the two Dempster's rules that underlie the dynamic of the model through the concept of specialization and least commitment. We present the canonical decomposition of any belief functions, and discover the concept of 'debt of beliefs'. We also present the generalization of the Bayesian Theorem to belief functions.
Article
Credit card fraud is a serious and growing problem. While predictive models for credit card fraud detection are in active use in practice, reported studies on the use of data mining approaches for credit card fraud detection are relatively few, possibly due to the lack of available data for research. This paper evaluates two advanced data mining approaches, support vector machines and random forests, together with the well-known logistic regression, as part of an attempt to better detect (and thus control and prosecute) credit card fraud. The study is based on real-life data of transactions from an international credit card operation.
Article
The Shapley value is a key solution concept for coalitional games in general and voting games in particular. Its main advantage is that it provides a unique and fair solution, but its main drawback is the complexity of computing it (e.g., for voting games this complexity is #p-complete). However, given the importance of the Shapley value and voting games, a number of approximation methods have been developed to overcome this complexity. Among these, Owen's multi-linear extension method is the most time efficient, being linear in the number of players. Now, in addition to speed, the other key criterion for an approximation algorithm is its approximation error. On this dimension, the multi-linear extension method is less impressive. Against this background, this paper presents a new approximation algorithm, based on randomization, for computing the Shapley value of voting games. This method has time complexity linear in the number of players, but has an approximation error that is, on average, lower than Owen's. In addition to this comparative study, we empirically evaluate the error for our method and show how the different parameters of the voting game affect it. Specifically, we show the following effects. First, as the number of players in a voting game increases, the average percentage error decreases. Second, as the quota increases, the average percentage error decreases. Third, the error is different for players with different weights; players with weight closer to the mean weight have a lower error than those with weight further away. We then extend our approximation to the more general k-majority voting games and show that, for n players, the method has time complexity O(k2n) and the upper bound on its approximation error is .
Article
We present and study the contribution-selection algorithm (CSA), a novel algorithm for feature selection. The algorithm is based on the multiperturbation shapley analysis (MSA), a framework that relies on game theory to estimate usefulness. The algorithm iteratively estimates the usefulness of features and selects them accordingly, using either forward selection or backward elimination. It can optimize various performance measures over unseen data such as accuracy, balanced error rate, and area under receiver-operator-characteristic curve. Empirical comparison with several other existing feature selection methods shows that the backward elimination variant of CSA leads to the most accurate classification results on an array of data sets.
A method for evaluating the distribution of power in a committee system. American political science review
• S Lloyd
• Martin Shapley
• Shubik
Lloyd S Shapley and Martin Shubik. A method for evaluating the distribution of power in a committee system. American political science review, 48(03):787-792, 1954.
Using the shapley value to analyze algorithm portfolios
• Alexandre Fréchette
• Lars Kotthoff
• Tomasz Michalak
• Talal Rahwan
• H Holger
• Kevin Hoos
• Leyton-Brown
Alexandre Fréchette, Lars Kotthoff, Tomasz Michalak, Talal Rahwan, Holger H Hoos, and Kevin Leyton-Brown. Using the shapley value to analyze algorithm portfolios. In Thirtieth AAAI Conference on Artificial Intelligence, 2016.
Knowledge-Based and Intelligent Information and Engineering Systems
• Julian Stier
• Gabriele Gianini
• Michael Granitzer
• Konstantin Ziegler
Julian Stier, Gabriele Gianini, Michael Granitzer, and Konstantin Ziegler. Analysing neural network topologies: a game theoretic approach. Procedia Computer Science, 126:234 -243, 2018. Knowledge-Based and Intelligent Information and Engineering Systems: Proceedings of the 22nd International Conference, KES-2018, Belgrade, Serbia.
Explaining prediction models and individual predictions with feature contributions
• Igor Erikštrumbeljerikˇerikštrumbelj
• Kononenko
ErikŠtrumbeljErikˇErikŠtrumbelj and Igor Kononenko. Explaining prediction models and individual predictions with feature contributions. Knowledge and information systems, 41(3):647-665, 2014.
Bounding the estimation error of sampling-based shapley value approximation
• Sasan Maleki
• Long Tran-Thanh
• Greg Hines
• Talal Rahwan
• Alex Rogers
Sasan Maleki, Long Tran-Thanh, Greg Hines, Talal Rahwan, Alex Rogers, Bounding the estimation error of sampling-based shapley value approximation, arXiv preprint arXiv:1306.4265, 2013.
Ernesto Damiani, FR3065558A1, System and method to manage the detection of fraud in a system of financial transactions
• Olivier Caelen
• Gabriele Gianini
Olivier Caelen, Gabriele Gianini, Ernesto Damiani, FR3065558A1, System and method to manage the detection of fraud in a system of financial transactions.