ChapterPDF Available

Secure Multiparty PageRank Algorithm for Collaborative Fraud Detection

Authors:

Abstract

Collaboration between financial institutions helps to improve detection of fraud. However, exchange of relevant data between these institutions is often not possible due to privacy constraints and data confidentiality. An important example of relevant data for fraud detection is given by a transaction graph, where the nodes represent bank accounts and the links consist of the transactions between these accounts. Previous works show that features derived from such graphs, like PageRank, can be used to improve fraud detection. However, each institution can only see a part of the whole transaction graph, corresponding to the accounts of its own customers. In this research a new method is described, making use of secure multiparty computation (MPC) techniques, allowing multiple parties to jointly compute the PageRank values of their combined transaction graphs securely, while guaranteeing that each party only learns the PageRank values of its own accounts and nothing about the other transaction graphs. In our experiments this method is applied to graphs containing up to tens of thousands of nodes. The execution time scales linearly with the number of nodes, and the method is highly parallelizable. Secure multiparty PageRank is feasible in a realistic setting with millions of nodes per party by extrapolating the results from our experiments.
... Their survey of standards development organizations identified several active working groups specifically addressing homomorphic encryption in financial contexts, though coordination between these efforts remains limited. The researchers estimated that initial implementation standards could emerge within a couple of years, though comprehensive standards suitable for regulatory compliance would likely require several more years of development and validation before achieving industry-wide acceptance.Banerjee identified interoperability as a primary concern among financial institutions, with the vast majority of survey respondents ranking it among their top implementation challenges[8].Their technical analysis revealed that the global payment ecosystem comprises a complex network of interconnected systems with varying capabilities and update cycles, creating significant coordination challenges for implementing advanced cryptographic techniques. The researchers documented that the average international payment transaction interacts with multiple distinct systems during processing, each potentially requiring modification to support homomorphic operations. ...
... While seemingly minor, these inconsistencies can propagate through complex computational chains, potentially resulting in transaction discrepancies that would be unacceptable in financial environments. Ensuring computational consistency across heterogeneous environments requires additional validation layers that further Despite these challenges, gateway architectures offer promising approaches for enhancing interoperability.Rahman and Banerjee documented successful implementations using specialized translation layers that encapsulate homomorphic operations within standardized interfaces, allowing gradual integration with existing systems[8]. Their case study of a regional payment processor's implementation revealed that this approach allowed the organization to achieve a substantial portion of the securitybenefits of full homomorphic implementation while requiring modifications to only a limited portion of existing systems. ...
Article
Full-text available
with legacy systems, performance overhead, standardization requirements, and interoperability concerns across heterogeneous payment ecosystems. This article analyzes these challenges alongside emerging optimization techniques and implementation strategies that show promise for overcoming current adoption barriers. As computational efficiency continues to improve through hardware acceleration and algorithmic innovations, FHE stands poised to revolutionize payment security by fundamentally altering how sensitive financial data is processed in distributed environments.
... Since then, SMPC has found applications in diverse domains, from secure statistical analysis [63], to more specific applications. The best known are those used in the financial sphere [67], in biomedical fields [47,74], and in the detection of satellite collisions [30]. ...
Article
Full-text available
In the era of Big Data and the advancement of the Internet of Things, there is an increasing amount of valuable information. It is important to emphasize that this data is usually sensitive or confidential, so security and privacy are two of the highest priorities for organizations when performing Data Mining. Researchers have explored techniques such as secure multi-party computation (SMPC) in the last decades. Nevertheless, there is still a significant gap between the theory of SMPC and its applicability, especially when working with resource-constrained devices or massive data. This work has been conducted with a systematic literature review, and it intends to analyze the open issues of adapting SMPC to those scenarios, by classifying the studies to answer two research questions: (1) how has the use of SMPC attempted to be adapted to constrained devices? and (2) how have traditional techniques fitted with Big Data? At the end of the process, after analyzing a total of 637 studies, 19 papers were selected. Regarding constrained devices, solutions are grouped into three main techniques: secure outsourcing, hardware-based trusted execution, and intermediate representations. As for Big Data, the selected studies use mixed protocols to change over cleartext and ciphertext, combine different types of SMPC protocols, or modify existing protocols through optimizations.
... Il rilevamento di reati finanziari è un tipico esempio di ambiente in cui più parti condividono un comune interesse, ma le norme sulla riservatezza e sulla privacy impediscono la collaborazione [18]. In una transazione, un istituto finanziario in genere conosce i dettagli solo se è stato coinvolto. ...
Thesis
Full-text available
The aim of the thesis was to demonstrate how graph algorithms are efficient in the field of fraud detection.
Article
Privacy-preserving collaborative data analysis is a popular research direction in recent years. Among all such analysis tasks, privacy-preserving SQL queries on multi-party databases are of particular industrial interest. Although the privacy concern can be addressed by many cryptographic tools, such as secure multi-party computation (MPC), the efficiency of executing such SQL queries is far from satisfactory, especially for high-volume databases. In particular, existing MPC-based solutions treat each SQL query as an isolated task and launch it from scratch, in spite of the nature that many SQL queries are done regularly and somewhat overlap in their functionalities. In this work, we are motivated to exploit this nature to improve the efficiency of MPC-based, privacy-preserving SQL queries. We introduce a cache-like optimization mechanism. To ensure a higher cache hit rate and reduce redundant MPC operators, we present a cache structure different from that of plain databases and design a set of cache strategies. Our optimization mechanism, SMPCache, can be built upon secret-sharing-based MPC frameworks, which attract much attention from the industry. To demonstrate the utility of SMPCache, we implement it on Rosetta, an open-source MPC library, and use real-world datasets to launch extensive experiments on some basic SQL operators (e.g., Filter, Order-by, Aggregation, and Inner-Join) and some representative composite SQL queries. To give a data point, we note that SMPCache can achieve most up to 3536× efficiency improvement on the TPC-DS dataset and 562× on the TPC-H dataset at a moderate storage cost. We also apply SMPCache to the basic SQL operators (Filter, Order-by, Group-by, Aggregation, and Inner-join) of the Secrecy framework, achieving up to 127.3× efficiency improvement.
Article
In the business scenarios at Ant Group, there is a rising demand for collaborative data analysis among multiple institutions, which can promote health insurance, financial services, risk control, and others. However, the increasing concern about privacy issues has led to data silos. Secure Multi-Party Computation (MPC) provides an effective solution for collaborative data analysis, which can utilize data value while ensuring data security. Nevertheless, the performance bottlenecks of MPC and the strong demand for scalability pose great challenges to secure collaborative data analysis frameworks. In this paper, we build a secure collaborative data analysis system SCQL with a general purpose. We design more efficient MPC protocols and relational operators to meet the demand for scalability. In terms of system design, we aim to implement a system with security, usability, and efficiency. We conduct extensive experiments on SCQL to validate our optimization improvements: (1) Our optimized secure sort protocol sorts one million 64-bit data in only 4.5 minutes, 126× faster than EMP (9.4 hours). (2) The end-to-end execution time of the typical vertical scenario query is reduced by 1991× from the state-of-the-art semi-honest collaborative analysis framework Secrecy (rewritten with Additive Secret Sharing protocol), with appropriate security tradeoffs. (3) We test the system in the WAN setting with input size = 10 ⁷ to demonstrate the scalability. We have successfully deployed SCQL to address problems in real-world business scenarios at Ant Group.
Chapter
Existing risk control techniques have primarily been developed from the perspectives of de-anonymizing address clustering and illicit account classification. However, these techniques cannot be used to ascertain the potential risks for all accounts and are limited by specific heuristic strategies or insufficient label information. These constraints motivate us to seek an effective rating method for quantifying the spread of risk in a transaction network. To the best of our knowledge, we are the first to address the problem of account risk rating on Ethereum by proposing a novel model called RiskProp, which includes a de-anonymous score to measure transaction anonymity and a network propagation mechanism to formulate the relationships between accounts and transactions. We demonstrate the effectiveness of RiskProp in overcoming the limitations of existing models by conducting experiments on real-world datasets from Ethereum. Through case studies on the detected high-risk accounts, we demonstrate that the risk assessment by RiskProp can be used to provide warnings for investors and protect them from possible financial losses, and the superior performance of risk score-based account classification experiments further verifies the effectiveness of our rating method.
Article
Full-text available
Despite much progress, general-purpose secure multi-party computation (MPC) with active security may still be prohibitively expensive in settings with large input datasets. This particularly applies to the secure evaluation of graph algorithms, where each party holds a subset of a large graph. Recently, Araki et al. (ACM CCS '21) showed that dedicated solutions may provide significantly better efficiency if the input graph is sparse. In particular, they provide an efficient protocol for the secure evaluation of “message passing” algorithms, such as the PageRank algorithm. Their protocol's computation and communication complexity are both O ~ ( M · B ) instead of the O ( M 2 ) complexity achieved by general-purpose MPC protocols, where M denotes the number of nodes and B the (average) number of incoming edges per node. On the downside, their approach achieves only a relatively weak security notion; 1 -out-of- 3 malicious security with selective abort. In this work, we show that PageRank can instead be captured efficiently as a restricted multiplication straight-line (RMS) program, and present a new actively secure MPC protocol tailored to handle RMS programs. In particular, we show that the local knowledge of the participants can be leveraged towards the first maliciously-secure protocol with communication complexity linear in M , independently of the sparsity of the graph. We present two variants of our protocol. In our communication-optimized protocol, going from semi-honest to malicious security only introduces a small communication overhead, but results in quadratic computation complexity O ( M 2 ) . In our balanced protocol, we still achieve a linear communication complexity O ( M ) , although with worse constants, but a significantly better computational complexity scaling with O ( M · B ) . Additionally, our protocols achieve security with identifiable abort and can tolerate up to n − 1 corruptions.
Chapter
The rapid development of internet finance has caused increasing concern in online payment fraud due to its great threat. Online payment fraud detection, a challenge faced by online service, plays an important role in rapidly evolving e-commerce. At present, most platforms use rule systems or machine learning based technologies to detect fraud. It is usually taken for granted that the occurrence of unauthorized behaviors is necessary for fraud detection in online payment services. Behavior-based methods are recognized as promising methods for online payment fraud detection. However, building high-performance behavior models for fraud detection faces several huge challenges, e.g., ex-ante anti-fraud and new fraud attacks. To this end, we have designed two horizontal association modeling solutions: \bullet We strive to design an ex-ante anti-fraud method that can work before unauthorized behaviors occur. The feasibility of our solution is supported by the cooperation of a characteristic and a finding in online payment fraud scenarios: The well-recognized characteristic is that online payment frauds are mostly caused by account compromise. Our finding is that account theft is indeed predictable based on users’ high-risk behaviors, without relying on the behaviors of thieves. Accordingly, we propose an account risk prediction scheme to realize the ex-ante fraud detection. It takes in an account’s historical transaction sequence, and outputs its risk score. The risk score is then used as an early evidence of whether a new transaction is fraudulent or not, before the occurrence of the new transaction. We examine our method on a real-world B2C transaction dataset from a commercial bank. Experimental results show that the ex-ante detection method can prevent more than 80%80\% of the fraudulent transactions before they actually occur. When the proposed method is combined with an interim detection to form a real-time anti-fraud system, it can detect more than 94%94\% of fraudulent transactions while maintaining a very low false alarm rate (less than 0.1%0.1\%). \bullet We pursue an adaptive learning approach to detect fraudulent online payment transactions with automatic sliding time windows. Accordingly, we make efforts on optimizing the setting of windows and improving the adaptability. We design an intelligent window, called learning automatic window (LAW). It utilizes the learning automata to learn the proper parameters of time windows and adjust them dynamically and regularly according to the variation and oscillation of fraudulent transaction patterns. By the experiments over a real-world dataset of the online payment service from a commercial bank, we validate the gain of LAW in terms of detection effectiveness and robustness. To the best of our knowledge, this is the first work to make a sliding time window for fraud detection capable of learning its proper size in changing situations.
Conference Paper
Full-text available
We report on the design and implementation of a system that uses multiparty computation to enable banks to benchmark their customers’ confidential performance data against a large representative set of confidential performance data from a consultancy house. The system ensures that both the banks’ and the consultancy house’s data stays confidential, the banks as clients learn nothing but the computed benchmarking score. In the concrete business application, the developed prototype helps Danish banks to find the most efficient customers among a large and challenging group of agricultural customers with too much debt. We propose a model based on linear programming for doing the benchmarking and implement it using the SPDZ protocol by Damgård et al., which we modify using a new idea that allows clients to supply data and get output without having to participate in the preprocessing phase and without keeping state during the computation. We ran the system with two servers doing the secure computation using a database with information on about 2500 users. Answers arrived in about 25 s.
Conference Paper
Full-text available
In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2-way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction set extensions. We have implemented this approach for tablet devices which run the x86 architecture (Intel Atom Z2760) using SSE2 instructions as well as devices which run on the ARM platform (Qualcomm MSM8960, NVIDIA Tegra 3 and 4) using NEON instructions. When instantiating modular exponentiation with this parallel version of Montgomery multiplication we observed a performance increase of more than a factor of 1.5 compared to the sequential implementation in OpenSSL for the classical arithmetic logic unit on the Atom platform for 2048-bit moduli.
Article
Full-text available
When processing data in the encrypted domain, homomorphic encryption can be used to enable linear operations on encrypted data. Integer division of encrypted data however requires an additional protocol between the client and the server and will be relatively expensive. We present new solutions for dividing encrypted data in the semi-honest model using homomorphic encryption and additive blinding, having low computational and communication complexity. In most of our protocols we assume the divisor is publicly known. The division result is not only computed exactly, but may also be approximated leading to further improved performance. The idea of approximating the result of an integer division is extended to similar results for secure comparison, secure minimum, and secure maximum in the client-server model, yielding new efficient protocols with demonstrated application in biometrics. The exact minimum protocol is shown to outperform existing approaches.
Article
Full-text available
We present a polynomial-time algorithm that, given as a input the description of a game with incomplete information and any number of players, produces a protocol for playing the game that leaks no partial information, provided the majority of the players is honest. Our algorithm automatically solves all the multi-party protocol problems addressed in complexity-based cryptography during the last 10 years. It actually is a completeness theorem for the class of distributed protocols with honest majority. Such completeness theorem is optimal in the sense that, if the majority of the players is not honest, some protocol problems have no efficient solution [C].
Article
Full-text available
We propose a new approach to practical two-party computation secure against an active adversary. All prior practical protocols were based on Yao's garbled circuits. We use an OT-based approach and get efficiency via OT extension in the random oracle model. To get a practical protocol we introduce a number of novel techniques for relating the outputs and inputs of OTs in a larger construction. We also report on an implementation of this approach, that shows that our protocol is more efficient than any previous one: For big enough circuits, we can evaluate more than 20000 Boolean gates per second. As an example, evaluating one oblivious AES encryption (~34000 gates) takes 64 seconds, but when repeating the task 27 times it only takes less than 3 seconds per instance.
Conference Paper
We present a new approach to cross channel fraud detection: build graphs representing transactions from all channels and use analytics on features extracted from these graphs. Our underlying hypothesis is community based fraud detection: an account (holder) performs normal or trusted transactions within a community that is “local” to the account. We explore several notions of community based on graph properties. Our results show that properties such as shortest distance between transaction endpoints, whether they are in the same strongly connected component, whether the destination has high page rank, etc., provide excellent discriminators of fraudulent and normal transactions whereas traditional social network analysis yields poor results. Evaluation on a large dataset from a European bank shows that such methods can substantially reduce false positives in traditional fraud scoring. We show that classifiers built purely out of graph properties are very promising, with high AUC, and can complement existing fraud detection approaches.
Conference Paper
We investigate the problem of solving traditional combinatorial graph problems using secure multi-party computation techniques, focusing on the shortest path and the maximum flow problems. To the best of our knowledge, this is the first time these problems have been addressed in a general multi-party computation setting. Our study highlights several complexity gaps and suggests the exploration of various trade-offs, while also offering protocols that are efficient enough to solve real-world problems.
Conference Paper
Secure two-party computation allows two untrusting parties to jointly compute an arbitrary function on their respective private inputs while revealing no information beyond the outcome. Existing cryptographic compilers can automatically generate secure computation protocols from high-level specifications, but are often limited in their use and efficiency of generated protocols as they are based on either garbled circuits or (additively) homomorphic encryption only. In this paper we present TASTY, a novel tool for automating, i.e., describing, generating, executing, benchmarking, and comparing, efficient secure two-party computation protocols. TASTY is a new compiler that can generate protocols based on homomorphic encryption and efficient garbled circuits as well as combinations of both, which often yields the most efficient protocols available today. The user provides a high-level description of the computations to be performed on encrypted data in a domain-specific language. This is automatically transformed into a protocol. TASTY provides most recent techniques and optimizations for practical secure two-party computation with low online latency. Moreover, it allows to efficiently evaluate circuits generated by the well-known Fairplay compiler. We use TASTY to compare protocols for secure multiplication based on homomorphic encryption with those based on garbled circuits and highly efficient Karatsuba multiplication. Further, we show how TASTY improves the online latency for securely evaluating the AES functionality by an order of magnitude compared to previous software implementations. TASTY allows to automatically generate efficient secure protocols for many privacy-preserving applications where we consider the use cases for private set intersection and face recognition protocols.