"The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.
It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise."
Another older available one is "German Credit fraud data", which is in ARFF format as used by Weka machine learning.
Recent publications:
GOTCHA! Network-based fraud detection for social security fraud Van Vlasselaer, V., Eliassi-Rad, T., Akoglu, L., Snoeck, M., Baesens, B., Management Science, accepted 2017.
A graph-based, semi-supervised, credit card fraud detection system B. Lebichot, F. Braun, and O. Caelen and M. Saerens, International Workshop on Complex Networks and their Applications, 721--733, 2016. Springer
Feature engineering strategies for credit card fraud detection AC Bahnsen, D Aouada, A Stojanovic, B Ottersten, Expert Systems with Applications 51, 134-142, 2016
Ensemble of Example-Dependent Cost-Sensitive Decision Trees AC Bahnsen, D Aouada, B Ottersten, arXiv preprint arXiv:1505.04637, 2015
Example-dependent cost-sensitive decision trees AC Bahnsen, D Aouada, B Ottersten, Expert Systems with Applications 42 (19), 6609-6619, 2015
APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions. Van Vlasselaer V., Bravo C., Caelen O., Eliassi-Rad T., Akoglu L., Snoeck M., Baesens B., Decision Support Systems. 2015. Elsevier
Detecting Credit Card Fraud using Periodic Features AC Bahnsen, D Aouada, A Stojanovic, 2015 IEEE 14th International Conference on Machine Learning and Applications.
Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi and G. Bontempi, International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 2015. 2014. pdf
Learned lessons in credit card fraud detection from a practitioner perspective A. Dal Pozzolo, O. Caelen, Y. Le Borgne, S. Waterschoot, and G. Bontempi, Expert Systems with Applications, vol. 41, no. 10, pp. 4915–4928, 2014. pdf
Using HDDT to avoid instances propagation in unbalanced and evolving data streams A. Dal Pozzolo, R. A Johnson, O. Caelen, S. Waterschoot, N. V Chawla, and G. Bontempi, International Joint Conference on Neural Networks (IJCNN), Beijing, China, 2014. pdf
Cost sensitive credit card fraud detection using Bayes minimum risk AC Bahnsen, A Stojanovic, D Aouada, B Ottersten, Machine Learning and Applications (ICMLA), 2013
The only entities that have the data on Credit Card Fraud Detection are the credit card companies. I am not sure that credit card companies can release this type of data to outsiders. But you never know. Just contact them and tell them that you are an academic who needs the data to support your research.
I agree with Ako's response. I have used credit card data for fraud analysis, but only as a representative auditor or consultant of the company whose data was being analysed - i.e. as a customer of the bank. The data is readily available when you go through the customer who owns the data.
"The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.
It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise."
Another older available one is "German Credit fraud data", which is in ARFF format as used by Weka machine learning.
Recent publications:
GOTCHA! Network-based fraud detection for social security fraud Van Vlasselaer, V., Eliassi-Rad, T., Akoglu, L., Snoeck, M., Baesens, B., Management Science, accepted 2017.
A graph-based, semi-supervised, credit card fraud detection system B. Lebichot, F. Braun, and O. Caelen and M. Saerens, International Workshop on Complex Networks and their Applications, 721--733, 2016. Springer
Feature engineering strategies for credit card fraud detection AC Bahnsen, D Aouada, A Stojanovic, B Ottersten, Expert Systems with Applications 51, 134-142, 2016
Ensemble of Example-Dependent Cost-Sensitive Decision Trees AC Bahnsen, D Aouada, B Ottersten, arXiv preprint arXiv:1505.04637, 2015
Example-dependent cost-sensitive decision trees AC Bahnsen, D Aouada, B Ottersten, Expert Systems with Applications 42 (19), 6609-6619, 2015
APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions. Van Vlasselaer V., Bravo C., Caelen O., Eliassi-Rad T., Akoglu L., Snoeck M., Baesens B., Decision Support Systems. 2015. Elsevier
Detecting Credit Card Fraud using Periodic Features AC Bahnsen, D Aouada, A Stojanovic, 2015 IEEE 14th International Conference on Machine Learning and Applications.
Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi and G. Bontempi, International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 2015. 2014. pdf
Learned lessons in credit card fraud detection from a practitioner perspective A. Dal Pozzolo, O. Caelen, Y. Le Borgne, S. Waterschoot, and G. Bontempi, Expert Systems with Applications, vol. 41, no. 10, pp. 4915–4928, 2014. pdf
Using HDDT to avoid instances propagation in unbalanced and evolving data streams A. Dal Pozzolo, R. A Johnson, O. Caelen, S. Waterschoot, N. V Chawla, and G. Bontempi, International Joint Conference on Neural Networks (IJCNN), Beijing, China, 2014. pdf
Cost sensitive credit card fraud detection using Bayes minimum risk AC Bahnsen, A Stojanovic, D Aouada, B Ottersten, Machine Learning and Applications (ICMLA), 2013
I know it's been two years since your question and you're probably working on other things. Maybe it's not too late. If you read this, I hope it helps.
Neither the German nor Austrailian credit card datasets are fraud datasets. Both are risk scoring datasets for predicting repayment or default probability. Neither are severely imbalanced and can be used without resampling.
To help gather more support for these initiatives, please consider sharing this post further (you don’t need a ResearchGate account to see it), and I will continue to update it with other initiatives as I find them. You can also click “Recommend” below to help others in your ResearchGate network see it. And if you know of any other community initiatives that we can share here please let us know via this form: https://forms.gle/e37EHouWXFLyhYE8A
-Ijad Madisch, CEO & Co-Founder of ResearchGate
-----
Update 03/07:
This list outlines country-level initiatives from various academic institutions and research organizations, with a focus on programs and sponsorship for Ukrainian researchers:
Extensive research has explored the style exposures of actively managed equity funds. We conducted an exhaustive set of return-based and holdings-based analyses to understand actively managed credit funds. We found that credit long–short managers tend to have high passive exposure to the credit risk premium. In contrast, we found that long-only man...
Loan portfolio problems have historically been the major cause of bank losses because of inherent risk of possible loan losses (credit risk). The study of Bank Loan Fraud Detection and IT-Based Combat Strategies in Nigeria which focused on analyzing the loan assessment system was carried out purposely to overcome the challenges of high incidence of...
This paper puts forward a distributed intelligent power information (DIPI) transacttion model based on energy blockchain and a credit risk management mechanism, such that distributed power generation companies (DPGCs) and users can transact directly in a smart, real-time and secure manner. Specifically, the credit evaluation index of mobile power i...