Asked 1st Nov, 2021

Where can I find activity-based datasets?

I am studying Computer Science and I am currently working on my Bachelor thesis. For that, I am looking for suitable datasets. My goal is to apply Process Mining to these datasets to identify and analyze interesting processes. However, the problem is that these datasets need to be in a certain format to be suitable for Process Mining. The data needs to have a Case Id, Activity, and Timestamp column. In other words, the data needs to be activity-based so that processes with different activity sequences can be found.
I wanted to ask if someone has any idea where I could find such datasets? I'd be most interested in datasets in sectors such as energy, waste management, public work (but other input would be helpful as well). So far I mainly could find the datasets from previous years' BPI challenges.
Here is a short page with more information about Process Mining and the desired format (including a brief example):
Any feedback would be highly appreciated.
Thanks in advance,

Most recent answer

3rd Nov, 2021
Md Mahmudur Rahman
Jahangirnagar University

All Answers (5)

2nd Nov, 2021
Faiyaz Fahim
Bangladesh University of Engineering and Technology
Kaggle can be helpful.
3rd Nov, 2021
Semeh Ben Salem
Ecole Polytechnique de Tunisie
You can find some thing interesting in this link coming from google and that contains thousands of free datasets
3rd Nov, 2021
Md Mahmudur Rahman
Jahangirnagar University

Similar questions and discussions


Process mining is widely used to diagnose processes and uncover performance and compliance problems. It is also possible to see relations between different behavioral aspects, e.g., cases that deviate more at the beginning of the process tend to get delayed in the later part of the process. However, correlations do not necessarily reveal causalitie...
Conference Paper
A Higher Education degree is composed by courses which can be organized in areas or modules. Over last years, time invested by students to complete Higher Education degrees has increased. This increment can be caused by the existence of bottlenecks in the courses of academic programs. We aim to carry out an analysis of students’ performance to dete...
