I am studying Computer Science and I am currently working on my Bachelor thesis. For that, I am looking for suitable datasets. My goal is to apply Process Mining to these datasets to identify and analyze interesting processes. However, the problem is that these datasets need to be in a certain format to be suitable for Process Mining. The data needs to have a Case Id, Activity, and Timestamp column. In other words, the data needs to be activity-based so that processes with different activity sequences can be found.
I wanted to ask if someone has any idea where I could find such datasets? I'd be most interested in datasets in sectors such as energy, waste management, public work (but other input would be helpful as well). So far I mainly could find the datasets from previous years' BPI challenges.
Here is a short page with more information about Process Mining and the desired format (including a brief example):
Process mining is widely used to diagnose processes and uncover performance and compliance problems. It is also possible to see relations between different behavioral aspects, e.g., cases that deviate more at the beginning of the process tend to get delayed in the later part of the process. However, correlations do not necessarily reveal causalitie...
A Higher Education degree is composed by courses which can be organized in areas or modules. Over last years, time invested by students to complete Higher Education degrees has increased. This increment can be caused by the existence of bottlenecks in the courses of academic programs. We aim to carry out an analysis of students’ performance to dete...