
Tomasz Miksa- PhD
- SBA Research
Tomasz Miksa
- PhD
- SBA Research
About
51
Publications
9,778
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
454
Citations
Introduction
Current institution
Publications
Publications (51)
This paper discusses possible changes to the machine-actionable Data Management Plans specification and ecosystem of services to enable compatibility with the FDO architecture, as defined by the DONA foundation.
Seamless Research Data Management for Researchers aims to cover a complete scientific workflow from planning a research project to registration and publication of results in repositories by connecting existing components, services, and tools using FDOs. This approach combines widely used components, so large data volumes can increasingly be FAIRifi...
Researchers of all disciplines produce, share, and reuse data as part of everyday research. Most funders require them to manage and document their data using data management plans (DMPs). DMPs are often static documents that researchers create by answering questions in predefined templates at the beginning of the research and, therefore, may become...
Artificial Intelligence (AI) Auditability is a core requirement for achieving responsible AI system design. However, it is not yet a prominent design feature in current applications. Existing AI auditing tools typically lack integration features and remain as isolated approaches. This results in manual, high-effort, and mostly one-off AI audits, ne...
Within the FAIR Data Austria project, supported by the Federal Ministry for Education, Science, and Research (BMBWF), a national strategy has been established to advance the creation of tailored Data Stewardship solutions for the Austrian context. The strategy, formalized as a toolbox, delineates various Data Steward models, corresponding competenc...
Most research funders require Data Management Plans (DMPs). The review process can be time consuming, since reviewers read text documents submitted by researchers and provide their feedback. Moreover, it requires specific expert knowledge in data stewardship, which is scarce. Machine-actionable Data Management Plans (maDMPs) and semantic technologi...
Based on a real world use case, we developed and evaluated a hybrid AI system that aims to extract key elements from legal permits by combining methods from the Semantic Web and Machine Learning. Specifically, we modelled the available background knowledge in a custom Knowledge Graph, which we exploited together with the usage of different language...
Semantic Web Machine Learning Systems (SWeMLS) characterise applications, which combine symbolic and subsymbolic components in innovative ways. Such hybrid systems are expected to benefit from both domains and reach new performance levels for complex tasks. While existing taxonomies in this field focus on building blocks and patterns for describing...
The concept of Data Management Plan (DMP) has emerged as a fundamental tool to help researchers through the systematical management of data. The Research Data Alliance DMP Common Standard (DCS) working group developed a set of universal concepts characterising a DMP so it can be represented as a machine-actionable artefact, i.e., machine-actionable...
Many research funders mandate researchers to create and maintain data management plans (DMPs) for research projects that describe how research data is managed to ensure its reusability. A DMP, being a static textual document, is difficult to act upon and can quickly become obsolete and impractical to maintain. A new generation of machine-actionable...
The concept of Data Management Plan (DMP) has emerged as a fundamental tool to help researchers through the systematical management of data. The Research Data Alliance DMP Common Standard (DCS) working group developed a core set of universal concepts characterising a DMP in the pursuit of producing a DMP as a machine-actionable information artefact...
Small and medium-sized organisations face challenges in acquiring, storing and analysing personal data, particularly sensitive data (e.g., data of medical nature), due to data protection regulations, such as the GDPR in the EU, which stipulates high standards in data protection. Consequently, these organisations often refrain from collecting data c...
Recordings of musical practices are kept in various public institutions and private depositories around the world. They constitute valuable data for ethnomusicological research and are substantial for the world's musical heritage. At the moment, there are no commonly used systems and standards for organizing, describing or categorizing these data,...
The common standard for machine-actionable Data Management Plans (DMPs) allows for automatic exchange, integration, and validation of information provided in DMPs. In this paper, we report on the hackathon organised by the Research Data Alliance in which a group of 89 participants from 21 countries worked collaboratively on use cases exploring the...
This article gives an overview of the FAIR Data Austria project objectives and current results. In collaboration with our project partners, we work on the development and establishment of tools for managing the lifecycle of research data, including machine-actionable Data Management Plans (maDMPs), repositories for long-term archiving of research r...
This paper presents the application profile for machine-actionable data management plans that allows information from traditional data management plans to be expressed in a machine-actionable way. We describe the methodology and research conducted to define the application profile. We also discuss design decisions made during its development and pr...
At present, accessing and processing Earth Observation (EO) data on different cloud platforms requires users to exercise distinct communication strategies as each backend platform is designed differently. The openEO API (Application Programming Interface) standardises EO-related contracts between local clients (R, Python, and JavaScript) and cloud...
Effective stewardship of data is a critical precursor to making data FAIR. The goal of this paper is to bring an overview of current state of the art of data management and data stewardship planning solutions (DMP). We begin by arguing why data management is an important vehicle supporting adoption and implementation of the FAIR principles, we desc...
Earth observation researchers use specialised computing services for satellite image processing offered by various data backends. The source of data is often the same, for example Sentinel-2 satellites operated by Copernicus, but the way how data is pre-processed, corrected, updated, and later analysed may differ among the backends. Backends often...
Research Data Alliance Austria (RDA-AT) is a national RDA node dedicated to representing emerging research and data management communities throughout Austria. RDA-AT will operate as a formal participant of RDA Europe and RDA Global, linking Austrian data management initiatives and RDA Working and Interest Groups, providing assistance in adoption of...
Data management plans (DMPs) are documents accompanying research proposals and project outputs. DMPs are created as free-form text and describe the data and tools employed in scientific investigations. They are often seen as an administrative exercise and not as an integral part of research practice.
There is now widespread recognition that the DMP...
Scientific experiments in various domains require nowadays collecting, processing, and reusing data. Researchers have to comply with funder policies that prescribe how data should be managed, shared and preserved. In most cases this has to be documented in data management plans. When data is selected and moved into a repository when project ends, i...
Im November 2017 fand an der Universität Wien der RDA Europe Workshop „From Planning to Action. Towards the Establishment of an Austrian Research Infrastructure“ mit Unterstützung von RDA Europe, der Technischen Universität Wien und der Universität Wien statt. Es war der erste in Österreich abgehaltene Workshop in einer Reihe von europaweiten RDA V...
Data management plans are free-form text documents describing the data used and produced in scientific experiments. The complexity of data-driven experiments requires precise descriptions of tools and datasets used in computations to enable their reproducibility and reuse. Data management plans fall short of these requirements. In this paper, we pr...
This report presents outputs of the International Digital Curation Conference 2017 workshop on machine-actionable data management plans. It contains community-generated use cases covering eight broad topics that reflect the needs of various stakeholders. It also articulates a consensus about the need for a common standard for machine-actionable dat...
Scientific experiments performed in the eScience domain require special tooling, software, and workflows that allow researchers to link, transform, visualize and interpret data. Recent studies report that such experiments often cannot be replicated due to differences in the underlying infrastructure. The provenance collection mechanisms were built...
Complex data driven experiments form the basis of biomedical research. Recent findings warn that the context in which the software is run, that is the infrastructure and the third party dependencies, can have a crucial impact on the final results delivered by a computational experiment. This implies that in order to replicate the same result, not o...
Purpose
– This paper aims to address the issue of long-term stability of services and systems depending on service-oriented architecture that has become a popular architecture in systems development and is often implemented using Web services. However, the dependency, especially on externally provided services, can impact the reliability of a syste...
High dependence on web services and service-oriented architecture affects not only business solutions, but also scientific research. Web services may be delivered by third parties, and thus are candidates for outsourcing. However, they represent a source of risks, which can jeopardise the robustness of processes. Hence, there is a need for actions...
Many business and scientific processes make extensive use of service-oriented architectures, using distributed services. These are often provided by third parties and are thus not under direct control of process owners. In this paper we discuss the issues of ensuring continuous and faithful execution of processes in distributed environments, focusi...
The reproducibility of modern research depends on the possibility to faithfully rerun the complex and distributed data transformation processes which were executed by scientists in order to make new scientific breakthroughs. New methods and frameworks try to address this problem by collecting evidence used for verification of such experiments. Howe...
The re-usability and repeatability of e-Science experiments is widely understood as a requirement of validating and reusing previous work in data-intensive domains. Experiments are, however, often complex chains of processing, involving a number of data sources, computing infrastructure, software tools, or external and third-party services, renderi...
In the era of research infrastructures and big data, sophisticated data management practices are becoming essential building blocks of successful science. Most practices follow a data-centric approach, which does not take into account the processes that created, analysed and presented the data. This fact limits the possibilities for reliable verifi...
This paper presents the results of an evaluation carried out by the EU 4C project to assess how well current digital curation cost and benefit models meet a range of stakeholders' needs. This work aims to elicit a means of modelling that enables comparing financial information across organisations, to support decision-making and for selecting the m...
Preserving processes requires not only the identification of all process components, but also the interception of all interactions of the process with the external influencers. In order to verify if the collected data is sufficient for the purpose of redeployment, as well as to verify that the redeployed process performs according to expectations,...
This paper aims to establish engineering processes and methods for the assessment and deployment of digitally preservable systems by identifying a method for assessing the preservability capabilities of systems. The work done on this was based on the hypothesis that preservability consists of a set of systems capabilities that originates from a com...
The reproducibility of research and possibility of reliable verification of results are key assumptions of credible data-driven science. In this position paper we present how the preservability of research can be improved by Process Management Plans, which are a novel process-centric approach to data management.
In the paper, a problem of cost optimization of Wide Area Network (WAN) is considered. For solving this problem two algorithms have been deigned and implemented: a heuristic algorithm called Marian and meta-heuristic algorithm called Kaflok. The paper presents the results of evaluation of these algorithms based on complex simulation experiments mad...
Finding a suitable repository to deposit research data is a difficult task for researchers since the landscape consists of thousands of repositories and automated tool support is limited. Machine-actionable DMPs can improve the situation since they contain relevant context information in a structured and machine-friendly way and therefore enable au...