About
164
Publications
33,851
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,328
Citations
Publications
Publications (164)
We find ourselves on the ever-shifting cusp of an AI revolution -- with potentially metamorphic implications for the future practice of healthcare. For many, such innovations cannot come quickly enough; as healthcare systems worldwide struggle to keep up with the ever-changing needs of our populations. And yet, the potential of AI tools and systems...
Background/Objectives: Predicting patient readmission is an important task for healthcare risk management, as it can help prevent adverse events, reduce costs, and improve patient outcomes. In this paper, we compare various conventional machine learning models and deep learning models on a multimodal dataset of electronic discharge records from an...
Predicting patient readmission is an important task for healthcare risk management, as it can help prevent adverse events, reduce costs, and improve patient outcomes. In this paper, we compare various conventional machine learning models on a multimodal dataset of electronic discharge records from an Irish acute hospital. We \khoi{evaluate the effe...
Innovative approaches are needed for managing risk and system change in healthcare. This paper presents a case study of a project that took place over two years, taking a systems approach to managing the risk of healthcare acquired infection in an acute hospital setting, supported by an Access Risk Knowledge Platform which brings together Human Fac...
In this paper, we propose a novel data valuation method for a Dataset Retrieval (DR) use case in Ireland's National mapping agency. To the best of our knowledge, data valuation has not yet been applied to Dataset Retrieval. By leveraging metadata and a user's preferences, we estimate the personal value of each dataset to facilitate dataset retrieva...
Deep learning algorithms have exhibited impressive performance across various computer vision tasks; however, the challenge of overfitting persists, especially when dealing with limited labeled data. This survey explores the mitigation of the overfitting issue through a comprehensive examination of image data augmentation techniques, which aim to e...
Data breaches and other security incidents are an emerging challenge in the digital era. The General Data Protection Regulation (GDPR) requires conducting an impact assessment to understand the effects of the breach, and to then notify authorities and affected individuals in certain cases. Communication of this information typically takes place via...
In this paper, the transparency of a knowledge graph (KG) generated by a governance, risk, and compliance tool is automatically evaluated for a real-world clinical use case, using a generalisable evaluation method. KGs are increasingly used in AI systems and their transparency has a prominent impact on the transparency of the systems that create an...
This paper investigates the impact of data valuation metrics (variability and coefficient of variation) on the feature importance in classification models. Data valuation is an emerging topic in the fields of data science, accounting, data quality, and information economics concerned with methods to calculate the value of data. Feature importance o...
Effective governance necessitates going beyond compliance with rules, regulations and procedures; particularly as adverse events are generally the result of a combination of human, organisational, technological, and economic factors. This study explores the use of socio-technical systems analysis (STSA) in an Artificial Intelligence (AI) platform c...
This work describes the application of semantic web standards to data quality governance of data production pipelines in the architectural, engineering, and construction (AEC) domain for Ordnance Survey Ireland (OSi). It illustrates a new approach to data quality governance based on establishing a unified knowledge graph for data quality measuremen...
Background
This paper proposes Cyrus, a new transparency evaluation framework, for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art transparency models and linked data quality assessment dimensions. It brings together a comprehensive view of transparency dimensions for OKE systems. The Cyrus framework is used to evalu...
This paper describes prototyping of the draft Solid application interoperability specification (INTEROP). We developed and evaluated a dynamic user interface (UI) for the new Solid application access request and authorization extended with the Data Privacy Vocabulary. Solid places responsibility on users to control their data. INTEROP adds new decl...
Advanced data augmentation techniques have demonstrated great success in deep learning algorithms. Among these techniques, single-image-based data augmentation (SIBDA), in which a single image’s regions are randomly erased in different ways, has shown promising results. However, randomly erasing image regions in SIBDA can cause a loss of the key di...
Deep learning (DL) algorithms have shown significant performance in various computer vision tasks. However, having limited labelled data lead to a network overfitting problem, where network performance is bad on unseen data as compared to training data. Consequently, it limits performance improvement. To cope with this problem, various techniques h...
Data is central to modern decision making and value creation. Society creates, consumes and collects data at an increasing pace. Despite advances in processing power, data is expensive to maintain and curate. So, it is imperative to have methods and tools to distinguish between data based on its value. Yet, there is no consensus on what characteris...
Background: In this paper a new transparency evaluation framework, Cyrus, is proposed for Open Knowledge Extraction (OKE) systems. Cyrus is based on the state-of-the-art AI and data transparency methods and a well-accepted framework for linked data quality evaluation. Then, the transparency of the outputs of three state-of-the-art OKE systems are a...
This paper proposes a framework for developing a trustworthy artificial intelligence (AI) supported knowledge management system (KMS) by integrating existing approaches to trustworthy AI, trust in data, and trust in organisations. We argue that improvement in three core dimensions (data governance, validation of evidence, and reciprocal obligation...
Over the years, the paradigm of medical image analysis has shifted from manual expertise to automated systems, often using deep learning (DL) systems. The performance of deep learning algorithms is highly dependent on data quality. Particularly for the medical domain, it is an important aspect as medical data is very sensitive to quality and poor q...
Over the years, the paradigm of medical image analysis has shifted from manual expertise to automated systems, often using deep learning (DL) systems. The performance of deep learning algorithms is highly dependent on data quality. Particularly for the medical domain, it is an important aspect as medical data is very sensitive to quality and poor q...
This paper proposes a framework for developing a trustworthy artificial intelligence (AI) supported knowledge management system (KMS) by integrating existing approaches to trustworthy AI, trust in data, and trust in organisations. We argue that improvement in three core dimensions (data governance, validation of evidence, and reciprocal obligation...
The GDPR requires Data Controllers and Data Protection Officers (DPO) to maintain a Register of Processing Activities (ROPA) as part of overseeing the organisation’s compliance processes. The ROPA must include information from heterogeneous sources such as (internal) departments with varying IT systems and (external) data processors. Current practi...
The creation and maintenance of Registers of Processing Activities (ROPA) are essential to meeting the General Data Protection Regulation (GDPR) and thus to demonstrate compliance based on the GDPR concept of accountability. To establish its effectiveness in meeting this obligation, we evaluate an ROPA semantic model, the Common Semantic Model–ROPA...
This paper proposes a new accountability and transparency evaluation framework (AccTEF) for ontology-based systems (OSysts). AccTEF is based on an analysis of the relation between a set of widely accepted data governance principles, i.e. findable, accessible, interoperable, reusable (FAIR) and accountability and transparency concepts. The evaluatio...
Forged documents specifically passport, driving licence and VISA stickers are used for fraud purposes including robbery, theft and many more. So detecting forged characters from documents is a significantly important and challenging task in digital forensic imaging. Forged characters detection has two big challenges. First challenge is, data for fo...
This chapter presents the Trusted Integrated Knowledge Dataspace (TIKD)—a trusted data sharing approach, based on Linked Data technologies, that supports compliance with the General Data Privacy Regulation (GDPR) for personal data handling as part of data security infrastructure for sensitive application environments such as healthcare. State-of-th...
Forged characters detection from personal documents including a passport or a driving licence is an extremely important and challenging task in digital image forensics, as forged information on personal documents can be used for fraud purposes including theft, robbery etc. For any detection task i.e. forged character detection, deep learning models...
This paper presents a new method for data augmentation called Stride Random Erasing Augmentation (SREA) to improve classification performance. In SREA, probability based strides of one image are pasted onto another image and also labels of both images are mixed with the same probability as the image mixing, to generate a new augmented image and aug...
This paper analyses the requirements of a blockchain-based data governance model for COVID-19 digital health certificates. Recognizing a gap in the existing literature, this paper aims to answer the research question “To what extent does a blockchain-based governance model for COVID-19 digital health certificates in the EU meet the relevant legal,...
This paper describes a tool using an extended Data Privacy Vocabulary (the DPV) to audit and monitor GDPR compliance of international transfers of personal data. New terms were identified which have been proposed as extensions to the DPV W3C Working Group. A prototype software tool was built based on the model plus a set of validation rules, and sy...
Three key challenges to a whole-system approach to process improvement in health systems are the complexity of socio-technical activity, the capacity to change purposefully, and the consequent capacity to proactively manage and govern the system. The literature on healthcare improvement demonstrates the persistence of these problems. In this projec...
The current Covid-19 global pandemic led to a proliferation of contact-tracing applications meant to help control and suppress the spread of the virus. However, the success of these contact-tracing apps relies on obtaining access to sensitive data stored on citizen's mobile devices. The approaches taken are different around the world. While the cou...
This paper describes a new semantic metadata-based approach to describing and integrating diverse data processing activity descriptions gathered from heterogeneous organisational sources such as departments, divisions, and external processors. This information must be collated to assess and document GDPR legal compliance, such as creating a Registe...
Contact tracing apps used in tracing and mitigating the spread of COVID-19 have sparked discussions and controversies worldwide. The major concerns in relation to these apps are around privacy. Ire-land was in general praised for the design of its COVID tracker app, and the transparency through which privacy issues were addressed. However, the "voi...
We introduce a study examining people?s privacy concerns during COVID-19 and reflect on people's willingness to share their personal data in the interest of controlling the spread of the virus and saving lives.
This research focuses on designing methods aimed at assessing Irish public attitudes regarding privacy in COVID-19 times and their influence on the adoption of COVID-19 spread control technology such as the COVID tracker app. The success of such technologies is dependent on their adoption rate and privacy concerns may be a factor delaying or preven...
Organisations can be complex entities, performing heterogeneous processing on large volumes of diverse personal data, potentially using outsourced partners or subsidiaries in distributed geographical locations and jurisdictions. Many organisations appoint a Data Protection Officer (DPO) to assist them with their demonstration of compliance with the...
Building Information Modelling (BIM) is a key enabler to support integration of building data within the buildings life cycle (BLC) and is an important aspect to support a wide range of use cases, related to intelligent automation, navigation, energy efficiency, sustainability and so forth. Open building data faces several challenges related to sta...
The creation and maintenance of a Register of Processing Activities (ROPA) is an essential process for the demonstration of GDPR compliance. We analyse ROPA templates from six EU Data Protection Regulators and show that template scope and granularity vary widely between jurisdictions. We then propose a flexible, consolidated data model for consiste...
This paper describes the Data Quality Index (DQI), a new data quality governance method to improve data quality in both paper and electronic healthcare records. This is an important use case as digital transformation is a slow process in healthcare and hybrid systems exist in many countries such as Ireland. First a baseline study of the nature and...
This paper describes a set of new Geospatial Linked Data (GLD) quality metrics based on ISO and W3C spatial standards for monitoring geospatial data production. The Luzzu quality assessment framework was employed to implement the metrics and evaluate a set of five public geospatial datasets. Despite the availability of metrics-based quality assessm...
The creation and maintenance of a Register of Processing Activities (ROPA) is an essential process for the demonstration of GDPR compliance. We analyse ROPA templates from six EU Data Protection Regulators and show that template scope and granularity vary widely between jurisdictions. We then propose a flexible, consolidated data model for consiste...
A core requirement for GDPR compliance is the maintenance of a register of processing activities (ROPA). Our analysis of six ROPA templates from EU data protection regulators shows the scope and granularity of a ROPA is subject to widely varying guidance in different jurisdictions. We present a consolidated data model based on common concepts and r...
A core requirement for GDPR compliance is the maintenance of a register of processing activities (ROPA). Our analysis of six ROPA templates from EU data protection regulators shows the scope and granularity of a ROPA is subject to widely varying guidance in different jurisdictions. We present a consolidated data model based on common concepts and r...
The Accountability Principle of the GDPR requires that an organisation can demonstrate compliance with the regulations. A survey of GDPR compliance software solutions shows significant gaps in their ability to demonstrate compliance. In contrast, RegTech has recently brought great success to financial compliance, resulting in reduced risk, cost sav...
Data is becoming one of the world's most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward...
The Accountability Principle of the GDPR requires that an organisation can demonstrate compliance with the regulations. A survey of GDPR compliance software solutions shows significant gaps in their ability to demonstrate compliance. In contrast, RegTech has recently brought great success to financial compliance, resulting in reduced risk, cost sav...
Managing privacy and understanding handling of personal data has turned into a fundamental right, at least within the European Union, with the General Data Protection Regulation (GDPR) being enforced since May 25th 2018. This has led to tools and services that promise compliance to GDPR in terms of consent management and keeping track of personal d...
While data value and value creation are highly relevant in today’s society, there is as yet no consensus data value models, dynamics, measurement techniques or even methods of categorising and comparing them. In this paper we analyse and categorise existing aspects of data that are used in literature to characterise and/or quantify data value. Base...
This paper presents DELTA-LD, an approach that detects and classifies the changes between two versions of a linked dataset. It contributes to the state-of-art: firstly, by proposing a classification to distinctly identify the resources that have had both their IRIs and representation changed and the resources that have had only their IRI changed; s...
The Linked Open Data (LOD) cloud has been around since 2007. Throughout the years, this prominent depiction served as the epitome for Linked Data and acted as a starting point for many. In this article we perform a number of experiments on the dataset metadata provided by the LOD cloud, in order to understand better whether the current visualised d...
In this paper we present a collection of ontologies specifically designed to model the information exchange needs of combined software and data engineering. Effective, collaborative integration of software and big data engineering for Web-scale systems, is now a crucial technical and economic challenge. This requires new combined data and software...
This paper describes a data governance knowledge extraction prototype for Slack channels based on an OWL ontology abstracted from the Collibra data governance operating model and the application of statistical techniques for named entity recognition. This addresses the need to convert unstructured information flows about data assets in an organisat...
In this paper we provide an in-depth analysis of a survey related to Information Professionals (IPs) experiences with Linked Data quality. We discuss and highlight shortcomings in linked data sources following a survey related to the quality issues IPs find when using such sources for their daily tasks such as metadata creation.
Data is quite popularly considered to be the new oil since it has become a valuable commodity. This has resulted in many entities and businesses that hoard data with the aim of exploiting it. Yet, the ‘simple’ exploitation of data results in entities who are not obtaining the highest benefits from the data, which as yet is not considered to be a fu...
New advances in computer science address problems historical scientists face in gathering and evaluating the now vast data sources available through the Internet. As an example we introduce Dacura, a dataset curation platform designed to assist historical researchers in harvesting, evaluating, and curating high-quality information sets from the Int...
Linked Data consists of many structured data knowledge bases that have been interlinked, often using equivalence statements. These equivalences usually take the form of owl:sameAs statements linking individuals, links between classes are far less common Often, the lack of class links is because their relationships cannot be described as one to one...