Mathieu d'Aquin

Mathieu d'Aquin
National University of Ireland, Galway | NUI Galway · Data Science Institute

PhD

About

209
Publications
26,172
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,208
Citations
Additional affiliations
April 2017 - present
National University of Ireland, Galway
Position
  • Professor
June 2006 - March 2017
The Open University (UK)
Position
  • Fellow
September 2005 - May 2006
Lorrain de Recherche en Informatique et Ses Applications
Position
  • ATER (temporary lecturer)

Publications

Publications (209)
Article
Full-text available
A key property of Linked Data is the representation and publication of data as interconnected labelled graphs where different resources linked to each other form a network of meaningful information. Searching these important relationships between resources-within single or distributed graphs-can be reduced to a pathfinding or navigation problem, i....
Article
Full-text available
In real-world machine learning applications, unlabeled training data are readily available, but labeled data are expensive and hard to obtain. Therefore, semi-supervised learning algorithms have gathered much attention. Previous studies in this area mainly focused on a semi-supervised classification problem, whereas semi-supervised regression has r...
Article
Semi-Supervised Learning (SSL) is an approach to machine learning that makes use of unlabeled data for training with a small amount of labeled data. In the context of molecular biology and pharmacology, one can take advantage of unlabeled data. For instance, to identify drugs and targets where a few genes are known to be associated with a specific...
Preprint
Full-text available
Prediction of metastatic sites from the primary site of origin is a impugn task in breast cancer (BRCA). Multi-dimensionality of such metastatic sites - bone, lung, kidney, and brain, using large-scale multi-dimensional Poly-Omics (Transcriptomics, Proteomics and Metabolomics) data of various type, for example, CNV (Copy number variation), GE (Gene...
Chapter
Linked Data are based on a set of principles and technologies to exploit the architecture of the Web in order to represent and provide access to machine-readable, globally integrated information. Those principles and technologies have many advantages when applied in the context of implementing data lakes, both generally and in particular domains....
Preprint
Full-text available
Beyond sharing datasets or simulations, we believe the Recommender Systems (RS) community should share Task Environments. In this work, we propose a high-level logical architecture that will help to reason about the most important components of a RS Task Environment, identify the differences between Environments, datasets and simulations, and most...
Article
Full-text available
Identifying the unintended effects of drugs (side effects) is a very important issue in pharmacological studies. The laboratory verification of associations between drugs and side effects requires costly, time-intensive research. Thus, an approach to predicting drug side effects based on known side effects, using a computational model, is highly de...
Conference Paper
Full-text available
The evaluation of text complexity is an important topic in education. While this objective has been addressed by approaches using lexical and syntactic analysis for decades, semantic complexity is less common, and the recent research works that tackle this question rely on machine learning algorithms that are hardly explainable and are not specific...
Conference Paper
Full-text available
The purpose of the LILE2019 workshop is to provide an interdisciplinary forum for researchers and practitioners who make innovative use of Web data for educational purposes, spanning areas such as learning analytics, Web mining, data and Web science, psychology and the social sciences. The previous editions of the LILE workshop were successfully he...
Conference Paper
One of the existing query recommendation strategies for unknown datasets is "by example", i.e. based on a query that the user already knows how to formulate on another dataset within a similar domain. In this paper we measure what contribution a structural analysis of the query and the datasets can bring to a recommendation strategy, to go alongsid...
Article
Full-text available
The advent of social media has enabled us to explore the impact of academic entities beyond the conventional bibliometric community. The traditional bibliometric indicators such as citation count, h-index and SNIP metrics aim to represent the propagation of knowledge in the academic world rather than the impact of the research on the wider world. I...
Article
Full-text available
The goal of this work is to describe how robots interact with complex city environments, and to identify the main characteristics of an emerging field that we call Robot–City Interaction (RCI). Given the central role recently gained by modern cities as use cases for the deployment of advanced technologies, and the advancements achieved in the robot...
Article
Full-text available
Research has approached the practice of musical reception in a multitude of ways, such as the analysis of professional critique, sales figures and psychological processes activated by the act of listening. Studies in the Humanities, on the other hand, have been hindered by the lack of structured evidence of actual experiences of listening as report...
Article
Full-text available
Data retrieval systems are facing a paradigm shift due to the proliferation of specialised data storage engines (SQL, NoSQL, Column Stores, MapReduce, Data Stream, Graph) supported by varied data models (CSV, JSON, RDB, RDF, XML). One immediate consequence of this paradigm shift results into data bottleneck over the Web; which means, Web applicatio...
Conference Paper
Addressing ethical issues arising from AI research, and by extension from most areas of Data Science, is a core challenge in both the academic and industry worlds. The nature of research and the specific set of technical skills involved imply that AI and Data Science researchers are not equipped to identify and anticipate such issues arising, or to...
Article
Full-text available
HSE EPA Data Policy Workshop From Open Data to GDPR Data Sharing Challenges – Environment, Health & Wellbeing HSE EPA Environment, Health & Wellbeing Conference 2-4pm Wednesday 7th November 2018 Radisson Blu Hotel, Golden Lane, Dublin The workshop was comprised of two panels lasting one hour each. Each panel was comprised of 4 people who made 8-10...
Book
Full-text available
Distance teaching and the use of openly available educational resources on the Web are becoming common practices at public higher education institutions as well as private training organisations. In addition, informal learning and knowledge exchange are inherent to our daily online interactions, such as searching the Web [1], and using learning and...
Conference Paper
It is our great pleasure to welcome you to the WWW 2018 Challenges Track. It is the first time that the WWW conference includes such a track, which aim was to showcase the maturity of the state of the art on tasks common to the Web community and adjacent academic communities, in a controlled setting of rigorous evaluation. Through our call for chal...
Conference Paper
This volume of proceedings presents the papers from the 2nd edition of the interdisciplinary workshop Re-coding Black Mirror, held on April 24, 2018 in Lyon, France and co-located with The WEB Conference (WWW2018). Participating to the topical debate of data ethics and algorithmic governance, Re-coding Black Mirror offers the research community too...
Conference Paper
Full-text available
Whether for using online services or dealing with legal issues, citizens are often requested to sign/accept policy documents that are intended to commit them to specific rights and duties. Usually such documents are difficult to read due to their nature, the length of sentences, complex terms used, etc. Since understanding is a prerequisite to maki...
Conference Paper
Full-text available
The goal of AFEL is to develop, pilot and evaluate methods and applications , which advance informal/collective learning as it surfaces implicitly in online social environments. The project is following a multidisciplinary , industry-driven approach to the analysis and understanding of learner data in order to personalize, accelerate and improve in...
Article
Semantic Web technologies aim to simplify the distribution, sharing and exploitation of information and knowledge, across multiple distributed actors on the Web. As with all technologies that manipulate information, there are privacy and security implications, and data policies (e.g., licenses and regulations) that may apply to both data and softwa...
Conference Paper
When publishing data, data licences are used to specify the actions that are permitted or prohibited, and the duties that target data consumers must comply with. However, in complex environments such as a smart city data portal, multiple data sources are constantly being combined, processed and redistributed. In such a scenario, deciding which poli...
Conference Paper
Full-text available
The focus of this work is to exploit ontologies to make robotic systems more accessible to non-expert users, therefore supporting the deployment of robot-integrated applications. Due to the increasing number of robotic platforms available for commercial use, robotic systems are nowadays being approached by users with different backgrounds, who are...
Conference Paper
Full-text available
In this paper, we discuss how Learning Analytics, as the activity to capture and analyze people's learning behaviors in order to improve their learning experiences, could be used as a way for bots to "learn how to learn" and how this might have a greater impact than the apparent improvement it would enable for Artificial Intelligence. Through explo...
Conference Paper
Full-text available
More and more learning activities take place online in a self-directed manner. Therefore, just as the idea of self-tracking activities for fitness purposes has gained momentum in the past few years, tools and methods for awareness and self-reflection on one's own online learning behavior appear as an emerging need for both formal and informal learn...
Conference Paper
Virtual data integration takes place at query execution time and relies on transformations of the original query to many target endpoints, where the data reside. In systems that integrate many data sources, this means maintaining many mappings, queries and query templates, as well as possibly issuing separate queries for linking entities in the dat...
Conference Paper
An increasing amount of large-scale knowledge graphs have been constructed in recent years. Those graphs are often created from text-based extraction, which could be very noisy. So far, cleaning knowledge graphs are often carried out by human experts and thus very inefficient. It is necessary to explore automatic methods for identifying and elimina...
Conference Paper
In this paper we present the DKA-robo framework, where a mobile robot is used to update the statements of a knowledge base that have lost validity in time. Managing the dynamic information of knowledge bases constitutes a key issue in many real-world scenarios, because constantly reevaluating data requires efforts in terms of knowledge acquisition...
Conference Paper
Web-scale reuse and interoperability of learning resources have been major concerns for the technology-enhanced learning community. While work in this area traditionally focused on learning resource metadata, provided through learning resource repositories, the recent emergence of structured entity markup on the Web through standards such as RDFa a...
Book
This book contains the best selected papers of two Satellite Events held at the 20th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2016, in November 2016 in Bologna, Italy: The Second International Workshop on Educational Knowledge Management, EKM 2016, and the First Workshop: Detection, Representation and Managem...
Article
The Learning Analytics and Knowledge (LAK) Dataset represents an unprecedented corpus which exposes a near complete collection of bibliographic resources for a specific research discipline, namely the connected areas of Learning Analytics and Educational Data Mining. Covering over five years of scientific literature from the most relevant conferenc...
Conference Paper
Workflow formalisations are often focused on the representation of a process with the primary objective to support execution. However, there are scenarios where what needs to be represented is the effect of the process on the data artefacts involved, for example when reasoning over the corresponding data policies. This can be achieved by annotating...
Conference Paper
The goal of this work is to learn a measure supporting the detection of strong relationships between Linked Data entities. Such relationships can be represented as paths of entities and properties, and can be obtained through a blind graph search process traversing Linked Data. The challenge here is therefore the design of a cost-function that is a...
Conference Paper
In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset...
Article
The Semantic Web is a young discipline, even if only in comparison to other areas of computer science. Nonetheless, it already exhibits an interesting history and evolution. This book is a reflection on this evolution, aiming to take a snapshot of where we are at this specific point in time, and also showing what might be the focus of future resear...
Article
This paper describes the construction of the LOTED2 ontology for the representation of European public procurement notices. LOTED2 follows initiatives around the creation of linked data-compliant representations of information regarding tender notices in Europe, but focusing on placing such representations within their legal context. It is therefor...
Chapter
Education has often been a keen adopter of new information and communication technologies. This is not surprising given that education is all about informing and communicating. Traditionally, educational institutions produce large volumes of data, much of which is publicly available, either because it is useful to communicate (e.g. the course catal...
Article
Full-text available
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the University. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic de...
Book
The 47 revised full papers presented together with three invited talks were carefully reviewed and selected from 204 submissions. This program was completed by a demonstration and poster session, in which researchers had the chance to present their latest results and advances in the form of live demos. In addition, the PhD Symposium program include...
Chapter
The Semantic Web, and by extension semantic web technologies, is very young in comparison to other computing disciplines, such as databases and artificial intelligence—and indeed even the Web is very young in comparison with these disciplines.1 As a result, as is usually the case with new phenomena, it will probably take time to develop a comprehen...
Chapter
As shown in the previous chapter, examples of intelligent semantic web systems are becoming more and more common, and have evolved from academic prototypes demonstrating advanced concepts, to being adopted by commercial organizations producing tools used on a daily basis by millions of people. Hence, the vision of the Semantic Web and of what could...
Chapter
The semantic web, which we have characterized in the previous chapter as a conceptual network two levels of abstractions above the web, requires new approaches and tools to enable users and applications to interact with and exploit something that is akin to a knowledge network, rather than being simply a network of documents. Knowledge-based system...
Chapter
In the previous chapters we provided a conceptualization of the Semantic Web and explained how intelligent applications can be built to rely on its contents and knowledge sharing infrastructure. At the same time we also pointed out the specificities of the semantic web, compared to other types of information systems. At this point, we want to shift...
Article
Full-text available
How can we innovate smart systems for smart cities, to make data available homogeneously, inexpensively, and flexibly while supporting an array of applications that have yet to exist or be specified?
Conference Paper
Full-text available
Governing the life cycle of data on the web is a challenging issue for organisations and users. Data is distributed under certain policies that determine what actions are allowed and in which circumstances. Assessing what policies propagate to the output of a process is one crucial problem. Having a description of policies and data flow steps impli...
Conference Paper
In this paper, we propose an ontology design pattern for the concept of "explanation". The motivation behind this work comes from our research, which focuses on automatically identifying explanations for data patterns. If we want to produce explanations from data agnostically from the application domain, we first need a formal definition of what an...
Conference Paper
In this paper we present the system Dedalo, whose aim is to generate explanations for data patterns using background knowledge retrieved from Linked Data. In many real-world scenarios, patterns are generally manually interpreted by the experts that have to use their own background knowledge to explain and refine them, while their workload could be...
Conference Paper
Licences are a crucial aspect of the information publishing process in the web of (linked) data. Recent work on modeling of policies with semantic web languages (RDF, ODRL) gives the opportunity to formally describe licences and reason upon them. However, choosing the right licence is still challenging. Particularly, understanding the number of fea...
Conference Paper
In this paper we exploit knowledge from Linked Data to ease the process of analysing scholarly data. In the last years, many techniques have been presented with the aim of analysing such data and revealing new, unrevealed knowledge, generally presented in the form of ``patterns". However, the discovered patterns often still require human interpreta...
Conference Paper
The LAK Data Challenge 2015 continues the research efforts of the previous data competitions in 2013 and 2014 by stimulating research on the evolving fields Learning Analytics (LA) and Educational Data Mining (EDM). Building on a series of activities of the LinkedUp project, the challenge aims to generate new insights and analysis on the LA & EDM d...
Article
Full-text available
Ontology evolution aims at maintaining an ontology up to date with respect to changes in the domain that it models or novel requirements of information systems that it enables. The recent industrial adoption of Semantic Web techniques, which rely on ontologies, has led to the increased importance of the ontology evolution research. Typical approach...
Article
In the context of linked open data, difererent datasets can be interlinked together, thereby providing rich background knowledge for a dataset under examination. We believe that knowledge from interlinked datasets can be used to validate the accuracy of a linked data fact. In this paper, we present a novel approach for linked data fact validation u...
Article
Data integration problems are commonly viewed as inter-operability issues, where the burden of reaching a common ground for exchanging data is distributed across the peers involved in the process. While apparently an effective approach towards standardization and interoperability, it poses a constraint to data providers who, for a variety of reason...
Book
The two-volume set LNCS 9366 and 9367 constitutes the refereed proceedings of the 14th International Semantic Web Conference, ISWC 2015, held in Bethlehem, PA, USA, in October 2015. The International Semantic Web Conference is the premier forum for Semantic Web research, where cutting edge scientific results and technological innovations are presen...
Article
The Listening Experience Database (http://www.open.ac.uk/Arts/LED) is the first project to collate and interrogate a mass of historical personal experiences of listening to music. Such accounts have previously received only isolated attention because they are challenging to locate and gather en masse. An extensive body of data about the responses o...
Conference Paper
The main idea behind Linked Data is to connect data from different sources together, in order to develop a hub of shared and publicly accessible knowledge. While the benefit of sharing knowledge is universally recognised, what is less visible is how much results can be affected when the knowledge in one dataset and in the connected ones are not equ...
Conference Paper
Two typical problems are encountered after obtaining a set of rules from a data mining process: (i) their number can be extremely large and (ii) not all of them are interesting to be considered. Both manual and automatic strategies trying to overcome those problems have to deal with technical issues such as time costs and computational complexity....
Conference Paper
This paper presents a study describing the development of an Evaluation Framework (EF) for data competitions in TEL. The study applies the Group Concept Method (GCM) to empirically depict criteria and their indicators for evaluating software applications in TEL. A statistical analysis including multidimensional scaling and hierarchical clustering o...
Conference Paper
The Listening Experience Database (LED) is a project that gathers documented evidence of listening to music across cultural and historical contexts. Its underlying information system relies on the principles and practices of Linked Data, including a knowledge base that is itself a linked dataset structured according to common vocabularies for media...
Conference Paper
We present Dedalo, a framework which is able to exploit Linked Data to generate explanations for clusters. In general, any result of a Knowledge Discovery process, including clusters, is interpreted by human experts who use their background knowledge to explain them. However, for someone without such expert knowledge, those results may be difficult...
Article
The European Union is increasingly committed to pushing forward open approaches as indicated by the G8 Open Data Charter, the Opening Up Education initiative, the launch of the Open Education Europa Portal for OER resources and other similar initiatives. The EU-funded LinkedUp Project (Linking Web data for education) aims to gather successful exemp...
Conference Paper
The LAK Data Challenge 2014 continues the research efforts of the second edition by stimulating research on the evolving fields Learning Analytics (LA) and Educational Data Mining (EDM). Building on a series of activities of the LinkedUp project, the challenge aims to generate new insights and analysis on the LA & EDM disciplines and is supported t...
Article
Full-text available
With a projected six-figure skills gap looming in the US alone, here the authors share strategies and lessons learned regarding how to bridge the gap in training competent data scientists in the near future.