Mathieu d'Aquin

Mathieu d'Aquin
University of Lorraine | UdL · LORIA - Laboratoire Lorrain de Recherche en Informatique et Applications

PhD

About

233
Publications
35,357
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,762
Citations
Additional affiliations
April 2017 - present
Ollscoil na Gaillimhe – University of Galway
Position
  • Professor
June 2006 - March 2017
The Open University
Position
  • Fellow
September 2005 - May 2006
Lorrain de Recherche en Informatique et Ses Applications
Position
  • ATER (temporary lecturer)

Publications

Publications (233)
Preprint
Full-text available
Machine learning promises to accelerate the material discovery by enabling high-throughput prediction of desirable macro-properties from atomic-level descriptors or structures. However, the limited data available about precise values of these properties have been a barrier, leading to predictive models with limited precision or the ability to gener...
Article
Full-text available
Objective Collaborate, Analyse, Research and Audit (CARA) project set out to provide an infrastructure to enable Irish general practitioners (GPs) to use their routinely collected patient management software (PMS) data to better understand their patient population, disease management and prescribing through data dashboards. This paper explains the...
Preprint
Full-text available
It has been reliably shown that the similarity of word embeddings obtained from popular neural models such as BERT approximates effectively a form of semantic similarity of the meaning of those words. It is therefore natural to wonder if those embeddings contain enough information to be able to connect those meanings through ontological relationshi...
Chapter
Full-text available
Building taxonomies is often a significant part of building an ontology, and many attempts have been made to automate the creation of such taxonomies from relevant data. The idea in such approaches is either that relevant definitions of the intension of concepts can be extracted as patterns in the data (e.g. in formal concept analysis) or that thei...
Preprint
Full-text available
Knowledge graphs (KGs) have emerged as a prominent data representation and management paradigm. Being usually underpinned by a schema (e.g. an ontology), KGs capture not only factual information but also contextual knowledge. In some tasks, a few KGs established themselves as standard benchmarks. However, recent works outline that relying on a limi...
Preprint
Full-text available
Many of us make quick decisions that affect our data privacy on our smartphones without due consideration of our values. One such decision point is establishing whether to download a smartphone app or not. In this work, we aim to better understand the relationship between our values, our privacy preferences, and our app choices, as well as explore...
Chapter
This paper focuses on lazy adaptation knowledge learning (LAKL) using frequent closed itemset extraction. This approach differs from eager adaptation knowledge learning (EAKL) by the number of cases used in the learning process and by the moment at which the process is triggered. Where EAKL aims to compute adaptation knowledge once on the whole cas...
Preprint
Full-text available
Automatic Emotion Detection (ED) aims to build systems to identify users' emotions automatically. This field has the potential to enhance HCI, creating an individualised experience for the user. However, ED systems tend to perform poorly on people with Autism Spectrum Disorder (ASD). Hence, the need to create ED systems tailored to how people with...
Chapter
Full-text available
Automatic Emotion Detection (ED) aims to build systems to identify users’ emotions automatically. This field has the potential to enhance HCI, creating an individualised experience for the user. However, ED systems tend to perform poorly on people with Autism Spectrum Disorder (ASD). Hence, the need to create ED systems tailored to how people with...
Article
Introduction: CARA is a five-year Health Research Board (HRB) project. Superbugs cause resistant infections that are difficult to treat and pose a serious threat to human health. Providing tools to explore the prescription of antibiotics by GPs may help identify gaps where improvements can be made. CARA's aim is to combine, link and visualise data...
Article
The scarcity of high-quality annotations in many application scenarios has recently led to an increasing interest in devising learning techniques that combine unlabeled data with labeled data in a network. In this work, we focus on the label propagation problem in multilayer networks. Our approach is inspired by the heat diffusion model, which show...
Article
Full-text available
The open science movement has gained significant momentum within the last few years. This comes along with the need to store and share research artefacts, such as publications and research data. For this purpose, research repositories need to be established. A variety of solutions exist for implementing such repositories, covering diverse features,...
Chapter
For regression tasks, using neural networks in a supervised way typically requires to repeatedly (over several iterations called epochs) present a set of items described by a number of features and the expected value to the network, so that it can learn to predict those values from those features. Inspired by case-based reasoning, several previous...
Article
Full-text available
A key property of Linked Data is the representation and publication of data as interconnected labelled graphs where different resources linked to each other form a network of meaningful information. Searching these important relationships between resources-within single or distributed graphs-can be reduced to a pathfinding or navigation problem, i....
Article
Knowledge Graphs have emerged as a core technology to aggregate and publish knowledge on the Web. However, integrating knowledge from different sources, not specifically designed to be interoperable, is not a trivial task. Finding the right ontologies to model a dataset is a challenge since several valid data models exist and there is no clear agre...
Article
Full-text available
In real-world machine learning applications, unlabeled training data are readily available, but labeled data are expensive and hard to obtain. Therefore, semi-supervised learning algorithms have gathered much attention. Previous studies in this area mainly focused on a semi-supervised classification problem, whereas semi-supervised regression has r...
Article
Background Worldwide, many people have been affected by COVID-19, a novel respiratory illness, caused by a new type of coronavirus SARS-CoV2. The COVID-19 outbreak is considered a pandemic and has created a number of challenges for the general population, patients, and healthcare professionals. Lockdowns have been implemented to slow down the sprea...
Article
The 29th ACM International Conference on Information and Knowledge Management (CIKM) was held online from the 19 th to the 23 rd of October 2020. CIKM is an annual computer science conference, focused on research at the intersection of information retrieval, machine learning, databases as well as semantic and knowledge-based technologies. Since it...
Article
Semi-Supervised Learning (SSL) is an approach to machine learning that makes use of unlabeled data for training with a small amount of labeled data. In the context of molecular biology and pharmacology, one can take advantage of unlabeled data. For instance, to identify drugs and targets where a few genes are known to be associated with a specific...
Preprint
Full-text available
Prediction of metastatic sites from the primary site of origin is a impugn task in breast cancer (BRCA). Multi-dimensionality of such metastatic sites - bone, lung, kidney, and brain, using large-scale multi-dimensional Poly-Omics (Transcriptomics, Proteomics and Metabolomics) data of various type, for example, CNV (Copy number variation), GE (Gene...
Chapter
Linked Data are based on a set of principles and technologies to exploit the architecture of the Web in order to represent and provide access to machine-readable, globally integrated information. Those principles and technologies have many advantages when applied in the context of implementing data lakes, both generally and in particular domains....
Preprint
Full-text available
Beyond sharing datasets or simulations, we believe the Recommender Systems (RS) community should share Task Environments. In this work, we propose a high-level logical architecture that will help to reason about the most important components of a RS Task Environment, identify the differences between Environments, datasets and simulations, and most...
Article
Full-text available
Identifying the unintended effects of drugs (side effects) is a very important issue in pharmacological studies. The laboratory verification of associations between drugs and side effects requires costly, time-intensive research. Thus, an approach to predicting drug side effects based on known side effects, using a computational model, is highly de...
Conference Paper
Full-text available
The evaluation of text complexity is an important topic in education. While this objective has been addressed by approaches using lexical and syntactic analysis for decades, semantic complexity is less common, and the recent research works that tackle this question rely on machine learning algorithms that are hardly explainable and are not specific...
Conference Paper
Full-text available
The purpose of the LILE2019 workshop is to provide an interdisciplinary forum for researchers and practitioners who make innovative use of Web data for educational purposes, spanning areas such as learning analytics, Web mining, data and Web science, psychology and the social sciences. The previous editions of the LILE workshop were successfully he...
Conference Paper
One of the existing query recommendation strategies for unknown datasets is "by example", i.e. based on a query that the user already knows how to formulate on another dataset within a similar domain. In this paper we measure what contribution a structural analysis of the query and the datasets can bring to a recommendation strategy, to go alongsid...
Article
Full-text available
The advent of social media has enabled us to explore the impact of academic entities beyond the conventional bibliometric community. The traditional bibliometric indicators such as citation count, h-index and SNIP metrics aim to represent the propagation of knowledge in the academic world rather than the impact of the research on the wider world. I...
Article
Full-text available
The goal of this work is to describe how robots interact with complex city environments, and to identify the main characteristics of an emerging field that we call Robot–City Interaction (RCI). Given the central role recently gained by modern cities as use cases for the deployment of advanced technologies, and the advancements achieved in the robot...
Article
Full-text available
Research has approached the practice of musical reception in a multitude of ways, such as the analysis of professional critique, sales figures and psychological processes activated by the act of listening. Studies in the Humanities, on the other hand, have been hindered by the lack of structured evidence of actual experiences of listening as report...
Article
Full-text available
Data retrieval systems are facing a paradigm shift due to the proliferation of specialised data storage engines (SQL, NoSQL, Column Stores, MapReduce, Data Stream, Graph) supported by varied data models (CSV, JSON, RDB, RDF, XML). One immediate consequence of this paradigm shift results into data bottleneck over the Web; which means, Web applicatio...
Conference Paper
Addressing ethical issues arising from AI research, and by extension from most areas of Data Science, is a core challenge in both the academic and industry worlds. The nature of research and the specific set of technical skills involved imply that AI and Data Science researchers are not equipped to identify and anticipate such issues arising, or to...
Article
Full-text available
HSE EPA Data Policy Workshop From Open Data to GDPR Data Sharing Challenges – Environment, Health & Wellbeing HSE EPA Environment, Health & Wellbeing Conference 2-4pm Wednesday 7th November 2018 Radisson Blu Hotel, Golden Lane, Dublin The workshop was comprised of two panels lasting one hour each. Each panel was comprised of 4 people who made 8-10...
Book
Full-text available
Distance teaching and the use of openly available educational resources on the Web are becoming common practices at public higher education institutions as well as private training organisations. In addition, informal learning and knowledge exchange are inherent to our daily online interactions, such as searching the Web [1], and using learning and...
Conference Paper
It is our great pleasure to welcome you to the WWW 2018 Challenges Track. It is the first time that the WWW conference includes such a track, which aim was to showcase the maturity of the state of the art on tasks common to the Web community and adjacent academic communities, in a controlled setting of rigorous evaluation. Through our call for chal...
Conference Paper
This volume of proceedings presents the papers from the 2nd edition of the interdisciplinary workshop Re-coding Black Mirror, held on April 24, 2018 in Lyon, France and co-located with The WEB Conference (WWW2018). Participating to the topical debate of data ethics and algorithmic governance, Re-coding Black Mirror offers the research community too...
Conference Paper
Full-text available
Whether for using online services or dealing with legal issues, citizens are often requested to sign/accept policy documents that are intended to commit them to specific rights and duties. Usually such documents are difficult to read due to their nature, the length of sentences, complex terms used, etc. Since understanding is a prerequisite to maki...
Conference Paper
Full-text available
The goal of AFEL is to develop, pilot and evaluate methods and applications , which advance informal/collective learning as it surfaces implicitly in online social environments. The project is following a multidisciplinary , industry-driven approach to the analysis and understanding of learner data in order to personalize, accelerate and improve in...
Article
Semantic Web technologies aim to simplify the distribution, sharing and exploitation of information and knowledge, across multiple distributed actors on the Web. As with all technologies that manipulate information, there are privacy and security implications, and data policies (e.g., licenses and regulations) that may apply to both data and softwa...
Conference Paper
When publishing data, data licences are used to specify the actions that are permitted or prohibited, and the duties that target data consumers must comply with. However, in complex environments such as a smart city data portal, multiple data sources are constantly being combined, processed and redistributed. In such a scenario, deciding which poli...
Conference Paper
Full-text available
The focus of this work is to exploit ontologies to make robotic systems more accessible to non-expert users, therefore supporting the deployment of robot-integrated applications. Due to the increasing number of robotic platforms available for commercial use, robotic systems are nowadays being approached by users with different backgrounds, who are...
Conference Paper
Full-text available
In this paper, we discuss how Learning Analytics, as the activity to capture and analyze people's learning behaviors in order to improve their learning experiences, could be used as a way for bots to "learn how to learn" and how this might have a greater impact than the apparent improvement it would enable for Artificial Intelligence. Through explo...
Conference Paper
Full-text available
More and more learning activities take place online in a self-directed manner. Therefore, just as the idea of self-tracking activities for fitness purposes has gained momentum in the past few years, tools and methods for awareness and self-reflection on one's own online learning behavior appear as an emerging need for both formal and informal learn...
Conference Paper
Virtual data integration takes place at query execution time and relies on transformations of the original query to many target endpoints, where the data reside. In systems that integrate many data sources, this means maintaining many mappings, queries and query templates, as well as possibly issuing separate queries for linking entities in the dat...
Conference Paper
An increasing amount of large-scale knowledge graphs have been constructed in recent years. Those graphs are often created from text-based extraction, which could be very noisy. So far, cleaning knowledge graphs are often carried out by human experts and thus very inefficient. It is necessary to explore automatic methods for identifying and elimina...
Conference Paper
In this paper we present the DKA-robo framework, where a mobile robot is used to update the statements of a knowledge base that have lost validity in time. Managing the dynamic information of knowledge bases constitutes a key issue in many real-world scenarios, because constantly reevaluating data requires efforts in terms of knowledge acquisition...
Conference Paper
Web-scale reuse and interoperability of learning resources have been major concerns for the technology-enhanced learning community. While work in this area traditionally focused on learning resource metadata, provided through learning resource repositories, the recent emergence of structured entity markup on the Web through standards such as RDFa a...
Book
This book contains the best selected papers of two Satellite Events held at the 20th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2016, in November 2016 in Bologna, Italy: The Second International Workshop on Educational Knowledge Management, EKM 2016, and the First Workshop: Detection, Representation and Managem...
Article
The Learning Analytics and Knowledge (LAK) Dataset represents an unprecedented corpus which exposes a near complete collection of bibliographic resources for a specific research discipline, namely the connected areas of Learning Analytics and Educational Data Mining. Covering over five years of scientific literature from the most relevant conferenc...
Conference Paper
Workflow formalisations are often focused on the representation of a process with the primary objective to support execution. However, there are scenarios where what needs to be represented is the effect of the process on the data artefacts involved, for example when reasoning over the corresponding data policies. This can be achieved by annotating...
Conference Paper
The goal of this work is to learn a measure supporting the detection of strong relationships between Linked Data entities. Such relationships can be represented as paths of entities and properties, and can be obtained through a blind graph search process traversing Linked Data. The challenge here is therefore the design of a cost-function that is a...
Conference Paper
In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset...
Article
The Semantic Web is a young discipline, even if only in comparison to other areas of computer science. Nonetheless, it already exhibits an interesting history and evolution. This book is a reflection on this evolution, aiming to take a snapshot of where we are at this specific point in time, and also showing what might be the focus of future resear...
Article
This paper describes the construction of the LOTED2 ontology for the representation of European public procurement notices. LOTED2 follows initiatives around the creation of linked data-compliant representations of information regarding tender notices in Europe, but focusing on placing such representations within their legal context. It is therefor...
Chapter
Education has often been a keen adopter of new information and communication technologies. This is not surprising given that education is all about informing and communicating. Traditionally, educational institutions produce large volumes of data, much of which is publicly available, either because it is useful to communicate (e.g. the course catal...
Article
Full-text available
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the University. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic de...
Book
The 47 revised full papers presented together with three invited talks were carefully reviewed and selected from 204 submissions. This program was completed by a demonstration and poster session, in which researchers had the chance to present their latest results and advances in the form of live demos. In addition, the PhD Symposium program include...
Chapter
The Semantic Web, and by extension semantic web technologies, is very young in comparison to other computing disciplines, such as databases and artificial intelligence—and indeed even the Web is very young in comparison with these disciplines.1 As a result, as is usually the case with new phenomena, it will probably take time to develop a comprehen...
Chapter
As shown in the previous chapter, examples of intelligent semantic web systems are becoming more and more common, and have evolved from academic prototypes demonstrating advanced concepts, to being adopted by commercial organizations producing tools used on a daily basis by millions of people. Hence, the vision of the Semantic Web and of what could...
Chapter
The semantic web, which we have characterized in the previous chapter as a conceptual network two levels of abstractions above the web, requires new approaches and tools to enable users and applications to interact with and exploit something that is akin to a knowledge network, rather than being simply a network of documents. Knowledge-based system...
Chapter
In the previous chapters we provided a conceptualization of the Semantic Web and explained how intelligent applications can be built to rely on its contents and knowledge sharing infrastructure. At the same time we also pointed out the specificities of the semantic web, compared to other types of information systems. At this point, we want to shift...
Article
Full-text available
How can we innovate smart systems for smart cities, to make data available homogeneously, inexpensively, and flexibly while supporting an array of applications that have yet to exist or be specified?
Conference Paper
Full-text available
Governing the life cycle of data on the web is a challenging issue for organisations and users. Data is distributed under certain policies that determine what actions are allowed and in which circumstances. Assessing what policies propagate to the output of a process is one crucial problem. Having a description of policies and data flow steps impli...
Conference Paper
In this paper, we propose an ontology design pattern for the concept of "explanation". The motivation behind this work comes from our research, which focuses on automatically identifying explanations for data patterns. If we want to produce explanations from data agnostically from the application domain, we first need a formal definition of what an...
Conference Paper
In this paper we present the system Dedalo, whose aim is to generate explanations for data patterns using background knowledge retrieved from Linked Data. In many real-world scenarios, patterns are generally manually interpreted by the experts that have to use their own background knowledge to explain and refine them, while their workload could be...
Conference Paper
Full-text available
Licences are a crucial aspect of the information publishing process in the web of (linked) data. Recent work on modeling of policies with semantic web languages (RDF, ODRL) gives the opportunity to formally describe licences and reason upon them. However, choosing the right licence is still challenging. Particularly, understanding the number of fea...
Conference Paper
In this paper we exploit knowledge from Linked Data to ease the process of analysing scholarly data. In the last years, many techniques have been presented with the aim of analysing such data and revealing new, unrevealed knowledge, generally presented in the form of ``patterns". However, the discovered patterns often still require human interpreta...
Conference Paper
The LAK Data Challenge 2015 continues the research efforts of the previous data competitions in 2013 and 2014 by stimulating research on the evolving fields Learning Analytics (LA) and Educational Data Mining (EDM). Building on a series of activities of the LinkedUp project, the challenge aims to generate new insights and analysis on the LA & EDM d...