Femke Ongenae

Femke Ongenae
Universiteit Gent - imec · IDLab

Professor

About

170
Publications
30,442
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,253
Citations
Citations since 2017
97 Research Items
1021 Citations
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
Introduction
I am a part-time assistant professor at the IDLab group of Ghent University and a full-time research manager at the imec research hub for nano and digital technologies. I lead the Knowledge Management team. This team performs research into a) expressive semantic stream and distributed reasoning, b) the incorporation of expert knowledge in data analytics algorithms, c) hybrid AI, fusing semantic models and machine learning, and d) explainable AI by leveraging Knowledge Graphs.
Additional affiliations
August 2007 - present
Universiteit Gent - imec
Position
  • Ghent University
Education
September 2003 - June 2007
Ghent University
Field of study
  • Computer Science

Publications

Publications (170)
Article
Full-text available
In this paper, a hybrid leak localization approach in WDNs is proposed, combining both model-based and data-driven modeling. Pressure heads of leak scenarios are simulated using a hydraulic model, and then used to train a machine-learning based leak localization model. A key element of the methodology is that discrepancies between simulated and mea...
Article
Full-text available
Background The diagnosis of headache disorders relies on the correct classification of individual headache attacks. Currently, this is mainly done by clinicians in a clinical setting, which is dependent on subjective self-reported input from patients. Existing classification apps also rely on self-reported information and lack validation. Therefore...
Preprint
Full-text available
A paper of Alsinglawi et al was recently accepted and published in Scientific Reports. In this paper, the authors aim to predict length of stay (LOS), discretized into either long (> 7 days) or short stays (< 7 days), of lung cancer patients in an ICU department using various machine learning techniques. The authors claim to achieve perfect results...
Article
Full-text available
Background Insomnia, eating disorders, heart problems and even strokes are just some of the illnesses that reveal the negative impact of stress overload on health and well-being. Early detection of stress is therefore of utmost importance. Whereas the gold-standard for detecting stress is by means of questionnaires, more recent work uses wearable s...
Article
Full-text available
As companies rely on an ever increasing number of connected devices for their day to day operations, a need arises for automated anomaly detectors to constantly observe crucial device metrics in real time to prevent downtime and data loss. As production environments tend to monitor a huge amount of these metrics, it prevents current state-of-the-ar...
Article
Full-text available
Background Beta-lactam antimicrobial concentrations are frequently suboptimal in critically ill patients. Population pharmacokinetic (PopPK) modeling is the golden standard to predict drug concentrations. However, currently available PopPK models often lack predictive accuracy, making them less suited to guide dosing regimen adaptations. Furthermor...
Article
Full-text available
Background: Anxiety disorders are highly prevalent in mental health problems. The lives of people suffering from an anxiety disorder can be severely impaired. Virtual Reality Exposure Therapy (VRET) is an effective treatment, which immerses patients in a controlled Virtual Environment (VE). This creates the opportunity to confront feared stimuli a...
Preprint
Full-text available
Feature selection is a crucial step in developing robust and powerful machine learning models. Feature selection techniques can be divided into two categories: filter and wrapper methods. While wrapper methods commonly result in strong predictive performances, they suffer from a large computational complexity and therefore take a significant amount...
Article
Full-text available
In today’s data landscape, data streams are well represented. This is mainly due to the rise of data-intensive domains such as the Internet of Things (IoT), Smart Industries, Pervasive Health, and Social Media. To extract meaningful insights from these streams, they should be processed in real time, while solving an integration problem as these str...
Preprint
Full-text available
This paper introduces pyRDF2Vec, a Python software package that reimplements the well-known RDF2Vec algorithm along with several of its extensions. By making the algorithm available in the most popular data science language, and by bundling all extensions into a single place, the use of RDF2Vec is simplified for data scientists. The package is rele...
Conference Paper
Full-text available
A large portion of structured data does not yet reap the benefits of the Semantic Web. Therefore, The "Tabular Data to Knowledge Graph Matching" competition at ISWC tries to bridge this gap by evaluating and promoting the creation of such semantic annotations tools. Besides annotating data semantically, the system should also be able to further aug...
Preprint
Full-text available
The inception of Relational Graph Convolutional Networks (R-GCNs) marked a milestone in the Semantic Web domain as it allows for end-to-end training of machine learning models that operate on Knowledge Graphs (KGs). R-GCNs generate a representation for a node of interest by repeatedly aggregating parametrised, relation-specific transformations of i...
Article
Full-text available
Deep learning techniques are increasingly being applied to solve various machine learning tasks that use Knowledge Graphs as input data. However, these techniques typically learn a latent representation for the entities of interest internally, which is then used to make decisions. This latent representation is often not comprehensible to humans, wh...
Article
Full-text available
Traditionally, neural networks are viewed from the perspective of connected neuron layers represented as matrix multiplications. We propose to compose these weight matrices from a set of orthogonal basis matrices by approaching them as elements of the real matrices vector space under addition and multiplication. Making use of the Kronecker product...
Article
Full-text available
Companies are increasingly gathering and analyzing time-series data, driven by the rising number of IoT devices. Many works in literature describe analysis systems built using either data-driven or semantic (knowledge-driven) techniques. However, little to no works describe hybrid combinations of these two. Dyversify, a collaborative project betwee...
Article
Full-text available
Manufacturers can plan predictive maintenance by remotely monitoring their assets. However, to extract the necessary insights from monitoring data, they often lack sufficiently large datasets that are labeled by human experts. We suggest combining knowledge-driven and unsupervised data-driven approaches to tackle this issue. Additionally, we presen...
Chapter
As Knowledge Graphs are symbolic constructs, specialized techniques have to be applied in order to make them compatible with data mining techniques. RDF2Vec is an unsupervised technique that can create task-agnostic numerical representations of the nodes in a KG by extending successful language modeling techniques. The original work proposed the We...
Chapter
Stream Reasoning, and more particularly RDF Stream Processing (RSP), has focused on processing data streams in a timely manner, while expressive reasoning techniques, such as OWL2 DL, allow to fully model and interpret their domain knowledge. However, expressive reasoning techniques have thus far mostly focused on static data, as it tends to become...
Book
Full-text available
This book constitutes the proceedings of the satellite events held at the 18th Extended Semantic Web Conference, ESWC 2021, in June 2021. The conference was held online, due to the COVID-19 pandemic. During ESWC 2021, the following six workshops took place: 1) the Second International Workshop on Deep Learning meets Ontologies and Natural Language...
Chapter
Full-text available
The RDF Stream Processing (RSP) community has proposed several models and languages for continuously querying and reasoning over RDF streams over the last decade. They each have their semantics, making them hard to compare. The variety of approaches has fostered both empirical and theoretical research and led to the design of RSPQL, i.e., a unifyin...
Article
Full-text available
Anomalies and faults can be detected, and their causes verified, using both data-driven and knowledge-driven techniques. Data-driven techniques can adapt their internal functioning based on the raw input data but fail to explain the manifestation of any detection. Knowledge-driven techniques inherently deliver the cause of the faults that were dete...
Article
Full-text available
In the time series classification domain, shapelets are subsequences that are discriminative of a certain class. It has been shown that classifiers are able to achieve state-of-the-art results by taking the distances from the input time series to different discriminative shapelets as the input. Additionally, these shapelets can be visualized and th...
Article
Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource...
Article
Full-text available
Background Leveraging graphs for machine learning tasks can result in more expressive power as extra information is added to the data by explicitly encoding relations between entities. Knowledge graphs are multi-relational, directed graph representations of domain knowledge. Recently, deep learning-based techniques have been gaining a lot of popula...
Chapter
At the end of 2019, Chinese authorities alerted the World Health Organization (WHO) of the outbreak of a new strain of the coronavirus, called SARS-CoV-2, which struck humanity by an unprecedented disaster a few months later. In response to this pandemic, a publicly available dataset was released on Kaggle which contained information of over 63,000...
Preprint
Full-text available
As KGs are symbolic constructs, specialized techniques have to be applied in order to make them compatible with data mining techniques. RDF2Vec is an unsupervised technique that can create task-agnostic numerical representations of the nodes in a KG by extending successful language modelling techniques. The original work proposed the Weisfeiler-Leh...
Article
Full-text available
This paper contributes to the pursuit of leveraging unstructured medical notes to structured clinical decision making. In particular, we present a pipeline for clinical information extraction from medical notes related to preterm birth, and discuss the main challenges as well as its potential for clinical practice. A large collection of medical not...
Article
Full-text available
The Matrix Profile is a state-of-the-art time series analysis technique that can be used for motif discovery, anomaly detection, segmentation and others, in various domains such as healthcare, robotics, and audio. Where recent techniques use the Matrix Profile as a preprocessing or modeling step, we believe there is unexplored potential in generali...
Article
Full-text available
In industry, dashboards are often used to monitor fleets of assets, such as trains, machines or buildings. In such industrial fleets, the vast amount of sensors evolves continuously, new sensor data exchange protocols and data formats are introduced, new visualization types may need to be introduced and existing dashboard visualizations may need to...
Article
Full-text available
Autism Spectrum Disorder (ASD) is characterized by social interaction difficulties and communication difficulties. Moreover, children with ASD often suffer from other co-morbidities, such as anxiety and depression. Finding appropriate treatment can be difficult as symptoms of ASD and co-morbidities often overlap. Due to these challenges, parents of...
Preprint
Full-text available
Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource...
Chapter
Full-text available
Communication networks are complex systems consisting of many components each producing a multitude of system metrics that can be monitored in real-time. Anomaly Detection (AD) allows to detect deviant behavior in these system metrics. However, in communication networks, large amounts of domain knowledge and huge manual efforts are required to effi...
Conference Paper
Full-text available
A large portion of structured data does not yet reap the benefits of the Semantic Web, or Web 2.0, as it is not semantically annotated. In this paper, we propose a system to generates semantic knowledge, available on DBPedia, from common CSV files. The "Tabular Data to Knowledge Graph Matching" competition, consisting of three different subchalleng...
Chapter
During amateur cycling training, analyzing sensor data in real-time would allow riders to receive immediate feedback on how they are performing, and adapt their training accordingly. In this paper, a solution with Semantic Web technologies is presented that gives such real-time personalized feedback, by integrating the data streams with domain know...
Article
Full-text available
IoT-based solutions for sport analytics aim to improve performance, coaching and strategic insights. These factors are especially relevant in cycling, where real-time data should be available anytime, anywhere, even in remote areas where there are no infrastructure-based communication technologies (e.g. LTE, Wi-Fi). In this paper, we present an exp...
Conference Paper
Full-text available
Deep-learning based techniques are increasingly being used for different machine learning tasks on knowledge graphs. While it has been shown empirically that these techniques often achieve better pre-dictive performances than their classical counterparts, where features are extracted from the graph, they lack interpretability. Interpretability is a...
Preprint
Full-text available
In the time series classification domain, shapelets are small time series that are discriminative for a certain class. It has been shown that classifiers are able to achieve state-of-the-art results on a plethora of datasets by taking as input distances from the input time series to different discriminative shapelets. Additionally, these shapelets...
Article
Full-text available
In highly dynamic domains such as the Internet of Things (IoT), Smart Industries, Smart Manufacturing, Pervasive Health or Social Media, data is being continuously generated. By combining this generated data with background knowledge and performing expressive reasoning upon this combination, meaningful decisions can be made. Furthermore, this conti...
Conference Paper
Full-text available
Many domains, such as the Internet of Things and Social Media, demand to combine data streams with background knowledge to enable meaningful analysis in real-time. When background knowledge takes the form of taxonomies and class hierarchies, Semantic Web technologies are valuable tools and their extension to data streams, namely RDF Stream processi...
Article
Full-text available
Background: Mobile apps generate vast amounts of user data. In the mobile health (mHealth) domain, researchers are increasingly discovering the opportunities of log data to assess the usage of their mobile apps. To date, however, the analysis of these data are often limited to descriptive statistics. Using data mining techniques, log data can offe...
Article
Full-text available
The continuous financial pressure on hospitals forces them to rethink various workflows. We focus on optimizing hospital transports, within the hospital, as they count up to 30% of the overall hospital cost. In this paper, we discuss a self-learning platform that learns the causes of transport delays, in order to avoid these kinds of delays in the...
Article
Behavioral disturbances of persons with dementia residing in a nursing home impose a significant burden on other residents and on the care staff. A social robot can provide an adequate technological support tool for the caregivers by approaching a resident that exhibits a behavioral disturbance. In this paper, we focus on how to position the robot...
Chapter
Machine learning techniques are increasingly applied in Decision Support Systems. The selection processes underlying a conclusion often become black-boxed. Thus, the decision flow is not always comprehensible by developers or end users. It is unclear what the priorities are and whether all of the relevant information is used. In order to achieve hu...
Article
Full-text available
Purpose: This study aimed to predict the session Rate of Perceived Exertion (sRPE) in soccer and determine the main predictive indicators of the sRPE. Methods: A total of 70 External Load Indicators (ELIs), Internal Load Indicators (ILIs), Individual Characteristics (ICs) and Supplementary Variables (SVs) were used to build a predictive model. Re...
Article
Full-text available
Background Headache disorders are an important health burden, having a large health-economic impact worldwide. Current treatment & follow-up processes are often archaic, creating opportunities for computer-aided and decision support systems to increase their efficiency. Existing systems are mostly completely data-driven, and the underlying models a...
Article
Full-text available
In the Internet of Things (IoT), multiple sensors and devices are generating heterogeneous streams of data. To perform meaningful analysis over multiple of these streams, stream processing needs to support expressive reasoning capabilities to infer implicit facts and temporal reasoning to capture temporal dependencies. However, current approaches c...
Article
Introduction: Blood cultures are often performed in the intensive care unit (ICU) to detect bloodstream infections and identify pathogen type, further guiding treatment. Early detection is essential, as a bloodstream infection can give cause to sepsis, a severe immune response associated with an increased risk of organ failure and death. Problem s...
Article
Robots are moving from well-controlled lab environments to the real world, where an increasing number of environments has been transformed into smart sensorized IoT spaces. Users will expect these robots to adapt to their preferences and needs, and even more so for social robots that engage in personal interactions. In this paper, we present declar...
Article
Full-text available
In hospitals and smart nursing homes, ambient-intelligent care rooms are equipped with many sensors. They can monitor environmental and body parameters, and detect wearable devices of patients and nurses. Hence, they continuously produce data streams. This offers the opportunity to collect, integrate and interpret this data in a context-aware manne...
Conference Paper
Full-text available
Assessing upfront the causes and effects of failures is an important aspect of system manufacturing. Nowadays, these analyses are performed by a large number of experts. To enable semantic unification and easy operationalization of these risk analyses, this paper demonstrates an approach to automatically map the captured information into an ontolog...
Conference Paper
Full-text available
Sensors, inside internet-connected devices, analyse the environment and monitor possible unwanted behaviour or the malfunctioning of the system. Current risk analysis tools, such as Fault Tree Analysis (FTA) and Failure Mode and Effect Analysis (FMEA), provide prior information on these faults together expert-driven insights of the system. Many peo...