ArticlePDF Available

Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design

Authors:

Abstract and Figures

Integrating Internet of Things (IoT) sensor data from heterogeneous sources with domain knowledge and context information in real-time is a challenging task in IoT healthcare data management applications that can be solved with semantics. Existing IoT platforms often have issues with preserving the privacy of patient data. Moreover, configuring and managing context-aware stream processing queries in semantic IoT platforms requires much manual, labor-intensive effort. Generic queries can deal with context changes but often lead to performance issues caused by the need for expressive real-time semantic reasoning. In addition, query window parameters are part of the manual configuration and cannot be made context-dependent. To tackle these problems, this paper presents DIVIDE, a component for a semantic IoT platform that adaptively derives and manages the queries of the platform’s stream processing components in a context-aware and scalable manner, and that enables privacy by design. By performing semantic reasoning to derive the queries when context changes are observed, their real-time evaluation does require any reasoning. The results of an evaluation on a homecare monitoring use case demonstrate how activity detection queries derived with DIVIDE can be evaluated in on average less than 3.7 seconds and can therefore successfully run on low-end IoT devices.
Content may be subject to copyright.
CORRECTED PROOF
[research-article] p. 1/49
Semantic Web 00 (20xx) 1–49 1
DOI 10.3233/SW-223281
IOS Press
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Context-aware query derivation for IoT data
streams with DIVIDE enabling privacy by
design
Mathias De Brouwer *, Bram Steenwinckel, Ziye Fang, Marija Stojchevska, Pieter Bonte,
Filip De Turck, Sofie Van Hoecke and Femke Ongenae
IDLab, Ghent University imec, Belgium
E-mails: mrdbrouw.DeBrouwer@UGent.be,Bram.Steenwinckel@UGent.be,Ziye.Fang@UGent.be,
Marija.Stojchevska@UGent.be,Pieter.Bonte@UGent.be,Filip.DeTurck@UGent.be,Sofie.VanHoecke@UGent.be,
Femke.Ongenae@UGent.be
Editors: Haridimos Kondylakis, FORTH-ICS, Greece; Praveen Rao, University of Missouri, USA; Kostas Stefanidis, Tampere University,
Finland
Solicited reviews: Fotis Aisopos, National Centre of Scientific Research Demokritos, Greece; three anonymous reviewers
Abstract. Integrating Internet of Things (IoT) sensor data from heterogeneous sources with domain knowledge and context
information in real-time is a challenging task in IoT healthcare data management applications that can be solved with seman-
tics. Existing IoT platforms often have issues with preserving the privacy of patient data. Moreover, configuring and managing
context-aware stream processing queries in semantic IoT platforms requires much manual, labor-intensive effort. Generic queries
can deal with context changes but often lead to performance issues caused by the need for expressive real-time semantic reason-
ing. In addition, query window parameters are part of the manual configuration and cannot be made context-dependent. To tackle
these problems, this paper presents DIVIDE, a component for a semantic IoT platform that adaptively derives and manages the
queries of the platform’s stream processing components in a context-aware and scalable manner, and that enables privacy by
design. By performing semantic reasoning to derive the queries when context changes are observed, their real-time evaluation
does require any reasoning. The results of an evaluation on a homecare monitoring use case demonstrate how activity detection
queries derived with DIVIDE can be evaluated in on average less than 3.7 seconds and can therefore successfully run on low-end
IoT devices.
Keywords: Context-aware query derivation, Internet of Things, cascading reasoning, semantic reasoning, homecare monitoring
1. Introduction
1.1. Background
In the healthcare domain, many applications involve a large collection of Internet of Things (IoT) devices and
sensors [40]. Many of those systems typically focus on the real-time monitoring of patients in hospitals, nursing
homes, homecare or elsewhere. In such systems, patients and their environment are being equipped with different
*Corresponding author. E-mail: mrdbrouw.DeBrouwer@UGent.be.
1570-0844 © 2023 The authors. Published by IOS Press. This is an Open Access article distributed under the terms of the
Creative Commons Attribution License (CC BY 4.0).
CORRECTED PROOF
[research-article] p. 2/49
2M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
devices and sensors for following up on the patients’ conditions, diseases and treatments in a personalized, context-
aware way. This is achieved by integrating the data collected by the IoT devices with existing domain knowledge
and context information. As such, analyzing this combination of data sources jointly allows a system to extract
meaningful insights and actuate on them [66].
Integrating and analyzing the IoT data with domain knowledge and context information in a real-time context is
a challenging task. This is due to the typically high volume, variety and velocity of the different data sources [2]. To
deal with these challenges, semantic IoT platforms can be deployed [11]. They generally contain stream processing
components that integrate and analyze the different data sources by continuously evaluating semantic queries. To
deploy this, Semantic Web technologies are typically employed: ontologies are designed to integrate and model the
data from different heterogeneous sources and its relationships and properties in a common, machine-interpretable
format, and existing stream reasoning techniques are used by the data stream processing components [30].
Currently, the configuration and management of queries that run on the stream processing components of a se-
mantic IoT platform are manual tasks that require a lot of effort from the end user. In the typical IoT applications
in healthcare, those queries should be context-aware: the context information determines which sensors and devices
should be monitored by the query, for example to filter specific events to send to other components for further anal-
ysis. For example, a patient’s diagnosis in the Electronic Health Record (EHR) determines the monitoring tasks that
should be performed in the patient’s hospital room, while the indoor location (room) of the patient in a homecare
monitoring environment determines which in-home activities can be monitored. Changes in this context informa-
tion regularly occur. For example, the profile information of patients in their EHR can be updated, or the in-home
location of the patient can evolve over time. Hence, the management of the queries should be able to deal with such
context changes. Currently, no semantic IoT platform component exists that allows to configure, derive and man-
age the platform’s queries in an automated, adaptive way. Therefore, platforms typically apply one of two existing
approaches to achieve this.
The first approach to introduce context-awareness into semantic queries is by defining them in a generic fashion.
A generic query uses generic ontology concepts in its definitions to perform multiple contextually relevant tasks.
This way, semantic reasoners will reason in real-time on all available streaming, context and domain knowledge data
to determine the contextually relevant sensors and devices to which the query is applicable. The advantage of this
approach is that such queries are prepared to deal with contextual changes: due to their generic nature, they should
not be updated often. However, the disadvantage of highly generic queries is the high computational complexity
of the semantic reasoning during their evaluation. This is caused by complex ontologies in IoT domains such as
healthcare that require expressive reasoning [16]. In healthcare applications that involve a large number of sensors,
it is practically challenging to do this in real-time: queries take longer to evaluate, causing lower performance and
difficulty to keep up with the required query executionfrequencies. Typically, central componentsin an IoT platform
have more resources and are therefore more likely to overcome this challenge. However, running all queries on
central components would require all generated IoT streaming data to be sent over the network, causing the network
to be highly congested all the time. In addition, the central server resources would be constantly in use, and local
decision making would no longer be possible. Importantly, this would also imply no flexibility in preserving the
patient’s privacy by keeping sensitive data locally. Looking at local and edge IoT devices to run those generic queries
instead, resources are typically lower, making the performance challenges an even bigger issue of the generic query
approach.
An alternative approach that can be adopted is installing multiple specific queries on the stream processing com-
ponents that filter the contextually relevant sensors for one specific task. Evaluating such non-generic queries re-
duces the required semantic reasoning effort, solving the performance issues of the generic approach. However, this
approach even further increases the required manual query configuration and management effort for the end user:
whenever the context changes, the queries should be manually updated. This is infeasible to do in practice. As a
consequence, current platforms do not apply this approach often and mostly work with generic queries instead.
Moreover, the definition of generic stream processing queries does not contain any means to make the window
parameters of the query dependent on the application context and domain knowledge. Currently, an end user should
configure these query parameters, and cannot let the system define them based on the data. This can be a problem
in some specific monitoring cases. For example, the size of the data window on which a monitoring task such as in-
home activity detection should be executed, may depend on the type of task, and therefore be defined in the domain
CORRECTED PROOF
[research-article] p. 3/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
knowledge. Another example is when the execution frequency of a monitoring task depends on certain contextual
events happening in the patient’s environment.
In addition, preserving the privacy of the patients is of utmost importance in healthcare systems [1]. In IoT
platforms, lots of the data generated by the IoT devices can contain privacy-sensitive information. Depending on
where the data processing components are being hosted, this privacy-sensitive data may have to be sent over the IoT
network, potentially exposing it to the outside world. Therefore, the IoT data is ideally processed close to where it
is generated to reduce the amount of information sent over the network as much as possible. With regards to this, a
semantic IoT platform should enable privacy by design [21]: it should allow an end user to build privacy by design
into an application by precisely controlling which data is kept locally, and which data is sent over the network.
Finally, a semantic IoT platform component that would solve the aforementioned issues, should also be practically
usable. Currently, existing semantic IoT healthcare platforms use semantic reasoners or stream reasoners that are
configured with existing sets of generic semantic queries [66]. Defining such queries and ensuring their correctness
is a delicate and time-consuming task. Hence, a new component should not introduce a completely different means
of defining generic queries, but instead reduce the required changes to these definitions to a minimum. This implies
that it should start from the generic definition of stream processing queries. Moreover, the other configuration tasks
of the component should also be as minimal as possible to increase overall usability.
1.2. Research objectives and paper contribution
In summary, there is a need for a semantic IoT platform component that fulfills the different requirements tackled
in the previous subsection, so that it can be applied in a healthcare data management system. Hence, we set the
following research objectives for the design of such an additional semantic IoT platform component:
1. The component should reduce the manual, labor-intensive query configuration effort by managing the queries
on the platform’s stream processing components in an automated, adaptive and context-aware way.
2. The evaluation of queries managed by the component should be performant, also on low-end IoT edge devices
with fewer resources. Network congestion and overuse of central resources should be avoided.
3. The component should allow for the query window parameters to be context-dependent.
4. The component should enable privacy by design: it should allow end users to integrate privacy by design into
an application by defining, on different levels of abstraction, which data is kept locally and which parts of the
data can be sent over the network.
5. The component should be practically usable, minimizing the effort to integrate it into an existing system.
This paper presents DIVIDE, a semantic IoT platform component that we have designed to achieve the presented
research objectives. DIVIDE automatically and adaptively derives and manages the contextually relevant specific
queries for the platform’s stream processing components, by performing semantic reasoning with a generic query
definition whenever contextual changes occur. As a result, the derived queries will efficiently monitor the relevant
IoT sensors and devices in real-time, and still do not require any real-time reasoning during their evaluation.
The contribution of this paper is the methodological design and proof-of-concept of the DIVIDE component,
fulfilling the requirements associated with the above research objectives. In the paper, DIVIDE is applied and eval-
uated on a realistic homecare monitoring use case, to demonstrate how it can be used in a practical IoT application
context that works with privacy-sensitive information.
1.3. Paper organization
The remainder of this paper is structured as follows. Section 2discusses some related work. In Section 3,the
eHealth use case scenario is further explained, translated into the technical system set-up, and semantically described
with an ontology. Section 4presents a general overview of the DIVIDE system. Further functional and algorithmic
details of DIVIDE are provided in Section 5and Section 6using the running use case example, while Section 7
zooms in on the technical implementation of DIVIDE. Section 8describes the evaluation set-up with the different
evaluation scenarios and hardware set-up. Results of the evaluations are presented in Section 9, and further discussed
in Section 10. Finally, Section 11 concludes the main findings of the paper and highlights future work.
CORRECTED PROOF
[research-article] p. 4/49
4M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
2. Related work
2.1. Semantic Web, stream processing and stream reasoning
Using Semantic Web technologies such as the Resource Description Framework (RDF) and the Web Ontol-
ogy Language (OWL), heterogeneous data sources can be consolidated and semantically enriched into a machine-
interpretable representation using ontologies [11]. An ontology is a model that semantically describes all domain-
specific knowledge by defining domain concepts and their relations and attributes. Within RDF, an Internationalized
Resource Identifier (IRI) is used to refer to every resource defined in an ontology [31]. Semantic reasoners can
interpret semantic data to derive new knowledge based on the definitions in the ontologies. The complexity of the
semantic reasoning depends on the expressivity of the underlying ontology [53]. Different ontology languages exist.
They range from RDFS, which has the lowest expressivity, to OWL 2 DL, which has the highest expressivity.
RDFox [55] and VLog [72] are state-of-the-art OWL 2 RL reasoners. OWL 2 RL contains all constructs that can
be evaluated by a rule engine. These constructs can be expressed by simple Datalog rules. By design, these engines
are not able to handle streaming data. However, RDFox can also run on a Raspberry Pi, and any ARM-based IoT
edge device in general. In addition, previous research has shown it can also successfully run on a smartphone [48].
Notation3 Logic (N3)[14] is a rule-based logic that is often used to write down RDF. N3is a superset of RDF/
Turtle [25], which implies that any valid RDF/Turtle definitions are valid N3as well.
Stream Reasoning (SR) [30] state-of-the-art contains three main approaches: Continuous Processing (CP) en-
gines, Reasoning Over Time (ROT) frameworks and Reasoning About Time (RAT) frameworks. CP engines have
continuous semantics, high throughput, and low latency but do not perform reasoning. ROT frameworks solve rea-
soning tasks continuously with high throughput and low latency, but do not consider time. RAT frameworks do
consider time in the reasoning task, but may lack reactivity due to the high latency. These various approaches each
investigate the trade-off between the expressiveness of reasoning and the efficiency of processing [30].
RDF Stream Processing (RSP) identifies a family of CP engines that solve information needs over heterogeneous
streaming data, which is typical in IoT applications. It addresses data variety by adopting RDF streams as data
model, and solves data velocity by extending SPARQL with the continuous semantics [65]. Different RSP engines
exist, such as C-SPARQL [9], CQELS [46], Yasper [70] and RSP4J [69]. Queries can be registered to these engines
that are used to continuously filter the defined data streams. A data window is placed on top of the data stream.
Parameters of the window definition include the size of the data window that is added to the query’s data model,
and the window’s sliding step which directly influences the query’s evaluation frequency.
RSP-QL [29] is a reference model that unifies the semantics of the existing RSP approaches. RSP has been
extended to support ROT in various ways: (i) solutions incorporating efficient incremental maintenance of materi-
alizations of the windowed ontology streams [10,44,54,73], (ii) solutions for expressive Description Logics (DL)
[47,68], and (iii) a solution for Answer Set Programming (ASP) [52]. More central to ROT is the logic-based frame-
work for analyzing reasoning over streams (LARS) [13] that extends ASP for analytical reasoning over data streams.
LASER [12] is a system, based on LARS, that employs a tractable fragment of LARS that ensures uniqueness of
models. BigSR [59] employs Big Data technologies (e.g., Apache Spark and Flink) to evaluate the positive fragment
of LARS. C-Sprite [17] focuses on efficient hierarchical reasoning to improve the throughput and application on
edge devices by efficiently filtering out unnecessary data in the stream. A similar approach to filter out unnecessary
streaming data in ASP exists, by investigating the dependency graph of the input data [57]. RDF Event Process-
ing (RSEP) identifies a family of approaches that extend CP over RDF Streams with event pattern matching [4].
RSEP extends RSP with a reactive RAT formalism with limited expressiveness [49]. RSEP-QL [28] is an exten-
sion of RSP-QL that incorporates the language features from Complex Event Processing (CEP) [50]. StreamQR
[20] rewrites continuous RSP queries to multiple parallel queries, allowing for the support of ontologies that are
expressed in the ELHIO logic. The CityPulse project [58] presents the combination of RSP, CEP and expressive
reasoning through ASP.
The most advanced attempts to develop expressive Stream Reasoning increased the reasoning expressiveness,
but at the cost of limited efficiency. DyKnow [38] and ETALIS [5] combine RAT and ROT reasoning, but perform
CP at an extremely slow speed. STARQL [56] is a first step in the right direction because it mixes RAT, and ROT
reasoning utilizing a Virtual Knowledge Graph (VKG) approach [75] to obtain CP. Cascading Reasoning [64]was
CORRECTED PROOF
[research-article] p. 5/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
proposed to solve the problem of expressive reasoning over high-frequency streams by introducing a hierarchical
approach consisting of multiple layers. Although several of the presented approaches adopt a hierarchical approach
[5,52,56], only a recent attempt has laid the first fundamentals on realizing the full vision of cascading reasoning
with Streaming MASSIF [18].
2.2. Semantic IoT platforms and privacy preservation
Today, different IoT platforms exist that extend big data platforms with IoT integrators [22,43]. FIWARE [24]isa
platform that offers different APIs that can be used to deploy IoT applications. Sofia2 [61] is a semantic middleware
platform that allows different systems and devices to become interoperable for smart IoT applications. SymbIoTe
[62] goes a step further and abstracts existing IoT platforms by providing a virtual IoT environment provisioned
over various cloud-based IoT platforms. The Agile [35] and BIG IoT [19] platforms focus on flexible IoT APIs and
gateway architectures, such as VICINITY [23] and INTER-IoT [36] which also provide an interoperability platform.
bIoTope [42] addresses the requirement for open platforms within IoT systems development.
Zooming in on IoT-based healthcare systems, a large number of solutions have risen in the last few years
[15,40,41,51]. Jaiswal et al. surveyed 146 healthcare for IoT solutions in recent years, and classified them in five
categories: sensor-based, resource-based, communication-based, application-based, and security-based approaches.
They identified scalability and interoperability as two big challenges that are yet to be solved by many systems.
Especially the latter is a challenge with the heterogeneity of data originating from different sources. This challenge
can be solved with Semantic Web technologies.
Focusing on IoT healthcare systems that involve semantic technologies, multiple solutions already exist. For ex-
ample, in the topic of homecare monitoring, Zgheib et al. [76] proposed a scalable semantic framework to monitor
activities of daily living in elderly, to detect diseases and epidemics. The proposed framework is based on several
semantic reasoning techniques that are distributed over a semantic middleware layer. It makes use of CEP to extract
symptom indicators, which are fed to a SPARQL engine that detects individual diseases. C-SPARQL is then em-
ployed on a stream of diseases to detect possible epidemics. While this approach zooms in largely on scalability for
this specific use case, it does not offer any flexibility in making the SPARQL and C-SPARQL queries context-aware
in a fully automated and adaptive way.
Moreover, Jabbar et al. [39] and Ullah et al. [71] both presented an IoT-based Semantic Interoperability Model
that provides interoperability among heterogeneous IoT devices in the healthcare domain. These models add seman-
tic annotations to the IoT data, allowing SPARQL queries to easily extract concepts of interest. However, these illus-
trative SPARQL queries require manual configuration effort and are not automatically ensuring context-awareness
in a dynamic environment. In addition, Ali et al. [3] present an ontology-aided recommendation system to efficiently
monitor the patient’s physiology based on wearable sensor data while recommending specific, personalized diets.
Similarly, Subramaniyaswamy et al. [67] present a personalized travel and food recommendation system based on
real-time IoT data about the patient’s physical conditions and activities. Again, these systems only work with static
SPARQL queries to evaluate their system, not achieving context-awareness in an adaptive, dynamic environment.
In summary, many of the presented platforms are adopting a wide range of existing Semantic Web technologies
to deal with the challenges associated with real-time IoT applications in complex IoT domains such as healthcare.
These platforms typically combine different technologies that involve both stream processing and semantic rea-
soning components. They all have in common that the queries for the stream processing components are not yet
configured and managed in a fully automated, adaptive and context-aware way.
Privacy by design is an approach that states that privacy must be incorporated into networked data systems and
technologies, by default [21,45,60]. It approaches privacy from the design-thinking perspective, stating that the
data controller of a system must implement technical measures for data regulation by default, within the applicable
context. Privacy by design is a broad concept that is more concretely defined through seven principles that can
be applied to the design of a system. One of these principles is that the privacy-preserving capabilities should be
embedded into the design and architecture of IT systems. Another principle focuses on the importance of keeping
privacy user-centric, ensuring that the design always considers the needs and interests of the users. Other principles
focus on visibility and transparency, privacy as the default setting, proactive instead of reactive measures, avoiding
CORRECTED PROOF
[research-article] p. 6/49
6M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
unnecessary privacy-related trade-offs, and end-to-end security through the lifecycle of the data. Privacy by design
is a key principle of the General Data Protection Regulation (GDPR) of the European Union [45].
3. Use case description and set-up
To demonstrate how DIVIDE can be employed in a semantic IoT network to perform context-aware homecare
monitoring, a detailed use case is presented in this section.
3.1. Use case description
The homecare monitoring use case scenario presented in this paper focuses on a rule-based service that recognizes
the activities of elderly people in their homes.
Use case background More and more people live with chronic illnesses and are followed up at home by various
healthcare actors such as their General Practitioner (GP), nursing organization, and volunteers. Patients in homecare
are increasingly equipped with monitoring devices such as lifestyle monitoring devices, medical sensors, localiza-
tion tags, etc. The shift to homecare makes it important to continuously assess whether an alarming situation occurs
at the patient. If an alarm is generated, either automatically or initiated by the patient, a call operator at an alarm
center should decide which intervention strategy is required. By reasoning on the measured parameters in combi-
nation with the medical domain knowledge, a system could help a human operator with choosing the most optimal
intervention strategy.
A core building block of a homecare monitoring solution is an autonomous activity recognition (AR) service
that detects and recognizes different in-home activities performed by the patient. Moreover, it should also monitor
whether ongoing activities belong to a known regular routine of the patient, so that anomalies in the patient’s daily
activity pattern can be detected. Such a service could make use of the data collected by the different sensors and
devices installed in the patient’s home environment, as well as knowledge about AR rules and known routines of
the patient. Given the heterogeneous nature of these different data sources, Semantic Web technologies are ideally
suited to create this autonomous AR service.
Details of the activity recognition service The use case of routine and non-routine AR has been designed together
with the home monitoring company Z-Plus. To properly perform knowledge-driven AR, AR rules should be known
by the system. Z-Plus helped us with designing the rules.
An AR rule can be defined as a set of one or more value conditions defined on certain observable properties that
are being analyzed for a certain entity type. An observable property is any property that can be measured by a sensor
in the patient’s environment, e.g., temperature, relative humidity, power consumption, door status (open vs. closed),
indoor location, etc. Every sensor analyzes its property for a specific entity. Examples of analyzed entities are a
room (e.g., for a humidity sensor), an electrical appliance such as a cooking stove (e.g., for a power consumption
sensor), a cupboard (e.g., for a door contact sensor), or even the patient (e.g., for a wearable sensor).
In a realistic home environment with a wide range of sensors installed, many different AR rules will be defined.
This makes it highly inefficient to continuously monitor all possible activities that can be recognized in the home,
since this would require the continuous monitoring of all sensors that observe a certain property for an entity type
associated with at least one rule. Hence, the AR service performs location-dependent activity monitoring: it only
observes activities that are relevant to the room that the patient is currently located in. To enable this, an indoor
location system should be installed that unambiguously knows the current room of the patient at every point in time.
The activities relevant to the current room can be derived by considering all sensors that analyze this room or an
entity in the room: all activity rules should be evaluated that have conditions (i) on observable properties that are
measured by these sensors, and (ii) that are defined for the same entity type as analyzed by those sensors.
Activities recognized by the AR service should be labeled as belonging to the regular routine of this patient
or not. If an ongoing activity in the patient’s routine is recognized, the situation is normal and requires no more
strict follow-up. Ideally, as long as an activity is going on, location changes in the home are less probable and
should therefore be monitored less frequently. However, if an activity outside the routine of the patient is being
CORRECTED PROOF
[research-article] p. 7/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
detected, more strict location monitoring is required since the situation is abnormal. If necessary, an alarm should
automatically be generated by the system. To implement such a system, knowledge on the existing routines of the
patient at different times of the day should exist.
Finally, an important requirement of the AR service is that it reduces the information that leaves the patient’s
home environment to a minimum, as a first step in preserving the patient’s privacy. This implies that no actual raw
sensor data should be sent over the network. To enable this, the AR service should largely run in-home, so that only
the actual outputs such as detected activities are being sent. Obviously, data that is not contained in the HomeLab
should always be sent over a secure, encrypted connection.
Running example To facilitate the methodological description of DIVIDE in Sections 4,5and 6, consider the
following illustrative running example derived from the presented homecare monitoring use case.
Consider a smart home with an indoor location system detecting in which room the patient is present, and
an environmental sensor system measuring the relative humidity in every room of the home. The smart home
consists of multiple rooms including one bathroom. The patient living in the home has a morning routine that
includes showering. To keep it simple, the AR service of the running example consists of a single rule. This
rule detects when a person is showering, and is formulated as follows:
A person is showering if the person is present in a bathroom with a relative humidity of at least 57%.
This is a rule with a single condition, defined on a crossed lower threshold for the relative humidity observ-
able property, for the bathroom entity type. Hence, given the presence of a humidity sensor in the patient’s
bathroom, the showering activity will be monitored by the AR service if the patient is located in the bathroom.
3.2. Activity recognition ontology
An Activity Recognition ontology has been designed to support the described use case scenario. This Activ-
ity Recognition ontology is linked to the DAHCC (Data Analytics for Healthcare and Connected Care) ontology
[63], which is an in-house designed ontology that includes different modules connecting data analytics to health-
care knowledge. Specifically for the purpose of this semantic use case, it is extended with a module KBActivi-
tyRecognition supporting the knowledge-driven recognition of in-home activities.
The DAHCC ontology contains five main modules. The SensorsAndActuators and SensorsAndWear-
ables modules describe the concepts that allow defining the observed properties, location, observations and/or
actions of different sensors, wearables and actuators in a monitored environment such as a smart patient home. The
MonitoredPerson and CareGiver modules contain concepts for the definition of a patient monitored inside
a residence and the patient’s caregivers. The ActivityRecognition module allows describing the activities
performed by a monitored person that are predicted by an AR model.
The DAHCC ontology bridges the concepts of multiple existing ontologies in the data analytics and health-
care domains. These ontologies include SAREF (the Smart Applications REFerence ontology) [26] and its exten-
sions SAREF4EHAW (SAREF extended with concepts of the eHealth Ageing Well domain) [37], SAREF4BLDG
(an extension for buildings and building spaces) and SAREF4WEAR (an extension for wearables), as well as the
Execution–Executor–Procedure (EEP) ontology [33].
Listing 1shows how a knowledge-based AR model can be defined and configured. In the example, it is configured
according to the use case’s running example, i.e., with one activity rule for showering. Lines 13–17 of this listing
contain the definition of the single condition of this rule.
In Section A.1 of Appendix A, additional listings detail multiple other definitions within the Activity Recognition
ontology that support the knowledge-driven AR use case and its running example. This includes the ontological
definitions that can be used by a semantic reasoner to define whether an activity prediction corresponds to a person’s
routine, as well as the semantic description of the example patient and home in the running use case example.
CORRECTED PROOF
[research-article] p. 8/49
8M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1 # define knowledge-based activity recognition model and its config with a specific rule
2 :KBActivityRecognitionModel rdf:type ActivityRecognition:ActivityRecognitionModel ;
3 eep:implements :KBActivityRecognitionModelConfig1 .
4 :KBActivityRecognitionModelConfig1 rdf:type ActivityRecognition:Configuration ;
5 :containsRule :showering_rule .
6
7 # define showering activity rule: detected by one specific condition
8 :showering_rule rdf:type :ActivityRule ;
9 ActivityRecognition:forActivity [ rdf:type ActivityRecognition:Showering ] ;
10 :hasCondition :showering_condition .
11
12 # define showering condition: relative humidity in the bathroom should be at least 57%
13 :showering_condition rdf:type :RegularThreshold ;
14 :forProperty [ rdf:type SensorsAndActuators:RelativeHumidity ] ;
15 Sensors:analyseStateOf [ rdf:type SensorsAndActuators:BathRoom ] ;
16 :isMinimumThreshold "true"^^xsd:boolean ;
17 saref-core:hasValue "57"^^xsd:float .
18
19 # define in system that conditions can be defined on relative humidity in a room
20 SensorsAndActuators:RelativeHumidity rdfs:subClassOf :ConditionableProperty .
21 SensorsAndActuators:Room rdfs:subClassOf :AnalyzableForCondition .
Listing 1. Example of how a knowledge-based AR model with an activity rule for showering can be described through triples in the KBActiv-
ityRecognition ontology module. All definitions are listed in RDF/Turtle syntax. To improve readability, the KBActivityRecogni-
tion: prefix is replaced by the :prefix.
Fig. 1. Architectural set-up of the eHealth use case scenario.
3.3. Architectural use case set-up
To implement the use case scenario of a knowledge-driven routine and non-routine AR service, a cascading
reasoning architecture is used [27]. An overview of the architectural cascading reasoning set-up for this use case
isshowninFig.1. This architecture is generic and can be applied to different use case scenarios in the healthcare
domain with similar requirements.
The architecture of the system is split up in a local and a central part. The local part consists of a set of components
that are running on a local device in the patient’s environment. This device could be any existing gateway that is
already installed in the patient’s home, such as the device for a deployed nurse call system. The local components
are the Semantic Mapper and an RSP Engine. The components of the central part are deployed on a back-end server
of an associated nursing home or hospital. They consist of a Central Reasoner, DIVIDE, and a Knowledge Base.
Knowledge Base The Knowledge Base contains the semantic representation of all domain knowledge and context
data in the system, in an RDF-based knowledge graph. In the given use case scenario, this domain knowledge con-
sists of the Activity Recognition ontology that is discussed in Section 3.2. It includes the AR model with its activity
rules. The contextual information describes the different smart homes and their installed sensors, and patients.
Semantic Mapper The Semantic Mapper semantically annotates all raw observations generated by the sensors in
the patient’s environment. These semantic sensor observations are forwarded to the data streams of the RSP Engine.
CORRECTED PROOF
[research-article] p. 9/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
RSP Engine The RSP engine continuously evaluates the registered queries on the RDF data streams, to filter
relevant events. In this use case scenario, the filtered events are in-home locations and recognized activities both in
and not in the patient’s routine. Only these filtered events are encrypted and sent over the network to the Central
Reasoner. By applying the cascading reasoning principles and installing the RSP Engine locally in the patient’s
environment, a first step in preserving the patient’s privacy can be taken.
Central Reasoner The Central Reasoner is responsible for further processing the events received from the RSP
Engine, and acting upon them. For example, it can aggregate the filtered events and save them to use for future
call enrichment, or send an alarm to the patient’s caregivers when necessary. In general, any action is possible,
depending on what additional components are deployed and implemented on the central node.
Importantly, the Central Reasoner will also update relevant contextual information in the Knowledge Base, such
as events occurring in the patients’ environment. This information can then trigger a re-evaluation of the queries
deployed on the local RSP engines. In the given use case scenario, relevant context changes that trigger a possible
change in the deployed RSP queries are location updates and detected activities. When the in-home location of the
patient changes, the set of activities that need to be monitored changes as well, since the AR service is location-
dependent. Moreover, context information about recognized ongoing routine and non-routine activities directly
defines the execution frequency of the location monitoring RSP query.
DIVIDE DIVIDE is the component that manages the queries executed by the local RSP Engine components. It
updates the queries whenever triggered by context updates in the Knowledge Base. By aggregating contextual infor-
mation with medical domain knowledge through semantic reasoning during the query derivation, the resulting RSP
queries only involve filtering and do not require any more real-time reasoning. Moreover, it allows to dynamically
manage the window parameters of the queries (i.e., the size of the data window and its sliding step) based on the
current context. It is fully automated and adaptive, so that at all times, relevant queries are being executed given the
context information about the patients in the Knowledge Base.
In the running example, DIVIDE will ensure that there is always a location monitoring query running on the RSP
Engine component installed in the patient’s home. The window parameters of this query will depend on whether
or not an activity is currently going on, and whether or not this activity belongs to the current patient’s routine. In
addition, when the patient is located in the bathroom, an additional RSP query will be derived and installed that
monitors when the patient is showering. When the query detects this activity, this would be considered a recognized
routine activity as showering is included in the patient’s morning routine.
4. Overview of the DIVIDE system
In Section 3, the general cascading reasoning architecture of the semantic system in the eHealth use case scenario
is explained. This section zooms in on DIVIDE, the architectural component responsible for managing the queries
running on the local RSP Engine components. It is the task of the DIVIDE system to ensure that these queries
perform the relevant filtering given the current context, at any given time, for every RSP Engine known to DIVIDE.
The methodological design of DIVIDE contains of two main pillars: (i) the initialization of DIVIDE, involving the
DIVIDE query parsing and ontology preprocessing steps, and (ii) the core of DIVIDE which is the query derivation.
Figure 2shows a schematic overview of the action steps, inputs and internal assets DIVIDE, in which the two main
pillars can be distinguished. The following two sections, Section 5and Section 6, provide more information on
this initialization and query derivation, respectively. Throughout the descriptions of DIVIDE in these sections, the
running eHealth use case example described in Section 3.1 is considered.
In terms of logic, DIVIDE works with the rule-based Notation3 Logic (N3)[14]. The semantic reasoner used
within DIVIDE should thus be a reasoner supporting N3. Such a reasoner can reason within the OWL 2 RL profile
[53], which implies that a semantic system that uses DIVIDE in combination with an RSP engine is equivalent to a
set-up involving a semantic OWL 2 RL reasoner. The reasoner should support the generation of all triples based on
a set of input triples and rules, as well as generating a proof towards a certain goal rule. Such a proof should contain
the chain of all rules used by the reasoner to infer new triples based on its inputs, described in N3logic.
CORRECTED PROOF
[research-article] p. 10/49
10 M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Fig. 2. Schematic overview of the DIVIDE system. It shows all actions that can be performed by DIVIDE with their inputs and outputs.
A distinction is made between internal assets and external inputs to the system. The overview is split up in the two major parts: the inputs, steps
and assets of the DIVIDE initialization, and those of the DIVIDE query derivation.
5. Initialization of the DIVIDE system
The core task of DIVIDE is the derivation and management of the queries running on the RSP engines of the
semantic components in the system that are known to DIVIDE. To allow DIVIDE to effectively and efficiently
perform the query derivation for one or more components upon context changes, different initialization steps are
required. Three main steps can be distinguished from the upper part of the DIVIDE system overview in Fig. 2:(i)
parsing and initializing the DIVIDE queries, (ii) preprocessing the system ontology, and (iii) initializing the DIVIDE
components. This section zooms in on each of these three initialization tasks.
5.1. Initialization of the DIVIDE queries
A DIVIDE query is a generic definition of an RSP query that should perform a real-time processing task on the
RDF data streams generated by the different local components in the system. The goal of DIVIDE is to instantiate
this query in such a way that it can perform this task in a single query that simply filters the RDF data streams. To
this end, the internal representation of a DIVIDE query contains a goal, a sensor query rule with a generic query
pattern, and a context enrichment. These three items are essential for correctly deriving the relevant queries during
the query derivation process. They will each be explained in detail in the first three subsections of this section.
In the running example, there is one RSP query that actively monitors the location of the patient in the home,
and one query that detects a showering activity when the patient is located in the bathroom. This subsection
will focus on the latter, which is an example of an actual AR query. Within DIVIDE, a generic DIVIDE query
will be defined for each type of activity rule present in the system. This means that no dedicated DIVIDE
query per activity should be defined, which would be too cumbersome and highly impractical in a real-world
CORRECTED PROOF
[research-article] p. 11/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
{
?p rdf:type KBActivityRecognition:RoutineActivityPrediction ;
ActivityRecognition:forActivity [ rdf:type ?activityType ] ;
ActivityRecognition:activityPredictionMadeFor ?patient ;
ActivityRecognition:predictedBy ?model ; saref-core:hasTimestamp ?t .
?activityType rdfs:subClassOf KBActivityRecognition:DetectableActivity .
}=>{
_:p rdf:type KBActivityRecognition:RoutineActivityPrediction ;
ActivityRecognition:forActivity [ rdf:type ?activityType ] ;
ActivityRecognition:activityPredictionMadeFor ?patient ;
ActivityRecognition:predictedBy ?model ; saref-core:hasTimestamp ?t .
}.
Listing 2. Goal of the generic DIVIDE query detecting an ongoing activity in a patient’s routine.
deployment. A rule type is a specific combination of conditions and the type of value they are defined on. For
the showering rule, this means that the type is defined as follows: a rule with a single condition on a lower
regular threshold that should be crossed. This means that the detailed specific RSP queries corresponding to
activity rules of the same type will all be derived from the same generic DIVIDE query. The generic DIVIDE
query corresponding to the type of the showering activity rule will be used as the running example DIVIDE
query in this section. Note that the running example will only focus on the detection of this activity in the
patient’s routine.
5.1.1. Goal
The goal of a DIVIDE query defines the semantic output that should be filtered by the resulting RSP query. This
required query output is translated to a valid N3rule. This rule is used in the DIVIDE query derivation to ensure
that the resulting RSP query is filtering this required RSP query output.
For the generic query definition corresponding to the RSP query that detects the showering activity in the
running example, the goal is specified in Listing 2. It is looking for any instance of a RoutineActivi-
tyPrediction.
5.1.2. Sensor query rule with generic query pattern
The sensor query rule is the core of the DIVIDE query definition. It is a complex N3rule that defines the generic
pattern of the RSP query, together with semantic information on when and how to instantiate it. Its usage by the
semantic rule reasoner during the DIVIDE query generation defines whether or not this generic query should be
instantiated given the involved context.
The formalism of the sensor query rule builds further on SENSdesc, which is the result of previous research [6].
This theoretical work was the first step in designing a format that describes an RSP query in a generic way that can
be combined with formal reasoning to obtain the relevant queries that filter patterns of interest. By generalizing this
format and integrating it into DIVIDE, it has become practically usable.
Each sensor query rule consists of three main parts: the relevant context in the rule’s antecedence, and the generic
query and ontology consequences defined in the rule’s consequence.
Relevant context In the antecedence of the sensor query, the context in which the generic RSP query might become
relevant is generically described. For each set of query variables for which the antecedence is valid, there is a chance
that the rule, instantiated with these query variables, will appear in the proof constructed by the semantic reasoner
during the query derivation. If this is the case, the query will be instantiated for this set of variables.
To explain the different parts, consider the DIVIDE query corresponding to the running example detecting the
showering activity. Listing 3defines the sensor query rule for the corresponding type of activity rule. The rule’s
antecedence with the relevant context of the sensor query rule is described in lines 2–24. In short, it looks for
AR rules relevant to the current room of the patient, following the definition of location-dependent activity
monitoring in Section 3.1.
CORRECTED PROOF
[research-article] p. 12/49
12 M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1{
2 ?model rdf:type ActivityRecognition:ActivityRecognitionModel ;
3 <https://w3id.org/eep#implements> [ rdf:type ActivityRecognition:Configuration ;
4 KBActivityRecognition:containsRule ?a ] .
5 ?a rdf:type KBActivityRecognition:ActivityRule ;
6 ActivityRecognition:forActivity [ rdf:type ?activityType ] ;
7 KBActivityRecognition:hasCondition [
8 rdf:type KBActivityRecognition:RegularThreshold ;
9 KBActivityRecognition:isMinimumThreshold "true"^^xsd:boolean ;
10 saref-core:hasValue ?threshold ;
11 Sensors:analyseStateOf [ rdf:type ?analyzed ] ;
12 KBActivityRecognition:forProperty [ rdf:type ?prop ]
13 ] .
14
15 ?activityType rdfs:subClassOf KBActivityRecognition:DetectableActivity .
16
17 ?sensor rdf:type saref-core:Device ; saref-core:measuresProperty ?prop_o ;
18 Sensors:isRelevantTo ?room ; Sensors:analyseStateOf [ rdf:type ?analyzed ] .
19 ?prop_o rdf:type ?prop .
20
21 ?prop rdfs:subClassOf KBActivityRecognition:ConditionableProperty .
22 ?analyzed rdfs:subClassOf KBActivityRecognition:AnalyzableForCondition .
23
24 ?patient MonitoredPerson:hasIndoorLocation ?room .
25 }
26 =>
27 {
28 _:q rdf:type sd:Query ;
29 sd:pattern sd-query:pattern ;
30 sd:inputVariables (("?sensor" ?sensor) ("?threshold" ?threshold)
31 ("?activityType" ?activityType)
32 ("?patient" ?patient) ("?model" ?model) ("?prop_o" ?prop_o)) ;
33 sd:windowParameters (("?range" 30 time:seconds) ("?slide" 10 time:seconds)) .
34
35 _:p rdf:type ActivityRecognition:ActivityPrediction ;
36 ActivityRecognition:forActivity [ rdf:type ?activityType ] ;
37 ActivityRecognition:activityPredictionMadeFor ?patient ;
38 ActivityRecognition:predictedBy ?model ; saref-core:hasTimestamp _:t .
39 } .
40
41 sd-query:pattern rdf:type sd:QueryPattern ;
42 sh:prefixes sd-query:prefixes-activity-showering ;
43 sh:construct """
44 CONSTRUCT {
45 _:p a KBActivityRecognition:RoutineActivityPrediction ;
46 ActivityRecognition:forActivity [ a ?activityType ] ;
47 ActivityRecognition:activityPredictionMadeFor ?patient ;
48 ActivityRecognition:predictedBy ?model ; saref-core:hasTimestamp ?now .
49 }
50 FROM NAMED WINDOW :win ON <http://protego.ilabt.imec.be/idlab.homelab>
51 [RANGE ?{range} STEP ?{slide}]
52 WHERE {
53 BIND (NOW() as ?now)
54 WINDOW :win { ?sensor saref-core:makesMeasurement [
55 saref-core:hasValue ?v ; saref-core:hasTimestamp ?t ;
56 saref-core:relatesToProperty ?prop_o ] .
57 FILTER (xsd:float(?v) > xsd:float(?threshold)) }
58 }
59 ORDER BY DESC(?t) LIMIT 1""" .
60
61 sd-query:prefixes-activity-showering rdf:type owl:Ontology ;
62 sh:declare [ sh:prefix "xsd" ;
63 sh:namespace "http://www.w3.org/2001/XMLSchema#"^^xsd:anyURI ] ;
64 sh:declare [ sh:prefix "saref-core" ;
65 sh:namespace "https://saref.etsi.org/core/"^^xsd:anyURI ] ;
66 sh:declare [ sh:prefix "ActivityRecognition" ; sh:namespace
67 "https://dahcc.idlab.ugent.be/Ontology/ActivityRecognition/"^^xsd:anyURI ] ;
68 sh:declare [ sh:prefix "KBActivityRecognition" ;
69 sh:namespace "https://dahcc.idlab.ugent.be/Ontology/ActivityRecognition/
KBActivityRecognition/"^^xsd:anyURI ] .
Listing 3. Sensor query rule of the generic DIVIDE query detecting an ongoing activity in a patient’s routine, for an activity rule type with a
single condition on a certain property of which a value should cross a lower threshold.
CORRECTED PROOF
[research-article] p. 13/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Generic query The generic query definition is contained inside the consequence of the sensor query rule. It consists
of three main aspects: the generic query pattern, its input variables, and its static window parameters.
The generic query pattern is a string representation of the actual RSP-QL query that will be the result of the
DIVIDE query derivation. This pattern is however still generic: some of its query variables still need to be substituted
by actual values to obtain the correct and valid RSP-QL query. Similarly, the window parameters of the input stream
windows of the RSP-QL query also need to be substituted.
The input variables that need to be substituted by the semantic reasoner in the generic query pattern are defined as
aN
3list. Every item in this list represents one input variable. This input variable is a list itself as well: the first item
represents the string literal of the variable in the generic query pattern to be substituted, the second item is the query
variable that should occur in the sensor query rule’s antecedence so that it is instantiated by the semantic reasoner if
the rule is applied in the proof during the query derivation.
Similarly, the definition of the static window parameters is also a list of lists. Static window parameters are
variables that should also be substituted by the semantic reasoner during the query derivation, but in the stream
window definition instead of the query body or output. They are static as their value is directly defined by the value
of the corresponding variable. Every item of the outer list is an inner list of three items. The first item represents
the string literal of the variable in awindow definition of the generic query pattern. The second item can either be a
query variable or literal defining the value of the window parameter. If this is a query variable, it will be substituted
during the rule evaluation based on the matching value in the rule’s antecedence, similarly to the input variables.
The third item defines the unit of the value.
In Listing 3, the generic query definition is described in lines 28–33 and lines 41–69. More specifically,
lines 28–29 and lines 41–69 define the generic query pattern, whereas lines 30–32 and line 33 define the
input variables and static window parameters of the generic query, respectively.
Inspecting the example in Listing 3in further detail, the generic RSP-QL query pattern string is defined in
lines 44–59. The query filters observations on the defined stream data window :win of a certain sensor ?sen-
sor with a value for the observed property ?prop_o that is higher than a certain threshold ?threshold
(WHERE clause in lines 52–58). For every match of this pattern, output triples are constructed that represent
an ongoing activity of type ?activityType in the routine of a patient ?patient, predicted by the activity
recognition model ?model (CONSTRUCT clause in lines 44–49). These six variables are exactly the six in-
put variables as defined in lines 30–32: their values will be instantiated during the query derivation. Note that
the window parameter definitions specified in line 33 of Listing 3define a window size of 30 seconds and a
window sliding step of 10 seconds.
Ontology consequences The ontology consequences are the second main part of the sensor query rule’s conse-
quence. This part describes the direct effect of a query result in a real-time reasoning context. This effect is obtained
when a stream window of the generic RSP query would fulfill the pattern of the WHERE clause but no additional
reasoning has been done (yet) to know the indirect consequences of this matching pattern. This is an essential aspect
to understand: the purpose of DIVIDE is to derive queries that can make conclusions that are valid with the given
context, through a single RSP-QL query without any reasoning involved. In a context without DIVIDE, these same
indirect conclusions could only be made by performing an additional semantic reasoning step, based on the direct
conclusions that are directly known from the matching query pattern. In other words, the triples defining the ontol-
ogy consequences can be the same as the output of the generic RSP-QL query and thus the consequence of the rule
representing the DIVIDE query’s goal. However, in practice, it will often require an additional semantic reasoning
step to see whether the ontology consequences actually imply the output of the generic RSP-QL query.
In the running example, the direct consequences of a sensor observation matching the WHERE clause in
lines 52–58 of Listing 3would be the fact that an ongoing activity of the given type is detected for the given
patient (lines 35–38). The indirect consequences represented by the definitions in the RSP-QL query output
(lines 45–48) state that this is an activity in the patient’s routine.
CORRECTED PROOF
[research-article] p. 14/49
14 M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
5.1.3. Context enrichment
Prior to the start of the query derivation with the semantic reasoner, the current context can still be enriched by
executing one or more SPARQL queries on this context. The context enrichment of a DIVIDE query consists of this
ordered set of valid SPARQL queries.
It is important to note that context-enriching queries are not only used to add general context to the model, but
also for the dynamic window parameter substitution as will be explained in Section 6.5.
For the running example, no context-enriching queries are part of the DIVIDE query definition. However,
Appendix Adiscusses the definition of a related DIVIDE query that does include a context enrichment.
5.1.4. DIVIDE query parser
As an end user of DIVIDE, it is not required to define a DIVIDE query according to its internal representation to
properly initialize DIVIDE. Instead, the recommended way to define a DIVIDE query is by specifying an ordered
collection of existing SPARQL queries that are applied in an existing rule-based stream reasoning system, or through
an already existing RSP-QL query. Through DIVIDE, this set of ordered queries will be replaced in the semantic
platform by a single RSP query that performs a semantically equivalent task. To enable this, DIVIDE contains a
query parser, which converts such an external DIVIDE query definition its internal representation. The goal of this
approach is to make it easy for an end user to integrate DIVIDE into an existing semantic (stream) reasoning system,
without having to know the details of how DIVIDE works.
DIVIDE is applied in a cascading system architecture. It considers its equivalent regular (stream) reasoning sys-
tem as a semantic reasoning engine in which the set of SPARQL queries is executed sequentially on a data model
containing the ontology (TBox) triples and rules, context (ABox) triples, and triples representing the sensor obser-
vations in the data stream. Each query in the ordered collection, except for the final one, should be a CONSTRUCT
query, and its outputs are added to the data model on which (incremental) rule reasoning is applied before the next
query in the chain is executed.
The definition of a DIVIDE query as an ordered set of SPARQL queries includes a context enrichment with zero
or more context-enriching queries, exactly one stream query, zero or more intermediate queries, and either no or
exactly one final query. Besides these queries, such a DIVIDE query definition also includes a set of stream windows
(required), a solution modifier (optional), and a variable mapping from stream query to final query (optional). The
remainder of this subsection will discuss these different inputs in this DIVIDE query definition.
Stream query and context enrichment In the ordered set of SPARQL queries, it is important that there is exactly
one query that reads from the stream(s) of sensor observations. This query is called the stream query. In some cases,
this query will be the first in the chain. If this is not the case, any preceding queries are defined as context-enriching
queries in the DIVIDE query definition. Importantly, the WHERE clause conditions of the stream query should be
part of named graphs defined as data inputs with a FROM clause, except for special SPARQL constructs such as a
FILTER or BIND clause. The IRIs of the named graphs are used to distinguish which data is considered as part of
the context, and which data will be put on the data stream. For the data streams, the named graph IRI should reflect
the stream IRI. This stream IRI should also be defined as a stream window.
Final query The final query in the ordered set of SPARQL queries is called the final query in DIVIDE. A final
query is optional: if it does not exist, the stream query is considered the final query.
Intermediate queries The intermediate queries are an ordered list of zero or more SPARQL queries. This list
contains those queries in the original set of SPARQL queries that are executed between the stream and final query.
Stream windows Each data stream window that should be included as input in the resulting RSP-QL query should
be explicitly defined. It consists of a stream IRI, a window definition, and a set of default window parameter values.
The stream IRI represents the IRI of the data stream. This IRI should exactly match the name of a named graph
defined in the stream query. The window definition defines the specification of how the windows are created on
the stream. If the user wants to define variable window parameters, named variables should be inserted into the
places that will be instantiated during the query derivation. In DIVIDE, two types of variable window parameters
exist: static and dynamic window parameters. Static window parameters might be substituted similarly to an input
CORRECTED PROOF
[research-article] p. 15/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
variable during the DIVIDE query derivation. Hence, the variable name of this window parameter should appear in
the WHERE clause of the stream query, in a named graph that is not corresponding to a stream window. This will
ensure that the variable name can be substituted as a regular input variable. During the DIVIDE query derivation,
dynamic window parameters are substituted before static parameters. A dynamic window parameter can be defined
in the output of a context-enriching query. In case no context-enriching query yields a value for the dynamic window
parameter variable, the value of the static window parameter with the same variable name will be substituted. If no
such static window parameter is defined, a default value will be used. Hence, for each such variable in the window
definition that is not defined as a static window parameter, this default value should be defined by the end user.
Solution modifier If the resulting RSP-QL query should have a SPARQL solution modifier, this can be included in
the DIVIDE query definition. Any unbound variable names in the solution modifier should be defined in a named
graph of the stream query’s WHERE clause that represents a stream window.
Variable mapping of stream to final query If a final query is specified, it often occurs that certain query variables in
both the stream and final query actually refer to the same individuals. To make sure that DIVIDE parses the DIVIDE
query input correctly, the mapping of these variable names should be explicitly defined. This is a manual required
step. Often, they will have the same variable names, making this mapping trivial.
Parsing the end user definition of a DIVIDE query to its internal representation The DIVIDE query parser can
construct the goal, sensor query rule and context enrichment of a DIVIDE query from its end user definition. The
context enrichment requires no parsing, while the goal and sensor query rule are composed from the different inputs.
The goal of the DIVIDE query is directly constructed from the final query. If it is a CONSTRUCT query, the
content of the WHERE clause is put in the antecedence of the goal, while the content of the CONSTRUCT clause
represents the goal’s consequence. For any other query form, the WHERE clause of the final query is used for both
the goal’s antecedence and consequence. If no final query is available, the antecedence and consequence of the goal
are copied from the result of the stream query. If the stream query is no CONSTRUCT query, the SELECT, ASK or
DESCRIBE result clause is first converted to a triple pattern containing all its unbound variables.
The sensor query rule is the most complex part to construct. In the standard case, disregarding any exceptions, the
antecedence of the rule is composed from all named graph patterns in the WHERE clause of the stream query that
do not represent a stream graph. The ontology consequences in the consequence of the sensor query rule are copied
from the stream query’s output. The generic RSP-QL query pattern is constructed from different parts. Its resulting
CONSTRUCT, SELECT, ASK or WHERE clause is directly copied from the result clause of the final query, or the
stream query if no final query is present. Its input stream window definitions are constructed using the defined stream
windows. The WHERE clause contains the content of the stream graphs in the stream query’s WHERE clause, and
the special SPARQL patterns that are not put inside a named graph pattern. If a solution modifier is specified, it is
appended to the generic RSP-QL query pattern. The input variables and window parameters of the sensor query rule
are derived by analyzing the stream query, final query and the variable mapping between both. Any intermediate
queries are converted to additional semantic rules that are appended to the main sensor query rule.
Finally, it is worth noting that a DIVIDE query can alternatively also be defined through an existing RSP-QL
query. Such a definition is quite similar to the definition described above, with a few differences. The main difference
is that by definition, no intermediate and final queries will be present since the original system already uses RDF
stream processing and individual RSP-QL queries. This means no variable mapping should be defined either. Hence,
this definition is typically more simple than the definition of a DIVIDE query as a set of SPARQL queries.
For the running use case example, the DIVIDE query that performs the monitoring of the showering activity
rule can be defined as a set of ordered SPARQL queries. The DIVIDE query parser will translate this definition
into the internal representation of this DIVIDE query, exactly as discussed in the previous subsections. This
end user definition is discussed in detail in Section A.2 of Appendix A.
5.2. Initialization of the DIVIDE ontology
To properly perform the query derivation, an ontology should be specified as input to DIVIDE by the end user.
During initialization, this ontology will be loaded into the system. By definition, this ontology is considered not
CORRECTED PROOF
[research-article] p. 16/49
16 M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
to change often during the system’s lifetime, in contrast with the context data. Therefore, the ontology should be
preprocessed by the semantic reasoner wherever possible. This will speed up the actual query derivation process,
since it avoids that the full ontology is loaded and processed every time the DIVIDE query derivation is triggered.
To what extent the ontology can be preprocessed depends on the semantic reasoner used.
For the running example, the triples and axioms in the KBActivityRecognition module of the Activity
Recognition ontology are preprocessed, including the definitions in all its imported ontologies.
5.3. Initialization of the DIVIDE components
To properly initialize DIVIDE, it should have knowledge about the componentsit is responsible for. A component
is defined as an entity in the IoT network on which a single RSP engine runs. For each DIVIDE component, the
following information should be specified by an end user for the correct initialization of DIVIDE:
The name of the graph (ABox) pattern in the knowledge base that contains the context specific for the entity
that this component’s RSP engine is responsible for. A typical example in the eHealth scenario is a graph
pattern of a specific patient, containing all patient information.
A list of any additional graph patterns in the knowledge base that contain context relevant to the entity that this
component’s RSP engine is responsible for. An example is generic information on the layout of the environment
in which the patient’s smart home is situated. Such context information is relevant to multiple components, and
is therefore stored in separate graphs in the knowledge base.
The type of the RSP engine of this component (e.g., C-SPARQL).
The base URL of the RSP engine’s server API. This API should support registering and unregistering RSP
queries, and pausing and restarting an RSP stream. It will be used during the DIVIDE query derivation.
Upon initialization, all component information is processed and saved by DIVIDE. For every graph pattern asso-
ciated with at least one component, DIVIDE should actively monitor for any updates to this ABox in the knowledge
base, to trigger the query derivation for the relevant components when updates occur.
6. DIVIDE query derivation
Whenever DIVIDE is alerted of a context change in the knowledge base, the DIVIDE query derivation is triggered
for every DIVIDE query. Based on the name of the updated ABox graph and the components known by the system,
DIVIDE knows for which components the query derivation process should be started. This process can be executed
independently, i.e., in parallel, for each combination of component and DIVIDE query. Hence, this section will
focus on the query derivation task for a single component and a single DIVIDE query.
The DIVIDE query of the running example, that performs the monitoring of the showering activity rule, will
be further used in this section to illustrate the query derivation process. The query derivation is triggered if any
relevant context for a given component is updated. For this example, this context consists of all information
about the patient and the smart home. Moreover, it also contains the output of the RSP queries: the in-home
patient location and the detected ongoing activities.
The DIVIDE query derivation task for one RSP engine and one DIVIDE query consists of several steps, which
are executed sequentially: (i) enriching the context, (ii) semantic reasoning on the enriched context to construct a
proof containing the details of derived queries and how to instantiate them, (iii) extracting these derived queries from
the proof, (iv) substituting the instantiated input variables in the generic RSP-QL query pattern for every derived
query, (v) substituting the window parameters in a similar way, and (vi) updating the active RSP queries on the
corresponding RSP engine. The input of the query derivation is the updated context, which consists of the set of
triples in the context graph(s) of the knowledge base that are associated with the given component’s RSP engine. In
the following subsections, the DIVIDE query derivation action steps are further detailed. Figure 2shows a schematic
overview of these steps on the bottom part. For every step, the inputs and outputs are detailed on the figure.
CORRECTED PROOF
[research-article] p. 17/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
6.1. Context enrichment
Prior to actually deriving the RSP queries for the given DIVIDE query, the context data model can still be en-
riched. This is done by executing the ordered set of context-enriching queries corresponding to the DIVIDE query
with a SPARQL query engine, if there are any, possibly after performing rule-based reasoning with the ontology
axioms. The result of this step is a data model containing the original context triples and all triples in the output of
any of the context-enriching queries, if there are any. Note that the output of the context-enriching queries can also
contain dynamic window parameters to be used in the window parameter substitution step of the query derivation.
The generic DIVIDE query corresponding to the running example of detecting the showering activity does not
contain any context-enriching query. Hence, the updated context will directly be sent to the input of the next
step. In Section A.3 of Appendix A, two additional examples are discussed of DIVIDE queries related to the
running example that do contain context-enriching queries.
6.2. Semantic reasoning to derive queries
Starting from the enriched context data model, the semantic reasoner used within DIVIDE is run to perform the
actual query derivation. This way, the reasoner will define whether the DIVIDE query should be initialized for the
given context. If so, it specifies with what values the input variables and static window parameters, as defined in the
query’s sensor query rule consequence, should be substituted in the generic query pattern of the DIVIDE query.
The inputs of the semantic reasoner in this step consist of the preprocessed ontology (i.e., all triples and rules
extracted from its axioms), the enriched context triples, the sensor query rule and the goal of the DIVIDE query.
Given these inputs, the reasoner performs semantic reasoning to construct and output a proof with all possible rule
chains in which the goal of the DIVIDE query is the final rule applied. Every such rule chain will be (partially)
different and correspond to a different set of instantiated query variables appearing in the goal’s rule.
To allow the semantic reasoner to construct a rule chain that starts from the context and ontology triples and ends
with the goal rule, the sensor query rule is crucial. If the inputs allow the reasoner to derive the set of triples in the
antecedence of the sensor query rule for a certain set of query variables, the rule can be evaluated for this set of
variables. However, the semantic reasoner will only actually evaluate the rule for this set and include it in the rule
chain, if the triples in the consequence of the sensor query rule (and more specifically, the part with the ontology
consequences) allow the semantic reasoner to derive the antecedence of the goal rule. This can be either directly
(i.e., without semantic reasoning) or indirectly (i.e., after rule-based semantic reasoning). If this is not the case, the
sensor query rule will not help the semantic reasoner in constructing a rule chain where the goal is the last rule
applied, for the given set of sensor query rule variables. Hence, if the proof contains an instantiation of the sensor
query rule for a given set of query variables, this implies that the generic RSP-QL query of this DIVIDE query
should be instantiated for this set. This should be done with those query variables of this set that are present in the
list of input variables or window parameters of the sensor query rule’s consequence.
To reassure that this process works, consider the DIVIDE query parser’s translation of the ordered set of SPARQL
queries in the end user DIVIDE query definition into its internal representation. If the original stream query in the
SPARQL input would yield a query result, the final query’s WHERE clause might have a matching pattern, and thus
an output. This is equivalent to the potential evaluation of the sensor query rule in the proof, depending on whether
the sensor query rule’s consequence directly or indirectly leads to a matching antecedence of the goal rule.
When the query derivation is executed for the DIVIDE query of the running example, the inputs will include
the showering AR rule in Listing 1that is defined in the preprocessed ontology. In the proof constructed by
the semantic reasoner, the DIVIDE query’s sensor query rule of Listing 3would be instantiated once for the
showering activity, if the current location of the patient is the bathroom. The step in the rule chain of the
reasoner’s proof in which this happens, is shown in Listing 4. This proof shows that the relative humidity
sensor with the given ID can detect the showering activity for patient with ID 157 if its value is 57 or higher. If
CORRECTED PROOF
[research-article] p. 18/49
18 M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
@prefix r: <http://www.w3.org/2000/10/swap/reason#>.
<#lemma3> a r:Inference;
r:gives {
_:sk_0 a sd:Query.
_:sk_0 sd:pattern sd-query:pattern.
_:sk_0 sd:inputVariables (
("?sensor" <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/70:ee:50:67:3e:78>)
("?threshold" "57"^^xsd:float)
("?activityType" ActivityRecognition:Showering)
("?patient" patients:patient157)
("?model" :KBActivityRecognitionModel)
("?prop_o" <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/org.dyamand.types.common.
RelativeHumidity>)
).
_:sk_0 sd:windowParameters (("?range" 30 time:seconds) ("?slide" 10 time:seconds)).
_:sk_1 a ActivityRecognition:ActivityPrediction.
_:sk_1 ActivityRecognition:forActivity _:sk_2.
_:sk_2 a ActivityRecognition:Showering.
_:sk_1 ActivityRecognition:activityPredictionMadeFor patients:patient157.
_:sk_1 ActivityRecognition:predictedBy :KBActivityRecognitionModel.
_:sk_1 saref-core:hasTimestamp _:sk_3.
};
r:evidence ( <#lemma8> [...] <#lemma31> );
[...]
r:rule <#lemma32>.
Listing 4. One step of the proof constructed by the semantic reasoner used in DIVIDE during the DIVIDE query derivation for the generic
DIVIDE query of the running use case example. It shows how the sensor query rule in Listing 3is instantiated in the proof’s rule chain. [...]
is a placeholder for omitted parts that are not of interest.
the current context would describe another patient location than the bathroom, or would not define showering
as part of the routine of the patient with ID 157, the proof would not contain this sensor query rule instantiation.
6.3. Query extraction
The proof in the output of the semantic reasoning step can contain instantiations of the sensor query rule. If not,
the proof will be empty, since this means that the semantic reasoner has not found any rule chain that leads to an
instantiation of the goal rule. Every sensor query rule instantiation in the proof contains the list of input variables
and window parameters that need to be substituted into the generic RSP-QL query of the considered DIVIDE query.
In the query extraction step, DIVIDE will extract these definitions from every sensor query rule instantiation in the
proof. Hence, the output of this step is a set of zero, one or more extracted queries.
The query extraction happens through two forward reasoning steps with the semantic reasoner used in DIVIDE.
The outputs of both steps are combined to construct the output of the query extraction. The first reasoning step
extracts the relevant content from the sensor query rule instantiations in the proof. For each instantiation, this content
includes the instantiated input variables and window parameters, as well as a reference to the query pattern in which
they need to be substituted. The second forward reasoning step of the query extraction retrieves any defined window
parameters from the enriched context that are associated with the instantiated RSP-QL query pattern. Such window
parameters may have been added to the enriched context during the context enrichment step. They will be used as
dynamic window parameters during the window parameter substitution, while the window parameters occurring in
the sensor query rule instantiations are considered as static window parameters.
For the running example, the output of the extraction for the proof step in Listing 4is presented in lines 4–12
of Listing 5. Line 15 of this listing presents the output of the second step. For this query example, there are no
dynamic window parameters, which defaults the output of this second query extraction step to an empty list.
In Section A.3 of Appendix A, a related example is presented that does include dynamic window parameters.
CORRECTED PROOF
[research-article] p. 19/49
M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design 19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
1 @prefix : <file:///home/divide/.divide/query-derivation/10-10-129-31-8175-/activity-ongoing/20211220
_194006/proof.n3#>.
2
3 # output of the first reasoning step of the query extraction
4 :lemma3 a sd:Query.
5 :lemma3 sd:inputVariables (
6 ("?sensor" <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/70:ee:50:67:3e:78>)
7 ("?threshold" "57"^^xsd:float) ("?activityType" ActivityRecognition:Showering)
8 ("?patient" patients:patient157) ("?model" :KBActivityRecognitionModel)
9 ("?prop_o" <https://dahcc.idlab.ugent.be/Homelab/SensorsAndActuators/org.dyamand.types.common.
RelativeHumidity>)
10 ).
11 :lemma3 sd:staticWindowParameters (("?range" 30 time:seconds) ("?slide" 10 time:seconds)).
12 :lemma3 sd:pattern sd-query:pattern.
13
14 # output of the second reasoning step of the query extraction
15 :lemma3 sd:dynamicWindowParameters ().
Listing 5. Output of the query extraction step of the DIVIDE query derivation, performed for the running example on the proof with a single
sensor query rule instantiation presented in the proof step of Listing 4. The extraction of the dynamic window parameters (line 15) is done on
the enriched context outputted by the context enrichment step.
6.4. Input variable substitution
In this step, DIVIDE substitutes the input variables of each query from the query extraction output into the
associated RSP-QL query pattern. To achieve this, a collection of N3rules have been defined that allow to substi-
tute the input variables into the query body in a deterministic way. Moreover, they ensure that the substitution is
correct for IRIs and literals of any data type. To perform the substitution, the semantic reasoner used in DIVIDE
performs a forward reasoning step. The input of this reasoning step consists of the substitution rules, the output of
the query extraction step and the query pattern of the considered DIVIDE query. For each query in the query ex-
traction output, the output of this step consists of a set of triples that define the partially substituted RSP-QL query
body.
The output of the input variable substitution step for the running example is presented in Listing 6.This
substitution is performed using the generic RSP-QL query body referenced in the output of the query extraction
in Listing 5. This query body is shown in Listing 3. In the output, lines 1–13 redefine the prefixes, which will be
required in a further step to construct the full RSP-QL query. Line 16 shows the current state of the instantiated
RSP-QL query body: input variables have already been substituted, but the window parameters still need to be
substituted. The static and dynamic window parameters that will be used for substitution in the following step,
are propagated from the output of the query extraction step (lines 19–20).
6.5. Window parameter substitution
In this step, the window parameters are also substituted in the partially instantiated queries to obtain the resulting
RSP-QL query bodies. This is the final step that is performed by the semantic reasoner used in DIVIDE.
In general, DIVIDE offers the possibility to define the window parameters of derived RSP queries using se-
mantic definitions. Currently, context-aware window parameters can be defined by an end user via the definition
of a DIVIDE query. By separating the window parameter substitution from the other query derivation steps, DI-
VIDE offers the flexibility to trigger this substitution for other reasons than a context change. An example of this
could be a device monitor observing that the resources of the device cannot handle the current query execution
frequency.
Currently, to enable the substitution of use case dependent window parameters, DIVIDE makes the distinction
between static and dynamic window parameters. For a static window parameter, the variable behaves as a regular
input variable. This means that it should be defined in the consequence of a DIVIDE query’s sensor query rule with
a triple similar to the following one:
CORRECTED PROOF
[research-article] p. 20/49
20 M. De Brouwer et al. / Context-aware query derivation for IoT data streams with DIVIDE enabling privacy by design
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26