Conference PaperPDF Available

Abstract and Figures

We present a framework for semantic situation understanding and interpretation of multimodal data using Description Logics (DL) and rules. More precisely, we use DL models to formally describe contextualised dependencies among verbal and non-verbal descriptors in multimodal natural language interfaces, while context aggregation, fusion and interpretation is supported by SPARQL rules. Both background knowledge and multimodal data, e.g. language analysis results, facial expressions and gestures recognized from multimedia streams, are captured in terms of OWL 2 ontology axioms, the de facto standard formalism of DL models on the Web, fostering reusability, adaptability and interoperability of the framework. The framework has been applied in the eminent field of healthcare, providing the models for the semantic enrichment and fusion of verbal and non-verbal descriptors in dialogue-based systems.
Content may be subject to copyright.
Description Logics and Rules for Multimodal
Situational Awareness in Healthcare
Georgios Meditskos, Stefanos Vrochidis, and Ioannis Kompatsiaris
Information Technologies Institute
Centre for Research and Technology - Hellas, Thessaloniki, Greece
{gmeditsk, stefanos, ikom}
Abstract. We present a framework for semantic situation understand-
ing and interpretation of multimodal data using Description Logics (DL)
and rules. More precisely, we use DL models to formally describe contex-
tualised dependencies among verbal and non-verbal descriptors in multi-
modal natural language interfaces, while context aggregation, fusion and
interpretation is supported by SPARQL rules. Both background knowl-
edge and multimodal data, e.g. language analysis results, facial expres-
sions and gestures recognized from multimedia streams, are captured in
terms of OWL 2 ontology axioms, the de facto standard formalism of DL
models on the Web, fostering reusability, adaptability and interoperabil-
ity of the framework. The framework has been applied in the eminent
field of healthcare, providing the models for the semantic enrichment and
fusion of verbal and non-verbal descriptors in dialogue-based systems.
Keywords: multimodal data, ontologies, rules, situation awareness
1 Introduction
A key requirement in multimodal domains is the ability to integrate the different
pieces of information (modalities), so as to derive high-level interpretations. More
precisely, in such environments, information is typically collected from multiple
sources and complementary modalities, such as from multimedia streams (e.g.
using video analysis and speech recognition), lifestyle and environmental sensors
[15]. Though each modality is informative on specific aspects of interest, the
individual pieces of information themselves are not capable of delineating com-
plex situations. Combined pieces of information on the other hand can plausibly
describe the semantics of situations, facilitating intelligent situation awareness.
In parallel, the demand for context-aware user task support has proliferated
in the recent years across a multitude of application domains, ranging from
healthcare and smart spaces to transportation and energy control. A key chal-
lenge in such applications is to abstract and fuse the captured context in order
to elicit an adequate understanding of user actions [5]. In healthcare, for exam-
ple, wearable and ambient sensors, coupled with profile information and clinical
knowledge can be used to improve the quality of life of care recipients and pro-
vide useful insights to clinical experts for personalized interventions and care
solutions [23].
Given the inherent requirement in multimodal environments to aggregate
low-level information and integrate domain knowledge, it comes as no surprise
that Semantic Web technologies have been acknowledged as affording a num-
ber of highly desirable features. More precisely, the OWL 2 ontology language
[10] has been extensively used to capture context elements (e.g. profiles, events,
activities, locations, postures and emotions) and their pertinent relations, map-
ping observations and domain knowledge to class and property assertions in
the Description Logics (DL) [3] theory, fostering integration of information at
various levels of abstraction and completeness [14]. The generated models encap-
sulate formal and expressive semantics, harvesting several benefits brought by
ontologies, e.g. modelling of complex logical relations, sharing information from
heterogeneous sources, sound and complete reasoning engines.
This paper describes an ontology-based framework for context awareness
and conversation understanding in multimodal natural language interfaces. The
framework allows the semantic enrichment of verbal and non-verbal information
coming from multiple devices and acquisition methods, e.g. from multimedia
analysis, following a knowledge-driven methodology for observation aggregation,
linking and situation interpretation. The contributions of our work can be sum-
marized in the following:
We alleviate the lack of inherent temporal reasoning support in DL and
OWL 2 by adopting an a-temporal approach for subsumption reasoning and
multimodal data fusion based on time-windows.
We propose an iterative combination of DL reasoning and rules to enhance
the reasoning capabilities of the framework.
We use SPARQL queries (CONSTRUCT graph patterns) as the underlying rule
language of the framework, overcoming the lack of a standard rule language
that runs directly on top of RDF and OWL ontologies.
Due to the dynamic and open nature of ontologies, the framework is modality-
agnostic, in the sense that it is not tight to specific domains and data sources
but it can be extended, adapted and used in a variety of situations.
We illustrate the capabilities of the framework through its integration into
a dialogue-based agent for conversational assistance in healthcare. More specif-
ically, elderly use the dialogue system (usually at home) to acquire information
and suggestions related to basic care and healthcare (e.g. symptoms, treatments,
etc.). A key challenge in this domain is the effective fuse of verbal and non-verbal
communication modalities, e.g. deictic gestures and spoken utterances, in order
to disambiguate and interpret user input during the interaction with the agent.
The rest of the paper is structured as follows: Section 2 begins with a basic
background on the DL theory and OWL ontologies. It continues with a discus-
sion on ontology-based context-aware solutions, explaining basic concepts and
challenges. Section 3 describes the proposed framework, providing details on
the representation and interpretation layers, as well as on the hybrid reasoning
scheme and the role of SPARQL. Section 4 explicates through an example use
case from the ongoing simulated evaluation of the framework in the healthcare
domain and Section 5 concludes the paper and outlines next steps.
2 Background and Related Work
2.1 Description Logics
Description Logics (DL [3]) is a family of knowledge representation formalisms
characterised by logically grounded semantics and well-defined reasoning tasks.
The main building blocks are concepts (or classes), representing sets of objects,
roles (or properties), representing relationships between objects, and individuals
(or instances) representing specific objects. Starting from atomic concepts, arbi-
trary complex concepts can be described through a rich set of constructors that
define the conditions of concept membership. DL provides, among others, con-
structs for concept inclusion (CvD), equality (CD) and assertion (C(a)),
as well as role inclusion (RvS) and assertion (R(a, b)).
The semantics of a DL language is formally defined through an interpretation
Ithat consists of a nonempty set I(the domain of interpretation) and an
interpretation function ·I, which assigns to every atomic concept Aa set AII
and to every atomic role Ra binary relation RII×I. Table 1 shows the
syntax and semantics of some of the most common DL constructors. For example,
the class of all deictic gesture observations that point to the head can be defined
as PointsToHeadGesture DeicticGesture u ∃hasBodyPart.{head}.
Besides formal semantics, DL comes with a set of powerful reasoning ser-
vices, for which efficient, sound and complete reasoning algorithms are available.
For example, through subsumption, one can derive implicit taxonomic relations
among concepts. Satisfiability and consistency checking are useful to determine
whether a knowledge base is meaningful at all. Instance realization returns all
concepts from the knowledge base that a given individual is an instance of.
2.2 OWL 2 Ontologies
An ontology is a set of precise descriptive statements about some part of the
world (usually referred to as the domain of interest). Precise descriptions satisfy
several purposes: most notably, they prevent misunderstandings in human com-
munication and they ensure that software behaves in a uniform, predictable way
and works well with other software1.
Table 1. Examples of concept and role constructors in DL.
Name Syntax Semantics
Intersection CuD CIDI
Union CtD CIDI
Universal Quantification R.C {aI| ∀b.(a, b)RIbCI}
Existential Quantification R.C {aI| ∃b.(a, b)RIbCI}
The Web Ontology language (OWL/OWL 2) [10] is a knowledge represen-
tation language widely used within the Semantic Web community for creating
ontologies. The design and semantics of OWL 2 have been strongly influenced
by DL2. Some basic notions are: a) axioms, the basic statements that an OWL
ontology expresses, b) entities, elements used to refer to real-world objects, and
c) expressions, combinations of entities to form complex descriptions.
In principle, every OWL 2 ontology is essentially a collection of such basic
“pieces of knowledge”. Statements that are made in an ontology are called ax-
ioms, and the ontology asserts that its axioms are true. However, despite the
rich primitives, there are certain limitations that amount to the DL style model
theory used to formalise semantics, and particularly the tree model property
[16] conditioning DL decidability. For example OWL 2 can model only domains
where objects are connected in a tree-like manner. In order to leverage OWL’s
limited relational expressiveness, research has been devoted to the integration
of OWL with rules (e.g. SWRL [12], SPIN [13]). User-defined rules on top of
the ontology allow expressing richer semantic relations that lie beyond OWL’s
expressive capabilities and couple ontological and rule knowledge [9].
2.3 Ontology-based Context Awareness and Fusion
Congruous with the open nature of context-awareness, where information at var-
ious levels of abstraction and completeness has to be integrated, ontologies have
attracted growing interest as means for modelling and reasoning over contex-
tual information in various domains [2][14]. For example, BeAware! [4] provides
a framework for context awareness in road traffic management; [26] proposes
an ontology-based framework for context-aware activity recognition in smart
homes. A survey on context awareness from an IoT perspective is presented in
[17], whereas challenges and opportunities in applying Semantic Web technolo-
gies in context-aware pervasive applications are discussed in [27].
A common characteristic in all cases above is the use of ontologies for domain
modelling. Ontology languages, such as OWL 2, share a common understanding
of the structure and semantics of information, enabling knowledge reuse and
inferencing. Capitalizing on the expressivity of the models, several approaches
define one or more interpretation layers in order to elicit an understanding of
the situation. For example, in the domain of natural language interfaces and
dialogue-based systems [24], ontologies provide the vocabulary and semantics
for content disambiguation [6][7], such as WordNet3and BabelNet4. Ontolo-
gies have been also used in NLP information extraction contexts for coreference
resolution in textual input [22][19]. In the domain of multimodal fusion, ontolo-
gies are used to fuse multi-level contextual information [8]. For example, [18]
presents a framework for coupling audio-visual cues with multimedia ontologies.
2In this paper, OWL 2 is used to refer to OWL 2 DL ontologies interpreted using the
Direct Semantics [20]
Relevant approaches are also described in [1] for various multimedia analysis
tasks. SmartKom [25] partially uses ontologies to fuse information in multi-
modal dialogue systems, combining speech, gesture and facial expressions.
Similar to the aforementioned approaches, we use OWL 2 ontologies for mod-
elling context types and their relationships in terms of DL concept class con-
structors. However, we argue that the constructors provided by DL, and hence
by OWL 2, are sometimes inadequate to facilitate effective multimodal fusion.
Certain modelling and reasoning limitations, such as the tree-model property
mentioned above or the lack of temporal reasoning, render OWL 2 insufficient
to address practical fusion requirements, such as the assertion of property fillers
for unconnected instances, as we demonstrate in Section 4. Our framework lever-
ages OWL 2 limited expressivity through an intelligent, multi-tier hybrid scheme
of DL reasoning that follows a context-aware fusion and interpretation solution
along with the use of SPARQL CONSTRUCT graph patterns [11] as the underlying
rule language of the framework.
3 Semantic Fusion Framework
The aim of the Semantic Fusion Framework (SFF) is to aggregate context types
and couple them with background knowledge. SFF does not impose any restric-
tion on the modalities that can be fused, provided that the underlying ontologies
support their representation. As such, SFF consists of two core tiers:
Representation tier: Provides the knowledge structures needed to capture
the semantics and structure of the various modalities, as well as the semantics
of the domain model that drives the fusion task.
Interpretation tier: Implements the fusion logic, capitalizing on OWL 2
DL reasoning and custom interpretation rules that combine the available
input and generate additional inferences.
The conceptual architecture of SFF is depicted in Fig. 1. In the following
sections, we further elaborate on the specifics of each tier.
2nd Tier: Interpretation Layer
DL reasoning and SPARQ L
1st Tier: Representation Layer
vocabularies and patt erns
current context
Fig. 1. Conceptual architecture of the Semantic Fusion Framework (SFF)
3.1 Domain and Context Descriptors
As mentioned above, SFF is modality-agnostic since it is not tight to specific con-
text types. In that sense, contextual information may be collected from a variety
of sources, such as ambient and wearable sensors (e.g. temperature and prox-
imity observations), multimedia analysis, such as text analysis (named entities
and concepts), video analysis (e.g. location, gestures), etc. All this information
needs to be mapped on domain entities to enable the derivation of contextual
descriptors that best satisfy and interpret the context.
We use the term “observation” to abstractly refer to the root of the context
type hierarchy. Fig. 2 depicts a lightweight vocabulary for modelling context
types. The ontology extends the leo:Event concept of LODE [21] to benefit
from existing vocabularies to describe events and observations. Property asser-
tions about the temporal extension of the observations and the agent (actor) are
allowed, reusing core properties of LODE. Fig. 2 also depicts the relationship be-
tween the upper-level domain and context models. More precisely, the Context
class is provided that allows one or more contains property assertions referring
to observations. In terms of DL semantics, the Context class is defined as:
Context ≡ ∃contains.Observation (1)
classifying instances with contains property assertions in Context. As we demon-
strate in Section 4, the adaptation of the framework in different domains involves
the extension of the Context concept, specifying the observation types that des-
ignate complex situations of interest that need to be recognized. Intuitively,
instances of the Context concept define set of observations, designating the cur-
rent context that needs to be classified and interpreted.
3.2 DL Reasoning and SPARQL
The interpretation tier defines the way atomic observations can lead to the
derivation of high-level interpretations. For this task, we group observations into
a single Context instance, creating the current context, which is then fed into
the DL reasoner for subsumption reasoning and context classification. In prin-
ciple, the current context is built taking into account the temporal extension of
observations, along with background information pertinent to the domain. We
present an example current context definition in Section 4.
dul:Agent Context
Fig. 2. Upper-level domain and context structures in SFF
However, apart from context classification, an important reasoning require-
ment in multimodal fusion is the propagation of property fillers among incoming
observations, e.g. the injection of the body part where a deictic gesture points
to, which is derived after fusion with spoken utterances. Due to the tree-model
property, DL reasoning is not able to update property fillers for unconnected
instances (observations). SFF uses SPARQL CONSTRUCT graph patterns to en-
rich the reasoning capabilities of the framework, implementing certain fusion
requirements, according to the entities and relations involved.
The hybrid reasoning algorithm is depicted in Figure 3. Assuming that Gis
the RDF/OWL graph with context observations, Qis the set with all SPARQL
CONSTRUCT graph patterns, RDL is the OWL 2 DL reasoning module and RSPARQL
is the SPARQL query engine, the algorithm in Figure 3 enriches Gwith addi-
tional interpretations. More specifically, the algorithm implements an iterative
combination of DL reasoning and SPARQL query execution. Initially, the DL
reasoning module is used over Gfor subsumption reasoning and realization (line
2). The derivations are added back to Gthat is now used as the underlying
graph for the SPARQL reasoning module. When all SPARQL queries have been
executed (lines 3 to 5), a reasoning iteration has been completed. The algorithm
terminates when no SPARQL inferences are derived after an iteration.
4 Use Case: Reference Resolution
We describe the simulated evaluation of SFF that involves the conversation of
users with a dialogue-based agent at a home or a nursing environment in order
to acquire treatment suggestions about problems they have. We describe the
ontologies and rules needed to disambiguate referring expressions, taking into
account non-verbal modalities, e.g. deictic gestures. In the simulated example,
the user touches his head and says “It hurts here!”. By fusing pointing gestures,
the agent can conclude that the user has a headache and it can provide relevant
treatment suggestions.
4.1 Domain Ontologies
It is assumed that SFF acquires contextual information about body gestures
(e.g. deictic gestures to the head) and verbal events (e.g. entities and concepts
extracted through language analysis) through respective multimedia analysis
Require: G6=, Q 6=
1: repeat
3: for all qQdo
4: GGRSP ARQL (q, G)
5: end for
6: until RSP ARQL (q, G) =
Fig. 3. The hybrid context interpretation algorithm.
In effect, the reasoning algorithm consists of successive steps of OWL rea-
soning, materialisation, and SPARQL queries execution. MetaQ can be char-
acterised as a loosely-coupled bidirectional hybrid framework in the sense that
the two reasoning modules are separate: the materialisation results of OWL rea-
soning are sent to the SPARQL module for executing the rules on top of the
information. Then the results of SPARQL are sent back to the OWL reasoner.
In principle, decidability in frameworks that combine ontologies and rules is en-
sured by allowing rule variables to bind only to explicitly named individuals.
SPARQL follows the Closed-World Assumption (i.e. the knowledge available is
thought to be a complete encoding of the domain of interest) and hence in-
herently satisfies this condition. Therefore, even if OWL reasoning follows the
OWA, the SPARQL-based interpretation layer queries the underlying knowl-
edge in a CWA-like manner. Also note that, in the generation of new individuals
during the execution of the SPARQL queries, unique URIs are enforced that en-
sure that an individual that has already been generated in a previous reasoning
cycle, cannot be reintroduced in a subsequent one (NOT EXISTS triple pattern
In Section 4 we present examples of such rules.
4 Use Case: Coreference Resolution
We present an instantiation of the framework in the domain of...
5 Conclusions
Acknowledgments. This work has been partially supported by the H2020-
645012 project “KRISTINA: A Knowledge-Based Information Agent with Social
Competence and Human Interaction Capabilities”.
1. Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion
for multimedia analysis: A survey. Multimedia Syst. 16(6), 345–379 (Nov 2010)
Fig. 3. Skeleton of the hybrid context interpretation algorithm.
HandDiagonally HeadDown
po i n ts To
is-a is-a
isPartOf isPartOf
Fig. 4. Excerpt of the gesture and body part ontologies.
modules (in this case, from video and audio data). Fig. 4 depicts the specializa-
tion of the Observation hierarchy (Fig. 2) for modelling gestures. The emphasis
is placed on the HandGesture concept that allows pointsTo property asser-
tions about the body part where the hand points to. Additional complex con-
cepts are defined (not visualised in Fig. 4) by composing existing contexts, e.g.
HeadReference HandGesture u ∃pointsTo.Head. As far as language analysis
is concerned, the current deployment capitalizes on the results of a frame-based
formalisation of natural language utterances using DOLCE-DnS Ultralite pat-
terns5. Fig. 5 depicts the relevant ontology. For example, a gesture event pointing
to the head can be represented as:
:g1 a :HandGesture;
:pointsTo [rdf:type :Head] ;
:atTime [...] .
which is further classified as HeadReference, based on the axiom defined above.
Likewise, the verbal event corresponding to the example can be represented as:
:fs1 a :InformSpeechAct, :PerceptionBodyFrameSituation;
dul:isSettingFor :h1 .
:h1 a [:Hurt rdfs:SubClassOf dul:Event] ;
dul:hasParticipant :d1 .
:d1 a :Here ;
dul:isClassifiedBy [:bp1 a :BodyPart], [:sd1 a :SpatialDeictic] .
As illustrated, the example utterance is an InformSpeechAct about a physi-
cal experience (i.e. a PerceptionBodyFrameSituation), where the affected body
part, i.e. the object classified as BodyPart, is not named explicitly but instead
implied by a deictic referring expression.
Fig. 5. The upper level ontology for representing verbal analysis results.
4.2 Context Models and Fusion
As already discussed, SFF needs to build the current context. In our example,
whenever a FrameSituation is sent to the SFF framework, SPARQL queries
retrieve neighbouring events that overlap a fixed time interval around it, e.g.
[-2s, +2s]. The overlapped observations form the current context, which is fed
into the ontology reasoner to interpret it. The current context is defined as
CurrentContext Context u ∃contains.FrameSituation (2)
where Context is given by (1). In order to model the situation when the user
feels pain, CurrentContext is further specialised as:
PainContext CurrentContext u ∃contains.(FrameSituation
u ∃isSettingFor.Hurt)(3)
According to (3), if the current context contains a FrameSituation that is
associated with a Hurt conceptualisation, it is classified in the PainContext
class. Assuming that fs1 is part of the current context, SFF interprets it as a
PainContext situation, since sf1 satisfies the complex class description in (3).
In addition, provided that the Hurt instantiation of the FrameSituation is
also associated with a body part, the current context can be further classified in
the Headache class, defined as:
Headache PainContext u ∃contains.
(FrameSituation u ∃isSettingFor.
(Hurt u ∃isAssociatedWith.
(Head u ∃isClassifiedBy.BodyPart)))
As such, if the user explicitly mentions the body part, then the FrameSituation
can be directly classified by the underlying ontology reasoner as a Headache. In
our example, however, the user does not explicitly refer to the head, but instead
points to it, while using the deictic referring expression “here”. As a result, the
inferred PainContext is associated with a non-body part entity, which moreover
is classified as SpatialDeictic. In this case, SFF needs to take into account
the fact that there is an underspecified body part in the FrameSituation that
requires additional contextual information, and in particular non-verbal one, in
order to resolve the ambiguity and provide an appropriate feedback. The logic
to derive such inferences is beyond the expressivity provided by OWL 2. In this
case, SFF uses a fusion rule to resolve this ambiguity. The following SPARQL
rule implements the fusion of language analysis results with hand gestures to
body parts, so as to fill the missing body part fillers.
?p isAssociatedWith ?bodypart.
?c a PainContext;
contains [isSettingFor ?p] .
?p a Hurt;
isAssociatedWith ?bp .
?bp a SpatialDeictic.
?c contains [a HandGesture; pointTo ?bodypart].
Having updated the context with the inferred body part, the DL reasoner
can now classify the current context in the Headache class, based on (4). As
such, through the combination of the DL and SPARQL modules, SFF interprets
the current situation as a headache, propagating it to subsequent modules to
retrieve suggestions and provide feedback to the end user.
5 Conclusions
In this work, we presented SFF, an ontology-driven framework that couples DL
reasoning and rules for multimodal fusion. The focus has been given on the
interpretation of conversational contexts in dialogue-based systems, fusing non-
verbal (i.e. gestures) and verbal features extracted from multimedia data for
situation awareness. Ontologies are used to formally capture context types and
background knowledge, while fusion and interpretation is reduced on the efficient
combination of DL reasoning and SPARQL query execution.
We also described the simulated evaluation of SFF for reference resolution.
We are currently collecting data for evaluating the framework using real-world
conversations. In parallel, we are working towards further enrichment of the
fusion and interpretation capabilities of the framework, so as to support for
additional use cases, e.g. tasking into account emotions and facial expressions.
It is also important to mention that the identification of the current conversa-
tional context does not take into account uncertainty. Our plan is to investigate
lightweight probabilistic and non-monotonic reasoning schemes to enhance the
interpretation capabilities of SFF.
Acknowledgments. This work has been partially supported by the H2020-
645012 project “KRISTINA: A Knowledge-Based Information Agent with Social
Competence and Human Interaction Capabilities”.
1. Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion
for multimedia analysis: A survey. Multimedia Syst. 16(6), 345–379 (Nov 2010)
2. Attard, J., Scerri, S., Rivera, I., Handschuh, S.: Ontology-based situation recogni-
tion for context-aware systems. In: Proceedings of the 9th International Conference
on Semantic Systems. pp. 113–120. I-SEMANTICS ’13, ACM (2013)
3. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.
(eds.): The Description Logic Handbook: Theory, Implementation, and Applica-
tions. Cambridge University Press (2003)
4. Baumgartner, N., Gottesheim, W., Mitsch, S., Retschitzegger, W., Schwinger, W.:
Beaware! - situation awareness, the ontology-driven way. Data & Knowledge En-
gineering 69(11), 1181 – 1193 (2010)
5. Bettini, C., Brdiczka, O., Henricksen, K., Indulska, J., Nicklas, D., Ranganathan,
A., Riboni, D.: A survey of context modelling and reasoning techniques. Pervasive
Mob. Comput. 6(2), 161–180 (Apr 2010)
6. Damljanovi´c, D., Agatonovi´c, M., Cunningham, H., Bontcheva, K.: Improving hab-
itability of natural language interfaces for querying ontologies with feedback and
clarification dialogues. Web Semantics: Science, Services and Agents on the World
Wide Web 19, 1–21 (2013)
7. Denaux, R., Dimitrova, V., Cohn, A.G.: Interacting with ontologies and linked
data through controlled natural languages and dialogues. In: Do-Form: Enabling
Domain Experts to Use Formalised Reasoning-AISB Convention 2013. pp. 18–20.
Society for the study of artificial intelligence (2013)
8. Dourlens, S., Ramdane-Cherif, A., Monacelli, E.: Multi levels semantic architecture
for multimodal interaction. Applied intelligence 38(4), 586–599 (2013)
9. Eiter, T., Ianni, G., Krennwallner, T., Polleres, A.: Rules and ontologies for the
semantic web. In: Baroglio, C., Bonatti, P., Mauszyski, J., Marchiori, M., Polleres,
A., Schaffert, S. (eds.) Reasoning Web, Lecture Notes in Computer Science, vol.
5224, pp. 1–53. Springer Berlin Heidelberg (2008)
10. Grau, B.C., Horrocks, I., Motik, B., Parsia, B., Patel-Schneider, P., Sattler., U.:
OWL 2: The Next Step for OWL. Web Semantics: Science, Services and Agents
on the World Wide Web 6(4), 309–322 (October 2008)
11. Harris, S., Seaborne, A.: SPARQL 1.1 query language, W3C recommendation 21
march 2013,
12. Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.:
SWRL: A semantic web rule language combining OWL and RuleML. Tech. rep.,
National Research Council of Canada and Stanford University (May 2004)
13. Knublauch, H., Hendler, J.A., Idehen, K.: SPIN - overview and motivation. W3C
member submission, World Wide Web Consortium (Feb 2011)
14. Kokar, M.M., Matheus, C.J., Baclawski, K.: Ontology-based situation awareness.
Information Fusion, Special Issue on High-level Information Fusion and Situation
Awareness 10(1), 83 – 98 (2009)
15. Lahat, D., Adali, T., Jutten, C.: Multimodal data fusion: An overview of methods,
challenges, and prospects. Proceedings of the IEEE 103(9), 1449–1477 (Sept 2015)
16. Motik, B., Cuenca Grau, B., Sattler, U.: Structured objects in OWL: representation
and reasoning. In: Proceedings of the 17th international conference on World Wide
Web (WWW ’08). pp. 555–564. ACM, New York, NY, USA (2008)
17. Perera, C., Zaslavsky, A., Christen, P., Georgakopoulos, D.: Context aware comput-
ing for the internet of things: A survey. IEEE Communications Surveys & Tutorials
16(1), 414–454 (2014)
18. Perperis, T., Giannakopoulos, T., Makris, A., Kosmopoulos, D.I., Tsekeridou,
S., Perantonis, S.J., Theodoridis, S.: Multimodal and ontology-based fusion ap-
proaches of audio and visual processing for violence detection in movies. Expert
Systems with Applications 38(11), 14102 – 14116 (2011)
19. Prokofyev, R., Tonon, A., Luggen, M., Vouilloz, L., Difallah, D.E., Cudr´e-Mauroux,
P.: Sanaphor: Ontology-based coreference resolution. In: International Semantic
Web Conference. pp. 458–473. Springer (2015)
20. Schneider, M., Rudolph, S., Sutcliffe, G.: Modeling in OWL 2 without restrictions.
In: Proceedings of the 10th International Workshop on OWL: Experiences and
Directions, co-located with 10th Extended Semantic Web Conference (2013)
21. Shaw, R., Troncy, R., Hardman, L.: Lode: Linking open descriptions of events. In:
4th Asian Conference on The Semantic Web. pp. 153–167. Shanghai, China (2009)
22. Sleeman, J., Finin, T.: Type prediction for efficient coreference resolution in het-
erogeneous semantic graphs. In: Semantic Computing (ICSC), 2013 IEEE Seventh
International Conference on. pp. 78–85. IEEE (2013)
23. Solanas, A., Patsakis, C., Conti, M., Vlachos, I.S., Ramos, V., Falcone, F., Posto-
lache, O., P´erez-Mart´ınez, P.A., Di Pietro, R., Perrea, D.N., et al.: Smart health: a
context-aware health paradigm within smart cities. IEEE Communications Maga-
zine 52(8), 74–81 (2014)
24. Sonntag, D.: Ontologies and Adaptivity in Dialogue for Question Answering, vol. 4.
IOS Press (2010)
25. Wahlster, W.: Dialogue Systems Go Multimodal: The SmartKom Experience, pp.
3–27. Springer Berlin Heidelberg, Berlin, Heidelberg (2006)
26. Wongpatikaseree, K., Ikeda, M., Buranarach, M., Supnithi, T., Lim, A.O., Tan,
Y.: Activity recognition using context-aware infrastructure ontology in smart home
domain. In: Knowledge, Information and Creativity Support. pp. 50–57 (2012)
27. Ye, J., Dasiopoulou, S., Stevenson, G., Meditskos, G., Kontopoulos, E., Kompat-
siaris, I., Dobson, S.: Semantic web technologies in pervasive computing: A survey
and research roadmap. Pervasive and Mobile Computing 23, 1–25 (2015)
... Distribution Statement A: Approved for Public Release; Distribution is Unlimited; #19-1107; Dated 07/18/19 analysis results, and gestures captured from multimedia data streams to provide situational awareness in healthcare [14]. Adjali et al. proposed an approach for multi-modal fusion of data from sensors to provide ambient intelligence for robots [2]. ...
Full-text available
Extracting relevant patterns from heterogeneous data streams poses significant computational and analytical challenges. Further, identifying such patterns and pushing analogous content to interested parties according to mission needs in real-time is a difficult problem. This paper presents the design of SKOD, a novel Situational Knowledge Query Engine that continuously builds a multi-modal relational knowledge base using SQL queries; SKOD pushes dynamic content to relevant users through triggers based on modeling of users’ interests. SKOD is a scalable, real-time, on-demand situational knowledge extraction and dissemination framework that processes streams of multi-modal data utilizing publish/subscribe stream engines. The initial prototype of SKOD uses deep neural networks and natural language processing techniques to extract and model relevant objects from video streams and topics, entities and events from unstructured text resources such as Twitter and news articles. Through its extensible architecture, SKOD aims to provide a high-performance, generic framework for situational knowledge on demand, supporting effective information retrieval for evolving missions.
... The dialogue manager in KRISTINA is based on the OwlSpeak dialogue manager [4] and has been extended to accommodate to the needs of the KRISTINA agent. It makes extensive use of a knowledge integration component [6] that models all relevant domain knowledge. This component provides a homogeneous interface to a variety of knowledge sources, such as a knowledge base containing personal information, relevant information from trusted web sources gathered dynamically using information retrieval techniques as well as web services such as weather forecasts. ...
Conference Paper
Access to health care related information can be vital and should be easily accessible. However, immigrants often have difficulties to obtain the relevant information due to language barriers and cultural differences. In the KRISTINA project, we address those difficulties by creating a socially competent multimodal dialogue system that can assist immigrants in getting information about health care related questions. Dialogue management, as core component responsible for the system behaviour, has a significant impact on the successful reception of such a system. Hence, this work presents the specific challenges of the KRISTINA project to adaptive dialogue management, namely the handling of a large dialogue domain and the cultural adaptability required by the envisioned dialogue system, and our approach to handling them.
Full-text available
The appeal of being able to ask a question to a mobile internet terminal and receive an answer immediately has been renewed by the broad availability of information on the Web. Ideally, a spoken dialogue system that uses the Web as its knowledge base would be able to answer a broad range of questions. A new generation of natural language dialogue systems is emerging that transforms traditional keyword search engines into semantic answering machines by providing exact and concise answers formulated in natural language instead of today's long lists of document references, which the user has to check by himself for relevant answers. This book presents the anatomy of the fully operational SmartWeb system (funded by the German Federal Ministry of Education and Research with grants totaling 14 million euros) that provides not only an open-domain question answering machine but a multimodal web service interface for coherent dialogue, where questions and commands are interpreted according to the context of the previous conversation. One of the key innovations described in this book is the ability of the system to learn how to predict the probability that it can answer a complex user query in a given time interval. © 2010, Akademische Verlagsgesellschaft AKA GmbH, Heidelberg. All rights reserved.
Conference Paper
Full-text available
We tackle the problem of resolving coreferences in textual content by leveraging Semantic Web techniques. Specifically, we focus on noun phrases that coreference identifiable entities that appear in the text; the challenge in this context is to improve the coreference resolution by leveraging potential semantic annotations that can be added to the identified mentions. Our system, SANAPHOR, first applies state-of-the-art techniques to extract entities, noun phrases, and candidate coreferences. Then, we propose an approach to type noun phrases using an inverted index built on top of a Knowledge Graph (e.g., DBpedia). Finally, we use the semantic relatedness of the introduced types to improve the state-of-the-art techniques by splitting and merging coreference clusters. We evaluate SANAPHOR on CoNLL datasets, and show how our techniques consistently improve the state of the art in coreference resolution.
Full-text available
This paper describes a suite of tools developed at the University of Leeds which aim to make it easier for domain experts to be involved in the creation and use of ontologies. The paper summarises the main features of the tools and gives a short summary of our evaluations and experiences using the tools with domain experts.
Full-text available
In various disciplines, information about the same phenomenon can be acquired from different types of detectors, at different conditions, in multiple experiments or subjects, among others. We use the term “modality” for each such acquisition framework. Due to the rich characteristics of natural phenomena, it is rare that a single modality provides complete knowledge of the phenomenon of interest. The increasing availability of several modalities reporting on the same system introduces new degrees of freedom, which raise questions beyond those related to exploiting each modality separately. As we argue, many of these questions, or “challenges,” are common to multiple domains. This paper deals with two key issues: “why we need data fusion” and “how we perform it.” The first issue is motivated by numerous examples in science and technology, followed by a mathematical framework that showcases some of the benefits that data fusion provides. In order to address the second issue, “diversity” is introduced as a key concept, and a number of data-driven solutions based on matrix and tensor decompositions are discussed, emphasizing how they account for diversity across the data sets. The aim of this paper is to provide the reader, regardless of his or her community of origin, with a taste of the vastness of the field, the prospects, and the opportunities that it holds.
Full-text available
The new era of mobile health ushered in by the wide adoption of ubiquitous computing and mobile communications has brought opportunities for governments and companies to rethink their concept of healthcare. Simultaneously, the worldwide urbanization process represents a formidable challenge and attracts attention toward cities that are expected to gather higher populations and provide citizens with services in an efficient and human manner. These two trends have led to the appearance of mobile health and smart cities. In this article we introduce the new concept of smart health, which is the context-aware complement of mobile health within smart cities. We provide an overview of the main fields of knowledge that are involved in the process of building this new concept. Additionally, we discuss the main challenges and opportunities that s-Health would imply and provide a common ground for further research.
Conference Paper
Full-text available
Today's personal devices provide a stream of information which, if processed adequately, can provide a better insight into their owner's current activities, environment, location, etc. In treating these devices as part of a personal sensor network, we exploit raw and interpreted context information in order to enable the automatic recognition of personal recurring situations. An ontology-based graph matching technique continuously compares a person's 'live context', with all previously-stored situations, both of which are represented as an instance of the DCON Context Ontology. Whereas each situation corresponds to an adaptive DCON instance, initially marked by a person and gradually characterised over time, the live context representation is constantly updated with mashed-up context information streaming in from various personal sensors. In this paper we present the matching technique employed to enable automatic situation recognition, and an experiment to evaluate its performance based on real users and their perceived context data.
Pervasive and sensor-driven systems are by nature open and extensible, both in terms of input and tasks they are required to perform. Data streams coming from sensors are inherently noisy, imprecise and inaccurate, with differing sampling rates and complex correlations with each other. These characteristics pose a significant challenge for traditional approaches to storing, representing, exchanging, manipulating and programming with sensor data. Semantic Web technologies provide a uniform framework for capturing these properties. Offering powerful representation facilities and reasoning techniques, these technologies are rapidly gaining attention towards facing a range of issues such as data and knowledge modelling, querying, reasoning, service discovery, privacy and provenance. This article reviews the application of the Semantic Web to pervasive and sensor-driven systems with a focus on information modelling and reasoning along with streaming data and uncertainty handling. The strengths and weaknesses of current and projected approaches are analysed and a roadmap is derived for using the Semantic Web as a platform, on which open, standard-based, pervasive, adaptive and sensor-driven systems can be deployed.
Conference Paper
We describe an approach for performing entity type recognition in heterogeneous semantic graphs in order to reduce the computational cost of performing coreferenceresolution. Our research specifically addresses the problem of working with semi-structured text that uses ontologies that are not informative or not known. This problem is similar to co reference resolution in unstructured text, where entities and their types are identified using contextual information and linguistic-based analysis. Semantic graphs are semi-structured with very little contextual information and trivial grammars that do not convey additional information. In the absence of known ontologies, performing co reference resolution can be challenging. Our work uses a supervised machine learning algorithm and entity type dictionaries to map attributes to a common attribute space. We evaluated the approach in experiments using data from Wikipedia, Freebase and Arnetminer.