Download full-text PDF

Semantic Interoperability for the Web of Things

Research (PDF Available)  · August 2016with8,175 Reads
DOI: 10.13140/RG.2.2.25758.13122
Paul Murdock at Association for Computing Machinery
  • Association for Computing Machinery
Louay Bassbouss at Fraunhofer Institute for Open Communication Systems FOKUS
  • 3.24
  • Fraunhofer Institute for Open Communication Systems FOKUS
Abstract
This paper is co-authored by an informal group of experts from a broad range of backgrounds all of whom are active in standards groups, consortia and/or alliances in the Internet of Things (IoT) space. The ambition is to create mindshare on approaches to semantic interoperability and to actively encourage consensus building on what the co-authors regard as a key technical issue.
29-Aug-2016
1
Semantic Interoperability for the Web of Things1
Editor
Murdock, Paul
Landis+Gyr
Contributors
Bassbouss, Louay
Fraunhofer FOKUS
Deutsche Telekom
Bauer, Martin
NEC
IoTecha
Ben Alaya, Mahdi
Sensinov
Comcast
Bhowmik, Rajdeep
Binghamton University
Orange
Brett, Patrica
Honeywell
InterDigital
Chakraborty, Rabindra
Senslytics
LAAS CNRS
Dadas, Mohammed
Orange
Landis+Gyr
Davies, John
BT plc
Schneider Electric
Diab, Wael
Huawei
W3C
Drira, Khalil
LAAS CNRS
TNO
Eastham, Bryant
OpenDOF
Insight Centre for Data Analytics
El Kaed, Charbel
Schneider Electric
LAAS CNRS
Elloumi, Omar
Nokia
NIST
Girod-Genet, Marc
TELECOM SudParis
Intel
Hernandez, Nathalie
IRIT
NEC
Hoffmeister, Michael
FESTO
Krypton Brothers
Jiménez, Jaime
Ericson
InterDigital
Kanti Datta, Soumya
Eurecom
Rockwell Automation
Khan, Imran
Schneider Electric
Huawei
Kim, Dongjoo
LG
1 This document is available under a Creative Commons Attribution 4.0 International License.
29-Aug-2016
2
Background
This paper is co-authored by an informal group of experts from a broad range of backgrounds all of
whom are active in standards groups, consortia and/or alliances in the Internet of Things (IoT) space.
The ambition is to create mindshare on approaches to semantic interoperability and to actively
encourage consensus building on what the co-authors regard as a key technical issue.
The paper
Considers the value associated with interoperability in the IoT context and suggests that building
mindshare across the industry on semantic approaches is one of the keys to unlocking that
potential
Introduces foundational interoperability concepts and provides a discussion of metadata, the
rationale for sharing metadata, and the requirements for ontologies
Introduces ontologies and discusses their specific relevance to interoperability and significance
in the IoT context
Provides examples of modular ontologies, overviews semantic annotation and tagging, and
highlights strategies for scaling
Draws conclusions and makes a number of recommendations
The document is made available under a Creative Commons Attribution 4.0 International License.
Introduction
The Internet of Things (IoT) generates expectations that smart devices can discover their context and
build collaborations with other smart devices and services to create value. For example, smart
devices in the home should be able to discover each other and to work together to both enhance the
comfort and security of the home owner and to improve the efficiency of the home. When driving
into the city, a smart car should be able to interact with city services to identify and reserve a
parking place and should be able to collaborate with a personal smart phone to facilitate payment.
The expectation that ad-hoc networks of smart devices and services can be constantly formed and
re-formed to manifest transient value systems is driving the need for broad agreement on how such
devices interoperate and understand each other.
Discovery, understanding and collaboration at this level requires more than just an ability to
interface and to exchange data. Whereas interoperability is “the ability of two or more systems or
components to exchange data and use information" [1] , semantic interoperability “means enabling
different agents, services, and applications to exchange information, data and knowledge in a
meaningful way, on and off the Web[2] .
Semantic interoperability is achieved when interacting systems attribute the same meaning to an
exchanged piece of data, ensuring consistency of the data across systems regardless of individual
data format. This consistency of meaning can be derived from pre-existing standards or agreements
on the format and meaning of data or it can be derived in a dynamic way using shared vocabularies
either in a schema form and/or in an ontology-driven approach. In this paper we will use the term
"data-model based semantic interoperability" to refer to the former, and "ontology based semantic
interoperability" to refer to the latter.
This paper considers semantic interoperability in the context of the Internet of Things (IoT). Note
that we use “IoT” as an umbrella term for the range of emerging technologies which may differ in
scope and reach2, but which enable cross-domain innovation and drive the need for interoperability
at a dynamic level.
2 See [3] for a view on the range of IoT technologies and their scope
29-Aug-2016
3
Semantic Interoperability as a Value Enabler
There are many analyst studies describing IoT as a broad concept spanning all application domains.
Beecham Research provided some early insight into the scope of domains covered with their "M2M
Sector Map" [4] . A more recent study by McKinsey [5] considers the value potential of IoT in the
context of domains broadly similar to those identified in the Beecham research.
The McKinsey study goes on to provide an estimation of the value that could be unlocked given
interoperability across those domains see Figure 1.
According to McKinsey:
Interoperability between IoT systems is critically important to capturing maximum value; on average,
interoperability is required for 40 percent of potential value across IoT applications and by nearly 60
percent in some settings.
Figure 13 Value Potential Requiring Interoperability
We suggest that this McKinsey finding can be qualified further. Specifically, the full value potential
can only be unlocked if interoperability is implemented in such a way that the dynamic nature of IoT
is fully supported. A clear prerequisite is that ad-hoc, cross-domain systems of IoT elements need to
3 “The internet of things: Mapping the value beyond the hype”, June 2015, McKinsey Global Institute,
www.mckinsey.com. Copyright (c) 2015 McKinsey & Company. All rights reserved. Reprinted by permission.
29-Aug-2016
4
be able to establish conversations and build understanding. Several examples of cross-domain use
cases and a discussion on the need for and the value of semantics in sensor applications are
described in [6] .
Hence, the position of this paper is that semantic interoperability is a key value enabler for IoT and
that establishing a shared ontology based approach is critical for the development and exploitation
of the technology.
Foundations of Semantic Interoperability
Metadata, an essential data reusability provider
What is the issue we want to solve?
Interoperability is commonly driven by the respective parties sharing a priori knowledge of some
kind; for example, shared knowledge of an application programming interface (API) or shared
knowledge of a set of database tables and related access rules. Key to these approaches is
conformance with prior agreements and understandings.
These data-model based semantic interoperability approaches are the cornerstone of inter-
operability in many enterprises and industrial contexts. One of the challenges, however, is that when
new applications are introduced into the context, they also need prior knowledge of the
interoperability schemes, the API specifications, and the meaning and use of database tables.
In the context of the IoT, however, there has to be a way to create an interoperability context which
does not rely on prior knowledge. This is the issue we are trying to solve and metadata is a core part
of the solution.
What is metadata?
Metadata is about describing the contents and context of data to facilitate discovery, understanding
and (re)usability of that data. Hence the usual statement that metadata is data about data.
It is often the case, however, that the actual meaning of data can only be discovered by examining
the software that generates and processes the data. This bundling obfuscates the semantics of the
data with the result that third-party processors receiving data have no guidance on how the values
should be interpreted and understood.
Figure 2 Meaningfulness of the data, increased with metadata
Metadata is about reducing the separation between semantics and values by ensuring that data is
provided with context and description. This enables interpretation and understanding by subsequent
processors and provides foundational support for interoperability and reusability.
Figure 2 [7] provides a view on the metadata associated with a temperature sensor. As can be seen,
multiple levels of meaning can be inferred; this gives transparency to the context and supports
subsequent design-time abstraction and modeling views.
29-Aug-2016
5
Sharing metadata
The Linked Open Vocabularies4 (LOV) community is driving a conversation around the creation and
use of shared metadata (shared vocabularies).
While locally defined and shared metadata will deliver value within a given domain, metadata which
is published more widely will necessarily drive interoperability and reusability to a greater extent, as
shown in Figure 3.
Programs such as the H2020 Large Scale Pilots [8] which address domain specific and cross-domain
concerns, as shown in Figure 4, provide multiple opportunities for exercising and proving the
strategic value of sharing metadata across a significant scope at scale.
Ontologies
What is an Ontology?
The LOV community is focusing on the curation of quality vocabularies across all domains.
Taxonomies often build on such controlled vocabularies using parent-child relationships to describe
the organization of terms within a specific domain. Ontologies extend this concept further to capture
relationships capable of supporting richer operations and more advanced levels of reasoning.
Ontologies build on metadata to provide a representation of knowledge about a given domain and
to provide a core resource for reasoning about a domain and a context. The Semantic Sensor
Network (SSN) Ontology [9] is an example of an existing ontology which describes the capabilities
and properties of sensors, the act of sensing and the resulting observations. Another example is the
oneM2M Base Ontology [10] that constitutes a framework for specifying the semantics of data that
are handled in oneM2M and to which domain specific ontologies can be mapped.
Figure 5 shows the key concepts and relations of the SSN ontology split by conceptual modules
(dotted lines).
Figure 6 shows the core concepts of the oneM2M Base Ontology.
4 http://lov.okfn.org/
Figure 4 Metadata and Data Reusability
Smart
Living
Smart
Cities
Smart
Industry
. . .Smart
Homes
Core metadata used across application domains
Industry specific groups are in the best position to define metadata for each vertical
Horizontal and Vertical Metadata
Figure 3 Shared Metadata
No
metadata
Locally
defined
metadata
Metadata
based on
shared
vocabularies
Hardly
reusable
Ver y
reusable
29-Aug-2016
6
Figure 5 Semantic Sensor Network Ontology
Figure 6 oneM2M Base Ontology
Ontologies and the IoT
Given the cross-domain nature of the IoT, there is a need both to capture and express knowledge
shared across the verticals and to leverage linkages between domains.
As can be seen in Figure 5, the SSN ontology comprises ten conceptual modules relating to sensors.
This modularity supports reuse of SSN concepts in other ontologies and, similarly, concepts from
other ontologies can be included into solutions using the SSN ontology as required.
In Figure 6 the core aspects of the oneM2M Base Ontology are shown. The ontology provides a
common, domain-independent basis to which existing domain-specific ontologies, e.g. SAREF [11] ,
can be mapped. IoT devices described according to the concepts of the Base Ontology, or derived
29-Aug-2016
7
from concepts in domain-specific ontologies, can be automatically mapped to a REST resource
structure in oneM2M.
Modularity, reuse and linkage are key strategies for supporting the use of ontologies in building
cross-domain IoT applications. These strategies, plus the need to educate the community in their
existence and usage are discussed in [12] [13] .
The diversity of IoT domains will drive ontology development through vocabulary creation, extension,
reuse, and retargeting; this motivates requirements for ontology management capabilities. The
richness of semantic data models can be leveraged to automate management capabilities - such
automation will be crucial for the continued operation of IoT ecosystems in which human
intervention is expected to be minimal, ineffective and/or unavailable.
Ontologies and Modularity
Requirements for modularization are commonly driven by use-cases in which only parts of an
existing ontology are needed, or in which constrained devices are unable to perform inference and
reasoning on a full ontology. Modularization also eases some of the complexities around semantic
data modelling and ontology design, integration, maintenance, and reuse [14] .
Modularization requires the partitioning of ontologies into independent sub-modules [15] [16] . Sub-
modules are self-contained knowledge components that:
Are loosely coupled
Define their own set of core concepts and relations
Are reusable
Are linked to other module(s) with explicit relationship(s).
As a consequence of the loose coupling, modules can be designed, used, managed and updated in a
stand-alone manner, with no impact on other modules. When modularizing ontologies, however, it’s
also important to avoid generating reasoning or querying complexities for future (module) unions.
Good examples of modular ontologies are the Smart BANs (Body Area Networks) and MyOntoSens
ontologies [17] [18] . Within MyOntoSens, a Wireless Sensor Network (WSN) module is formed of
clusters (Cluster module; BAN module for Smart BANs) that are composed of nodes (Node Module).
A node is used for process (Process Module) and takes measurements (Measurements Module). The
‘Measurements Module’ is sufficiently light to be instantiated and stored within sensors, while the
Process and Measurements modules full instantiation and inference/reasoning can actually only be
performed within a more capable node, the cluster sink (or BAN hub). The full BAN ontology
(including service level modules), is instantiated, inferred and processed within remote and
distributed monitoring and control servers (e.g. hospital servers).
Ontology modularity can also be handled using a layered approach as shown in Figure 7. Although
heterogeneity characterizes the landscape of devices and systems across domains, there are
commonalties which can be abstracted out. Thus, ontologies can often be modularized into at least
two layers: cross-domain ontologies and domain ontologies.
The cross-domain ontologies consist of concepts shared across domains and silos. For instance, a
general protocol ontology can be used to classify the communication protocols along with
information regarding the supported communication medium and range. Such general information
can be used during diagnostic and maintenance operations. Similarly, there can be multiple cross-
domain ontologies covering shared concepts related to quantities, units, topological relations,
location, and usage.
The cross-domain ontologies capture the shared concepts across domains and constitute the
building blocks of future extensions.
29-Aug-2016
8
The domain ontologies relate to specific silos or verticals and often reference the cross-domain
ontologies. For example, in Figure 7 the Buildings Ontology relies on both the Physical Quantities
Ontology to express the measurements, and on the Localization Ontology to reference a site or a
floor. Moreover, both cross-domain and domain ontologies can also rely on existing dictionaries
(such as HayStack5 from the building domain).
Figure 7 Multi-layered Ontologies
Consider the concepts of Current and Phase in the Buildings context:
The Quantities Ontology (cross-domain) includes definitions for Current and Phase and also
defines the hasPhase relation. Now assume that A is an instance of Phase.
The Buildings Ontology (vertical domain) has a similar model and represents a current of
phase A as IA. Thus, using the cross-domain ontology, the IA definition can be expressed as
follows: IA ≡ Current and hasPhase A.
Similarly, the Energy Ontology refers to a current of phase A as CA, then its definition (by
reuse of the cross-domain concepts) becomes CA ≡ Current and hasPhase A.
A multi-layer approach enables great flexibility when querying since the ontologies are
interconnected and queries can exploit both high level and specific concepts to explore a domain.
Applications operating at a high-level of abstraction can use more general concepts to retrieve
information such as (Current and hasPhase A), while applications operating at a more granular level
can rely on the vertical ontologies to formulate queries and extract specific information such as CA
and IA.
Taking a further example, oneM2M provides a set of rules to map the conceptual model of the
oneM2M Base Ontology to the underlying oneM2M resource structure. Then for systems using
other ontologies for which a mapping to the oneM2M Base Ontology can be defined, this provides a
mechanism for those (other) ontologies to be instantiated within a oneM2M system as resources
5 http://project-haystack.org
29-Aug-2016
9
with associated semantic annotation. This enables different IoT systems to interwork with each
other via a common upper ontology (Base Ontology) and a common architecture.
Ontologies and Semantically Augmented Things
Designing ontologies is the first step towards the interoperability vision; the second step consists of
enabling the sensors, devices and systems to express their contextual information and data by
applying the ontologies.
The connected world is a diverse ecosystem comprising elements which range from small,
constrained devices and sensors, to larger more complex modules and machinery. This diversity is
reflected in the processing, storage and communication capabilities provided by the respective
elements and it follows that the degree to which ontologies and semantic capabilities can be
embedded will also vary.
These considerations impact constrained elements for which embedding semantics is not an option.
Sparsely resourced sensors, for example, will often provide little in terms of processing and may only
support very lightweight, near binary format communications. In these cases, metadata can be
(externally) attached to the sensor’s data in a process referred to as semantic annotation [19] [20] .
Semantic annotation is usually performed by the agent receiving the sensor’s data, for example a
gateway, a system, or a cloud agent [20] [21] .
We distinguish between two semantic annotation mechanisms; automatic tagging and
commissioning [19] .
Automatic Tagging can be handled by a software agent running on a gateway, on a system or in the
cloud. The agent decodes the sensor data stream (for example) and then augments the data using an
appropriate semantic representation - see Figure 8. Other approaches, such as in [22] , suggest a
heuristic based inference to harmonize the tags based on previously existing unstructured data.
Figure 8 Semantic Annotation
Commissioning is usually handled through a user interface during the installation phase of a gateway
or a system. For example, the use and location of a given sensor are only known during the
commissioning phase and at that time the installer uses a commissioning tool to set the data from
the ontologies. Commissioning tools should evolve to take into account such tagging.
29-Aug-2016
10
Depending on the resources of the gateway or system, such annotation can remain as tags which
can be processed by query engines to answer specific queries, as in [19] . Other approaches can rely
on such tags to generate a complete ontology, as in [20] [21] .
Ontologies and Context
IoT technology is driving new opportunities for context-aware systems and applications. These
classes of sentient system and application are able to adapt their behavior to the current context
without explicit intervention.
Context awareness often means that a system combines physical awareness (time, location, sound,
movement, touch, temperature) with application awareness (tasks, goals, processes, compliance,
compatibility, approval, user disposition) to modify its own behavior.
Metadata alone has proven insufficient to address interoperability; in some institutions it is not part
of software engineering curricula. Due to their formal expressiveness and the possibilities for
applying ontology reasoning techniques, various existing and emerging context-aware frameworks
use ontologies in their implementation [23] . Context sensitive machine learning techniques are also
finding a role in deriving interoperability contexts [24] .
Ontologies and Scalability
Figure 9 Scalability of ontology-based integration
A growing number of devices and applications
are delivering data-streams and events on a
continuous basis. This growth in data volume
and velocity is accompanied by a growth in
variety, driven by the heterogeneity of device
and data formats.
Data integration programs based on
traditional approaches such as relational
databases are efficient in small, static and
closed environments. In fact, with a low
number of data sources, the cost of using and
maintaining data remains low compared to an
ontology-driven approach where a larger
initial amount of effort is required therefore
involving high costs.
The benefit of semantic standards stands out, however, when it comes to large, dynamic and open
systems with critical requirements in terms of scalability and interoperability. Semantic models
enable integrating a huge number of heterogeneous and mobile sources in short period of time with
reasonable costs compared to traditional approaches. Semantic integration is performed once at the
beginning, paving the way for advanced querying and reasoning, and enabling data to be integrated
in a collaborative, standard and reusable way.
Current and Emerging Practices
Technologies and Strategies for Linked Data
The key to overcoming the fragmentation of the IoT and catalyzing exponential growth in services
will be enabling end-to-end interoperability across different platforms. This requires open standards
for metadata that define the data and interaction models exposed to applications, the protocols
involved, and the communication patterns that can be used. In other words, this requires standards
for a web of things.
Source: PricewaterhouseCoopers, 2009 [25]
29-Aug-2016
11
What are the technologies and strategies for handling such metadata?
The Resource Description Framework (RDF) [26] provides for globally unique identifiers for metadata.
These identifiers in many cases serve as links to further information for a web of linked data. RDF
allows data and metadata to be described in terms of triples, i.e. named relationships that connect a
subject to an object. There are multiple serialization formats for RDF, e.g. RDF-XML [27] , Turtle [28]
and N3 [29] , comma separated values [30] , and JSON-LD [31] . Further techniques address how to
include metadata within web pages.
Semantic models can be expressed with RDF Schema (RDF-S) [32] or the Web Ontology Language
(OWL) [33] . SPARQL [34] is a query language for accessing and updating RDF triples. The Linked
Data Platform (LDP) [35] defines how to use HTTP for read-write linked data on the web. DCAT [36]
is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the
Web. Linked Open Vocabularies (LOV) community, introduced in the “Sharing Metadata” section,
maintains descriptions of RDF-S vocabularies and OWL ontologies used for datasets in the Linked
Data Cloud, see [37] .
Semantics in Support of Cross-Domain IoT
Semantic technologies provide a common means to describe domain knowledge whilst enabling
heterogeneity and multimodality through interoperable data formats and various semantic models
[38] .
Beyond the representational aspects, semantic computing supports reasoning on raw sensor data
enabling the derivation of higher level abstractions; such abstractions form the basis of domain- and
cross-domain knowledge. The Machine-to-Machine Measure (M3) Framework discusses these
concepts [41] and provides an implementation which explores the creation of cross-domain
applications.
Figure 10, which is discussed in [42] , outlines how M3 applies a semantic approach to enable cross-
domain reasoning. The figure shows two sensors in different domains; sensor A is in the health
domain and sensor B is in the weather domain.
Designing Semantic Models
Semantic data model design can be mainly split into two phases:
1. Specification, i.e. the conceptual/logical/abstract definition of the model. This is mainly
mapped out in the form of objects, materialized as classes that can be linked together
(relationships) and that are described by attributes
2. Formalization, i.e. its physical model in the form of semantic metadata or ontology.
Conceptual models are generally structured through the use of Entity Relationship (ER) or Unified
Modeling Language (UML) diagrams. In line with practices commonly followed in Semantic Web [43]
and RDF [44] development, Java conventions [45] are generally observed. For example:
Adopt naming conventions which impose minimum changes
Replace spaces in strings with underscores
Use lower-case for metadata and ontologies namespaces
Use camel-case for class and object names
Use mixed-case for property names
The formalization of a semantic data model is achieved through the use of description languages.
Lightweight description languages such as JSON-LD (JavaScript Object Notation for Linked Data) [46]
are generally preferred when dealing with low-power, low-energy, constrained embedded devices
such as sensors and actuators. Less constrained environments commonly use XML based description