Conference PaperPDF Available

PIOTRe: Personal Internet of Things Repository

Authors:

Abstract and Figures

Resource-constrained Internet of Things (IoT) devices like Raspberry Pis', with specific performance optimisation, can serve as in-teroperable personal Linked Data repositories for IoT applications. In this demo paper we describe PIOTRe, a personal datastore that utilises our sparql2sql query translation technology on Pis' to process, store and publish IoT time-series historical data and streams. We demonstrate with PIOTRe in a smart home scenario: a real-time dashboard that utilises RDF stream processing, a set of descriptive analytics visualisations on historical data, a framework for registering stream queries within a local network and a means of sharing metadata globally with HyperCat and Web Observatories.
Content may be subject to copyright.
PIOTRe: Personal Internet of Things Repository
Eugene Siow, Thanassis Tiropanis, and Wendy Hall
Electronics & Computer Science, University of Southampton
{eugene.siow,t.tiropanis,wh}@soton.ac.uk
Abstract. Resource-constrained Internet of Things (IoT) devices like
Raspberry Pis’, with specific performance optimisation, can serve as in-
teroperable personal Linked Data repositories for IoT applications. In
this demo paper we describe PIOTRe, a personal datastore that utilises
our sparql2sql query translation technology on Pis’ to process, store and
publish IoT time-series historical data and streams. We demonstrate with
PIOTRe in a smart home scenario: a real-time dashboard that utilises
RDF stream processing, a set of descriptive analytics visualisations on
historical data, a framework for registering stream queries within a local
network and a means of sharing metadata globally with HyperCat and
Web Observatories.
Keywords: SPARQL, SQL, RSP, Query Translation, Internet of Things,
Analytics, Web Observatory
1 Introduction
Internet of Things (IoT) time-series data that is flat and wide can be efficiently
stored in relational databases and queried with SPARQL using mappings and
query translation engines as shown in our previous work [1]. sparql2sql which
translates SPARQL to SQL and sparql2stream which translates SPARQL to
Event Processing Language (EPL) are two such engines for historical data and
streams respectively. Both engines show performance improvements in query
latency for IoT scenarios on Raspberry Pis’ that range from 2 times to 3 orders
of magnitude.
In this demonstration paper we will present PIOTRe1, a personal repository
that utilises sparql2sql and sparql2stream to provide efficient Linked Data access
through SPARQL endpoints for IoT historical data and streams on Pis’. We will
then demonstrate how PIOTRe supports applications like:
1. A real-time smart home dashboard that utilises sparql2stream’s RDF stream
processing to update widgets with push results via web sockets.
2. A smart home visualisation application that uses a set of SPARQL queries
with space-time aggregations, translated by sparql2sql, on historical data.
3. A lightweight client-broker-server architecture that supports registering stream
queries and delivering results on an offline local network.
4. A means of publishing and sharing metadata and mappings online as Hy-
perCat or with decentralised catalogues known as Web Observatories.
1https://github.com/eugenesiow/piotre
2 PIOTRE: Personal Internet of Things Repository
2 PIOTRe Design and Architecture
Fig. 1. Architecture of PIOTRe
PIOTRe is designed to run on a compact and mobile lightweight computer
like a Raspberry Pi (it has also been tested on an x86 Gizmo22). PIOTRe consists
of a number of components as shown in Fig. 1. Sensors produce time-series data
that forms a set of input data streams to the system.
Data streams, enter an event stream for processing and are stored in a histor-
ical relational store. Each stream forms a table in the store. PIOTRe uses H23
as a default relational store and Esper4as an event stream processing engine.
H2 can be replaced by a relational database or column store that supports SQL.
sparql2sql and sparql2stream work on top of the historical store and event
stream respectively. Mappings are shared across the engines and are a represen-
tation of how RDF maps to columns in relational tables or a field of an event
in a stream. A SPARQL endpoint uses the sparql2sql engine to translate in-
coming SPARQL queries to SQL, execute the query on the relational store and
return the result set in the appropriate format. The UNified IoT Environment
(UNIoTE) Server is part of a client-broker-server architecture to allow streaming
SPARQL queries to be registered with the sparql2stream engine. The engine in
turn translates queries to EPL, registers them with the underlying event stream
and sends results to requesting clients. UNIoTE is described in Section 4.
Apps in PIOTRe have the flexibility of being written in any language and
framework and communicate with the SPARQL Endpoint and UNIoTE Server
through HTTP, although underlying the UNIoTE server are ZeroMQ sockets5.
The metadata publishing component publishes the metadata from mappings,
like the sensor descriptions, locations, data fields and formats to support global
discovery and interoperability. It is described in more detail in Section 5.
2http://www.gizmosphere.org/products/gizmo-2/
3http://www.h2database.com
4http://www.espertech.com/
5http://zeromq.org/
PIOTRe: Personal Internet of Things Repository 3
3 Example Applications using PIOTRe
IoT Freeboard6is a real-time dashboard that consists of a simulator that replays
a stream of smart home data at a variable rate to PIOTRe and a web applica-
tion dashboard that receives the output from the set of registered push-based
streaming SPARQL queries as a web socket events stream. Each result in the
stream consists of the query name it was generated by and fields of the result set
e.g. {"queryName":"averagePower","averagePower":"10.5","uom":"W"}. A
host of widgets can be added to the dashboard to visualise these events. A demo
video is available7.
Fig. 2. Apps using streaming and historical data with PIOTRe
Pi-SmartHome8is an application that uses SPARQL queries on historical
smart home data to provide visualisations across time and space. The queries
are as follows: 1) hourly aggregation of temperature, 2) daily aggregation of
temperature, 3) hourly and room-based aggregation of energy usage and 4) di-
agnosis of unattended devices through energy usage and motion, aggregating by
hour and room. The application allows the user to tweak the days and months
as parameters to the SPARQL queries and generate and compare graph visual-
isations. A demo video9and online demo are available.
4 Lightweight Local, Offline Client-Broker-Server
In some IoT scenarios like disaster management or environmental monitoring in
remote locations, local, offline networks of devices are necessary. The UNified
IoT Environment’s10 (UNIoTE) client-broker-server architecture enables stream
queries to be registered and results to be delivered to devices. A publish-subscribe
mechanism based on lightweight ZeroMQ sockets are used. Servers subscribe to
URIs and clients publish queries (with URIs in the FROM clause) facilitated
by brokers with known addresses. As shown in Fig. 1 a UNIoTE server uses a
sparql2stream engine to translate and register queries. Results are then delivered
directly to all requesting clients through push-pull sockets.
6https://github.com/eugenesiow/iotwo
7https://youtu.be/oH0iSWTmKUg
8https://github.com/eugenesiow/ldanalytics-PiSmartHome
9https://youtu.be/g8FLr974v9o
10 https://github.com/eugenesiow/uniote-broker
4 PIOTRE: Personal Internet of Things Repository
Fig. 3. Sequence Diagram of Issuing a Query with UNIOTE
Fig. 3 describes, using a sequence diagram, how the two publish-subscribe
and push-pull mechanisms work in UNIOTE. A client publishes a streaming
SPARQL query with uri1 and uri3 in the FROM clause and its address to each
of the URI topics. Servers 1 and 2 have each subscribed to URI topics based on
their streams and receive the query, register the query using sparql2stream and
push results directly to the client’s addresses.
5 Observatories and Online Metadata Sharing
The Web Observatory Project is developing a global decentralised, distributed
infrastructure that allows the use and exchange of datasets, analytical apps and
visualisations [2]. Web Observatory instances exist as collections of datasets and
analytical tools protected by access controls. The principles espoused by the Web
Observatory as a distributed infrastructure are a possible solution for managing
datasets and apps in the IoT. By sharing metadata of IoT datasets and providing
the Observatory access through the SPARQL endpoint, PIOTRe systems, when
online, are able to support apps which use globally distributed data sources
across Observatory instances securely.
HyperCat11 is a complementary catalogue for exposing collections of uniform
resource identifiers (URLs) that refer to IoT assets over the web. PIOTRe sys-
tems connected to various sensors, devices and things, publish metadata of these
from mappings with a HyperCat server when online, increasing interoperability.
References
1. Siow, E., Tiropanis, T., Hall, W.: SPARQL-to-SQL on Internet of Things Databases
and Streams. In: Proceedings of 15th International Semantic Web Conference
(ISWC2016) (2016)
2. Tiropanis, T., Hall, W., Hendler, J., de Larrinaga, C.: The Web Observatory: A
Middle Layer for Broad Data. Big Data 2(3), 129–133 (2014)
11 http://www.hypercat.io/
... Siow et al. [11] have built a personal datastore to process, publish and store IoT time-series data in a smart home application, including historical and real-time streaming data. In addition, they have demonstrated a real-time dashboard that can visualise historical IoT data. ...
... In addition, they have demonstrated a real-time dashboard that can visualise historical IoT data. Like Siow et al. [11], Guo et al. [12] have introduced an algorithm which recognises the field of the data source from a large collection of time-series data. It helps to extract characteristics of time-series data and groups them using the self-organising map. ...
... It helps to extract characteristics of time-series data and groups them using the self-organising map. We can utilise both the datastore [11] and algorithm [12] to process a large collection of time-series data from multiple sources. However, these approaches are not adequate to capture the freshest IoT data based on considering the freshness attributes that we have used in this paper. ...
... where I have quoted from the work of others, the source is always given. parts of this work have been published as: Siow et al. (2016a), Siow et al. (2016b), Siow et al. (2016c) and Siow et al. (2017). Lights in the hallway brighten as a man enters the house. ...
... Fog computing, that was introduced in the study of background literature in Chapter 2, presents a greater availability of computing resources near to Things that can be utilised for decentralised SWoT needs. Research literature, like Diaspora (Bielenberg et al., 2012), PIOTRe (Siow et al., 2016b) and Solid (Mansour et al., 2016), also have demonstrated how decentralised pods, servers and repositories can manage and store data independently of the social network creating and consuming it. ...
Thesis
Full-text available
This thesis is concerned with the development of efficient methods for managing contextualised time-series data and event streams produced by the Internet of Things (IoT) so that both historical and real-time information can be utilised to generate value within analytical applications. From a database systems perspective, two conflicting challenges motivate this research, interoperability and performance. IoT applications integrating streams of time-series data from heterogeneous IoT agents require a level of semantic interoperability. This semantic interoperability can be achieved with a common flexible data model that represents both metadata and data. However, applications might also have time constraints or require processing to be performed on large volumes of historical and streaming time-series data, possibly on resource-constrained platforms, without significant delay. Obtaining good performance is complicated by the complexity of the data model. In the first part of the thesis, a graph data model is shown to support the representation of metadata and data that various research and standard bodies are working towards, while the ‘volume’ of IoT data is shown to exhibit flat, wide and numerical characteristics. A three step abstraction is defined to reconcile queries on the graph model with efficient underlying storage by query translation. This storage is iteratively improved to exploit the character of time-series IoT data, achieving orders of magnitude performance improvement over state-of-the-art commercial, open-source and research databases. The second part of the thesis extends this abstraction to efficiently process real-time IoT streams continuously and proposes an infrastructure for fog computing that shows how resource-constrained platforms close to source IoT agents can co-operatively orchestrate stream processing. The main contributions of this thesis are therefore, i) a novel interoperable and performant abstraction for querying IoT graph representations, ii) high performance historical, streaming and fog computing time-series database implementations and iii) analytical applications and platforms built on this abstraction that act as practical models for the socio-technical development of the IoT.
... Siow et al. [100] have built a personal datastore to process, publish and store IoT time-series data in a smart home application, including historical and dynamic streaming data. In addition, they have demonstrated a real-time dashboard that can visualise historical IoT data. ...
Article
Full-text available
Over the last few decades, the proliferation of the Internet of Things (IoT) has produced an overwhelming flow of data and services, which has shifted the access control paradigm from a fixed desktop environment to dynamic cloud environments. Fog computing is associated with a new access control paradigm to reduce the overhead costs by moving the execution of application logic from the centre of the cloud data sources to the periphery of the IoT-oriented sensor networks. Indeed, accessing information and data resources from a variety of IoT sources has been plagued with inherent problems such as data heterogeneity, privacy, security and computational overheads. This paper presents an extensive survey of security, privacy and access control research, while highlighting several specific concerns in a wide range of contextual conditions (e.g., spatial, temporal and environmental contexts) which are gaining a lot of momentum in the area of industrial sensor and cloud networks. We present different taxonomies, such as contextual conditions and authorization models, based on the key issues in this area and discuss the existing context-sensitive access control approaches to tackle the aforementioned issues. With the aim of reducing administrative and computational overheads in the IoT sensor networks, we propose a new generation of Fog-Based Context-Aware Access Control (FB-CAAC) framework, combining the benefits of the cloud, IoT and context-aware computing; and ensuring proper access control and security at the edge of the end-devices. Our goal is not only to control context-sensitive access to data resources in the cloud, but also to move the execution of an application logic from the cloud-level to an intermediary-level where necessary, through adding computational nodes at the edge of the IoT sensor network. A discussion of some open research issues pertaining to context-sensitive access control to data resources is provided, including several real-world case studies. We conclude the paper with an in-depth analysis of the research challenges that have not been adequately addressed in the literature and highlight directions for future work that has not been well aligned with currently available research.
Conference Paper
Full-text available
To realise a semantic Web of Things, the challenge of achieving efficient Resource Description Format (RDF) storage and SPARQL query performance on Internet of Things (IoT) devices with limited resources has to be addressed. State-of-the-art SPARQL-to-SQL engines have been shown to outperform RDF stores on some benchmarks. In this paper, we describe an optimisation to the SPARQL-to-SQL approach, based on a study of time-series IoT data structures, that employs metadata abstraction and efficient translation by reusing existing SPARQL engines to produce Linked Data ‘just-in-time’. We evaluate our approach against RDF stores, state-of-the-art SPARQL-to-SQL engines and streaming SPARQL engines, in the context of IoT data and scenarios. We show that storage efficiency, with succinct row storage, and query performance can be improved from 2 times to 3 orders of magnitude.
Article
Full-text available
The Web Observatory project is a global effort that is being led by the Web Science Trust, its network of WSTnet laboratories, and the wider Web Science community. The goal of this project is to create a global distributed infrastructure that will foster communities exchanging and using each other's web-related datasets as well as sharing analytic applications for research and business web applications.3 It will provide the means to observe the digital planet, explore its processes, and understand their impact on different sectors of human activity.