ArticlePDF Available

Abstract and Figures

Checking the compliance of geospatial triplestores with the GeoSPARQL standard represents a crucial step for many users when selecting the appropriate storage solution. This publication presents the software which comprises the GeoSPARQL compliance benchmark – a benchmark which checks RDF triplestores for compliance with the requirements of the GeoSPARQL standard. Users can execute this benchmark within the HOBBIT benchmarking platform to quantify the extent to which the GeoSPARQL standard is implemented within the triplestore of interest. This enables users to make an informed decision when choosing an RDF storage solution and helps assess the general state of adoption of geospatial technologies on the Semantic Web.
Content may be subject to copyright.
Software Impacts 8 (2021) 100071
Contents lists available at ScienceDirect
Software Impacts
journal homepage: www.journals.elsevier.com/software-impacts
Original software publication
Software for the GeoSPARQL compliance benchmark
Milos Jovanovik a,d,
, Timo Homburg b, Mirko Spasić c,d
aSs. Cyril and Methodius University in Skopje, North Macedonia
bMainz University of Applied Sciences, Germany
cUniversity of Belgrade, Serbia
dOpenLink Software, London, UK
ARTICLE INFO
Keywords:
GeoSPARQL
Benchmarking
Compliance
RDF triplestores
ABSTRACT
Checking the compliance of geospatial triplestores with the GeoSPARQL standard represents a crucial step
for many users when selecting the appropriate storage solution. This publication presents the software
which comprises the GeoSPARQL compliance benchmark — a benchmark which checks RDF triplestores for
compliance with the requirements of the GeoSPARQL standard. Users can execute this benchmark within the
HOBBIT benchmarking platform to quantify the extent to which the GeoSPARQL standard is implemented
within the triplestore of interest. This enables users to make an informed decision when choosing an RDF
storage solution and helps assess the general state of adoption of geospatial technologies on the Semantic
Web.
Code metadata
Current code version 2.1
Permanent link to code/repository used for this code version https://github.com/SoftwareImpacts/SIMPAC-2021- 29
Permanent link to Reproducible Capsule
Legal Code License GNU General Public License
Code versioning system used Git
Software code languages, tools, and services used Java, Docker
Compilation requirements, operating environments & dependencies Maven
If available Link to developer documentation/manual https://github.com/OpenLinkSoftware/GeoSPARQLBenchmark
Support email for questions geosparql-benchmark@openlinksw.com
Software metadata
Current software version 2.1
Permanent link to executables of this version https://github.com/OpenLinkSoftware/GeoSPARQLBenchmark/releases/tag/v2.1
Permanent link to Reproducible Capsule
Legal Software License GNU General Public License
Computing platforms/Operating Systems Online via the HOBBIT platform
Installation requirements & dependencies
If available, link to user manual–if formally published include a reference to the
publication in the reference list
https://arxiv.org/abs/2102.06139
Support email for questions geosparql-benchmark@openlinksw.com
1. Introduction
The GeoSPARQL standard [1] defines a representation format for
geospatial data expressed in RDF [2] as part of the Semantic Web [3],
Corresponding author at: Ss. Cyril and Methodius University in Skopje, North Macedonia.
E-mail address: milos.jovanovik@finki.ukim.mk (M. Jovanovik).
along with a set of geometry functions which work with the same
geospatial data. Since its introduction in 2012, the standard has been
https://doi.org/10.1016/j.simpa.2021.100071
Received 24 March 2021; Received in revised form 25 March 2021; Accepted 26 March 2021
2665-9638/©2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
M. Jovanovik, T. Homburg and M. Spasić Software Impacts 8 (2021) 100071
adopted by RDF triplestores to various degrees which are not necessar-
ily apparent to the user of the individual storage system. For example, a
user may want to find out if a triplestore claiming GeoSPARQL support
provides support for various coordinate reference systems, or just for
the default world geodetic system WGS84; or whether it supports
geometries expressed in both WKT [4] and GML [5], or just one of
them, etc. To assess the state of GeoSPARQL compliance in any given
triplestore, we developed the GeoSPARQL compliance benchmark, and
this paper explains the software behind it.
2. Description
The software for the GeoSPARQL compliance benchmark1consists
of three components: the benchmark dataset, the benchmark queries,
and the expected query answers. The benchmark dataset consists of
a set of various geometries expressed in RDF. The benchmark queries
represent a set of 206 SPARQL queries which aim to test and check for
the compliance of the tested system with the 30 requirements defined
in the GeoSPARQL standard. Correctly evaluated queries contribute
to a benchmark score which states the GeoSPARQL compliance as
elaborated in [6].
To evaluate a given triplestore, the benchmark software first loads
the GeoSPARQL benchmark dataset into the triplestore. Next, the
benchmarking software reads the 206 test SPARQL queries and their
expected answers. The queries are then executed on the tested triple-
store, and its results are compared to the expected answers. After all
queries have been executed and the correctness of all retrieved results
has been assessed, the benchmark software starts an evaluation process
which generates two results:
Correct answers: The number of correct answers out of all
SPARQL queries.
GeoSPARQL compliance percentage: The percentage of compli-
ance with the requirements of the GeoSPARQL standard.
The former is the number of correct answers the system provided
out of the 206 test queries. The latter is calculated from the perspective
of the 30 requirements and measures the overall compliance of the
benchmarked system with the GeoSPARQL standard. It measures the
number of supported requirements of the system, out of the 30 speci-
fied requirements, where the weight of each requirement is uniformly
distributed [6].
The HOBBIT platform2[7,8] is a platform for benchmarking of
triplestores (Fig. 1). It is an open and modular platform, where, on the
one hand, users can define and execute benchmarks, and, on the other,
are able to add new triplestores which can be tested with the plat-
form benchmarks. Our GeoSPARQL compliance benchmark has been
developed for the HOBBIT platform, which means that the benchmark
can be executed via the publicly available platform on various RDF
triplestores, which ensures the reproducibility of the results.
3. Impact
The software of our benchmark allows for a precise assessment
of the GeoSPARQL compliance of RDF triplestores. It achieves this
by calculating a unified percentage score — the compliance score,
representing a relative measure of GeoSPARQL compliance of a given
triplestore implementation. In addition, the benchmark software ex-
poses in detail which parts of the GeoSPARQL standard are supported
and to which extent. In daily practice, this enables users to make edu-
cated decisions about the choice of an RDF storage solution based on
their needs. In addition, it provides the much-needed foundation for the
1GeoSPARQL compliance benchmark: https://github.com/
OpenLinkSoftware/GeoSPARQLBenchmark.
2Public instance of the HOBBIT platform: http://master.project-hobbit.eu.
creation of a standardized test for GeoSPARQL compliance, which may
be established by the OGC.3For implementers, the benchmark software
provides the opportunity to improve and verify their implementations
against a standardized test framework. Hence, the very existence of the
software may lead to better support of GeoSPARQL in the future.
Our initial benchmark results show that the most commonly used
triplestores, most of which claim GeoSPARQL support, actually have
significantly varying levels of support for the standard. More strikingly,
none has full compliance4[6] (Fig. 2). These results work toward both
goals outlined above: they provide helpful insight both to potential
users and to the triplestore implementers, as well.
Other impact points of the benchmark software include the ability to
verify support for upcoming versions of the GeoSPARQL standard. The
modular design of the software allows for the definition of new query-
answer pairs, which would test new features or requirements outlined
in the future versions of the GeoSPARQL standard.
The software is currently used by the development team of Virtuoso5
and the development team of GeoSPARQL Fuseki,6to extend their sup-
port for GeoSPARQL. However, as the benchmark results have already
shown, the benchmark is also interesting for a variety of companies
which are in the business of semantic storage solutions, such as the
companies behind GraphDB,7AllegroGraph,8AnzoGraph,9etc., and
a variety of open source projects striving for compliance with the
GeoSPARQL standard, e.g. RDF4J,10 Blazegraph,11 etc. They can use the
benchmark to track the progress in their support for the GeoSPARQL
standard and compare their system to the competitors. In the long run,
with the volume of geospatial data on the Semantic Web increasing and
the ongoing development to extend the GeoSPARQL standard [9,10],
we expect additional companies developing triplestores to adopt our
test for measuring the extent of their GeoSPARQL compliance. For
the purpose of making vendors aware of the upcoming updates of the
GeoSPARQL standard and raising awareness to GeoSPARQL in general,
the GeoSPARQL working group has already created a list of potential
future implementers.12
The software of the benchmark has been used for the research and
analysis outlined in our publication which compares the GeoSPARQL
compliance scores of the most commonly used RDF triplestores [6].
4. Conclusions and future work
This publication introduced the software of the GeoSPARQL com-
pliance benchmark, which is used to test triplestores for GeoSPARQL
compliance. It fills an important gap in the geospatial domain of
the Semantic Web by allowing both triplestore users and triplestore
developers to gain an overview and an insight into the GeoSPARQL
support of the available RDF storage solutions. Given that a successor
to the GeoSPARQL standard – GeoSPARQL 2.0 – is currently in devel-
opment [9,10], the modular design of the benchmark will allow for a
future extension which will also include tests for the new capabilities
which the next version will introduce.
3OGC Compliance Program: https://www.ogc.org/compliance.
4Results from the GeoSPARQL compliance benchmark: https:
//master.project-hobbit.eu/experiments/1612476122572, 1612477003063,
1612476116049,1612477500164, 1612661614510,1612637531673,
1612828110551,1612477849872.
5Virtuoso: https://virtuoso.openlinksw.com.
6GeoSPARQL Fuseki: https://jena.apache.org/documentation/geosparql/
geosparql-fuseki.
7GraphDB: https://www.ontotext.com/products/graphdb/.
8AllegroGraph: https://allegrograph.com.
9AnzoGraph: https://www.cambridgesemantics.com/anzograph/.
10 RDF4J: https://rdf4j.org.
11 Blazegraph: https://blazegraph.com.
12 GeoSPARQL implementers: https://github.com/opengeospatial/ogc-
geosparql/issues/59.
2
M. Jovanovik, T. Homburg and M. Spasić Software Impacts 8 (2021) 100071
Fig. 1. The HOBBIT benchmarking platform.
Fig. 2. Results from the GeoSPARQL compliance benchmark, from the public instance of the HOBBIT platform.
Declaration of competing interest
The authors declare the following financial interests/personal rela-
tionships which may be considered as potential competing interests:
Milos Jovanovik and Mirko Spasić work for OpenLink Software which
is the vendor of Virtuoso – one of the benchmarked triplestores featured
in the results shown in Fig. 2.
Acknowledgment
This work has been partially supported by Eurostars Project SAGE
(GA no. E!10882).
References
[1] O.G. Consortium, et al., OGC GeoSPARQL - A geographic query language for
RDF data, OGC Candidate Implementat. Stand. (2012).
[2] R. Cyganiak, D. Wood, M. Lanthaler, G. Klyne, J.J. Carroll, B. McBride, RDF
1.1 concepts and abstract syntax, 2014, W3C Recommendation. https://www.
w3.org/TR/rdf11-concepts.
[3] T. Berners-Lee, J. Hendler, O. Lassila, The Semantic Web, Sci. Am. 284 (5) (2001)
34–43.
[4] J. Herring, et al., Simple feature access - Part 1: Common architecture, Open
Geospatial Consortium, 2011.
[5] C. Portele, OGC implementation specification 07-036: OpenGIS geography
markup language (GML) encoding standard, Open Geospatial Consortium, 2007.
[6] M. Jovanovik, T. Homburg, M. Spasić, A GeoSPARQL compliance benchmark,
2021, arXiv:2102.06139.
[7] A.-C.N. Ngomo, M. Röder, HOBBIT: Holistic benchmarking for Big Linked Data,
ERCIM News (105) (2016).
[8] M. Röder, D. Kuchelev, A.-C. Ngonga Ngomo, HOBBIT: A platform for
benchmarking Big Linked Data, Data Science 3 (1) (2020) 15–35.
[9] J. Abhayaratna, L. van den Brink, N. Car, R. Atkinson, T. Homburg, F. Knibbe,
K. McGlinn, A. Wagner, M. Bonduel, M. Holten Rasmussen, F. Thiery, OGC
Benefits of representing spatial data using semantic and graph technologies, Open
Geospatial Consortium, 2020, http://docs.ogc.org/wp/19-078r1/19- 078r1.html.
[10] J. Abhayaratna, L. van den Brink, N. Car, T. Homburg, F. Knibbe, OGC
GeoSPARQL 2.0 SWG charter, Open Geospatial Consortium, 2020, https://github.
com/opengeospatial/geosemantics-dwg/tree/master/geosparql_2.0_swg_charter.
3
... However, the application of queries with geospatial functions is limited, and GeoSPARQL is not entirely compliant. Additionally, considering the increasing approaches based on GeoSPARQL, some works have provided ways to measure the support in GeoSPARQL-enabled RDF triple stores [7,[47][48][49]. Even a benchmark utilizing GeoSPARQL constructs was defined, facing all phases of federated query processing [50]. ...
... However, we did not test our approach with other engines, such as Apache Jena, GraphDB, or Parliament, due to the lack of geospatial knowledge graphs deployed on the mentioned triple store engines. Nevertheless, we will work to deploy several geospatial resources regarding some of these engines and perform different tests using some of the recent GeoSPARQL benchmarks [7,[47][48][49]. ...
Article
Full-text available
Geospatial data is increasingly being made available on the Web as knowledge graphs using Linked Data principles. This entails adopting the best practices for publishing, retrieving, and using data, providing relevant initiatives that play a prominent role in the Web of Data. Despite the appropriate progress related to the amount of geospatial data available, knowledge graphs still face significant limitations in the GIScience community since their use, consumption, and exploitation are scarce, especially considering that just a few developments retrieve and consume geospatial knowledge graphs from within GIS. To overcome these limitations and address some critical challenges of GIScience, standards and specific best practices for publishing, retrieving, and using geospatial data on the Web have appeared. Nevertheless, there are few developments and experiences that support the possibility of expressing queries across diverse knowledge graphs to retrieve and process geospatial data from different and distributed sources. In this scenario, we present an approach to request, retrieve, and consume (geospatial) knowledge graphs available at diverse and distributed platforms, prototypically implemented on Apache Marmotta, supporting SPARQL 1.1 and GeoSPARQL standards. Moreover, our approach enables the consumption of geospatial knowledge graphs through a lightweight web application or QGIS. The potential of this work is shown with two examples that use GeoSPARQL-based knowledge graphs.
... The total size of the RDF dataset is over 300 triples. The dataset is available as part of the benchmark code [18,19], in RDF/XML, GeoJSON [20] and GML representations. ...
... Req. 19 Implementations shall support geof:distance, geof:buffer, geof:convexHull, geof:intersection, geof:union, geof:difference, geof:symDifference, geof:envelope and geof:boundary as SPARQL extension functions, consistent with the definitions of the corresponding functions (distance, buffer, convexHull, intersection, difference, symDifference, envelope and boundary respectively) in Simple Features [5]. ...
Article
Full-text available
GeoSPARQL is an important standard for the geospatial linked data community, given that it defines a vocabulary for representing geospatial data in RDF, defines an extension to SPARQL for processing geospatial data, and provides support for both qualitative and quantitative spatial reasoning. However, what the community is missing is a comprehensive and objective way to measure the extent of GeoSPARQL support in GeoSPARQL-enabled RDF triplestores. To fill this gap, we developed the GeoSPARQL compliance benchmark. We propose a series of tests that check for the compliance of RDF triplestores with the GeoSPARQL standard, in order to test how many of the requirements outlined in the standard a tested system supports. This topic is of concern because the support of GeoSPARQL varies greatly between different triplestore implementations, and the extent of support is of great importance for different users. In order to showcase the benchmark and its applicability, we present a comparison of the benchmark results of several triplestores, providing an insight into their current GeoSPARQL support and the overall GeoSPARQL support in the geospatial linked data domain.
... A semantic knowledge engine is an interactive access system for semantic KB based on Jena Fuseki [21]. Jena Fuseki is deployed in the cloud, and Fig. 4 shows its structure. ...
Article
Full-text available
Rich semantic information in natural language increases team efficiency in human collaboration, reduces dependence on high precision data information, and improves adaptability to dynamic environment. We propose a semantic centered cloud control framework for cooperative multi-unmanned ground vehicle (UGV) system. Firstly, semantic modeling of task and environment is implemented by ontology to build a unified conceptual architecture, and secondly, a scene semantic information extraction method combining deep learning and semantic web rule language (SWRL) rules is used to realize the scene understanding and task-level cloud task cooperation. Finally, simulation results show that the framework is a feasible way to enable autonomous unmanned systems to conduct cooperative tasks.
Preprint
Full-text available
This paper introduces RDFGraphGen, a general-purpose, domain-independent generator of synthetic RDF graphs based on SHACL constraints. The Shapes Constraint Language (SHACL) is a W3C standard which specifies ways to validate data in RDF graphs, by defining constraining shapes. However, even though the main purpose of SHACL is validation of existing RDF data, in order to solve the problem with the lack of available RDF datasets in multiple RDF-based application development processes, we envisioned and implemented a reverse role for SHACL: we use SHACL shape definitions as a starting point to generate synthetic data for an RDF graph. The generation process involves extracting the constraints from the SHACL shapes, converting the specified constraints into rules, and then generating artificial data for a predefined number of RDF entities, based on these rules. The purpose of RDFGraphGen is the generation of small, medium or large RDF knowledge graphs for the purpose of benchmarking, testing, quality control, training and other similar purposes for applications from the RDF, Linked Data and Semantic Web domain. RDFGraphGen is open-source and is available as a ready-to-use Python package.
Article
Full-text available
This article applies a knowledge graph-based approach to unify multiple heterogeneous domains inherent in climate and energy supply research. Existing approaches that rely on bespoke models with spreadsheet-type inputs are noninterpretable, static and make it difficult to combine existing domain specific models. The difficulties inherent to this approach become increasingly prevalent as energy supply models gain complexity while society pursues a net-zero future. In this work, we develop new ontologies to extend the World Avatar knowledge graph to represent gas grids, gas consumption statistics, and climate data. Using a combination of the new and existing ontologies we construct a Universal Digital Twin that integrates data describing the systems of interest and specifies respective links between domains. We represent the UK gas transmission system, and HadUK-Grid climate data set as linked data for the first time, formally associating the data with the statistical output areas used to report governmental administrative data throughout the UK. We demonstrate how computational agents contained within the World Avatar can operate on the knowledge graph, incorporating live feeds of data such as instantaneous gas flow rates, as well as parsing information into interpretable forms such as interactive visualizations. Through this approach, we enable a dynamic, interpretable, modular, and cross-domain representation of the UK that enables domain specific experts to contribute toward a national-scale digital twin.
Article
Full-text available
This article develops an ontological description of land use and applies it to incorporate geospatial information describing land coverage into a knowledge-graph-based Universal Digital Twin. Sources of data relating to land use in the UK have been surveyed. The Crop Map of England (CROME) is produced annually by the UK Government and was identified as a valuable source of open data. Formal ontologies to represent land use and the geospatial data arising from such surveys have been developed. The ontologies have been deployed using a high-performance graph database. A customized vocabulary was developed to extend the geospatial capabilities of the graph database to support the CROME data. The integration of the CROME data into the Universal Digital Twin is demonstrated in two use cases that show the potential of the Universal Digital Twin to share data across sectors. The first use case combines data about land use with a geospatial analysis of scenarios for energy provision. The second illustrates how the Universal Digital Twin could use the land use data to support the cross-domain analysis of flood risk. Opportunities for the extension and enrichment of the ontologies, and further development of the Universal Digital Twin are discussed.
Technical Report
Full-text available
http://docs.ogc.org/wp/19-078r1/19-078r1.html The purpose of this document is to outline the benefits of representing geospatial data using semantic and graph technologies. It aims to provide motivation for OGC members to consider the publication of geospatial data using these technologies.
OGC GeoSPARQL - A geographic query language for RDF data
  • Consortium
O.G. Consortium, et al., OGC GeoSPARQL -A geographic query language for RDF data, OGC Candidate Implementat. Stand. (2012).
Simple feature access - Part 1: Common architecture, Open Geospatial Consortium
  • J Herring
J. Herring, et al., Simple feature access -Part 1: Common architecture, Open Geospatial Consortium, 2011.
  • M Jovanovik
  • T Homburg
  • M Spasić
M. Jovanovik, T. Homburg, M. Spasić, A GeoSPARQL compliance benchmark, 2021, arXiv:2102.06139.
OGC GeoSPARQL 2.0 SWG charter
  • J Abhayaratna
  • L Van Den
  • N Brink
  • T Car
  • F Homburg
  • Knibbe
J. Abhayaratna, L. van den Brink, N. Car, T. Homburg, F. Knibbe, OGC GeoSPARQL 2.0 SWG charter, Open Geospatial Consortium, 2020, https://github. com/opengeospatial/geosemantics-dwg/tree/master/geosparql_2.0_swg_charter.
OGC GeoSPARQL 2.0 SWG charter
  • Abhayaratna