Content uploaded by Milos Jovanovik
Author content
All content in this area was uploaded by Milos Jovanovik on Apr 16, 2021
Content may be subject to copyright.
Software Impacts 8 (2021) 100071
Contents lists available at ScienceDirect
Software Impacts
journal homepage: www.journals.elsevier.com/software-impacts
Original software publication
Software for the GeoSPARQL compliance benchmark
Milos Jovanovik a,d,∗
, Timo Homburg b, Mirko Spasić c,d
aSs. Cyril and Methodius University in Skopje, North Macedonia
bMainz University of Applied Sciences, Germany
cUniversity of Belgrade, Serbia
dOpenLink Software, London, UK
ARTICLE INFO
Keywords:
GeoSPARQL
Benchmarking
Compliance
RDF triplestores
ABSTRACT
Checking the compliance of geospatial triplestores with the GeoSPARQL standard represents a crucial step
for many users when selecting the appropriate storage solution. This publication presents the software
which comprises the GeoSPARQL compliance benchmark — a benchmark which checks RDF triplestores for
compliance with the requirements of the GeoSPARQL standard. Users can execute this benchmark within the
HOBBIT benchmarking platform to quantify the extent to which the GeoSPARQL standard is implemented
within the triplestore of interest. This enables users to make an informed decision when choosing an RDF
storage solution and helps assess the general state of adoption of geospatial technologies on the Semantic
Web.
Code metadata
Current code version 2.1
Permanent link to code/repository used for this code version https://github.com/SoftwareImpacts/SIMPAC-2021- 29
Permanent link to Reproducible Capsule
Legal Code License GNU General Public License
Code versioning system used Git
Software code languages, tools, and services used Java, Docker
Compilation requirements, operating environments & dependencies Maven
If available Link to developer documentation/manual https://github.com/OpenLinkSoftware/GeoSPARQLBenchmark
Support email for questions geosparql-benchmark@openlinksw.com
Software metadata
Current software version 2.1
Permanent link to executables of this version https://github.com/OpenLinkSoftware/GeoSPARQLBenchmark/releases/tag/v2.1
Permanent link to Reproducible Capsule
Legal Software License GNU General Public License
Computing platforms/Operating Systems Online via the HOBBIT platform
Installation requirements & dependencies
If available, link to user manual–if formally published include a reference to the
publication in the reference list
https://arxiv.org/abs/2102.06139
Support email for questions geosparql-benchmark@openlinksw.com
1. Introduction
The GeoSPARQL standard [1] defines a representation format for
geospatial data expressed in RDF [2] as part of the Semantic Web [3],
∗Corresponding author at: Ss. Cyril and Methodius University in Skopje, North Macedonia.
E-mail address: milos.jovanovik@finki.ukim.mk (M. Jovanovik).
along with a set of geometry functions which work with the same
geospatial data. Since its introduction in 2012, the standard has been
https://doi.org/10.1016/j.simpa.2021.100071
Received 24 March 2021; Received in revised form 25 March 2021; Accepted 26 March 2021
2665-9638/©2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
M. Jovanovik, T. Homburg and M. Spasić Software Impacts 8 (2021) 100071
adopted by RDF triplestores to various degrees which are not necessar-
ily apparent to the user of the individual storage system. For example, a
user may want to find out if a triplestore claiming GeoSPARQL support
provides support for various coordinate reference systems, or just for
the default world geodetic system WGS84; or whether it supports
geometries expressed in both WKT [4] and GML [5], or just one of
them, etc. To assess the state of GeoSPARQL compliance in any given
triplestore, we developed the GeoSPARQL compliance benchmark, and
this paper explains the software behind it.
2. Description
The software for the GeoSPARQL compliance benchmark1consists
of three components: the benchmark dataset, the benchmark queries,
and the expected query answers. The benchmark dataset consists of
a set of various geometries expressed in RDF. The benchmark queries
represent a set of 206 SPARQL queries which aim to test and check for
the compliance of the tested system with the 30 requirements defined
in the GeoSPARQL standard. Correctly evaluated queries contribute
to a benchmark score which states the GeoSPARQL compliance as
elaborated in [6].
To evaluate a given triplestore, the benchmark software first loads
the GeoSPARQL benchmark dataset into the triplestore. Next, the
benchmarking software reads the 206 test SPARQL queries and their
expected answers. The queries are then executed on the tested triple-
store, and its results are compared to the expected answers. After all
queries have been executed and the correctness of all retrieved results
has been assessed, the benchmark software starts an evaluation process
which generates two results:
•Correct answers: The number of correct answers out of all
SPARQL queries.
•GeoSPARQL compliance percentage: The percentage of compli-
ance with the requirements of the GeoSPARQL standard.
The former is the number of correct answers the system provided
out of the 206 test queries. The latter is calculated from the perspective
of the 30 requirements and measures the overall compliance of the
benchmarked system with the GeoSPARQL standard. It measures the
number of supported requirements of the system, out of the 30 speci-
fied requirements, where the weight of each requirement is uniformly
distributed [6].
The HOBBIT platform2[7,8] is a platform for benchmarking of
triplestores (Fig. 1). It is an open and modular platform, where, on the
one hand, users can define and execute benchmarks, and, on the other,
are able to add new triplestores which can be tested with the plat-
form benchmarks. Our GeoSPARQL compliance benchmark has been
developed for the HOBBIT platform, which means that the benchmark
can be executed via the publicly available platform on various RDF
triplestores, which ensures the reproducibility of the results.
3. Impact
The software of our benchmark allows for a precise assessment
of the GeoSPARQL compliance of RDF triplestores. It achieves this
by calculating a unified percentage score — the compliance score,
representing a relative measure of GeoSPARQL compliance of a given
triplestore implementation. In addition, the benchmark software ex-
poses in detail which parts of the GeoSPARQL standard are supported
and to which extent. In daily practice, this enables users to make edu-
cated decisions about the choice of an RDF storage solution based on
their needs. In addition, it provides the much-needed foundation for the
1GeoSPARQL compliance benchmark: https://github.com/
OpenLinkSoftware/GeoSPARQLBenchmark.
2Public instance of the HOBBIT platform: http://master.project-hobbit.eu.
creation of a standardized test for GeoSPARQL compliance, which may
be established by the OGC.3For implementers, the benchmark software
provides the opportunity to improve and verify their implementations
against a standardized test framework. Hence, the very existence of the
software may lead to better support of GeoSPARQL in the future.
Our initial benchmark results show that the most commonly used
triplestores, most of which claim GeoSPARQL support, actually have
significantly varying levels of support for the standard. More strikingly,
none has full compliance4[6] (Fig. 2). These results work toward both
goals outlined above: they provide helpful insight both to potential
users and to the triplestore implementers, as well.
Other impact points of the benchmark software include the ability to
verify support for upcoming versions of the GeoSPARQL standard. The
modular design of the software allows for the definition of new query-
answer pairs, which would test new features or requirements outlined
in the future versions of the GeoSPARQL standard.
The software is currently used by the development team of Virtuoso5
and the development team of GeoSPARQL Fuseki,6to extend their sup-
port for GeoSPARQL. However, as the benchmark results have already
shown, the benchmark is also interesting for a variety of companies
which are in the business of semantic storage solutions, such as the
companies behind GraphDB,7AllegroGraph,8AnzoGraph,9etc., and
a variety of open source projects striving for compliance with the
GeoSPARQL standard, e.g. RDF4J,10 Blazegraph,11 etc. They can use the
benchmark to track the progress in their support for the GeoSPARQL
standard and compare their system to the competitors. In the long run,
with the volume of geospatial data on the Semantic Web increasing and
the ongoing development to extend the GeoSPARQL standard [9,10],
we expect additional companies developing triplestores to adopt our
test for measuring the extent of their GeoSPARQL compliance. For
the purpose of making vendors aware of the upcoming updates of the
GeoSPARQL standard and raising awareness to GeoSPARQL in general,
the GeoSPARQL working group has already created a list of potential
future implementers.12
The software of the benchmark has been used for the research and
analysis outlined in our publication which compares the GeoSPARQL
compliance scores of the most commonly used RDF triplestores [6].
4. Conclusions and future work
This publication introduced the software of the GeoSPARQL com-
pliance benchmark, which is used to test triplestores for GeoSPARQL
compliance. It fills an important gap in the geospatial domain of
the Semantic Web by allowing both triplestore users and triplestore
developers to gain an overview and an insight into the GeoSPARQL
support of the available RDF storage solutions. Given that a successor
to the GeoSPARQL standard – GeoSPARQL 2.0 – is currently in devel-
opment [9,10], the modular design of the benchmark will allow for a
future extension which will also include tests for the new capabilities
which the next version will introduce.
3OGC Compliance Program: https://www.ogc.org/compliance.
4Results from the GeoSPARQL compliance benchmark: https:
//master.project-hobbit.eu/experiments/1612476122572, 1612477003063,
1612476116049,1612477500164, 1612661614510,1612637531673,
1612828110551,1612477849872.
5Virtuoso: https://virtuoso.openlinksw.com.
6GeoSPARQL Fuseki: https://jena.apache.org/documentation/geosparql/
geosparql-fuseki.
7GraphDB: https://www.ontotext.com/products/graphdb/.
8AllegroGraph: https://allegrograph.com.
9AnzoGraph: https://www.cambridgesemantics.com/anzograph/.
10 RDF4J: https://rdf4j.org.
11 Blazegraph: https://blazegraph.com.
12 GeoSPARQL implementers: https://github.com/opengeospatial/ogc-
geosparql/issues/59.
2
M. Jovanovik, T. Homburg and M. Spasić Software Impacts 8 (2021) 100071
Fig. 1. The HOBBIT benchmarking platform.
Fig. 2. Results from the GeoSPARQL compliance benchmark, from the public instance of the HOBBIT platform.
Declaration of competing interest
The authors declare the following financial interests/personal rela-
tionships which may be considered as potential competing interests:
Milos Jovanovik and Mirko Spasić work for OpenLink Software which
is the vendor of Virtuoso – one of the benchmarked triplestores featured
in the results shown in Fig. 2.
Acknowledgment
This work has been partially supported by Eurostars Project SAGE
(GA no. E!10882).
References
[1] O.G. Consortium, et al., OGC GeoSPARQL - A geographic query language for
RDF data, OGC Candidate Implementat. Stand. (2012).
[2] R. Cyganiak, D. Wood, M. Lanthaler, G. Klyne, J.J. Carroll, B. McBride, RDF
1.1 concepts and abstract syntax, 2014, W3C Recommendation. https://www.
w3.org/TR/rdf11-concepts.
[3] T. Berners-Lee, J. Hendler, O. Lassila, The Semantic Web, Sci. Am. 284 (5) (2001)
34–43.
[4] J. Herring, et al., Simple feature access - Part 1: Common architecture, Open
Geospatial Consortium, 2011.
[5] C. Portele, OGC implementation specification 07-036: OpenGIS geography
markup language (GML) encoding standard, Open Geospatial Consortium, 2007.
[6] M. Jovanovik, T. Homburg, M. Spasić, A GeoSPARQL compliance benchmark,
2021, arXiv:2102.06139.
[7] A.-C.N. Ngomo, M. Röder, HOBBIT: Holistic benchmarking for Big Linked Data,
ERCIM News (105) (2016).
[8] M. Röder, D. Kuchelev, A.-C. Ngonga Ngomo, HOBBIT: A platform for
benchmarking Big Linked Data, Data Science 3 (1) (2020) 15–35.
[9] J. Abhayaratna, L. van den Brink, N. Car, R. Atkinson, T. Homburg, F. Knibbe,
K. McGlinn, A. Wagner, M. Bonduel, M. Holten Rasmussen, F. Thiery, OGC
Benefits of representing spatial data using semantic and graph technologies, Open
Geospatial Consortium, 2020, http://docs.ogc.org/wp/19-078r1/19- 078r1.html.
[10] J. Abhayaratna, L. van den Brink, N. Car, T. Homburg, F. Knibbe, OGC
GeoSPARQL 2.0 SWG charter, Open Geospatial Consortium, 2020, https://github.
com/opengeospatial/geosemantics-dwg/tree/master/geosparql_2.0_swg_charter.
3