PreprintPDF Available

Abstract and Figures

We propose a series of tests that check for the compliance of RDF triplestores with the GeoSPARQL standard. The purpose of the benchmark is to test how many of the requirements outlined in the standard a tested system supports and to push triplestores forward in achieving a full GeoSPARQL compliance. This topic is of concern because the support of GeoSPARQL varies greatly between different triplestore implementations, and such support is of great importance for the domain of geospatial RDF data. Additionally, we present a comprehensive comparison of triplestores, providing an insight into their current GeoSPARQL support.
Content may be subject to copyright.
A GEOSPARQL COMPLIANCE BENCHMARK
A PREPRINT
Milos Jovanovik
Ss. Cyril and Methodius Univesity in Skopje, N. Macedonia
OpenLink Software, London, UK
milos.jovanovik@finki.ukim.mk
Timo Homburg
Mainz University Of Applied Sciences, Germany
timo.homburg@hs-mainz.de
Mirko Spasi´
c
University of Belgrade, Serbia
OpenLink Software, London, UK
mirko@matf.bg.ac.rs
February 12, 2021
ABS TRAC T
We propose a series of tests that check for the compliance of RDF triplestores with the GeoSPARQL
standard. The purpose of the benchmark is to test how many of the requirements outlined in the
standard a tested system supports and to push triplestores forward in achieving a full GeoSPARQL
compliance. This topic is of concern because the support of GeoSPARQL varies greatly between
different triplestore implementations, and such support is of great importance for the domain of
geospatial RDF data. Additionally, we present a comprehensive comparison of triplestores, providing
an insight into their current GeoSPARQL support.
Keywords GeoSPARQL ·Benchmarking ·RDF ·SPARQL
1 Introduction
The geospatial Semantic Web [
1
] as part of the Semantic Web [
2
] represents an ever-growing semantically interpreted
wealth of geospatial information. The initial research [
3
] and the subsequent introduction of the OGC GeoSPARQL
standard [
4
] formalized geospatial vector data representations (WKT [
5
] and GML [
6
]) in ontologies, and extended the
SPARQL query language [7] with support for spatial relation operators.
Several RDF storage solutions have since adopted GeoSPARQL to various extents as features of their triplestore
implementations [
8
,
9
]. These varying levels of implementation may lead to some false assumptions of users when
choosing an appropriate triplestore implementation for their project. For example, some implementations allow for
defining a coordinate reference system (CRS) [
10
] in a given WKT geometry literal as stated in the GeoSPARQL standard
(e.g. GraphDB). Other implementations do not allow a CRS definition and instead only support the world geodetic
system WGS84 (e.g. RDF4J) [
11
]. Such implementations, even though incomplete according to the GeoSPARQL
standard, still cover many geospatial use-cases and can be useful in many scenarios. But, they are not useful, for
example, for a geospatial authority that needs to work with many different coordinate system definitions.
The requirements of GeoSPARQL compliant triplestores have been clearly spelled out in the GeoSPARQL standard [
4
].
However, the Semantic Web and GIS community lack a compliance test suite for GeoSPARQL, which we contribute in
this publication. We hope that our contribution may be added to the list of OGC conformance tests
1
, as they lack a
suitable test suite for GeoSPARQL.
1OGC Test Suites: https://cite.opengeospatial.org/teamengine/
arXiv:2102.06139v1 [cs.DB] 11 Feb 2021
APREPRINT - FE BRUARY 12, 2021
Our paper is organized as follows. In Section 2 we discuss existing approaches that worked towards evaluating
geospatial triplestores, Section 3 introduces the test framework of the benchmark and describes how the compliance
tests were implemented. Section 4 describes the application of the defined test framework against different triplestore
implementations, and we discuss the results in Section 5. In Section 6 we lay out the limitations of our approach, before
concluding the work in Section 7.
2 Related Work
Most standards define requirements which need to be fulfilled to satisfy the standard definition. However, not all
standards expose explicit descriptions on how to test compliance with their requirements or a test suite that tests the
overall compliance to the standard.
GeoSPARQL [
4
], as an extension of the SPARQL [
7
] query language, defines an ontology model to represent vector
geometries, their relations and serializations in WKT and GML, a set of geometry filter functions, an RDFS entailment
extension, a query rewrite extension to simplify geospatial queries and further geometry manipulation functions.
First, it is important that we distinguish between performance benchmarks and compliance benchmarks. Performance
benchmarks try to evaluate the performance of system, usually by employing a set of queries. Performance benchmarks
may also consider semantically equivalent implementations that are not following the syntax specified by a given
standard. On the other hand, compliance benchmarks are not concerned with the efficiency or overall performance of a
system, but rather with its ability to fulfill certain requirements.
Several benchmark implementations targeting geospatial triplestores, such as the Geographica Series [
12
,
13
] or [
9
],
try to evaluate the performance of geospatial function implementations. Both approaches originate from the Linked
Data community. Additionally, [
14
] shows that the geospatial community is interested in benchmarking geospatial
triplestores, as well. Their benchmark includes a newly created dataset and tests GeoSPARQL filter functions. While
the aforementioned benchmarks might reveal if functions are implemented, they do not necessarily reveal an incorrect
implementation of a given function.
The Tests for Triplestores (TFT) benchmark [
15
] includes a GeoSPARQL subtest. However, the subtest used here is
based on the six example SPARQL queries and the example dataset defined in Annex B of the GeoSPARQL standard
[
4
]. Although these examples are a good starting point, they are of informative nature and are intended as guidelines.
Therefore, any benchmark based solely on them does not even begin to cover all possible requirements or the multiple
ways in which they have to be tested, in order for a system to be deemed as compliant with the standard.
Recently, the EuroSDR group reused the benchmark implementation of [
14
] to implement a small GeoSPARQL
compliance benchmark
2
. This compliance benchmark consists of 27 queries testing a selection of GeoSPARQL
functions on a test dataset. In contrast to our benchmark, this implementation does not explicitly test all requirements
defined in the GeoSPARQL standard. In particular, GML support, RDFS entailment support and the query rewrite
extension, among others, have not been tested in this benchmark.
3 GeoSPARQL Compliance Benchmark
The GeoSPARQL compliance benchmark is based on the requirements defined in the GeoSPARQL standard [
4
]. The
30 requirements defined in the standard are grouped into 6 categories and refer to the core GeoSPARQL ontology model
and a set of extensions which systems need to implement, and which need to be tested in our benchmark:
1. Core component (CORE): Defines the top-level spatial vocabulary components (Requirements 1 - 3)
2. Topology vocabulary extension (TOP): Defines the topological relation vocabular (Requirements 4 - 6)
3.
Geometry extension (GEOEXT): Defines the geometry vocabulary and non-topological query functions
(Requirements 7 - 20)
4.
Geometry topology extension (GTOP): Defines topological query functions for geometry objects (Requirements
21 - 24)
5.
RDFS entailment extension (RDFSE): Defines a mechanism for matching implicit (inferred) RDF triples that
are derived based on RDF and RDFS semantics, i.e. derived from RDFS reasoning (Requirements 25 - 27)
6.
Query rewrite extension (QRW): Defines query transformation rules for computing spatial relations between
spatial objects based on their associated geometries (Requirements 28 - 30)
2EuroSDR GeoSPARQL Test: https://data.pldn.nl/eurosdr/geosparql-test
2
APREPRINT - FE BRUARY 12, 2021
Each of the specified requirements may be tested using a set of guidelines which are loosely defined in the abstract test
suite in Annex A of the GeoSPARQL standard [
4
]. While the abstract test suite defines the test purpose, method and
type to verify if a specific requirement has been fulfilled, it does not define a concrete set of SPARQL queries and a test
dataset which may be used for reference. We contribute the test dataset and the set of SPARQL queries to verify each
requirement in this publication.
In the GeoSPARQL compliance benchmark, each requirement is tested by one or more SPARQL queries, where there is
a single expected answer or a set of expected answers. The number of queries used to test a requirement, as well as the
number of expected answers per query, depends on the nature of the requirement. For some of them, it is sufficient to
have a single query and a single expected answer to test whether the system under test complies with it. In contrast,
other requirements have sub-requirements – for example, requirements which refer to multiple properties or functions,
requirements referring to functions which can be used with geometries with different serializations, or requirements
which need a broader coverage of cases, to make sure they are fully met. In these cases, multiple queries are used.
Multiple logically equivalent expected answers are used when the answer of a SPARQL query can be technically
expressed in different formats or literal serializations.
This approach of using queries and expected answers as tests, allows us to measure the compliance of any RDF storage
system by using the HOBBIT benchmarking platform [16,17].
The output of the benchmark is a percentage which measures the overall compliance of the tested system with the
GeoSPARQL standard. It measures the number of supported requirements of the system, out of the 30 specified
requirements, as a percentage.
3.1 Benchmark Dataset
The GeoSPARQL standard defines an example dataset for testing in its Annex B [
4
], which can be used with the set of
6 example test queries defined in the same section. This example dataset contains 6 geometries. We wanted to use this
dataset, but given that we aimed to test all requirements of the standard, we had to substantially extend the dataset both
with new geometries and additional properties of the existing geometries. Figure 1 shows the geometries included in
our extended dataset.
The extended benchmark dataset contains 13 geometries of
Polygon
,
Point
and
LineString
types, all expressed as
both WKT and GML literals. The total size of the RDF dataset is over 300 triples. The dataset is available as part of the
benchmark code [18], in RDF/XML, GeoJSON [19] and GML representations.
3.2 Benchmark Queries
We provide here an overview of the approach we had in writing the queries used by the benchmark to test the
requirements of the GeoSPARQL standard. The requirements are presented in order of the GeoSPARQL extension
definitions presented in Section 3. The benchmark queries are available as part of the benchmark code [
18
]. The details
about how each test and sub-test is scored are presented in Section 3.3.
Req. 1:
Implementations shall support the SPARQL Query Language for RDF [
7
], the SPARQL Protocol for RDF
[20] and the SPARQL Query Results XML Format [21].
We test requirement 1 with a single, basic SPARQL query which selects the first triple where geometry A is the subject.
To get consistent results across different systems, we have to use a specific subject and have to order the results.
Req. 2:
Implementations shall allow the RDFS [
22
] class
geo:SpatialObject
to be used in SPARQL graph
patterns.
Req. 3: Implementations shall allow the RDFS class geo:Feature to be used in SPARQL graph patterns.
Requirements 2 and 3 are tested with single SPARQL queries, which select the first entity of type
geo:SpatialObject
and
geo:Feature
, respectively. In order to get consistent results for both queries across different systems, we order
the results.
Req. 4:
Implementations shall allow the properties
geo:sfEquals
,
geo:sfDisjoint
,
geo:sfIntersects
,
geo:sfTouches
,
geo:sfCrosses
,
geo:sfWithin
,
geo:sfContains
,
geo:sfOverlaps
to be used in
SPARQL graph patterns.
Req. 5:
Implementations shall allow the properties
geo:ehEquals
,
geo:ehDisjoint
,
geo:ehMeet
,
geo:ehOverlap
,
geo:ehCovers
,
geo:ehCoveredBy
,
geo:ehInside
,
geo:ehContains
to be used in
SPARQL graph patterns.
3
APREPRINT - FE BRUARY 12, 2021
A (Polygon)
G (Polygon)
B (Polygon)
B (Point)
C (Polygon)
C (Point)
D (Polygon)
D (Point)
E (LineString)
F (Point)
A (Point)
G (Point)
J (Polygon)
K (Polygon)
L (Point)
M (Point)
Figure 1: Abstract view of the geometries which are part of the benchmark dataset. Geometries A, B, C, D, G, J and
K represent
Polygon
geometries and (aside from J and K) all have a center
Point
geometry, as well. Geometry E
represents a
LineString
geometry, while geometries F, L and M represent
Point
geometries. Geometries H and I are
empty geometries and not visible in this figure. All geometries are represented in the CRS84 geodetic system, except
for geometry M which is represented in EPSG:4326. Each geometry is represented both using WKT and GML literals.
Req. 6:
Implementations shall allow the properties
geo:rcc8eq
,
geo:rcc8dc
,
geo:rcc8ec
,
geo:rcc8po
,
geo:rcc8tppi,geo:rcc8tpp,geo:rcc8ntpp,geo:rcc8ntppi to be used in SPARQL graph patterns.
We test each of the requirements 4, 5 and 6 with eight different queries, to test the sub-requirements for each property
specified. Since the queries for requirements 28, 29 and 30 require the use of these same properties to test the system’s
compliance to the GeoSPARQL RIF [
23
] rules, we use an approach where the explicit RDF triples needed to test
requirements 4, 5 and 6 involve geometries which are the top result when using the ordering of the query results. For
this purpose, the queries for requirements 4, 5, and 6 order the results, and select the top result only, to ensure they test
the existence of the explicit and materialized RDF triple in the dataset.
Req. 7: Implementations shall allow the RDFS class geo:Geometry to be used in SPARQL graph patterns.
Req. 8:
Implementations shall allow the properties
geo:hasGeometry
and
geo:hasDefaultGeometry
to be used
in SPARQL graph patterns.
Req. 9:
Implementations shall allow the properties
geo:dimension
,
geo:coordinateDimension
,
geo:spatialDimension
,
geo:isEmpty
,
geo:isSimple
,
geo:hasSerialization
to be used in
SPARQL graph patterns.
The tests for requirements 7, 8 and 9 are done by selecting all entities of type
geo:Geometry
(Req. 7), or by selecting
the object/value of geometry A denoted by the property in question (Req. 8 and 9). Since requirement 8 specifies two
distinct properties, and requirement 9 specifies six such properties, the tests for these requirements consist of two and
six queries, respectively.
Req. 10:
All RDFS Literals of type
geo:wktLiteral
shall consist of an optional URI identifying the coordinate
reference system followed by Simple Features Well Known Text (WKT) describing a geometric value. Valid
4
APREPRINT - FE BRUARY 12, 2021
geo:wktLiteral
instances are formed by concatenating a valid, absolute URI as defined in [
24
], one or more
spaces (Unicode U+0020 character) as a separator, and a WKT string as defined in Simple Features [25].
We test requirement 10 by selecting and checking the datatype of a correctly defined WKT literal from the dataset, to
make sure the system under test supports the specified format of WKT literals and their datatype.
Req. 11:
URI
<http://www.opengis.net/def/crs/OGC/1.3/CRS84>
shall be assumed as the spatial reference
system for geo:wktLiterals that do not specify an explicit spatial reference system URI.
We test requirement 11 by first defining two geometries in the dataset: J and K, which represent the same polygon, but
geometry K has a WKT literal with an explicitly specified reference system, while geometry J does not contain the URI
and only contains the polygon points in the literal value:
J: Polygon((-77.089005 38.913574, -77.029953 38.913574, -77.029953 38.886321,
-77.089005 38.886321, -77.089005 38.913574))
K: <http://www.opengis.net/def/crs/OGC/1.3/CRS84>
Polygon((-77.089005 38.913574, -77.029953 38.913574, -77.029953 38.886321,
-77.089005 38.886321, -77.089005 38.913574))
Then, we test whether these two geometries, i.e. their corresponding WKT literals, are geometrically equal. This ensures
that a correct answer to this test means that the underlying system assumes CRS84 as the default spatial reference
system for WKT literals which do not specify one explicitly.
Req. 12:
Coordinate tuples within
geo:wktLiterals
shall be interpreted using the axis order defined in the spatial
reference system used.
In order to test requirement 12, we define two new geometries in the dataset: L and M, which represent the same point.
Geometry L has a WKT literal which specifies the point using the CRS84 coordinate system, while geometry M uses
the EPSG:4326 coordinate system [
26
]. Compared to one another, these coordinate systems use an inverted axis order:
L: <http://www.opengis.net/def/crs/OGC/1.3/CRS84> Point(-88.38 31.95)
M: <http://www.opengis.net/def/crs/EPSG/0/4326> Point( 31.95 -88.38)
In order to test whether the system interprets the axis order correctly, i.e. according to the spatial reference system, we
test if the two geometries are equal based on the system under test.
Req. 13: An empty RDFS Literal of type geo:wktLiteral shall be interpreted as an empty geometry.
We define two new geometries, H and I, for the purpose of testing requirement 13. Geometry H represents a
LineString
geometry which has a WKT literal, which is an empty string. Geometry I represents an explicitly defined empty
LineString geometry:
H:
I: LineString EMPTY
Additionally, as most of the other geometries, these two geometries have a
Point
representation, as well. In the case of
geometry H, it is again represented by an empty value of the WKT literal, while geometry I has an explicitly defined
empty Point geometry in its WKT literal:
H:
I: Point EMPTY
The test then consists of two parts, where both check if the WKT literals of
H
and
I
are equal. The two parts refer to the
separate testing of the equality of the
LineString
geometries and the
Point
geometries. Both parts should be correct
in order for requirement 13 to be fulfilled and thus fully scored by the benchmark.
Req. 14: Implementations shall allow the RDF property geo:asWKT to be used in SPARQL graph patterns.
We test requirement 14 by simply selecting the
geo:asWKT
value of geometry A and checking it against the expected
literal value.
5
APREPRINT - FE BRUARY 12, 2021
Req. 15:
All
geo:gmlLiterals
shall consist of a valid element from the GML schema that implements a subtype
of GM_Object as defined in [27].
For the purpose of testing requirement 15, we select all the values of the
geo:asGML
property, regardless of the RDF
subject, and check whether all of them contain a valid
GM_Object
subtype in the value and whether its datatype is
geo:gmlLiteral
. The ordered list of results is then checked against the expected answers, which include all valid
GML literals from the dataset.
Req. 16: An empty geo:gmlLiteral shall be interpreted as an empty geometry.
Similarly to requirement 13, we test compliance to requirement 16 by providing an empty string as a GML literal value
in one geometry - geometry H, and an explicitly defined empty LineString in a GML literal - geometry I:
H:
I: <LineString><posList></posList></LineString>
Just like with requirement 13, here we use a
Point
representations, as well. In the case of geometry H, it is again
represented by an empty value of the GML literal, while geometry I has an explicitly defined empty
Point
geometry in
its GML literal:
H:
I: <Point><pos></pos></Point>
The test for requirement 16 consists of two parts, as well, where both check if the GML literals of H and I are equal.
The two parts refer to the separate testing of the equality of the
LineString
geometries and the
Point
geometries.
Both parts should be correct in order for requirement 16 to be fulfilled.
Req. 17: Implementations shall document supported GML profiles.
Requirement 17 is the only non-technical requirement of the GeoSPARQL standard, and therefore cannot be automati-
cally checked and tested. This is the only requirement omitted by the benchmark tests. To keep it simple, we assume
that all GeoSPARQL implementations fulfill this requirement and provide proper documentation for supported GML
profiles, which we believe to be a reasonable assumption.
Req. 18: Implementations shall allow the RDF property geo:asGML to be used in SPARQL graph patterns.
Similarly to requirement 14, we test requirement 18 by simply selecting the
geo:asGML
value of geometry A and
checking it against the expected literal value.
Req. 19:
Implementations shall support
geof:distance
,
geof:buffer
,
geof:convex-
Hull
,
geof:intersection
,
geof:union
,
geof:difference
,
geof:symDifference
,
geof:envelope
and
geof:boundary
as SPARQL extension functions, consistent with the definitions of the corresponding
functions (distance, buffer, convexHull, intersection, difference, symDifference, envelope and boundary
respectively) in Simple Features [25].
In order to test requirement 19, we use separate tests for the nine functions in question, i.e. we check each function
separately. To test the full compliance of each function, we run three sub-tests for them: (a) we test the function with
geometry parameters which are expressed as WKT literals, (b) we test it with geometry parameters expressed as GML
literals, and (c) we test it with a combination of WKT and GML literals. If the function uses a single parameter, we
only use the (a) and (b) sub-tests. If it uses two parameters, we use the (a), (b) and (c) sub-tests, where (c) consists of
two queries in which WKT is the first and GML is the second parameter of the function (denoted as WKT-GML), and
vice-versa (denoted as GML-WKT). With this, the test for each function consists of either two sub-tests (WKT and
GML), or of four sub-tests (WKT-WKT, GML-GML, WKT-GML and GML-WKT). This ensures that the compliance
score for each function is thoroughly checked. The scoring details for these tests are presented in Section 3.3.
With this, the entire test for requirement 19 consists of tests for the nine functions, each with two or four sub-tests, for a
total of 28 SPARQL queries.
Req. 20: Implementations shall support geof:getSRID as a SPARQL extension function.
We test requirement 20 by using the
geof:getSRID
function in two queries: one with the WKT literal of geom-
etry A, and the other with the GML literal of geometry A. In both cases we check if the system correctly returns
http://www.opengis.net/def/crs/OGC/1.3/CRS84 as an answer.
6
APREPRINT - FE BRUARY 12, 2021
Req. 21:
Implementations shall support
geof:relate
as a SPARQL extension function, consistent with the relate
operator defined in Simple Features [25].
For testing requirement 21, we use a relate operator which denotes the
contains
relation (expressed as
T*****FF*
in DE-9IM [
28
]), and test it on geometries A and B, where A contains B in the dataset. Given that the
geof:relate
function uses two parameters, there are four queries for this test: WKT-WKT, GML-GML, WKT-GML and GML-WKT.
Req. 22:
Implementations shall support
geof:sfEquals
,
geof:sfDisjoint
,
geof:sfIntersects
,
geof:sfTouches
,
geof:sfCrosses
,
geof:sfWithin
,
geof:sfContains
,
geof:sfOverlaps
as
SPARQL extension functions, consistent with their corresponding DE-9IM intersection patterns [
28
], as
defined by Simple Features [25].
Req. 23:
Implementations shall support
geof:ehEquals
,
geof:ehDisjoint
,
geof:ehMeet
,
geof:ehOverlap
,
geof:ehCovers
,
geof:ehCoveredBy
,
geof:ehInside
,
geof:ehContains
as SPARQL extension functions, consistent with their corresponding DE-9IM
intersection patterns, as defined by Simple Features [25].
Req. 24:
Implementations shall support
geof:rcc8eq
,
geof:rcc8dc
,
geof:rcc8ec
,
geof:rcc8po
,
geof:rcc8tppi
,
geof:rcc8tpp
,
geof:rcc8ntpp
,
geof:rcc8ntppi
as SPARQL extension func-
tions, consistent with their corresponding DE-9IM intersection patterns [
28
] , as defined by Simple Features
[25].
We test requirements 22, 23 and 24 by applying a separate set of tests for each of the twenty-four functions specified.
Each function is tested by employing four queries: one with two WKT literals (WKT-WKT), one with two GML literals
(GML-GML), and two with a combination of WKT and GML literals (WKT-GML and GML-WKT). Each of the
queries tests if the relation implemented by the tested function is correct for the used geometries from the dataset, and
each of them returns a
xsd:boolean
answer. The geometries used for the tests of each function are carefully selected
in order to provide an unambiguous assessment of whether the function is supported and correctly implemented in the
system under test.
Req. 25: Basic graph pattern matching shall use the semantics defined by the RDFS Entailment Regime [29].
For the purpose of testing requirements 25, 26 and 27, we use queries which require the system to select both materialized
RDF triples, as well as inferred RDF triples, based on the specifics of each requirement.
Therefore, we test requirement 25 using three separate queries: the first one selects all instances of the
geo:Feature
class, where we expect the system to select instances of the subclasses of the class, as well, e.g.
my:PlaceOfInterest
;
the second and the third one select all instances with the
geo:hasGeometry
and
geo:hasDefaultGeometry
properties, but expect the results to contain entities which use subproperties of these properties, as well,
e.g. my:hasExactGeometry.
Req. 26:
Implementations shall support graph patterns involving terms from an RDFS/OWL [
30
] class hierarchy of
geometry types consistent with the one in the specified version of Simple Features [25].
For requirement 26, we use two separate queries: they select all instances of
sf:Surface
and
sf:Curve
, respectively,
but expect the results to contain all instances of their subclasses, as well, such as sf:LineString and sf:Polygon.
Req. 27:
Implementations shall support graph patterns involving terms from an RDFS/OWL class hierarchy of
geometry types consistent with the GML schema that implements
GM_Object
using the specified version of
GML [27].
To test requirement 27, we use a single query which selects all instances of
gml:Surface
, but the expected results
include all instances of its subclass, gml:LineString.
Req. 28:
Basic graph pattern matching shall use the semantics defined by the RIF Core Entailment Regime [W3C
SPARQL Entailment] for the RIF rules [
31
]
geor:sfEquals
,
geor:sfDisjoint
,
geor:sfIntersects
,
geor:sfTouches,geor:sfCrosses,geor:sfWithin,geor:sfContains,geor:sfOverlaps.
Req. 29:
Basic graph pattern matching shall use the semantics defined by the RIF Core Entailment Regime [W3C
SPARQL Entailment] for the RIF rules [
31
]
geor:ehEquals
,
geor:ehDisjoint
,
geor:ehMeet
,
geor:eh-
Overlap,geor:ehCovers,geor:ehCoveredBy,geor:ehInside,geor:ehContains.
7
APREPRINT - FE BRUARY 12, 2021
Req. 30:
Basic graph pattern matching shall use the semantics defined by the RIF Core Entailment Regime [W3C
SPARQL Entailment] for the RIF rules [
31
]
geor:rcc8eq
,
geor:rcc8dc
,
geor:rcc8ec
,
geor:rcc8po
,
geor:rcc8tppi,geor:rcc8tpp,geor:rcc8ntpp,geor:rcc8ntppi.
We test the requirements 28, 29 and 30 with eight different queries each, in order to test the sub-requirements for each
individual rule specified. The queries used here are similar to the queries for requirements 4, 5 and 6, with the difference
that the tests for requirements 28, 29 and 30 require both materialized RDF triples and inferred RDF triples to be
selected for the query response. To ensure that the system selects all such entities, and therefore supports the semantics
defined in the RIF core entailment regime for the RIF rules, the tests require an ordered list of entities fulfilling the
query request.
3.3 Benchmark Results
The benchmark can test if the benchmarked system provides a correct or an incorrect answer on each of the 206
benchmark queries. In order to transform these individual results into an overall result, we calculate two benchmark
results from a given experiment:
Correct answers: The number of correct answers out of all GeoSPARQL queries, i.e. tests.
GeoSPARQL compliance percentage
: The percentage of compliance with the requirements of the
GeoSPARQL standard.
The former is straightforward – it’s the number of correct answers the system provided, out of the 206 test queries. The
latter is calculated from the perspective of the 30 requirements and measures the overall compliance of the benchmarked
system with the GeoSPARQL standard. It measures the amount of supported requirements of the system, out of the 30
specified requirements, where the weight of each requirement is uniformly distributed, i.e. each requirement contributes
3.33% to the total result.
If a requirement contains multiple sub-test queries, its 3.33% are uniformly distributed among them. Therefore, for
instance, each of the eight sub-requirements of requirement 4 contributes with 12.5% to the parent test score, i.e. with
0.4167% (3.33% x 12.5%) to the total benchmark compliance percentage score. This means that a single requirement
from the GeoSPARQL standard can be fully supported, partially supported or not supported at all.
The only exception of this rule of uniform distribution of the weights between tests on the same level, are the sub-test
queries which test GeoSPARQL functions with different serializations of literals as parameters, i.e. requirements 19 -
24. When we test a function for compliance to the standard while using (a) WKT-only literals, (b) GML-only literals
and (c) a combination of WKT and GML literals, the score is uniformly distributed between these three logical groups,
each contributing with 33.33% to the parent test score. However, (c) is practically tested using two queries: one where
WKT is the first and GML is the second parameter of the function (denoted as WKT-GML), and vice-versa (denoted
as GML-WKT). These two queries technically contribute with 16.67% to the parent test score each, so that the total
contribution from the logical group (c) remains 33.33%. With this, the technical weight of the queries themselves is
33.33% for the WKT-only query, 33.33% for the GML-only query, 16.67% for the WKT-GML query, and 16.67% for
the GML-WKT query. Technically, on a query level, this is an exception of the uniform distribution rule we practice,
but logically, on a group level, it still holds.
Given that requirement 17 is non-technical, and therefore not tested as part of the benchmark, each system gets its
3.33% score points automatically, when it provides at least one correct answer to the benchmark tests.
3.4 Benchmark Considerations
When creating the benchmark we needed to take certain considerations and interpretations which were implicitly given
in the GeoSPARQL standard. We elaborate on these in this subsection.
Geometry Literals
Many results of query functions defined in the GeoSPARQL standard return a
ogc:geomLiteral
as a result following
the GeoSPARQL standard definition. This means that, according to the standard, a function such as:
geof:boundary(ogc:geomLiteral):ogc:geomLiteral
may take either a WKT, a GML 2.0, or a GML 3.2 literal as an argument, and may return either a WKT, a GML 2.0, or
a GML 3.2 literal as a result. The dataset we use for our benchmark includes WKT and GML 3.2 formatted literals.
8
APREPRINT - FE BRUARY 12, 2021
However, we provide query answers in WKT, GML 2.0 and GML 3.2 to support all possible outcomes from a system
tested by the benchmark.
The decision to include only GML 3.2 and not GML 2.0 literals in our dataset was taken because GML 2.0 has been
de-facto superseded by GML 3.2. GML 2.0 is not even supported as an export option in current GIS software, such as
QGIS for instance. In addition, in all systems we benchmarked, the only GML variant that was supported was GML 3.2.
Variations Between Literal Serializations
Within the same literal type, different semantically equivalent representations of geometries are possible. WKT
serializations may include a CRS URI, but they may also omit it (if it’s missing, WGS84 CRS is assumed), and they may
differ in the amount and positioning of whitespaces. GML literals may differ in the order of attributes and definition of
namespaces. To be flexible about these variations, we apply a normalization process before comparing the results from
the tested system with the expected answer. WKT literals are trimmed and their whitespaces are removed, and GML
literals are converted to canonicalized XML with normalized namespace definitions.
Alternative Answers
The GeoSPARQL standard defines the results of GeoSPARQL functions as
ogc:geomLiteral
values, but does not
define which geometry types these literals should serialize. Therefore, functions may not only return results in different
literal types, but also in different geometry representations even within the same literal serialization. One example
is the
geof:boundary
function which could return a
sf:LinearRing
or a
sf:Polygon
geometry as a result. Even
supposedly simple return values such as an
xsd:boolean
may be represented as either the
xsd:boolean
literals with
value true and false or 1and 0.
In order to deal with these scenarios, we define alternative query answers for each of the aforementioned possibilities.
This means that each test consists of a single query which is issued to the system under test, and a set of several
alternative correct answers, which are logically equivalent, but may be technically represented in different serializations.
3.5 Implementation
We have implemented the benchmark as a benchmark for the HOBBIT platform
3
, intended for holistic benchmarking
of big Linked Data [
17
]. The HOBBIT platform allows for users to define and execute benchmarks, on one hand, and
provide and add triplestore systems, on the other. A user can run an experiment on the platform by selecting the desired
benchmark and the target triplestore system to be tested. The platform then loads the benchmark as a set of Docker
containers (benchmark controller, data generator, task generator and evaluation module), loads the system as a Docker
container (benchmarked system), and then runs the benchmark according to its logic, programmed in the controller
(Figure 2). The results of each experiment are stored in the platform and are made publicly available on the Web.
In our case, the GeoSPARQL compliance benchmark first loads the dataset into the benchmarked system, then reads
all the test queries and sends them to the benchmarked system for execution. The evaluation module reads the single
expected answer or the set of expected alternative answers for each query, and compares whether the benchmarked
system returns a correct or an incorrect answer, saving the result into the evaluation store. After all tests are done, the
evaluation module calculates two summarized results: (1) the number of correct answers, out of all possible tests, and
(2) the percentage of compliance to the requirements of the GeoSPARQL standard, as described in Section 3.3.
We decided to use the HOBBIT platform for our benchmark due to its plug-in nature, in which additional systems can
be added by interested users, which will then be able to run an experiment with the benchmark over their own system.
A user can also run our GeoSPARQL compliance benchmark over any triplestore system which is already available on
the platform. Additionally, the public nature of the platform allows for greater transparency and reproducibility of the
results of each benchmark, including our GeoSPARQL compliance benchmark.
4 Experimental Setup
In order to showcase the usability and usefulness of the GeoSPARQL compliance benchmark, we set out to run a
number of experiments over some of the most commonly used triplestores. The set of chosen triplestores is shown in
Table 1.
3Public instance of the HOBBIT Platform: http://master.project-hobbit.eu
9
APREPRINT - FE BRUARY 12, 2021
Figure 2: The HOBBIT benchmarking platform.
Triplestore Version Reference
Apache Marmotta 3.4.0 [32]
Blazegraph 3.1.5 [33]
Eclipse RDF4J 3.4.0 [34]
Jena Fuseki 3.14.0 [35]
GeoSPARQL Fuseki 3.17.0 [9,36]
Ontotext GraphDB 9.3.3 [37]
OpenLink Virtuoso 7.3 [38,39]
Stardog 7.4.0 [40]
Table 1: Triplestores which have been tested using the GeoSPARQL compliance benchmark.
For each experiment, a system adapter has been created and published on a public HOBBIT platform instance, as well
as in the HOBBIT GitLab repository4. This allows for the reproduction of the experiments and the results.
Each triplestore version from Table 1 was the most recent available stable version of the implementation at the time
of testing. For each of the triplestores which have been tested, a system adapter implementation has been created
which handles the initial configuration of the triplestore, e.g. setting up a repository which contains the data to be
tested, enabling geospatial query support, etc. If possible, this adapter implementation was added to the triplestore
implementation in a joint Docker image or two Docker images – the adapter implementation and the triplestore
implementation – were created for testing. It needs to be stated that not all of the aforementioned triplestores claim
to support GeoSPARQL. In fact, Blazegraph and Jena Fuseki do not support GeoSPARQL. We included them in
our experiments in order to show which GeoSPARQL requirements are already supported by a non-GeoSPARQL
implementation of an RDF triplestore.
5 Results and Discussion
5.1 Overall Results
The results of the experiments with our benchmark and the systems listed in Table 1 are shown in Table 2 and on
Figure 3, and are available online on the HOBBIT platform
5
. They show that none of these widely used RDF storage
4HOBBIT Platform Triplestores: https://git.project-hobbit.eu/triplestores
5
Results on the HOBBIT platform: https://master.project-hobbit.eu/experiments/1612476122572,1612477003063,1612476116049,
1612477500164,1612661614510,1612637531673,1612828110551,1612477849872
10
APREPRINT - FE BRUARY 12, 2021
solutions fully comply to the GeoSPARQL standard. Aside from that, we can point out that one of them stands out with
a significantly better GeoSPARQL compliance score than the others, and more generally, the top three stand out from
the rest. The triplestores in positions 4 - 7 share an almost identical result.
Triplestore Correct Answers GeoSPARQL Compliance
GeoSPARQL Fuseki 3.17 177 82.75%
Ontotext GraphDB 9.3.3 80 69.75%
OpenLink Virtuoso 7.3 73 63.46%
Eclipse RDF4J 3.4.0 47 58.33%
Stardog 7.4.0 46 56.67%
Blazegraph 2.1.5 46 56.67%
Jena Fuseki 3.14 46 56.67%
Apache Marmotta 3.4.0 40 46.67%
Table 2: Results from the GeoSPARQL compliance benchmark.
Figure 3: Results from the GeoSPARQL compliance benchmark, from the public instance of the HOBBIT platform.
In order to see the reasons for these variations more closely, we made a breakdown of the compliance results into the
six extensions defined in the GeoSPARQL standard. These results are shown in Table 3. As we can see from this table,
the triplestores in positions 4 - 7 share the same result due to demonstrating full compliance with the CORE, TOP
and RDFSE extensions of the GeoSPARQL benchmark, but not with the other extensions. The reason why almost all
benchmarked triplestores comply with CORE, TOP and RDFSE is simple: these requirements are designed in such a
way that they are satisfied “out-of-the-box” by most RDF- and SPARQL-compliant storage solutions. They refer to the
use of specific classes (CORE) and properties (TOP) in SPARQL query patterns, as well as RDFS reasoning (RDFSE),
which are features supported in most triplestores nowadays. Since RDFS reasoning was not activated in the Marmotta
version we benchmarked, it has no compliance for RDFSE so its score comes only from its compliance with CORE and
TOP, thus is lower than the scores of the other systems.
11
APREPRINT - FE BRUARY 12, 2021
The bottom three systems are explicitly not GeoSPARQL-compliant, but we included them in our experiments as
baseline tests. As we can see, they all demonstrated compatibility with either two or with three extensions of the
GeoSPARQL standard (Table 3), and scored 56.67% or 46.67% of the GeoSPARQL compliance score (Table 2). This,
however, does not mean that the benchmark score should start at 56.67% or 46.67%, since a benchmarked RDF storage
system may fail these tests, too.
Triplestore CORE TOP GEOEXT GTOP RDFSE QRW
GeoSPARQL Fuseki Full Full Full/E Full Full Full/E
Ontotext GraphDB Full Full Partial [WKT] Partial [WKT] Full None
OpenLink Virtuoso Full Full Partial [WKT] Partial [WKT] Full None
Eclipse RDF4J Full Full Partial [WKT CRS84] Partial [WKT CRS84] Full None
Stardog Full Full None None Full None
Blazegraph Full Full None None Full None
Jena Fuseki Full Full None None Full None
Apache Marmotta Full Full None None None None
Table 3: Support of the different GeoSPARQL extension by the tested triplestores. Full indicates full support, comprised
of correct query answers only, Full/E indicates that support is implemented but erroneous, Partial [GML/WKT] indicates
that support is partially implemented, None indicates that support for this GeoSPARQL extension is not present.
5.2 Discussion on the Results for each Triplestore
First, we tested RDF triplestores which claim GeoSPARQL support. We wanted to check how extensive their compliance
with the GeoSPARQL benchmark is, and this list included: GeoSPARQL Fuseki, GraphDB, Virtuoso, RDF4J and
Stardog.
GeoSPARQL Fuseki is the triplestore with the highest GeoSPARQL compliance score in our experiments. It is the only
system with full GML and WKT support and the only system with a full implementation of all GeoSPARQL extensions
(Table 3). However, GeoSPARQL Fuseki produced incorrect results in many functions covered by the query rewrite
extension and in a few functions covered by the geometry extension. Also, just like all other triplestores we tested,
GeoSPARQL Fuseki fails to handle empty WKT and empty GML literals.
GraphDB provides a full implementation of all but the query rewrite extension. However, GraphDB can only handle
WKT literals, but not GML literals. This leads to a substantially lower score in our benchmark, as many queries require
either a GML literal as input, or a combination of a GML and a WKT literal in order to be executed. Most functions
with WKT-only literals in the GEOEXT and GTOP extension tests produced correct results.
Virtuoso provides support for WKT literals, but not GML literals. Similarly to GraphDB, it provides full implementation
for all GeoSPARQL extensions, except for the query rewrite extension. However, it has an additional issue: even though
it returns logically correct results for the tests for the functions in requirement 19 (part of the GEOEXT extension), the
literals are transformed from WKT literals to an internal literal type which is Virtuoso-specific. This renders a mismatch
between the provided and expected answer, and lowers the benchmark score for Virtuoso.
The RDF4J triplestore implements all the GeoSPARQL functions of the GEOEXT extension and the GTOP extension
for WKT literals. However, RDF4J fails almost all of the GeoSPARQL tests from these extensions because it does not
support CRS URIs in WKT literals. While the GeoSPARQL standard acknowledges that the integration of CRS URIs
in WKT Literals is optional, they are used in various use-cases, especially at geospatial authorities, and we expect them
to be supported in every triplestore which claims GeoSPARQL support. Thus, WKT literals with explicit CRS URIs are
included in most of the tests of the benchmark. In addition, RDF4J lacks support for GML literals and the query rewrite
extension.
The Stardog triplestore provides an implementation covering WKT literals and implements five geospatial func-
tions which are similar to the GeoSPARQL functions, but not fully compatible. More specifically, out of their
five geospatial functions (
geof:within
,
geof:area
,
geof:nearby
,
geof:distance
and
geof:relate
), only the
geof:distance
function follows the signature of the GeoSPARQL function with the same URI. However, our tests for
this function include WKT literals with explicit CRS URIs, which Stardog doesn’t support, so the test for this function
fails. The tests for the other functions fail either because functions with those URIs don’t exist in the GeoSPARQL
12
APREPRINT - FE BRUARY 12, 2021
standard, or because of a function signature mismatch. Thus, Stardog only scores in tests which cover the CORE, TOP
and RDFSE extensions.
Next, we tested triplestores which do not claim to support GeoSPARQL, but claim support for other geospatial
extensions. We expected that they will provide full support for the GeoSPARQL CORE, TOP and RDFSE extensions
which do not rely on the implementation of additional geospatial operators. They thereby constitute as baseline tests for
our approach, and this list included: Blazegraph, Jena Fuseki, Apache Marmotta and Parliament.
Blazegraph supports some non-GeoSPARQL spatial functions in its GeoSpatial Search Extension
6
. This extension
allows the definition of
Points
via WKT literals, but is otherwise not GeoSPARQL-compliant. Blazegraph therefore
fails the GEOEXT, GTOP and QRW tests, as expected.
Jena Fuseki includes a customized spatial extension Jena Spatial
7
which is planned to be replaced by the GeoSPARQL
Fuseki implementation we tested. Jena Fuseki can cope with WKT literals and defines a custom set of functions, none
of which match the function signatures defined in the GeoSPARQL standard. Hence, Jena Fuseki only gets awarded a
full score in the CORE, GTOP and RDFSE extensions.
Apache Marmotta has a GeoSPARQL implementation which was created in a Google Summer of Code project
8
. At
the time of testing the extension was not included in the last stable version of this triplestore, therefore the version we
tested was not GeoSPARQL-compliant. Even though Marmotta supports RDFS reasoning, we were unsuccessful in our
attempts to activate it on the instance we worked with, so even though we expected it to achieve the same score as the
other triplestores which do not support GeoSPARQL, it only scored as compliant with CORE and TOP.
Finally, we want to acknowledge that we also tested the Parliament 2.7.10 triplestore. Parliament validates WKT and
GML literals before they are added to the graph, and fails to load a dataset if a validation error occurs. In our test,
Parliament failed to parse GML 3.2 literals and the empty WKT literals. As a result, the benchmark dataset could not
be loaded and we could not conduct the experiment with the Parliament triplestore.
6 Limitations of the Benchmark
The GeoSPARQL compliance benchmark does not test every GeoSPARQL function with every available geometry
type and their combinations. We do that with WKT and GML serializations, but not different geometry types. The
reason for this is that the amount of possible combinations of geometries would be inconceivably too large and the
benefit of testing them far too low. WKT defines 27 geometry types, GML defines at least as many which would need
to be considered in both in their GML 2.0 and in their GML 3.2 variants, to be complete. Instead, our dataset consists
of
Points
,
LineStrings
and
Polygons
, which are the most widely used geometry types. With this, we believe we
strike a good balance between the benchmark being too extensive and being sufficiently precise in measuring a system’s
compliance with the GeoSPARQL standard.
Regarding the GeoSPARQL compliance percentage score: as we already stated, this score measures the number of
supported requirements of the system, out of the 30 specified requirements, where the weight of each requirement is
uniformly distributed, i.e. each requirement contributes 3.33% to the total result. The reason we decided to use uniform
distribution instead of assigning requirement-specific weights, is because adding weights to different requirements
would be somewhat arbitrary. Given that the authors of the GeoSPARQL standard have not discussed or put any variable
significance between the different requirements, gives us a signal that, at least for the time being, we should treat them
as equally important. While that practically isn’t the case, and different stakeholders may have different significance
implicitly assigned to them, we don’t think there is a better universal way to address this.
7 Conclusions
This paper introduces a GeoSPARQL compliance benchmark which aims to measure the extent to which an RDF
triplestore complies with the requirements specified in the GeoSPARQL standard. By doing a series of tests for each
requirement, the benchmark is able to assess whether the benchmarked system fully or partially supports a given
requirement, or not at all. The results from the 206 individual tests are transformed into a GeoSPARQL compliance
percentage which aims to provide a metric of the amount of requirements covered by the benchmarked system.
In order to showcase the usefulness and usability of the benchmark, as part of the HOBBIT platform, we ran a series
of experiments with eight of the most commonly used RDF triplestores. The overall results show that GeoSPARQL
6Blazegraph GeoSpatial: https://github.com/blazegraph/database/wiki/GeoSpatial
7Jena Spatial: https://jena.apache.org/documentation/query/spatial-query.html
8Marmotta GeoSPARQL: http://marmotta.apache.org/kiwi/geosparql.html
13
APREPRINT - FE BRUARY 12, 2021
support varies greatly between the tested triplestores. While the CORE, TOP and RDFSE extensions are supported in
almost every triplestore – as they only depend on SPARQL and RDFS functionalities and are not GeoSPARQL-specific
– the GEOEXT and GTOP extensions show varying levels of implementation. Some triplestores, such as GraphDB or
Virtuoso, chose to only implement support for WKT literals, RDF4J supports only WKT literals without CRS URIs and
only GeoSPARQL-Jena provides a full GeoSPARQL-compliant implementations of all functions with both GML and
WKT compatibility. GeoSPARQL-Jena is also the only implementation tested in our benchmark which implements the
QRW extension of GeoSPARQL.
In conclusion, we can see that the GeoSPARQL standard, almost nine years after its initial release, is often only partially
supported by major triplestore vendors. We hope that the contribution of our GeoSPARQL benchmark can help to
motivate implementers to improve their RDF storage solutions, give customers a guideline as to which implementation
is most suitable for their given use-case, and provide a starting point for a further standard-conform expansion of the
geospatial Semantic Web.
7.1 Future Work
Recently, the OGC GeoSPARQL Working Group has been reactivated [
41
,
42
] to define GeoSPARQL 2.0, a successor
to the GeoSPARQL standard. It is a good practice of emerging OGC standards to first be defined, then reviewed, and at
the same time also implemented as a proof-of-concept. During the course of this implementation, compliance testing
becomes increasingly common as can be seen by the establishment of the OGC Team Engine
9
, a compliance test suite
which enterprises may use to get official OGC compliance certifications for their software implementations. Given that
currently no OGC-endorsed OGC GeoSPARQL compliance test exists, we would welcome a collaboration with the
OGC and would like to extend our test suite to cover the changes which will be defined in GeoSPARQL 2.0.
Acknowledgement
This work has been partially supported by Eurostars Project SAGE (GA no. E!10882).
References
[1]
Frederico Fonseca and A Sheth. The GeoSpatial Semantic Web. The Handbook of Geographic Information
Science, pages 367–376, 2002.
[2]
Tim Berners-Lee, James Hendler, and Ora Lassila. The Semantic Web. Scientific American, 284(5):34–43, 2001.
[3]
Robert Battle and Dave Kolas. GeoSPARQL: Enabling a GeoSpatial Semantic Web. Semantic Web Journal,
3(4):355–370, 2011.
[4]
Open Geospatial Consortium et al. OGC GeoSPARQL - A Geographic Query Language for RDF Data. OGC
Candidate Implementation Standard, 2012.
[5]
John Herring et al. OpenGIS
®
Implementation Standard for Geographic Information - Simple Feature Access -
Part 1: Common Architecture [corrigendum]. 2011.
[6]
Clemens Portele. OGC Geography Markup Language (GML) – Extended Schemas and Encoding Rules. In Open
Geospatial Consortium. Citeseer, 2012.
[7]
Eric PrudHommeaux. SPARQL Query Language for RDF. W3C Recommendation.
http: // www. w3. org/
TR/ rdf-sparql- query/ , 2008.
[8]
Robert Battle and Dave Kolas. Enabling the Geospatial Semantic Web with Parliament and GeoSPARQL. Semantic
Web, 3(4):355–370, 2012.
[9]
Gregory L Albiston, Taha Osman, and Haozhe Chen. GeoSPARQL-Jena: Implementation and Benchmarking of a
GeoSPARQL Graphstore. Under review in the Semantic Web Journal, 2018.
[10] Jr Leo W Tobin. Coordinate Reference System, April 28 1964. US Patent 3,131,292.
[11] B LOUIS Decker. World Geodetic System 1984. Technical report, Defense Mapping Agency, 1986.
[12]
George Garbis, Kostis Kyzirakos, and Manolis Koubarakis. Geographica: A Benchmark for GeoSpatial RDF
Stores (long version). In International Semantic Web Conference, pages 343–359. Springer, 2013.
[13] Theofilos Ioannidis, George Garbis, Kostis Kyzirakos, Konstantina Bereta, and Manolis Koubarakis. Evaluating
Geospatial RDF stores Using the Benchmark Geographica 2. arXiv preprint arXiv:1906.01933, 2019.
9OGC Team Engine: https://cite.opengeospatial.org/teamengine/
14
APREPRINT - FE BRUARY 12, 2021
[14]
Weiming Huang, Syed Amir Raza, Oleg Mirzov, and Lars Harrie. Assessment and Benchmarking of Spatially
Enabled RDF Stores for the Next Generation of Spatial Data Infrastructure. ISPRS International Journal of
Geo-Information, 8(7):310, 2019.
[15]
Karima Rafes, Julien Nauroy, and Cécile Germain. TFT, Tests For Triplestores. In Semantic Web Challenge, part
of the International Semantic Web Conference, 2014.
[16]
Axel-Cyrille Ngonga Ngomo and Michael Röder. HOBBIT: Holistic Benchmarking for Big Linked Data. ERCIM
News, (105), 2016.
[17]
Michael Röder, Denis Kuchelev, and Axel-Cyrille Ngonga Ngomo. HOBBIT: A Platform for Benchmarking Big
Linked Data. Data Science, 3(1):15–35, 2020.
[18]
Milos Jovanovik, Timo Homburg, and Mirko Spasi´
c. GeoSPARQL Compliance Benchmark.
https://github.
com/OpenLinkSoftware/GeoSPARQLBenchmark.
[19]
Howard Butler, Martin Daly, Allan Doyle, Sean Gillies, Tim Schaub, and Christopher Schmidt. GeoJSON.
Electronic. http: // geojson. org , 2014.
[20]
Kendall Grant Clark, Lee Feigenbaum, and Elias Torres. SPARQL Protocol for RDF. W3C Recommendation.
https: // www. w3. org/ TR/ rdf-sparql- protocol/ , 2008.
[21]
Dave Beckett and Jeen Broekstra. SPARQL Query Results XML Format. W3C Recommendation.
https:
// www. w3. org/ TR/ rdf-sparql-XMLres/ , 15, 2008.
[22]
Dan Brickley and Ramanathan V Guha. RDF Schema 1.1. W3C Recommendation.
https: // www. w3. org/
TR/ rdf-schema/ , 2014.
[23]
Michael Kifer and Harold Boley. RIF Overview. W3C working draft, W3C, (October 2009).
http: // www. w3.
org/ TR/ rif-overview , 2013.
[24]
Tim Berners-Lee, Roy Fielding, and Larry Masinter. RFC2396: Uniform Resource Identifiers (URI): Generic
Syntax, 1998.
[25] John Herring et al. Simple Feature Access - Part 1: Common Architecture. Open Geospatial Consortium.
[26]
R Nicolai and G Simensen. The New EPSG Geodetic Parameter Registry. In 70th EAGE Conference and
Exhibition incorporating SPE EUROPEC 2008, pages cp–40. European Association of Geoscientists & Engineers,
2008.
[27]
C Portele. OGC Implementation Specification 07-036: OpenGIS Geography Markup Language (GML) Encoding
Standard. Open Geospatial Consortium, 2007.
[28]
Christian Strobl. Dimensionally Extended Nine-Intersection Model (DE-9IM), pages 470–476. Springer Interna-
tional Publishing, Cham, 2017.
[29]
Birte Glimm, Chimezie Ogbuji, S Hawke, I Herman, B Parsia, A Polleres, and A Seaborne. SPARQL 1.1
Entailment Regimes. W3C Recommendation. https: // www. w3. org/ TR/ sparql11-entailment/ , 2013.
[30]
Deborah L McGuinness, Frank Van Harmelen, et al. OWL Web Ontology Language Overview. W3C Recommen-
dation, https: // www. w3. org/ TR/ owl-features/ , 2004.
[31]
Harold Boley, Gary Hallmark, Michael Kifer, Adrian Paschke, Axel Polleres, and Dave Reynolds. Rif core dialect.
W3C Recommendation. https: // www. w3. org/ TR/ rif-core/ , 2010.
[32] Apache Marmotta. http://marmotta.apache.org.
[33] Blazegraph. https://blazegraph.com.
[34] Eclipse RDF4J. https://rdf4j.org.
[35] Jena Fuseki. https://jena.apache.org/documentation/fuseki2/.
[36] GeoSPARQL Fuseki. https://jena.apache.org/documentation/geosparql/geosparql- fuseki.
[37] GraphDB. https://graphdb.ontotext.com.
[38] Orri Erling. Virtuoso, a Hybrid RDBMS/Graph Column Store. IEEE Data Eng. Bull., 35(1):3–8, 2012.
[39] Virtuoso. https://virtuoso.openlinksw.com.
[40] Stardog. https://www.stardog.com.
[41]
Joseph Abhayaratna, Linda van den Brink, Nicholas Car, Rob Atkinson, Timo Homburg, Frans Knibbe, Kris
McGlinn, Anna Wagner, Mathias Bonduel, Mads Holten Rasmussen, and Florian Thiery. OGC Benefits of
Representing Spatial Data Using Semantic and Graph Technologies, 2020.
[42]
Joseph Abhayaratna, Linda van den Brink, Nicholas Car, Timo Homburg, and Frans Knibbe. OGC GeoSPARQL
2.0 SWG Charter, 2020.
15
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This work presents an RDF graphstore implementation for all six modules of the GeoSPARQL standard using the Apache Jena Semantic Web library. Previous implementations have provided only partial coverage of the GeoSPARQL standard. There is discussion of the design and development of on-demand indexes to improve query performance without incurring lengthy data preparation delays. A supporting benchmarking framework is also discussed for the evaluation of any SPARQL compliant queries with interfaces provided for integrating additional test systems. This benchmarking framework is utilised to examine the performance of the implementation against two existing GeoSPARQL systems using the Geographica benchmark. It is found that the implementation achieves comparable or faster query responses than the alternative systems while also providing much faster dataset loading and initialisation durations.
Article
Full-text available
GeoSPARQL is an important standard for the geospatial linked data community, given that it defines a vocabulary for representing geospatial data in RDF, defines an extension to SPARQL for processing geospatial data, and provides support for both qualitative and quantitative spatial reasoning. However, what the community is missing is a comprehensive and objective way to measure the extent of GeoSPARQL support in GeoSPARQL-enabled RDF triplestores. To fill this gap, we developed the GeoSPARQL compliance benchmark. We propose a series of tests that check for the compliance of RDF triplestores with the GeoSPARQL standard, in order to test how many of the requirements outlined in the standard a tested system supports. This topic is of concern because the support of GeoSPARQL varies greatly between different triplestore implementations, and the extent of support is of great importance for different users. In order to showcase the benchmark and its applicability, we present a comparison of the benchmark results of several triplestores, providing an insight into their current GeoSPARQL support and the overall GeoSPARQL support in the geospatial linked data domain.
Article
Full-text available
Geospatial information is indispensable for various real-world applications and is thus a prominent part of today’s data science landscape. Geospatial data is primarily maintained and disseminated through spatial data infrastructures (SDIs). However, current SDIs are facing challenges in terms of data integration and semantic heterogeneity because of their partially siloed data organization. In this context, linked data provides a promising means to unravel these challenges, and it is seen as one of the key factors moving SDIs toward the next generation. In this study, we investigate the technical environment of the support for geospatial linked data by assessing and benchmarking some popular and well-known spatially enabled RDF stores (RDF4J, GeoSPARQL-Jena, Virtuoso, Stardog, and GraphDB), with a focus on GeoSPARQL compliance and query performance. The tests were performed in two different scenarios. In the first scenario, geospatial data forms a part of a large-scale data infrastructure and is integrated with other types of data. In this scenario, we used ICOS Carbon Portal’s metadata—a real-world Earth Science linked data infrastructure. In the second scenario, we benchmarked the RDF stores in a dedicated SDI environment that contains purely geospatial data, and we used geospatial datasets with both crowd-sourced and authoritative data (the same test data used in a previous benchmark study, the Geographica benchmark). The assessment and benchmarking results demonstrate that the GeoSPARQL compliance of the RDF stores has encouragingly advanced in the last several years. The query performances are generally acceptable, and spatial indexing is imperative when handling a large number of geospatial objects. Nevertheless, query correctness remains a challenge for cross-database interoperability. In conclusion, the results indicate that the spatial capacity of the RDF stores has become increasingly mature, which could benefit the development of future SDIs.
Conference Paper
Full-text available
Geospatial extensions of SPARQL like GeoSPARQL and stSPARQL have recently been defined and corresponding geospatial RDF stores have been implemented. However, there is no widely used benchmark for evaluating geospatial RDF stores which takes into account recent advances to the state of the art in this area. In this paper, we develop a benchmark, called Geographica, which uses both real-world and synthetic data to test the offered functionality and the performance of some prominent geospatial RDF stores.
Article
As the amount of Linked Open Data on the web increases, so does the amount of data with an inherent spatial context. Without spatial reasoning, however, the value of this spatial context is limited. Over the past decade there have been several vocabularies and query languages that attempt to exploit this knowledge and enable spatial reasoning. These attempts provide varying levels of support for fundamental geospatial concepts. GeoSPARQL, a forthcoming OGC standard, attempts to unify data access for the geospatial Semantic Web. As authors of the Parliament triple store and contributors to the GeoSPARQL specification, we are particularly interested in the issues of geospatial data access and indexing. In this paper, we look at the overall state of geospatial data in the Semantic Web, with a focus on GeoSPARQL. We first describe the motivation for GeoSPARQL, then the current state of the art in industry and research, followed by an example use case, and finally our implementation of GeoSPARQL in the Parliament triple store.