Figure - uploaded by Milos Jovanovik
Content may be subject to copyright.
Source publication
GeoSPARQL is an important standard for the geospatial linked data community, given that it defines a vocabulary for representing geospatial data in RDF, defines an extension to SPARQL for processing geospatial data, and provides support for both qualitative and quantitative spatial reasoning. However, what the community is missing is a comprehensiv...
Contexts in source publication
Context 1
... results of the experiments with our benchmark and the systems listed in Table 1 are shown in Table 2 and in Figure 3, and are available online on the HOBBIT platform (Results on the HOBBIT platform: https://master.project-hobbit.eu/experiments /1612476122572, 1612477003063,1612476116049,1625421291667,1612477500164,1612661614510,161263753167 3,1612828110551,1612477849872 (accessed on 08.07.2021)). ...Context 2
... bottom three systems are explicitly not GeoSPARQL-compliant, but we included them in our experiments as baseline tests. As we can see, they all demonstrated compatibility with either two or with three extensions of the GeoSPARQL standard (Table 3), and scored 56.67% or 46.67% of the GeoSPARQL compliance score (Table 2). This, however, does not mean that the benchmark score should start at 56.67% or 46.67%, since a benchmarked RDF storage system may fail these tests too. ...Context 3
... bottom three systems are explicitly not GeoSPARQL-compliant, but we included them in our experiments as baseline tests. As we can see, they all demonstrated compatibility with either two or with three extensions of the GeoSPARQL standard (Table 3), and scored 56.67% or 46.67% of the GeoSPARQL compliance score (Table 2). This, however, does not mean that the benchmark score should start at 56.67% or 46.67%, since a benchmarked RDF storage system may fail these tests too. ...Similar publications
We present the Linked SPARQL Queries (LSQ) dataset, which currently describes 43.95 million executions of 11.56 million unique SPARQL queries extracted from the logs of 27 different endpoints. The LSQ dataset provides RDF descriptions of each such query, which are indexed in a public LSQ endpoint, allowing interested parties to find queries with th...
Citations
... Furthermore, testing this implementation in RDF databases is still in its early stages [Habgood et al., 2022]. Few graph databases support DGGS-based indexing, such as those using S2 and H3 (e.g., NebulaGraphDB), but these are not RDF-based or GeoSPARQL-compliant [Jovanovik et al., 2021]. The commercial Foursquare GeoKG [Gundeti, 2023] utilizes H3 for location-specific business use cases, but only as an indexing framework. ...
... B. GeoSPARQL: In the RDF domain, OGC's GeoSPARQL standard is the leading specification for representing and querying geospatial data. It is implemented as an extension by many RDF graph databases [Jovanovik et al., 2021]. GeoSPARQL leverages OGC's Simple Features ontology 12 for defining spatial entities as shown in Figure. ...
Geospatial Knowledge Graphs (GeoKGs) have become integral to the growing field of Geospatial Artificial Intelligence. Initiatives like the U.S. National Science Foundation's Open Knowledge Network program aim to create an ecosystem of nation-scale, cross-disciplinary GeoKGs that provide AI-ready geospatial data aligned with FAIR principles. However, building this infrastructure presents key challenges, including 1) managing large volumes of data, 2) the computational complexity of discovering topological relations via SPARQL, and 3) conflating multi-scale raster and vector data. Discrete Global Grid Systems (DGGS) help tackle these issues by offering efficient data integration and representation strategies. The KnowWhereGraph utilizes Google's S2 Geometry -- a DGGS framework -- to enable efficient multi-source data processing, qualitative spatial querying, and cross-graph integration. This paper outlines the implementation of S2 within KnowWhereGraph, emphasizing its role in topologically enriching and semantically compressing data. Ultimately, this work demonstrates the potential of DGGS frameworks, particularly S2, for building scalable GeoKGs.
... GeoSPARQL defines an RDF vocabulary to express positions and polygons using either the WKT (OpenGIS, 2023) or GML (OpenGIS, 2016) serializations and a series of SPARQL functions to query spatial relationships. However, GeoSPARQL is inconsistently implemented and partially supported across triplestores (Jovanovik et al., 2021), which can lead to obtaining incorrect results (Jovanovik et al., 2021) or performance issues (Li et al., 2022). ...
... GeoSPARQL defines an RDF vocabulary to express positions and polygons using either the WKT (OpenGIS, 2023) or GML (OpenGIS, 2016) serializations and a series of SPARQL functions to query spatial relationships. However, GeoSPARQL is inconsistently implemented and partially supported across triplestores (Jovanovik et al., 2021), which can lead to obtaining incorrect results (Jovanovik et al., 2021) or performance issues (Li et al., 2022). ...
Introduction
Modern forestry increasingly relies on the management of large datasets, such as forest inventories and land cover maps. Governments are typically in charge of publishing these datasets, but they typically employ disparate data formats (sometimes proprietary ones) and published datasets are commonly disconnected from other sources, including previous versions of such datasets. As a result, the usage of forestry data is very challenging, especially if we need to combine multiple datasets.
Methods and results
Semantic Web technologies, standardized by the World Wide Web Consortium (W3C), have emerged in the last decades as a solution to publish heterogeneous data in an interoperable way. They enable the publication of self-describing data that can easily interlink with other sources. The concepts and relationships between them are described using ontologies, and the data can be published as Linked Data on the Web, which can be downloaded or queried online. National and international agencies promote the publication of governmental data as Linked Open Data, and research fields such as biosciences or cultural heritage make an extensive use of Semantic Web technologies. In this study, we present the result of the European Cross-Forest project, addressing the integration and publication of national forest inventories and land cover maps from Spain and Portugal using Semantic Web technologies. We used a bottom-up methodology to design the ontologies, with the goal of being generalizable to other countries and forestry datasets. First, we created an ontology for each dataset to describe the concepts (plots, trees, positions, measures, and so on) and relationships between the data in detail. We converted the source data into Linked Open Data by using the ontology to annotate the data such as species taxonomies. As a result, all the datasets are integrated into one place this is the Cross-Forest dataset and are available for querying and analysis through a SPARQL endpoint. These data have been used in real-world use cases such as (1) providing a graphical representation of all the data, (2) combining it with spatial planning data to reveal the forestry resources under the management of Spanish municipalities, and (3) facilitating data selection and ingestion to predict the evolution of forest inventories and simulate how different actions and conditions impact this evolution.
Discussion
The work started in the Cross-Forest project continues in current lines of research, including the addition of the temporal dimension to the data, aligning the ontologies and data with additional well-known vocabularies and datasets, and incorporating additional forestry resources.
... With an increasing focus on urban building energy consumption, this chapter introduces an ontology-driven method for urban building energy simulation. This chapter addresses the challenges related to data integration 44 from diverse fields, streamlines simulation file generation, and achieves building energy simulation. The proposed Building Template Ontology not only outlines the necessary data fields in an energy simulation template but also clarifies the intrinsic relationships among multi-field data. ...
... This approach involves extracting relevant data using mappings and automating queries to construct a typical Linked Data graph from data instances. In line with this concept, this chapter develops UBO, aligning each pertinent term and concept appropriately under geo:SpatialObject of GeoSPARQL[44], as illustrated inFigure 2.3. The Open Geospatial Consortium (OGC) GeoSPARQL standard defines a vocabulary for representing geospatial data in RDF on the Semantic Web. ...
Urbanization poses a significant challenge in the 21st century. Currently, more than half of the global population resides in urban areas, and this percentage is projected to reach 68% by 2050. The increase in urban population has led to a substantial rise in residential energy consumption, alongside a surge in commercial energy use to meet the growing demand for services. Consequently, overall building energy consumption has witnessed a significant increase. Therefore, effectively managing energy use in urban buildings has become imperative. To achieve this goal, various methodologies and tools for urban building energy modeling have been developed. These models offer valuable insights into the energy demands of building stock, covering benchmarking analysis, scenario assessments, peak load evaluations, energy pattern analysis, and other specialized analyses.
Despite extensive research in the field of energy modeling, assessing urban energy remains complex due to three significant challenges. Firstly, urban building simulation involves various aspects such as geography, construction, materials, and HVAC (Heating, Ventilation, and Air Conditioning) systems, each of which is stored in its own unique data model. As a result, creating text-based simulation files for urban buildings from scratch is an intricate task which requires the integration and processing of cross-domain data models. Secondly, conventional simulation models rely on climate conditions provided by a limited number of weather stations, which do not accurately capture the microclimate variations caused by urban morphologies, natural conditions, and man-made structures. This limitation results in unrealistic and unreliable simulation outputs, further hindering effective decision-making for urban sustainability. Lastly, previous efforts have primarily focused on complex physical conditions within cities but have often encountered challenges such as intricate modeling and substantial computational loads.
To address these gaps, this dissertation proposes a system architecture for urban building energy distributed simulation. The first aspect involves designing ontologies using semantic network technology, grounded in the features of building energy simulation inputs, clearly defining the potential logical relationships between the inputs, and facilitating the generation of qualified simulation files. Additionally, the concept of UrbanPatch, which represents the microclimate perception domain of urban buildings, is introduced. By analyzing the building morphology and green spaces within each UrbanPatch, a microclimate tuning approach is proposed to localize weather conditions for buildings. Finally, a rapid simulation approach is created, which decomposes the city model into spatially correlated building blocks for distributed simulation. The proposed algorithm, known as distributed adjacency blocks (DABs), uses 2D footprints to construct 3D building groups and considers solar azimuth angles, altitude angles, and shading planes to simplify the simulation targets. Using multiple threads and abstracted inter-building boundary conditions, the energy dynamics of an entire city can be simulated in parallel.
The innovative system architecture for urban building energy distributed simulation proposed in this dissertation offers a novel solution that prompts researchers to reconsider the traditional bottom-up approach towards city-scale energy simulation. Centered around distributed building networks, this dissertation not only distributes the computational load across multiple computing components, enabling dynamic energy simulations for extensive metropolitan areas, but also accounts for the influence of microclimate on building energy consumption in the urban built environment, resulting in more precise and reliable simulation outcomes and enhancing the efficiency of city energy decision-making and management.
... It is written in the Java programming language, and builds on a previous generator, in order to improve some of the metrics in the resulting graph and make its features closer to a real-world RDF dataset. Aside from this, our team has also worked with other RDF graph generators, for instance in the field of geo-spatial data [16][15] [14] and in benchmarking RDF storage solution [13] [18]. All of these examples include purpose-built RDF data generators, which serve a specific need. ...
This paper introduces RDFGraphGen, a general-purpose, domain-independent generator of synthetic RDF graphs based on SHACL constraints. The Shapes Constraint Language (SHACL) is a W3C standard which specifies ways to validate data in RDF graphs, by defining constraining shapes. However, even though the main purpose of SHACL is validation of existing RDF data, in order to solve the problem with the lack of available RDF datasets in multiple RDF-based application development processes, we envisioned and implemented a reverse role for SHACL: we use SHACL shape definitions as a starting point to generate synthetic data for an RDF graph. The generation process involves extracting the constraints from the SHACL shapes, converting the specified constraints into rules, and then generating artificial data for a predefined number of RDF entities, based on these rules. The purpose of RDFGraphGen is the generation of small, medium or large RDF knowledge graphs for the purpose of benchmarking, testing, quality control, training and other similar purposes for applications from the RDF, Linked Data and Semantic Web domain. RDFGraphGen is open-source and is available as a ready-to-use Python package.
... GeoSPARQL forms the de-facto standard for representing and querying geospatial data on the Semantic Web, including an extension to the SPARQL query language for processing geospatial data. The level of support for GeoSPARQL varies among different triple stores, but remains limited and inconsistent e14-10 (Chadzynski et al., 2021;Jovanovik et al., 2021): While RDF4J, for example, provides "partial GeoSPARQL support" (Eclipse Foundation, 2021), Blazegraph (which is used in this study) does not support GeoSPARQL at all and instead offers a custom subsystem for simple geospatial queries (Blazegraph, 2020b). ...
This article proposes a framework of linked software agents that continuously interact with an underlying knowledge graph to automatically assess the impacts of potential flooding events. It builds on the idea of connected digital twins based on the World Avatar dynamic knowledge graph to create a semantically rich asset of data, knowledge, and computational capabilities accessible to humans, applications, and artificial intelligence. We develop three new ontologies to describe and link environmental measurements and their respective reporting stations, flood events, and their potential impact on population and built infrastructure as well as the built environment of a city itself. These coupled ontologies are deployed to dynamically instantiate near real-time data from multiple fragmented sources into the World Avatar. Sequences of autonomous agents connected via the derived information framework automatically assess consequences of newly instantiated data, such as newly raised flood warnings, and cascade respective updates through the graph to ensure up-to-date insights into the number of people and building stock value at risk. Although we showcase the strength of this technology in the context of flooding, our findings suggest that this system-of-systems approach is a promising solution to build holistic digital twins for various other contexts and use cases to support truly interoperable and smart cities.
... In addition, this includes domain-specific research collections and Wikidata projects as part of the Collection Research Network [29] (p. 2), such as Roman Open Data (https://romanopendata.eu, accessed on 09 March 2022), Linked Open Samian Ware [30][31][32][33], Linked Ogham Data [34][35][36][37][38][39][40] (pp. [119][120][121][122][123][124][125][126][127], the Wiki-Project Prähistorische Keramik [41,42], and Linked Aegean Seals [43] within the Corpus of the Minoan and Mycenaean Seals (CMS) project (http://cmsheidelberg.uni-hd.de, accessed on 9 March 2023) using the iDAI.world ...
... In the field of archaeology, research on Digital Twins is on the rise, such as in graph-based data management for cultural heritage conservation with Digital Twins [112], Digital Twins and 3D documentation using multi-lens photogrammetric approaches [113], and SPARQLing Ogham Digital Twins as Linked Open Data [40] (pp. [119][120][121][122][123][124][125][126][127], [34] on the basis of the Ogham in 3D project, combining 3D capturing and EPIDOC XML modelling [114,115]. The following subsections describe geometric capturing (Section 2.4.1) and semantic archaeological modelling (Section 2.4.2) in the ARS3D project. ...
... For open graph-based provision under the FAIR principles, three community standards in particular are used in the field of cultural heritage: the CIDOC Conceptual Reference Model (CIDOC CRM) ontology, the Resource Description Framework (RDF) of the World Wide Web Consortium (W3C), and the Linked Open Data (LOD) Principles [119]. The Open Geospatial Consortium (OGC) standard GeoSPARQL has established itself in the field of geodata, enabling the semantic description of geodata using simple features and providing functions for querying, such as in PostGIS [120]. ...
In this paper, we introduce applications of Artificial Intelligence techniques, such as Decision Trees and Semantic Reasoning, for semi-automatic and semantic-model-based decision-making for archaeological feature comparisons. This paper uses the example of Roman African Red Slip Ware (ARS) and the collection of ARS at the LEIZA archaeological research institute. The main challenge is to create a Digital Twin of the ARS objects and artefacts using geometric capturing and semantic modelling of archaeological information. Moreover, the individualisation and comparison of features (appliqués), along with their visualisation, extraction, and rectification, results in a strategy and application for comparison of these features using both geometrical and archaeological aspects with a comprehensible rule set. This method of a semi-automatic semantic model-based comparison workflow for archaeological features on Roman ceramics is showcased, discussed, and concluded in three use cases: woman and boy, human–horse hybrid, and bears with local twists and shifts.
... A recent piece by Ioannidis, Garbis, Kyzirakos, Bereta, and Koubarakis (2021) provided a benchmark on geospatial RDF stores from a computational perspective; however, the RDF/triple stores they selected, such as Parliament and Strabon (Kyzirakos, Karpathiotakis, & Koubarakis, 2012), may not be the most active platforms, and the evaluation contains only a single type of semantic data repositories: the triple stores. Jovanovik, Homburg, and Spasić (2021) conducted a comprehensive review on the compatibility of various triple stores with GeoSPARQL (SPARQL with Geospatial functions; SPARQL: Simple Protocol and Rdf Query Language); however, computational performance is not evaluated. ...
... While recent literature has evaluated the availability and compatibility of semantic data repositories to support spatial queries, almost all focus on RDF triple stores (Ioannidis et al., 2021;Jovanovik et al., 2021;Raza, 2019) and very few have addressed this question from a computational performance perspective. In this paper, we provide a comprehensive analysis of a variety of semantic repository solutions, including RDF triple stores, property graph databases, and OBDA platforms in terms of their capabilities, community activeness, and computational efficiency for handling spatial-semantic queries. ...
Knowledge graph has become a cutting-edge technology for linking and integrating heterogeneous, cross-domain datasets to address critical scientific questions. As big data has become prevalent in today's scientific analysis, semantic data repositories that can store and manage large knowledge graph data have become critical in successfully deploying spatially explicit knowledge graph applications. This paper provides a comprehensive evaluation of the popular semantic data repositories and their computational performance in managing and providing semantic support for spatial queries. There are three types of semantic data repositories: (1) triple store solutions (RDF4j, Fuseki, GraphDB, Virtuoso), (2) property graph databases (Neo4j), and (3) an Ontology-Based Data Access (OBDA) approach (Ontop). Experiments were conducted to compare each repository's efficiency (e.g., query response time) in handling geometric, topological, and spatial-semantic related queries. The results show that Virtuoso achieves the overall best performance in both non-spatial and spatial-semantic queries. The OBDA solution, Ontop, has the second-best query performance in spatial and complex queries and the best storage efficiency, requiring the least data-to-RDF conversion efforts. Other triple store solutions suffer from various issues that cause performance bottlenecks in handling spatial queries, such as inefficient memory management and lack of proper query optimization.
... Conformance testing was performed with an updated version of an existing GeoSPARQL compliance benchmark test. [3]. ...
... Another touted benefit of DGGSes is their ability to represent both raster and vector spatial information in unified form, for a given spatial accuracy. Commercial companies exist internationally that specilise in raster and vector spatial data integration 2 via DGGS and some large technology companies are known to employ DGGS for large-scale spatial data operations 3 . ...
... We chose an extended version of the GeoSPARQL 1.0 compliance benchmark [3] to test for the compatibility of the given implementations. We added new sub-tests for the existing requirements in order to include the new DGGS literals in the testing. ...
We set out to determine the feasibility of implementing Discrete Global Grid System (DGGS) representations of geometry support in a GeoSPARQL-enabled triplestore, and test the GeoSPARQL compliance for it. The implementation is a variant of Apache Jena's existing GeoSPARQL support. Compliance is tested using an adapted implementation of the GeoSPARQL Compliance Benchmark testing system developed previously to test for GeoSPARQL 1.0 compliance. The benchmark results confirm that a majority of the functions which were set out to be implemented in the course of this paper were implemented correctly and points out possible future work for full compliance.
... To test whether the reference implementation and all following implementations fulfil the criteria that the given standard sets, compliance benchmarking can be used. [13] created the first compliance benchmark for GeoSPARQL 1.0 using the HOBBIT benchmarking platform [34]. Once an execution of the GeoSPARQL compliance benchmark is finished, it may produce a benchmark result in RDF (https: //github.com/hobbit-project/platform/issues/531, ...
... The GeoSPARQL implementation of the Apache Jena software library GeoSPARQL-Jena [38] provides, according to recent benchmarks [13], the only complete implementation of the GeoSPARQL 1.0 specification. In addition, GeoSPARQL-Jena has been extended in a prototypical use case to support raster data in [39]. ...
In 2012 the Open Geospatial Consortium published GeoSPARQL defining ``an RDF/OWL ontology for [spatial] information'', ``SPARQL extension functions'' for performing spatial operations on RDF data and ``RIF rules'' defining entailments to be drawn from graph pattern matching.
In the 8+ years since its publication, GeoSPARQL has become the most important spatial Semantic Web standard, as judged by references to it in other Semantic Web standards and its wide use for Semantic Web data.
An update to GeoSPARQL was proposed in 2019 to deliver a version 1.1 with a charter to: handle outstanding change requests and source new ones from the user community and to "better present" the standard, that is to better link all the standard's parts and better document \& exemplify elements. Expected updates included new geometry representations, alignments to other ontologies, handling of new spatial referencing systems, and new artifact
presentation. This paper describes motivating change requests and actual resultant updates in the candidate version 1.1 of the standard alongside reference implementations and usage examples.
We also describe the theory behind particular updates, initial implementations of many parts of the standard, and our expectations for GeoSPARQL 1.1's use.
... Also, triplestores supports GeoSPARQL in many different ways. (Jovanovik et al., 2021) tested the GeoSPARQL support of some triplestores and pointed out, that the choice of the right triplestore is important for a good geometry support. Their results also show, that there is no triplestore which fully supports GeoSPARQL. ...
The integration of geodata and building models is one of the current challenges in the AECOO (architecture, engineering, construction , owner, operation) domain. Data from Building Information Models (BIM) and Geographical Information Systems (GIS) can't be simply mapped 1:1 to each other because of their different domains. One possible approach is to convert all data in a domain-independent format and link them together in a semantic database. To demonstrate, how this data integration can be done in a federated database architecture, we utilize concepts of the semantic web, ontologies and the Resource Description Framework (RDF). It turns out, however, that traditional object-relational approaches provide more efficient access methods on geometrical representations than triplestores. Therefore we developed a hybrid approach with files, geodatabases and triplestores. This work-in-progess-paper (extend abstract) demonstrates our intermediate research results by practical examples and identifies opportunities and limitations of the hybrid approach.