Chapter

Geolocation of Cultural Heritage Using Multi-view Knowledge Graph Embedding

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Knowledge Graphs (KGs) have proven to be a reliable way of structuring data. They can provide a rich source of contextual information about cultural heritage collections. However, cultural heritage KGs are far from being complete. They are often missing important attributes such as geographical location, especially for sculptures and mobile or indoor entities such as paintings. In this paper, we first present a framework for ingesting knowledge about tangible cultural heritage entities from various data sources and their connected multi-hop knowledge into a geolocalized KG. Secondly, we propose a multi-view learning model for estimating the relative distance between a given pair of cultural heritage entities, based on the geographical as well as the knowledge connections of the entities.KeywordsCultural heritageGeolocationKnowledge graphsMulti-view graph embedding

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... A graph structure provides the realization of both kinds of connectedness with content provided by WikiData [18]. We follow the approach of [43] for constructing a Knowledge Graph (KG) to provide content suggestions in the form of textual information and images [44]. ...
Chapter
Full-text available
As the concept of the Metaverse becomes a reality, storytelling tools sharpen their teeth to include Artificial Intelligence and Augmented Reality as prominent enabling features. While digitally savvy and privileged populations are well-positioned to use technology, marginalized groups risk being left behind and excluded from societal progress, deepening the digital divide. In this paper, we describe MEMEX, an interactive digital storytelling tool where Artificial Intelligence and Augmented Reality play enabling roles in support of the cultural integration of communities at risk of exclusion. The tool was developed in the context of 3 years EU-funded project, and in this paper, we focus on describing its final working prototype with its pilot study.
Article
Full-text available
Last years witnessed a shift from the potential utility in digitisation to a crucial need to enjoy activities virtually. In fact, before 2019, data curators recognised the utility of performing data digitisation, while during the lockdown caused by the COVID-19, investing in virtual and remote activities to make culture survive became crucial as no one could enjoy Cultural Heritage in person. The Cultural Heritage community heavily invested in digitisation campaigns, mainly modelling data as Knowledge Graphs by becoming one of the most successful Semantic Web technologies application domains. Despite the vast investment in Cultural Heritage Knowledge Graphs, the syntactic complexity of RDF query languages, e.g., SPARQL, negatively affects and threatens data exploitation, risking leaving this enormous potential untapped. Thus, we aim to support the Cultural Heritage community (and everyone interested in Cultural Heritage) in querying Knowledge Graphs without requiring technical competencies in Semantic Web technologies. We propose an engaging exploitation tool accessible to all without losing sight of developers’ technological challenges. Engagement is achieved by letting the Cultural Heritage community leave the passive position of the visitor and actively create their Virtual Assistant extensions to exploit proprietary or public Knowledge Graphs in question-answering. By accessible to all, we mean that the proposed software framework is freely available on GitHub and Zenodo with an open-source license. We do not lose sight of developers’ technical challenges, which are carefully considered in the design and evaluation phases. This article first analyses the effort invested in publishing Cultural Heritage Knowledge Graphs to quantify data developers can rely on in designing and implementing data exploitation tools in this domain. Moreover, we point out challenges developers may face in exploiting them in automatic approaches. Second, it presents a domain-agnostic Knowledge Graph exploitation approach based on virtual assistants as they naturally enable question-answering features where users formulate questions in natural language directly by their smartphones. Then, we discuss the design and implementation of this approach within an automatic community-shared software framework (a.k.a. generator) of virtual assistant extensions and its evaluation in terms of performance and perceived utility according to end-users. Finally, according to a taxonomy of the Cultural Heritage field, we present a use case for each category to show the applicability of the proposed approach in the Cultural Heritage domain. In overviewing our analysis and the proposed approach, we point out challenges that a developer may face in designing virtual assistant extensions to query Knowledge Graphs, and we show the effect of these challenges in practice.
Article
Full-text available
Global problems all occur at a particular location on or near the Earth’s surface. Sitting at the junction of artificial intelligence (AI) and big data, knowledge graphs (KGs) organize, interlink, and create semantic knowledge, thus attracting much attention worldwide. Although the existing KGs are constructed from internet encyclopedias and contain abundant knowledge, they lack exact coordinates and geographical relationships. In light of this, a geographical knowledge graph (GeoKG) construction method based on multisource data is proposed, consisting of a modeling schema layer and a filling data layer. This method has two advantages: (1) the knowledge can be extracted from geographic datasets; (2) the knowledge on multisource data can be represented and integrated. Firstly, the schema layer is designed to represent geographical knowledge. Then, the methods of extraction and integration from multisource data are designed to fill the data layer, and a storage method is developed to associate semantics with geospatial knowledge. Finally, the GeoKG is verified through linkage rate, semantic relationship rate, and application cases. The experiments indicate that the method could automatically extract and integrate knowledge from multisource data. Additionally, our GeoKG has a higher success rate of linking web pages with geographic datasets, and its exact coordinates have increased to 100%. This paper could bridge the distance between a Geographic Information System and a KG, thus facilitating more geospatial applications.
Article
Full-text available
Research on digital cultural heritage has raised the importance of providing visitors with relevant assistance before and during their visits. With the advent of the social web, the cultural heritage area is affected by the problem of information overload. Indeed, a large number of available resources have emerged coming from the social information systems (SocIS). Therefore, visitors are swamped with enormous choices in their visited cities. SocIS platforms use the features of collaborative tagging, named folksonomy, to commonly contribute to the management of the shared resources. However, collaborative tagging uses uncontrolled vocabulary which semanti- cally weakens the description of resources, consequently decreases their classification, clustering, thereby their recommendation. Therefore, the shared resources have to be pertinently described to ameliorate their recommendations. In this paper, we aim to enhance the cultural heritage visits by suggesting semantically related places that are most likely to interest a visitor. Our proposed approach represents a semantic graph-based recommender system of cultural heritage places through two steps; (1) constructing an emergent semantic description that semantically augments the place and (2) effectively modeling the emerging graphs representing the semantic relatedness of similar cultural heritage places and their related tags. The experimental evaluation shows relevant results attesting the efficiency of the proposed approach.
Conference Paper
Full-text available
Cultural heritage institutions store and digitize large amounts of multimedia data inside archives to make archival records findable by archivists, scientists, and general public. Cataloging standards vary from archive to archive and, therefore, the sharing and use of this data are limited. To solve this issue, linked open data (LOD) is rising as an essential paradigm to open and provide access to the archival resources. Archives which are opened to the world knowledge benefit from external connections by enabling the application of automated approaches to process archival records, helping all stakeholders to gain valuable insights. In this paper, we present the Archive Dynamics Ontology (ArDO)-an ontology designed for describing the hierarchical nature of archival multimedia data, as well as its application on the example of archival resources about the Weimar Republic. Furthermore, ArDO semantically organizes multimedia archival resources in form of texts, images, audios, and videos by representing the dynamics related to their classification over time. ArDO tracks the changes of a specific hierarchical classification schema referred to as systematics adopted to organize archival resources under semantically defined keywords.
Article
Full-text available
Online cultural heritage resources are widely available through digital libraries maintained by numerous organizations. In order to improve discoverability in cultural heritage, the typical approach is metadata aggregation, a method where centralized efforts such as Europeana improve the discoverability by collecting resource metadata. The redefinition of the traditional data models for cultural heritage resources into data models based on semantic technology has been a major activity of the cultural heritage community. Yet, linked data may bring new innovation opportunities for cultural heritage metadata aggregation. We present the outcomes of a case study that we conducted within the Europeana cultural heritage network. In this study, the National Library of The Netherlands contributed by providing the role of data provider, while the Dutch Digital Heritage Network contributed as an intermediary aggregator that aggregates datasets and provides them to Europeana, the central aggregator. We identified and analyzed the requirements for an aggregation solution for the linked data, guided by current aggregation practices of the Europeana network. These requirements guided the definition of a workflow that fulfils the same functional requirements as the existing one. The workflow was put into practice within this study and has led to the development of software applications for administrating datasets, crawling the web of data, harvesting linked data, data analysis and data integration. We present our analysis of the study outcomes and analyze the effort necessary, in terms of technology adoption, to establish a linked data approach, from the point of view of both data providers and aggregators. We also present the expertise requirements we identified for cultural heritage data analysts, as well as determining which supporting tools were required to be designed specifically for semantic data.
Article
Full-text available
A Geographic Knowledge Graph (GeoKG) links geographic relation triplets into a large-scale semantic network utilizing the semantic of geo-entities and geo-relations. Unfortunately, the sparsity of geo-related information distribution on the web leads to a situation where information extraction systems can hardly detect enough references of geographic information in the massive web resource to be able to build relatively complete GeoKGs. This incompleteness, due to missing geo-entities or geo-relations in GeoKG fact triplets, seriously impacts the performance of GeoKG applications. In this paper, a method with geospatial distance restriction is presented to optimize knowledge embedding for GeoKG completion. This method aims to encode both the semantic information and geospatial distance restriction of geo-entities and geo-relations into a continuous, low-dimensional vector space. Then, the missing facts of the GeoKG can be supplemented through vector operations. Specifically, the geospatial distance restriction is realized as the weights of the objective functions of current translation knowledge embedding models. These optimized models output the optimized representations of geo-entities and geo-relations for the GeoKG’s completion. The effects of the presented method are validated with a real GeoKG. Compared with the results of the original models, the presented method improves the metric Hits@10(Filter) by an average of 6.41% for geo-entity prediction, and the Hits@1(Filter) by an average of 31.92%, for geo-relation prediction. Furthermore, the capacity of the proposed method to predict the locations of unknown entities is validated. The results show the geospatial distance restriction reduced the average error distance of prediction by between 54.43% and 57.24%. All the results support the geospatial distance restriction hiding in the GeoKG contributing to refining the embedding representations of geo-entities and geo-relations, which plays a crucial role in improving the quality of GeoKG completion.
Article
Full-text available
The INCEPTION project, "Inclusive Cultural Heritage in Europe through 3D Semantic Modelling", started in June 2015 and lasting four years, aims at developing advanced 3D modelling for accessing and understanding European cultural assets. One of the main challenges of the project is to close the gap between effective user experiences of Cultural Heritage via digital tools and representations, and the enrichment of the scientific knowledge. Within this framework, the INCEPTION project goals are consistently aligned while accomplishing the main objectives of accessing, understanding and strengthening European cultural heritage by means of enriched 3D models. At the end of the third year of activity, the project is now facing different challenging actions starting from already developed advancement in 3D data capturing and holistic digital documentation, under interdisciplinary and cross-cutting fields of knowledge. In this direction, the approach and the methodology for semantic organization and data management toward H-BIM modelling will be presented, as well as a preliminary nomenclature for semantic enrichment of heritage 3D models. According to the overall INCEPTION workflow, the H-BIM modelling procedure starts with documenting user needs, including experts and non-experts. The identification of the Cultural Heritage buildings semantic ontology and data structure for information catalogue will allow the integration of semantic attributes with hierarchically and mutually aggregated 3D digital geometric models for management of heritage information.
Article
Full-text available
data.europeana.eu is an ongoing effort of making Europeana metadata available as Linked Open Data on the Web. It allows others to access metadata collected from Europeana data providers via standard Web technologies. The data are represented in the Europeana Data Model (EDM) and the described resources are addressable and dereferencable by their URIs. Links between Europeana resources and other resources in the Linked Data Web will enable the discovery of semantically related resources. We developed an approach that allows Europeana data providers to opt for their data to become Linked Data and converts their metadata to EDM, benefiting from Europeana efforts to link them to semantically related resources on the Web. With that approach, we produced a first Linked Data version of Europeana and published the resulting datasets on the Web. We also gained experiences with respect to EDM, HTTP URI design, and RDF store performance and report them in this paper.
Conference Paper
Full-text available
DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the extraction of the DBpedia datasets, and how the resulting information is published on the Web for human- and machine-consumption. We describe some emerging applications from the DBpedia community and show how website authors can facilitate DBpedia content within their sites. Finally, we present the current status of interlinking DBpedia with other open datasets on the Web and outline how DBpedia could serve as a nucleus for an emerging Web of open data.
Chapter
ArCo is the Italian Cultural Heritage knowledge graph, consisting of a network of seven vocabularies and 169 million triples about 820 thousand cultural entities. It is distributed jointly with a SPARQL endpoint, a software for converting catalogue records to RDF, and a rich suite of documentation material (testing, evaluation, how-to, examples, etc.). ArCo is based on the official General Catalogue of the Italian Ministry of Cultural Heritage and Activities (MiBAC) - and its associated encoding regulations - which collects and validates the catalogue records of (ideally) all Italian Cultural Heritage properties (excluding libraries and archives), contributed by CH administrators from all over Italy. We present its structure, design methods and tools, its growing community, and delineate its importance, quality, and impact.
Chapter
Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation and maintenance, even the largest (e.g., Yago, DBPedia or Wikidata) remain incomplete. We introduce Relational Graph Convolutional Networks (R-GCNs) and apply them to two standard knowledge base completion tasks: Link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes). R-GCNs are related to a recent class of neural networks operating on graphs, and are developed specifically to handle the highly multi-relational data characteristic of realistic knowledge bases. We demonstrate the effectiveness of R-GCNs as a stand-alone model for entity classification. We further show that factorization models for link prediction such as DistMult can be significantly improved through the use of an R-GCN encoder model to accumulate evidence over multiple inference steps in the graph, demonstrating a large improvement of 29.8% on FB15k-237 over a decoder-only baseline.
Article
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved state-of-the-art results across three established transductive and inductive graph benchmarks: the Cora and Citeseer citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs are entirely unseen during training).
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Article
Wikidata allows every user to extend and edit the stored information, even without creating an account. A form based interface makes editing easy. Wikidata's goal is to allow data to be used both in Wikipedia and in external applications. Data is exported through Web services in several formats, including JavaScript Object Notation, or JSON, and Resource Description Framework, or RDF. Data is published under legal terms that allow the widest possible reuse. The value of Wikipedia's data has long been obvious, with many efforts to use it. The Wikidata approach is to crowdsource data acquisition, allowing a global community to edit the data. This extends the traditional wiki approach of allowing users to edit a website. In March 2013, Wikimedia introduced Lua as a scripting language for automatically creating and enriching parts of articles. Lua scripts can access Wikidata, allowing Wikipedia editors to retrieve, process, and display data. Many other features were introduced in 2013, and development is planned to continue for the foreseeable future.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.
Link prediction based on graph neural networks
  • M Zhang
  • Y Chen
  • S Bengio
  • H Wallach
  • H Larochelle
  • K Grauman
  • N Cesa-Bianchi
Zhang, M., Chen, Y.: Link prediction based on graph neural networks. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 31. Curran Associates, Inc. (2018)
data. europeana. eu: The europeana linked open data pilot
  • B Haslhofer
  • A Isaac
Haslhofer, B., Isaac, A.: data. europeana. eu: The europeana linked open data pilot. In: International Conference on Dublin Core and Metadata Applications. pp. 94-104 (2011)
Attention is all you need. Advances in neural information processing systems
  • A Vaswani
  • N Shazeer
  • N Parmar
  • J Uszkoreit
  • L Jones
  • A N Gomez
  • L Kaiser
  • I Polosukhin
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
  • P Veličković
  • G Cucurull
  • A Casanova
  • A Romero
  • P Lio
  • Y Bengio
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
How powerful are graph neural networks?
  • K Xu
  • W Hu
  • J Leskovec
  • S Jegelka
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? ArXiv abs/1810.00826 (2019)