Conference Paper

GraphChain: A Distributed Database with Explicit Semantics and Chained RDF Graphs

Authors:
  • MakoLab USA
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper we present a new idea of creating a Blockchain compliant distributed database which exposes its data with explicit semantics, is easily and natively accessible, and which applies Blockchain securitization mechanisms to the RDF graph data model directly, without additional packaging or specific serialisation. Essentially, the resulting database forms the linked chain of named RDF graphs and is given a name: GraphChain. Such graphs can then be published with the help of any standard mechanisms using triplestores or as linked data objects accessible via standard web mechanisms using the HTTP protocol to make them available on the web. They can also be easily queried using techniques like SPARQL or methods typical to available RDF graphs frameworks (like rdflib, Apache Jena, RDF4J, OWL API, RDF HDT, dotnetRDF and others). The GraphChain concept comes with its own, OWL-compliant ontology that defines all the structural, invariant elements of the GraphChain and defines their basic semantics. The paper describes also a few simple, prototypical GraphChain implementations with examples created using Java, .NET/C# and JavaScript/Node.js frameworks.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Sopek et. al. [12] and Tomaszuk et. al. [13] introduce and extend GraphChain as a framework for on-chain data management for BCT using the synergies between ontologies and BCT. ...
... Developed data models are urgently needed to make clear connections and knowledge [19]. An official querying and data storage about BCT is essential [12]. Processing files in BCT requires file storage using Knowledge Graph traceability [10]. ...
... Data can also be simply linked to other sources of information using ST approaches. It is possible now to connect domain-specific data from sources external to the chain, for example linking BCT Open Badge information with other Linked Data resources [12]. Interoperability will be simpler and complex domains will become more easy to deal with [18]. ...
... RDF has been successfully implemented in projects using blockchain technology [7], [8], [9], [10]. Unfortunately, the presence of blank nodes in RDF complicates some fundamental operations on RDF graphs. ...
... "Ad hash" algorithm [13] meets this criteria and exhibits a good compromise between performance and security. RDF graph hashing procedure based on this algorithm was presented in [8]. The authors of that approach defined the hash of the graph as a result of a specific summation of the hashes of all triples of the graph: ...
... This is followed by RDF triple lexical sorting and then the blank nodes are re-labelled. In [21] and [8], authors propose algorithms that support graphs with connected blank nodes in large and dense structures. The first paper, by Lantzaki et al. [21], consists of computing the signatures of blank nodes using constant terms in the close vicinity. ...
... The authors of this paper have created a working model of a system that directly addresses the challenge and brings Blockchain mechanisms to the RDF graph database [12]. This model is called GraphChain [13], and its first application was for enhancing trust of the digital identity system for legal entities [14]. ...
... The definition and the first implementation of GraphChain, which forms the foundation on which Ontospace has been built was first proposed in 2018 [13]. GraphChain is a Blockchain solution where the fundamental data model is a collection of linked named RDF Graphs. ...
... The fundamental advantage of such approach for the users is that they can work with the chained named graphs using standard tools developed in the domain of semantic web technology like SPARQL for quering [16], Linked Data mechanisms for accessing the nodes of the graphs [17], reasoners for ontologies [18] and many others -while benefiting from Blockchain mechanisms in their capacity to guarantee trust to the data. The first implementation of GraphChains used the 1st generation Blockchain framework of Hyperledger Indy [13]. While it was beneficial for the first application (the digital identity system for legal entities implemented for LEI.INFO portal 4 [14]), it was not rich enough for more sophisticated applications that required the capacity of a smart-contract based transaction. ...
Article
Full-text available
In this article we present the technological foundations on which an ecosystem of semantic data objects can be implemented on the latest Blockchain based systems. As the most important citizens among the semantic data objects are ontologies, the ecosystem is referred to as Ontospace. The foundations can be characterized by their architectural, cryptographic and transactional aspects. The architectural aspect borrows from the latest Layer-2 protocols of the 3rd generation blockchains and from the rules of Linked Data systems creation. The cryptographic aspect represents an original work that attempts to resolve the issue of efficient hashing of the graph data structures. The transactional aspect is concerned with the graph replication consistency, conditions for the direct access to graph data from the blockchain smart-contracts and with linkage between sidechains bearing semantic objects and the main network. The large parts of the work were implemented in the context of the Ontochain project – a part of the Next Generation Internet EU Initiative.
... Blockchains [15,27,32,37] are chains of data blocks that represent global ledgers. They rely on the consensus of participating nodes and facilitate updates by adding new blocks to the chain. ...
... Using blockchain technology for the Web of Data could thus allow users to collaborate on new updates to the data, keep the published data up-to-date, and improve its quality by correcting mistakes in community-driven efforts. However, blockchains typically replicate the chain on all participating nodes [37] to ensure immutability and persistance [32]. While this increases availability and security, it also requires that every node has to provide a considerably large amount of resources to store multiple large knowledge graphs. ...
... They rely on the consensus of participating nodes on the state of the ledger. Using blockchains over decentralized knowledge graphs has, to the best of our knowledge, only briefly been researched [15,32]. However, such systems require the entire chain to be stored on all nodes [37], pack structured data into blocks of a fixed size, and guarantee immutability of the data itself. ...
... It is usually seen in its popularized usage on Bitcoin, Ethereum, Dogecoin, and other cryptocurrencies. The Blockchain is not limited within the boundaries of financial usage, as it can be expanded further upon to encompass other types of systems, applications, and make a decentralized network [38]. Asymmetric cryptography and distributed consensus algorithms are part of the systems within Blockchain, which provide user security and ledger consistency [39]. ...
... Blockchain exhibits the following key characteristics [38,39]: ...
... Graphchain also has the benefit of using parallel mining [52] for increased performance and transaction processing. Graphchain is capable of being implemented with semantic technology of providing relations and "meaning" for the data structure to enhance the distributed ledger component [38]. But there is an issue with Graphchain where despite the necessary step to assure a decentralized system, centralization occurs within Graphchain due to a common descendant being shared between all newly created transactions [51]. ...
Preprint
Full-text available
Blockchain has made an impact on today's technology by revolutionizing the financial industry in its utilization on cryptocurrency and the features it provided on decentralization. With the current trend of pursuing the decentralized Internet, many methods have been proposed to achieve decentralization considering different aspects of the current Internet model ranging from infrastructure and protocols to services and applications. This paper focuses on using Blockchain to provide a robust and secure decentralized computing system. The paper conducts a literature review on Blockchain-based methods capable for the decentralization of the future Internet. To achieve that decentralization, two research aspects of Blockchain have been investigated that are highly relevant in realizing the decentralized Internet. The first aspect is the consensus algorithms, which are vital components for decentralization of Blockchain. We have identified three consensus algorithms being PoP, Paxos, and PoAH to be more adequate for reaching consensus in Blockchain-enabled Internet architecture. The second aspect that we investigated is the impact of future Internet technologies on Blockchain, where their combinations with Blockchain would help to make it overcome its established flaws and be more optimized and applicable for Internet decentralization.
... ¿Cómo se han desarrollado los casos de uso de blockchain semántico ? esta pregunta se pudo construir con base a [2,3,4,5,6,7,8,9] Uno de los principales intereses para el tema de investigación Blockchain semántico es encontrar información acerca de cómo se pueden interconectar estas dos tecnologías para que trabajen juntas. La pregunta RQ1 nos permitirá conocer el estado actual de la investigación y responder a las siguientes preguntas ¿ Cuáles son los avances que se han hecho en esta área y que aplicaciones se han pensado o construido utilizando la tecnología? ...
... -Metodologías de desarrollo empleadas para construir ontologías para una red blockchain, se han encontrado información acerca de una sola, siguiendo las reglas semánticas SWRL [18] RQ2: ¿Con qué estrategias se están construyendo contratos inteligentes semánticos? esta pregunta se resolvió en base a [7], [9], [10] y [11][12] [13] y [14]. ...
... De los beneficios adjuntos que el blockchain proporciona, ¿ cómo la web semántica podría beneficiarse para dar solución a los desafíos que se han encontrado para su implementación ? [2,3,4,5,6,7,8,9]. ...
Research
Full-text available
1. Resumen Blockchain Semántico, constituye un área de conocimiento que se encuentra en crecimiento continuo y que ha visto su nacimiento en los ultimos años gracias al avance que ha tenido Blockchain como una base de datos distribuida y descentralizada con un alto nivel de confiabilidad y trazabilidad, lo que la ha convertido en una posible candidata ideal para contribuir en la Web 3.0. La presente revisión de literatura trata sobre Blockchain semántico y los casos de estudios realizados en este área, que abarcan ontologías, y tecnologías empleadas. Aplicando el método de Torres-Carrión, se ha estructurado un constructo central para la investigación como base para construir el script de búsqueda, el cual se aplicará en las bases de datos científicas seleccionadas (Scopus, IEEE Xplore y Google Scholar). Se proponen cuatro preguntas de investigación, que involucran los casos de uso para el Blockchain semántico, las pautas sobre metodologías empleadas, frameworks empleados y existentes, las tecnologías que se utilizan y algunas ontologías para blockchain que podríamos tomar como base para nuestra investigación. Como resultado final, se ha obtenido una lista de revistas relevantes y bases de datos del área; Se analizaron 20 artículos para responder a las preguntas de investigación, y se ha organizado toda la información de manera estructurada, lo que permite al investigador establecer un contexto válido desde el cual enfocar la investigación futura. 2. Introducción Blockchain Semántico, es una tecnología emergente que ha surgido en los últimos años como resultado de la unión de la Web Semántica y el Blockchain, con el fin de fortalecerse la una con la ayuda de la otra para contribuir en la formación de la denominada Web 3.0 o web de los datos. Esta tecnología es muy reciente por lo que la información existente de la misma es muy limitada, es por eso que uno de nuestros objetivos es contribuir al conocimiento sobre la investigación y las aplicaciones que se podrían realizar con esta nueva área de conocimiento en el campo específico de la educación, en particular poder contribuir con la trazabilidad entre consorcios académicos. Por este motivo, se realizó una revisión sistemática de la literatura científica sobre estos temas
... Unfortunately, most of the existing DKG infrastructures tend to prioritize query efficiency over trustworthiness [2,3,8,17], although recent studies exist that focus on data integrity in DKGs [4,47,55]. For example, ColChain [4] enables data integrity in Byzantine environments by establishing storage nodes that maintain duplicate, immutable copies of subgraphs through blockchain consensus. ...
... Blockchain is a tamper-proof and decentralized ledger for reliable and auditable data storage [43,63] and has been used widely to enable trustworthy database management [25,32,44,49,51,68,75]. Several works have explored using blockchain to store linked data in KGs, which allows untrusted nodes to collaborate on data updates and to keep a trusted historical record of it [19,55,62]. However, a major challenge of using blockchain in KGs is that it requires every node to maintain a full copy of all data, which results in poor scalability and high hardware requirements. ...
Article
Full-text available
The ability to decentralize knowledge graphs (KG) is important to exploit the full potential of the Semantic Web and realize the Web 3.0 vision. However, decentralization also renders KGs more prone to attacks with adverse effects on data integrity and query verifiability. While existing studies focus on ensuring data integrity, how to ensure query verifiability - thus guarding against incorrect, incomplete, or outdated query results - remains unsolved. We propose VeriDKG, the first SPARQL query engine for decentralized knowledge graphs (DKG) that offers both data integrity and query verifiability guarantees. The core of VeriDKG is the RGB-Trie, a new blockchain-maintained authenticated data structure (ADS) facilitating correctness proofs for SPARQL query results. VeriDKG enables verifiability of subqueries by gathering global index information on subgraphs using the RGB-Trie, which is implemented as a new variant of the Merkle prefix tree with an RGB color model. To enable verifiability of the final query result, the RGB-Trie is integrated with a cryptographic accumulator to support verifiable aggregation operations. A rigorous analysis of query verifiability in VeriDKG is presented, along with evidence from an extensive experimental study demonstrating its state-of-the-art query performance on the largeRDFbench benchmark.
... In another interesting publication, Graphchain [89] the authors aim to build a blockchain-compliant, distributed database that exposes its data with explicit semantics, is easily and natively accessible, and implies blockchain securitization mechanisms to the RDF data model directly, without additional packaging or specific serialization. Essentially, the result is a collection of linked chains forming a distributed database of chained named RDF graphs. ...
... Improved data models are needed to clearly convey connections and knowledge [123]. Formal standards for querying blockchain data storage are essential [89]. Processing files on the blockchain requires integrating file storage with knowledge graph traceability [52]. ...
Article
In recent years, on the one hand, we have witnessed the rise of blockchain technology, which has led to better transparency, traceability, and therefore, trustworthy exchange of digital assets among different actors. On the other hand, achieving trustworthy content exchange has been one of the primary objectives of the Semantic Web, part of the World Wide Web Consortium. Semantic Web and blockchain technologies are the fundamental building blocks of Web3 (the third version of the Internet), which aims to link data through a decentralized approach. Blockchain provides a decentralized and secure framework for users to safeguard their data and take control over their data and Web3 experiences. However, developing trustworthy decentralized applications (Dapps) is a challenge because many blockchain-based functionalities must be developed from scratch, and combined with data semantics to open new innovative opportunities. In this survey paper, we explore the cross-cutting domain of the Semantic Web and blockchain and identify the critical building blocks required to achieve trust in the Next-Generation Internet. The application domains that could benefit from these technologies are also investigated. We developed a deep analysis of the published literature between 2015 and 2023. We performed our analysis in different digital libraries (e.g., Elsevier, IEEE, ACM), and as a result of our research, we retrieved 137 papers, of which 97 were retrieved as relevant to include in the paper. Furthermore, we studied several aspects (e.g., network type, transactions per second) of existing blockchain platforms. Semantic Web and blockchain technologies can be used to realize a verification and certification process for data quality. Examples of mechanisms to achieve this are the Decentralized Identities of the Semantic Web or the various blockchain consensus protocols that help achieve decentralization and realize democratic principles. Therefore, Semantic Web and blockchain technologies should be combined to achieve trust in the highly decentralized, semantically complex, and dynamic environments needed to build smart applications of the future.
... Authors in [24,25] use RDF graphs as fundamental structure and linked it to the blockchain, i.e. every node of the blockchain is linked to a triple store, the nodes can interact with their triple stores via SPARQL queries. In [22], the authors have followed the same principles to store data in a RDF graph. ...
Article
Full-text available
Drug traceability is a critical process involving monitoring and validation of the origin, quality, and safety of pharmaceutical products throughout their supply chain to prevent the distribution of counterfeit, substandard, or expired drugs that could harm patients. Traditional centralized solutions for drug traceability, relying on intermediaries and central authorities, introduce risks of data manipulation, corruption, and single points of failure. This work presents the design and implementation of a novel solution for decentralized drug traceability based on blockchain technology and on a reputation mechanism that operates on top of a trustworthy decentralized knowledge base, thus integrating three core technologies: blockchain, semantic, and reputation methods. Blockchain technologies ensure transparent and secure supply chain processes while providing a trustworthy estimation of the reputation of supply chain participants. Semantic technologies address drug data heterogeneity by ensuring interoperability and creating mappings between various data sources, including verifying the identities of the various users. Additionally, the reputation mechanism promotes transparency and accountability, as stakeholders contribute feedback on drug quality, authenticity, and reliability. This fosters a culture of trust and reliability, offering the drug supply chain an effective tool for continuous improvement and informed decision-making based on aggregated feedback, ultimately enhancing overall quality and safety throughout the distribution network. The design and implementation of the system, along with several evaluations, show the feasibility of the new semantic blockchain system in real-world scenarios and the improvement of the entities with a high reputation score. Our solution is more trustworthy, discouraging fraudulent activities as security is based on the various properties included in the semantic model.
... COLCHAIN builds upon PIQNIC and divides the entire network into communities of nodes that not only replicate the same data, but also collaborate on keeping certain data (fragments) up to date. This is done by using blockchain technology [27,54,65,73] where chains of updates maintain the history of changes to the data fragments. By linking such update chains to the data fragments in a community, COLCHAIN allows community participants to collaborate on keeping the data up-to-date while using consensus to make malicious updates less likely and allowing users to roll-back updates to an earlier version on request. ...
Article
Full-text available
While the Web of Data in principle offers access to a wide range of interlinked data, the architecture of the Semantic Web today relies mostly on the data providers to maintain access to their data through SPARQL endpoints. Several studies, however, have shown that such endpoints often experience downtime, meaning that the data they maintain becomes inaccessible. While decentralized systems based on Peer-to-Peer (P2P) technology have previously shown to increase the availability of knowledge graphs, even when a large proportion of the nodes fail, processing queries in such a setup can be an expensive task since data necessary to answer a single query might be distributed over multiple nodes. In this paper, we therefore propose an approach to optimizing SPARQL queries over decentralized knowledge graphs, called Lothbrok. While there are potentially many aspects to consider when optimizing such queries, we focus on three aspects: cardinality estimation, locality awareness, and data fragmentation. We empirically show that Lothbrok is able to achieve significantly faster query processing performance compared to the state of the art when processing challenging queries as well as when the network is under high load.
... For data management with semantic precision, the GraphChain 14 and OntoSpace 15 projects are available. The former is an ONTOCHAIN Call 1 project and Blockchain compliant distributed database in which the data is delivered with semantics assigned expressed in RDF graphs [44]. The latter is the continuation Call 2 project and goes further in that a novel Ethereum client is available to access graph databases. ...
Preprint
Full-text available
The audiovisual media content (AMC) industry, focused on film and television drama production, is confronted with a broken business model due to the dominance of centralized streaming platforms. The top platforms dominate global distribution but only offer slices of produced and heritage content. In addition, they compete with the AMC industry by producing a majority of content distributed. This leaves fewer gatekeepers deciding on the content to be distributed and less diverse content easily accessible to audiences Consequently, audiences are compelled to engage in pirating movies despite a willingness to pay. Recent blockchain innovations towards the so-called Web3 promise to restore this broken business model by re-establishing direct contact between the producers of films and their audiences. The benefits of networks (peer-to-peer or community based) in combination with Web3 follow the principle of decentralized disintermediation while comprising elements such as FIAT to crypto-payment mechanisms, self-sovereign identity authentication, blockchain oracles, decentralized autonomous organizations (DAO), and so on. A gap exists with regard to methodological designs of Web3 decentralized applications (DApp) and their ecosystems for restoring a viable AMC business model that not only eliminates the need for piracy activities but even the need for platforms. The DApp architecture designs for the film- and media industry ecosystem creation in this paper allow, on the one hand, for a legal compliance check ahead of a costly deployment. On the other hand, the DApp designs of this paper also allow for a tailored blockchain technology stack development. Ultimately, this research is a continuation of an earlier whitepaper to establish a participatory economy in the film industry from peer-to-peer streaming.
... The work [80] presents a distributed database compatible with Blockchain named GraphChain that exposes the data with RDF graphs in a semantic model. GraphChain, to define its semantics, uses its own Ontology Web Language (OWL) ontology to define structural entities. ...
Thesis
Full-text available
With the advent and popularization of Internet of Things (IoT) devices, new possibilities for applications that use data extracted from the things we use in everyday life arise. Cars, wearables, health sensors, and home appliances will generate unprecedented amounts of data and bring insights that will revolutionize our daily routines. A potential scenario significantly impacted is Smart Cities (SC), which uses devices spread out on a large scale in an urban environment to extract traffic, weather, and equipment maintenance data to obtain insights acting on city management and disaster prevention. The network infrastructure currently available for these network applications uses proprietary communication technologies and is dependent on mobile phone companies. Their systems are proprietary, centralized, isolated from other databases, and constantly exposed to Single Point of Failure (SPOF). IoT applications are still primarily embryonic and do not provide reliable verification of the data source at the edge, as in the case of IoT devices, often with outdated firmware. Our work investigates the use in SC of a composition of Low Power Wide Area Networks (LPWAN) and the popular Personal Area Networks (PAN), independence of mobile network providers, and Low Power consumption. For this, we used development kits with LoRa and BLE to verify the feasibility and possible problems in this integration, and we evaluated the scalability of LoRa using a simulator. Security gaps in IoT Apps in Smart Cities mainly come from the difficulty of knowing and trusting edge devices. The problem of standardizing and updating these devices during their lifetime justifies our search for using tools that support transparency, scalability, reliability, resilience, and implicit requirements of decentralized Blockchain networks that support Smart Contracts. For this, we present a network architecture using Fog Computing and Smart Contracts Blockchain, which, through API gateways, authorizes and authenticates edge communication from IoT devices previously known by their metadata and firmware. To provide standard and link data from Blockchain with existing Web datasets, we use and add new components to ontologies that model Ethereum entities. This approach allows us to use the semantic web for data consumption and linking, which exposes data from Ethereum networks in soft-realtime through middleware. This work investigates the potential use of Fog Computing in SC in Low Power networks, strategies to identify and authenticate IoT devices at the edges using Blockchain and Smart Contract, and consumption and data link of Blockchain with the current web using the Semantic web. The set of these resources used in Fog computing allows searching for a composition of independent SC network infrastructures, Low Power, with reliable information coming from the edges and integrable with other pre-existing data sets. As the main results, we show the limits of the LoRa network, using a simulator in single-gateway and multi-gateway scenarios. We present scenarios of mixed use of traditional using Blockchain as authentication and validation background, by API gateway in Fog Computing architecture, and we present the times in transactions per second of this approach considering signatures and validation of payloads using Ethereum Blockchain. We present a middleware to expose Ethereum data in soft-realtime using ontologies that model Ethereum in the literature and extended by our EthExtras ontology, providing classes and properties for links and queries.The main advances of this work are the models using the Fog Computing paradigm for Smart Cities, where we present its use as a mixing point of LoRa and BLE and the Blockchain API Gateway to validate data from IoT devices. In addition to our Middleware for extracting and consuming Ethereum data in soft real-time using our EthExtras and EthOn vocabulary.
... In [16,17], a BC-based data management system that could be used for the Global Legal Entity Identifier System (GLEIS) is proposed. Towards realizing this, the authors utilized Hyperledger Indy and their previously proposed GraphChain [18]. The implementation and utilization model of the proposed solution were presented, in which some challenges were faced, such as the limited message size of Indy. ...
Article
Full-text available
Several revolutionary applications have been built on the distributed ledgers of blockchain (BC) technology. Besides cryptocurrencies, we can find many other application fields in smart systems exploiting smart contracts and Self Sovereign Identity (SSI) management. The Hyperledger Indy platform is a suitable open-source solution for realizing permissioned BC systems for SSI projects. SSI applications usually require short response times from the underlying BC network, which may vary highly depending on the application type, the used BC software, and the actual BC deployment parameters. To support the developers and users of SSI applications, we present a detailed latency analysis of a private permissioned BC system built with Indy and Aries. To streamline our experiments, we developed a Python application using containerized Indy and Aries components from official Hyperledger repositories. We deployed our experimental application on multiple virtual machines in the public Google Cloud Platform and on our local, private cloud using a Docker platform with Kubernetes. We evaluated and compared their performance with the metrics of reading and writing response latency. We found that the local Indy ledger reads 30–50% faster, and writes 65–85% faster than the Indy ledger running on the Google Cloud Platform.
... COLCHAIN builds upon PIQNIC and divides the entire network into communities of nodes that not only replicate the same data, but also collaborate on keeping certain data (fragments) up-to-date. This is done by using blockchain technology [15][16][17][18] where chains of updates maintain the history of changes to the data fragments. By linking such update chains to the data fragments in a community, COLCHAIN allows community participants to collaborate on keeping the data up-to-date while using consensus to make malicious updates less likely and allowing users to rollback updates to an earlier version on request. ...
Preprint
While the Web of Data in principle offers access to a wide range of interlinked data, the architecture of the Semantic Web today relies mostly on the data providers to maintain access to their data through SPARQL endpoints. Several studies, however, have shown that such endpoints often experience downtime, meaning that the data they maintain becomes inaccessible. While decentralized systems based on Peer-to-Peer (P2P) technology have previously shown to increase the availability of knowledge graphs, even when a large proportion of the nodes fail, processing queries in such a setup can be an expensive task since data necessary to answer a single query might be distributed over multiple nodes. In this paper, we therefore propose an approach to optimizing SPARQL queries over decentralized knowledge graphs, called Lothbrok. While there are potentially many aspects to consider when optimizing such queries, we focus on three aspects: cardinality estimation, locality awareness, and data fragmentation. We empirically show that Lothbrok is able to achieve significantly faster query processing performance compared to the state of the art when processing challenging queries as well as when the network is under high load.
... Blockchain technologies are currently widely discussed and have led to innovative solutions in various fields [8]. Multiple benefits have been previously identified for applying blockchains in the semantic web [10], e.g., for using RDF as the data storage format on blockchains and thus providing a decentralized, immutable, tamper-proof data storage for RDF graphs [45]. Another approach has been proposed in [16]. ...
Chapter
Full-text available
When applying ontologies in practice, human and machine agents need to ensure that their provenance is trustworthy and it can be relied upon the contained concepts. This is particularly crucial for sensitive tasks such as in medical diagnostics or for safety-critical applications. In this paper, we propose an architecture for the decentralized attestation and verification of the integrity and validity of ontologies using blockchain technologies. Blockchains are an immutable, tamper-resistant and decentralized storage where all transactions are digitally signed. Thus, they permit tracing the provenance of concepts and identify responsible actors. For a proof-of-concept we extended the WebProtégé editor so that domain experts can attest to the provenance of ontologies via their Ethereum blockchain account, subsequently permitting other actors to reason about the validity and integrity of ontologies. For evaluating the applicability of this approach, we explore a use case in the biomedical domain and perform a cost analysis for the public Ethereum blockchain. It is shown that the attestation procedure is technically feasible and offers a new strategy for placing trust in ontologies.
... In (Sopek et al., 2018b) and (Raclawickie, 2019), a BC-based data management system that could be used for the Global Legal Entity Identifier System (GLEIS) is proposed. For realizing this, the authors utilized Hyperledger Indy and their previously proposed GraphChain (Sopek et al., 2018a). The implementation and utilization model of the proposed solution were presented, in which some challenges were faced such as the limited message size of Indy. ...
Conference Paper
Full-text available
Blockchain is the core technology behind several revolutionary applications that require consistent and immutable Distributed Ledgers, maintained by multi-party authorities. Examples of such applications include cryptocurrencies, smart contracts, Self Sovereign Identity (SSI) and Edge/Fog-enabled smart systems (eHealth, IIoT, IoV, etc.). Hyperledger Indy and Aries are suitable open-source tools for permissioned blockchain utilization in SSI projects. Those two frameworks have gained much attraction by researchers interested in the topic, while continuously maintained under the umbrella of the Linux Foundation. However, some SSI applications require specific upper bound of response time depending on their business model. In this paper we aim at presenting a detailed latency analysis of Indy, on top of which Aries is typically built. With such an architecture, researchers and practitioners of SSI applications can decide whether this framework fulfills their application requirements. To realize our proposed architecture, we have developed a Python application with containerized Indy and Aries components. Its scripts use and build on the official open-source codes of the Hyperledger repositories. We have deployed our application on multiple virtual machines in the Google Cloud Platform, and evaluated it with various scenarios. Depending on the transaction rate, we found that the writing response latency of an Indy-based blockchain containing 4 and 8 nodes, ranges between 1-16 seconds, while the reading response latency with similar settings ranges between 0.01-5 seconds.
... However, digital signatures typically rely on a centralized public-key infrastructure whereas blockchains offer a decentralized, distributed, peer-to-peer architecture [14]. Multiple benefits have been previously identified for applying blockchains in the semantic web [2], e.g., for using RDF as the data storage format on blockchains and thus providing a decentralized, immutable, tamper-proof data storage for RDF graphs [12]. Another approach has been proposed in the form of knowledge blockchains for the transparent monitoring of ontology evolution and proving the existence of concepts without disclosing them using so-called zero-knowledge proofs [4]. ...
Conference Paper
Full-text available
Ontologies are shared, formal conceptualizations of a domain that are consumed by human and machine agents alike. Trust in ontolo-gies is a central issue for their application. For example, machine learning algorithms for medical diagnosis may rely on the correctness of ontolo-gies and could potentially deliver false results. For enhancing trust, we developed a WebProtégé plugin for the decentralized attestation and verification to the integrity and validity of ontologies using the Ethereum blockchain. Blockchains are an immutable, tamper-resistant and decentralized storage where all transactions are digitally signed. Thus, they permit tracing the provenance of concepts and identify responsible actors. For a first experimental evaluation, we evaluated the transaction costs for attesting to the provenance of ontologies.
... The blockchain nodes process transactions and achieve consensus over data that represent them. The difference is that in parallel to the creation of new blocks in the Blockchain's chain, the chain of named RDF-star graphs is created according to the GraphChain 1.0 specification [11]. More details are presented in Table 1 and Table 2. ...
Chapter
Full-text available
GraphChain – a framework for on-chain data management for Blockchains is presented. The framework forms the foundational technology for the Ontochain project offering the synergy between ontologies and the Blockchain mechanisms. The use of Ethereum based Layer-2 mechanisms helped create the idea of Ontospace, which designates an ecosystem for trusted ontologies and trusted processing of smart contracts that can directly use the semantic data.
... Knowledge graphs represented in the Resource Description Framework (RDF) are stored using a blockchain technology, called GraphChain. 21 Blockchain is also applied to the decentralized construction of knowledge graphs. 22 In this system, company-level domain knowledge about employees' skills is constructed from the participation of employees in the company. ...
Article
Full-text available
This paper presents the use of gamified crowdsourcing for knowledge content validation. Constructing a high-quality knowledge base is crucial for building an intelligent system. We develop a refinement process for the knowledge base of our word retrieval assistant system, where each piece of knowledge is represented as a triple. To validate triples acquired from various sources, we introduce yes/no quizzes and present them to many casual users for their inputs. Only the triples voted “yes” by a sufficient number of users are incorporated into the main knowledge base. Users are incentivized by rewards based on their contribution to the validation process. To ensure transparency of the reward-giving process, blockchain is utilized to store logs of the users’ inputs from which the rewards are calculated. Different strategies are also proposed for selecting the next quiz. The simulation results indicate that the proposed approach has the potential to validate knowledge contents. This paper is a revised version of our conference paper presented at the 12th Asian Conference on Intelligent Information and Database Systems (ACIIDS 2020).
... However, due to its relatively simple data structure, blockchain itself reveals some shortcomings in handling complex operations such as data discovery and verification, which are vital for supply chain management. To cope with such challenge, approach to incorporating graph data model such as Resource Description Framework (RDF) [5] has been proposed [6]- [10]. One of the initial works on this direction introduces how blockchain and Semantic Web can mutually benefit from each other, suggesting potential use cases such as supply chain, data market, and online educational credentials [6]. ...
Chapter
Ensure traceability and orchestration of the participants in the food supply chain can help improve the food production and reduce the distribution of unsafe or low-quality products. This article provides some insights about the design of a high-level architecture to support a semantic blockchain platform that ensures traceability and orchestration of the food systems. The design involves two layers that cover: (a) the decentralized orchestration of the participants, (b) the semantic modeling of the data and processes involved, and (c) the storage and integrity of the data. To deal with the platform design, it is analyzed the operation and attributes of the food supply chain management and it is discussed how the combination of Semantic and Blockchain technologies can address the platform features.
Chapter
Tangle is a novel directed acyclic graph (DAG)-based distributed ledger preferred over traditional linear ledgers in blockchain applications because of better transaction throughput. Earlier techniques have mostly focused on comparing the performance of graph chains over linear chains and incorporating the Markov Chain Monte Carlo process in probabilistic traversals to detect unverified transactions in DAG chains. In this paper, we present a parallel detection method for unverified transactions. Experimental evaluation of the proposed parallel technique demonstrates a significant, scalable average speed-up of close to 70%, and a peak speed-up of approximately 73% for a large number of transactions.
Chapter
This paper reports an application of blockchains for knowledge refinement. Constructing a high-quality knowledge base is crucial for building an intelligent system. One promising approach to this task is to make use of “the wisdom of the crowd,” commonly performed through crowdsourcing. To give users proper incentives, gamification could be introduced into crowdsourcing so that users are given rewards according to their contribution. In such a case, it is important to ensure transparency of the rewards system. In this paper, we consider a refinement process of the knowledge base of our word retrieval assistant system. In this knowledge base, each piece of knowledge is represented as a triple. To validate triples acquired from various sources, we introduce yes/no quizzes. Only the triples voted “yes” by a sufficient number of users are incorporated into the main knowledge base. Users are given rewards based on their contribution to this validation process. We describe how a blockchain can be used to ensure transparency of the process, and we present some simulation results of the knowledge refinement process.
Poster
Full-text available
Blockchain Semántico, constituye un área de conocimiento que se encuentra en crecimiento continuo y que ha visto su nacimiento en los ultimos años gracias al avance que ha tenido Blockchain como una base de datos distribuida y descentralizada con un alto nivel de confiabilidad y trazabilidad, lo que la ha convertido en una posible candidata ideal para contribuir en la Web 3.0. La presente revisión de literatura trata sobre Blockchain semántico y los casos de estudios realizados en este área, que abarcan ontologías, y tecnologías empleadas. Aplicando el método de Torres-Carrión, se ha estructurado un constructo central para la investigación como base para construir el script de búsqueda, el cual se aplicará en las bases de datos científicas seleccionadas (Scopus, IEEE Xplore y Google Scholar). Se proponen cuatro preguntas de investigación, que involucran los casos de uso para el Blockchain semántico, las pautas sobre metodologías empleadas, frameworks empleados y existentes, las tecnologías que se utilizan y algunas ontologías para blockchain que podríamos tomar como base para nuestra investigación. Como resultado final, se ha obtenido una lista de revistas relevantes y bases de datos del área; Se analizaron 20 artículos para responder a las preguntas de investigación, y se ha organizado toda la información de manera estructurada, lo que permite al investigador establecer un contexto válido desde el cual enfocar la investigación futura.
Chapter
The main idea behind GraphChain is to use blockchain mechanisms on top of abstract RDF graphs. This paper presents an implementation of GraphChain in the Hyperledger Indy framework. The whole setting is shown to be applied to the RDF graphs containing information about Legal Entity Identifiers (LEIs). The blockchain based data management system presented in the paper preserves all the benefits of using RDF data model for the representation of LEI system reference data, including powerful querying mechanisms, explicit semantics and data model extensibility with the security and non-repudiation of LEIs as the digital identifiers for legal entities.
Conference Paper
Full-text available
Distributed cryptographic ledgers, such as the blockchain, are now being used in recordkeeping. However, they lack a key feature of more traditional recordkeeping systems needed to establish the authenticity of records and enable reliance on them for trustworthy recordkeeping. The missing feature is known in archival science as the archival bond – the mutual relationship that exists among documents by virtue of the actions in which they participate. In this paper, we propose a novel data model and syntax using core web principles that can be used to address this shortcoming in distributed ledgers as recordkeeping systems.
Conference Paper
Full-text available
To make digital resources on the web verifiable, immutable, and permanent, we propose a technique to include cryptographic hash values in URIs. We call them trusty URIs and we show how they can be used for approaches like nanopublications to make not only specific resources but their entire reference trees verifiable. Digital artifacts can be identified not only on the byte level but on more abstract levels such as RDF graphs, which means that resources keep their hash values even when presented in a different format. Our approach sticks to the core principles of the web, namely openness and decentralized architecture, is fully compatible with existing standards and protocols, and can therefore be used right away. Evaluation of our reference implementations shows that these desired properties are indeed accomplished by our approach, and that it remains practical even for very large files.
Article
Full-text available
The Semantic Web consists of many RDF graphs nameable by URIs. This paper extends the syntax and semantics of RDF to cover such Named Graphs. This enables RDF statements that describe graphs, which is beneficial in many Semantic Web application areas. As a case study, we explore the application area of Semantic Web publishing: Named Graphs allow publishers to communicate assertional intent, and to sign their graphs; information consumers can evaluate specific graphs using task-specific trust policies, and act on information from those Named Graphs that they accept. Graphs are trusted depending on: their content; information about the graph; and the task the user is performing. The extension of RDF to Named Graphs provides a formally defined framework to be a foundation for the Semantic Web trust layer.
Conference Paper
Full-text available
Being able to determine the provenience of statements is a fundamental step in any SW trust modeling. We propose a methodology that allows signing of small groups of RDF statements. Groups of statements signed with this methodology can be safely inserted into any existing triple store without the loss of provenance information since only standard RDF semantics and constructs are used. This methodology has been implemented and is both available as open source library and deployed in a SW P2P project.
Article
Full-text available
this paper we examine the particular case of a graph containing statements expressed in the Resource Description Framework (RDF) (see Section 1.1). By carefully avoiding the need for an intermediate canonical serialization we are able to e#ciently compute a digest for such graphs
Conference Paper
Searching for information in distributed ledgers is currently not an easy task, as information relating to an entity may be scattered throughout the ledger with no index. As distributed ledger technologies become more established, they will increasingly be used to represent real world transactions involving many parties and the search requirements will grow. An index providing the ability to search using domain specific terms across multiple ledgers will greatly enhance to power, usability and scope of these systems. We have implemented a semantic index to the Ethereum blockchain platform, to expose distributed ledger data as Linked Data. As well as indexing block- and transaction-level data according to the BLONDiE ontology, we have mapped smart contracts to the Minimal Service Model ontology, to take the first steps towards connecting smart contracts with Semantic Web Services.
Book
This book constitutes the refereed proceedings of the 11th Extended Semantic Web Conference, ESWC 2014, held in Anissaras, Crete, Greece France, in May 2014. The 50 revised full papers presented together with three invited talks were carefully reviewed and selected from 204 submissions. They are organized in topical sections on mobile, sensor and semantic streams; services, processes and cloud computing; social web and web science; data management; natural language processing; reasoning; machine learning, linked open data; cognition and semantic web; vocabularies, schemas, ontologies. The book also includes 11 papers presented at the PhD Symposium.
Article
Existential blank nodes greatly complicate a number of fundamental operations on Resource Description Framework (RDF) graphs. In particular, the problems of determining if two RDF graphs have the same structure modulo blank node labels (i.e., if they are isomorphic), or determining if two RDF graphs have the same meaning under simple semantics (i.e., if they are simple-equivalent), have no known polynomial-time algorithms. In this article, we propose methods that can produce two canonical forms of an RDF graph. The first canonical form preserves isomorphism such that any two isomorphic RDF graphs will produce the same canonical form; this iso-canonical form is produced by modifying the well-known canonical labelling algorithm Nauty for application to RDF graphs. The second canonical form additionally preserves simple-equivalence such that any two simple-equivalent RDF graphs will produce the same canonical form; this equi-canonical form is produced by, in a preliminary step, leaning the RDF graph, and then computing the iso-canonical form. These algorithms have a number of practical applications, such as for identifying isomorphic or equivalent RDF graphs in a large collection without requiring pairwise comparison, for computing checksums or signing RDF graphs, for applying consistent Skolemisation schemes where blank nodes are mapped in a canonical manner to Internationalised Resource Identifiers (IRIs), and so forth. Likewise a variety of algorithms can be simplified by presupposing RDF graphs in one of these canonical forms. Both algorithms require exponential steps in the worst case; in our evaluation we demonstrate that there indeed exist difficult synthetic cases, but we also provide results over 9.9 million RDF graphs that suggest such cases occur infrequently in the real world, and that both canonical forms can be efficiently computed in all but a handful of such cases.
Conference Paper
In this paper, we propose and evaluate a scheme to produce canonical labels for blank nodes in RDF graphs. These labels can be used as the basis for a Skolemisation scheme that gets rid of the blank nodes in an RDF graph by mapping them to globally canonical IRIs. Assuming no hash collisions, the scheme guarantees that two Skolemised graphs will be equal if and only if the two input graphs are isomorphic. Although the proposed scheme is exponential in the worst case, we claim that such cases are unlikely to be encountered in practice. To support these claims, we present the results of applying our Skolemisation scheme over a diverse collection of 43.5 million real-world RDF graphs (BTC-2014); we also provide results for some nasty synthetic cases.
Conference Paper
Existing algorithms for signing graph data typically do not cover the whole signing process. In addition, they lack distinctive features such as signing graph data at different levels of granularity, iterative signing of graph data, and signing multiple graphs. In this paper, we introduce a novel framework for signing arbitrary graph data provided, e.g., as RDF(S), Named Graphs, or OWL. We conduct an extensive theoretical and empirical analysis of the runtime and space complexity of different framework configurations. The experiments are performed on synthetic and real-world graph data of different size and different number of blank nodes. We investigate security issues, present a trust model, and discuss practical considerations for using our signing framework.
Article
This paper presents a hash and a canonicalization algorithm for Notation 3 (N3) and Resource Description Framework (RDF) graphs. The hash algorithm produces, given a graph, a hash value such that the same value would be obtained from any other equivalent graph. Contrary to previous related work, it is well-suited for graphs with blank nodes, variables and subgraphs. The canonicalization algorithm outputs a canonical serialization of a given graph (i.e. a canonical representative of the set of all the graphs that are equivalent to it). Potential applications of these algorithms include, among others, checking graphs for identity, computing differences between graphs and graph synchronization. The former could be especially useful for crawlers that gather RDF/N3 data from the Web, to avoid processing several times graphs that are equivalent. Both algorithms have been evaluated on a big dataset, with more than 29 million triples and several millions of subgraphs and variables.
Article
A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they'll generate the longest chain and outpace attackers. The network itself requires minimal structure. Messages are broadcast on a best effort basis, and nodes can leave and rejoin the network at will, accepting the longest proof-of-work chain as proof of what happened while they were gone.
Conference Paper
Assuming P < GI < NP, the creation and verification of a digital signature of an arbitrary RDF graph cannot be done in polynomial time. However, it is possible to define a large class of canonicalizable RDF graphs, such that digital signatures for graphs in this class can be created and verified in O(nlog(n)). Without changing its meaning, an arbitrary RDF graph can be nondeterministically pre-canonicalized into a graph of this class, before signing. The techniques in this paper are key enablers for the use of digital signature technology in the Semantic Web.
Conference Paper
In this paper a method for Partial RDF Encryption (PRE) is proposed in which sensitive data in an RDF-graph is encrypted for a set of recipients while all non-sensitive data remain publicly readable. The result is an RDF-compliant self-describing graph containing encrypted data, encryption metadata, and plaintext data. For the representation of encrypted data and encryption metadata, the XML-Encryption and XML-Signature recommendations are used. The proposed method allows for fine-grained encryption of arbitrary subjects, predicates, objects and subgraphs of an RDF-graph. An XML vocabulary for specifying encryp- tion policies is introduced.
Conference Paper
We present a simple, new paradigm for the design of collision-free hash functions. Any function emanating from this paradigm is incremental. (This means that if a message x which I have previously hashed is modi ed to x, then rather than having to re-compute the hash of x, from scratch, I can quickly \update" the old hash value to the new one, in time proportional to the amount of modi cation made in x to get x,. ) Also any function emanating from this paradigm is parallelizable, useful for hardware implementation. We derive several speci c functions from our paradigm. All use a standard hash function, assumed ideal, and some algebraic operations. The rst function, MuHASH, uses one modular multiplication per block of the message, making it reasonably e cient, and signi cantly faster than previous incremental hash functions. Its security is proven, based on the hardness of the discrete logarithm problem. A second function, AdHASH, is even faster, using additions instead of multiplications, with security proven given either that approximation of the length of shortest lattice vectors is hard or that the weighted subset sum problem is hard. A third function, LtHASH, is a practical variant of recent lattice based functions, with security proven based, again on the hardness of shortest lattice vector approximation.
Article
Contenido: Técnicas de especificación abstracta; Análisis de algoritmos; Hacia más generalización en algoritmos; Tipo de datos no estructurados; Tipos de datos semiestructurados; Tipos de datos estructurados linealmente; Arboles binarios; Arboles binarios de búsqueda; Arboles de camino múltiples de búsqueda; Grafos directos o digrafos; Grafos indirectos y complejidades; Listas generalizadas; Administración de la memoria.
Blockchain technologies & the Semantic Web: A framework for symbiotic development CS Conference for University of Bonn Students
  • M English
  • S Auer
  • S Domingue
M. English, S. Auer, and S.Domingue. 2016. Blockchain technologies & the Semantic Web: A framework for symbiotic development. In CS Conference for University of Bonn Students, J. Lehmann, H. Thakkar, L. Halilaj, and R. Asmat (Eds.). 47-61.
Hashing of RDF Graphs and a Solution to the Blank Node Problem Proceedings of the 10th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2014) co-located with the 13th International Semantic Web Conference
  • Fernando Bobillo
  • N Rommel
  • Davide Carvalho
  • Paulo Ceolin
  • G Cesar
  • Claudia Da Costa
  • Nicola Amato
  • Kathryn B Fanizzi
  • Kenneth J Laskey
  • Thomas Laskey
  • Lukasiewicz
On Partial Encryption of RDF-Graphs
  • Mark Giereth
  • Yolanda Gil
  • Enrico Motta
Mark Giereth. 2005. On Partial Encryption of RDF-Graphs. In The Semantic WebISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings (Lecture Notes in Computer Science), Yolanda Gil, Enrico Motta, V. Richard Benjamins, and Mark A. Musen (Eds.), Vol. 3729. Springer, 308-322. https://doi.org/10.1007/11574620 2 4
Hashing of RDF Graphs and a Solution to the Blank Node Problem
  • Edzard Höfig
  • Ina Schieferdecker
  • Fernando Bobillo
  • Rommel N Carvalho
  • Davide Ceolin
  • Paulo Cesar
  • G Da Costa
  • Claudia Amato
  • Nicola Fanizzi
  • Kathryn B Laskey
  • Kenneth J Laskey
  • Thomas Lukasiewicz
  • P Trevor
Edzard Höfig and Ina Schieferdecker. 2014. Hashing of RDF Graphs and a Solution to the Blank Node Problem. In Proceedings of the 10th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2014) co-located with the 13th International Semantic Web Conference (ISWC 2014), Riva del Garda, Italy, October 19, 2014. (CEUR Workshop Proceedings), Fernando Bobillo, Rommel N. Carvalho, Davide Ceolin, Paulo Cesar G. da Costa, Claudia d'Amato, Nicola Fanizzi, Kathryn B. Laskey, Kenneth J. Laskey, Thomas Lukasiewicz, Trevor P.
The Web Ledger Protocol 1.0 A format and protocol for decentralized ledgers on the Web
  • Manu Sporny
  • Dave Longley
Manu Sporny and Dave Longley. 2017. The Web Ledger Protocol 1.0 A format and protocol for decentralized ledgers on the Web. Technical Report. Digital Bazaar, https://w3c.github.io/web-ledger/.