Article

Blank nodes in RDF

Authors:
  • Nantong Institute of Technoledge
  • school of mathematics and information engineering, Taizhou university, China
  • Taizhou university
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Semantic Web plays an important role in the Web of future. The RDF data is the key component which establishes the basis of the Semantic Web. In this paper, we conlude the usages and the possible problems of blank nodes of RDF with detailed analyses of the applications and semantics of the blank nodes in RDF graphs. Give special attentions to the inconsistency between RDF semantics and SPARQL semantics of blank nodes. Employ the concept of "lean graph" in the pre-process of the RDF data operation, propose a method of using the entailment relations between RDF graphs and the transformation from blank nodes to the URI references to eliminate the blank nodes in RDF graphs, and give the theoretic background to support the method. Lastly, some referenced methods of transforming the blank nodes to URI references are provided.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... According to empirical surveys conducted in [22,13] about RDF data, a notable percentage of unique terms (25.7%) are blank nodes (for short bnodes), proving that they are prevalent in real-world data. Bnodes are useful for representing complex attributes, for describing multi-component structures (like RDF containers), for representing reification and provenance information [4], as well as for representing OWL classes defined by expressions (like unions, intersections, and others). However, due to their anonymity (i.e. ...
... However, by applying bnode matching, the delta will contain less or equal to 32 change operations. In the optimal case, where the bnodes are matched as the following pairs indicate { (1,7), (2,8), (3,9), (4,10), (5,11), (6,12)}, only 2 change operations are needed (i.e. the triple (4, city, M iami) is deleted and the triple (10, city, N Y ) is added). The signature-based algorithm in [28] does not ensure optimality when applied in this example; notice that bnodes 1 and 2 of graph G 1 and 7 and 8 of graph G 2 have the same direct neighborhoods; thereby they are matched randomly. ...
Article
Full-text available
In the linked open data cloud, the biggest open data graph that currently exists, a remarkable percentage of data are unnamed resources, also called blank nodes. Several fundamental tasks, such as graph isomorphism checking and RDF data versioning, require computing a map between the sets of blank nodes of two graphs. This map aims at minimizing the delta size, i.e. the number of change operations that are required to make the graphs isomorphic. Computing the optimal map is NP-Hard in the general case, and various approximation algorithms have been proposed. In this work, we propose a novel radius-aware signature-based algorithm that is not restricted to the direct neighborhood of the compared blank nodes. Contrary to the older algorithms, the proposed algorithm manages to decrease the deviation from the optimal solution even for graphs that contain connected blank nodes in large and dense structures. The conducted experiments over real and synthetically generated datasets (including datasets from the Billion Triple Challenge 2012 and 2014) show the significantly smaller deltas. For isomorphism checking (simple RDF equivalence), with a wise configuration of radius, the proposed algorithm achieves optimality for 100%100\,\% of the datasets, while in non-isomorphic datasets the deltas are on average 50–75 % smaller than those of the previous algorithms. Finally, the trade-off between radius, deviation from the optimum and time efficiency is analyzed.
... Blanks + code collections (lists) + code provenance (reification) + code non-binary relations and structures + replace URIs when not known, needed or wanted cause clutter with SPARQL queries. cause clutter and broken links in merging RDF graphs complicate linking in Linked Data [6] suggest three ways to alleviate the problems. ...
... Running these SPARQL scripts on blank data may generate non-lean RDF with duplicate and redundant triples. This is because SPARQL interprets blanks differently from RDF [6,14]: SPARQL blanks are anonymous terms, while RDF blanks are existentially quantified variables. ...
Article
Full-text available
This paper describes a workflow of simplifying and matching spec ial language terms in RDF generated from trawling term candidates from Web terminology sites with TermFactory, a Semantic Web framework for professional terminology. Term candidates from such sources need to be matched and eventually merged with resources already in TermFactory. While merging anonymous data, it is important not to lose track of provenance. For coding provenance in RDF, TF uses a minor but apparently novel variant of RDF reification. In addition, TF implements a toolkit of methods for dealin g with graphs containing anonymous (blank) nodes.
... Following [52], blank nodes give the capability to: ...
Article
Full-text available
Resource Description Framework (RDF) can seen as a solution in today’s landscape of knowledge representation research. An RDF language has symmetrical features because subjects and objects in triples can be interchangeably used. Moreover, the regularity and symmetry of the RDF language allow knowledge representation that is easily processed by machines, and because its structure is similar to natural languages, it is reasonably readable for people. RDF provides some useful features for generalized knowledge representation. Its distributed nature, due to its identifier grounding in IRIs, naturally scales to the size of the Web. However, its use is often hidden from view and is, therefore, one of the less well-known of the knowledge representation frameworks. Therefore, we summarise RDF v1.0 and v1.1 to broaden its audience within the knowledge representation community. This article reviews current approaches, tools, and applications for mapping from relational databases to RDF and from XML to RDF. We discuss RDF serializations, including formats with support for multiple graphs and we analyze RDF compression proposals. Finally, we present a summarized formal definition of RDF 1.1 that provides additional insights into the modeling of reification, blank nodes, and entailments.
... When it is exported to IFC, various geometrical structures (points, polylines, and polygons) must be generated to conform to the IFC ontology. 3. Collections and n-ary predicates that need to be represented in RDF as graph fragments that contain blank nodes [12]. A particularly important source of blank nodes are linked list structures of RDF that are used to represent lists and arrays of IFC. ...
Article
Full-text available
The capability to accurately detect changes between successive versions of the IFC representation of a BIM model would enable the development of generic change management functionalities for construction projects. Unfortunately, IFC models consist mostly of anonymous objects without stable identities; therefore the computation of differences between versions is complicated. When IFC models are converted into an RDF representation, the uniform graph structure offers new algorithmic opportunities. We study how to assign unique and stable identities to anonymous nodes based on signatures generated from their graph environment, and present the Short Paths Crossings Algorithm (SPCA) that computes sets of paths with limited length from anonymous nodes taking into account their crossings. Empirical tests show that SPCA produces significantly smaller difference sets for IFC-derived graphs than previous algorithms for RDF change detection.
... [17]) have demonstrated the usefulness of blank nodes for the representation of the Semantic Web data. In a nutshell, from a theoretical perspective blank nodes play the role of the existential variables and from a technical perspective, as gathered in [7], they give the capability to (a) describe multi-component structures, like the RDF containers, (b) describe reification (e.g. provenance information) and (c) represent complex attributes without having to name explicitly the auxiliary node (e.g. the address of a person consisting of the street, the number, the postal code and the city). ...
Article
Full-text available
In various domains and cases, we observe the creation and usage of information elements which are unnamed. Such elements do not have a name, or may have a name that is not externally referable (usually meaningless and not persistent over time). This paper discusses why we will never `escape' from the problem of having to construct mappings between such unnamed elements in information systems. Since unnamed elements nowadays occur very often in the framework of the Semantic Web and Linked Data as blank nodes, the paper describes scenarios that can benefit from methods that compute mappings between the unnamed elements. For each scenario, the corresponding bnode matching problem is formally defined. Based on this analysis, we try to reach to more a general formulation of the problem, which can be useful for guiding the required technological advances. To this end, the paper finally discusses methods to realize blank node matching, the implementations that exist, and identifies open issues and challenges.
... [10]) have demonstrated the usefulness of blank nodes for the representation of the Semantic Web data. In a nutshell, from a theoretical perspective blank nodes play the role of the existential variables and from a technical perspective, as gathered in [2], they give the capability to (a) describe multi-component structures, like the RDF containers, (b) apply reification (i.e. provenance information), (c) represent complex attributes without having to name explicitly the auxiliary node (e.g. the address of a person consisting of the street, the number, the postal code and the city) and (d) offer protection of the inner information (e.g. ...
Conference Paper
Full-text available
Generators for synthetic RDF datasets are very important for testing and benchmarking various semantic data management tasks (e.g. querying, storage, update, compare, integrate). However, the current generators do not support sufficiently (or totally ignore) blank node connectivity issues. Blank nodes are used for various purposes (e.g. for describing complex attributes), and a significant percentage of resources is currently represented with blank nodes. Moreover, several semantic data management tasks, like isomorphism checking (useful for checking equivalence), and blank node matching (useful in comparison, versioning, synchronization, and in semantic similarity functions), not only have to deal with blank nodes, but their complexity and optimality depends on the connectivity of blank nodes. To enable the comparative evaluation of the various techniques for carrying out these tasks, in this paper we present the design and implementation of a generator, called BGen, which allows building datasets containing blank nodes with the desired complexity, controllable through various features (morphology, size, diameter, density and clustering coefficient). Finally, the paper reports experimental results concerning the efficiency of the generator, as well as results from using the generated datasets, that demonstrate the value of the generator.
Article
World Wide Web provides vast of resources to the public. Currently, many researches have been done on resources sharing among users through implementation of ontologies. Knowledge in an ontology are represented in the form of triple(s-p-o), where concepts are brought together by a relation. In a situation where there is a need to represent a resource which exist without IRI, blank node can be implemented in placed of the resource. Increase number of blank nodes implemented will increase the complexity of ontology structure. Since it is impossible to avoid blank nodes implementation in the ontology, increase used of it might lead to the intractable of data during the information retrieval. This paper presents a new clause-based structure that able to handle N-ary, container, collection and reified knowledge issues brought by the blank node application. The result shows that the structure able to store complicated knowledge without the need to implement blank node.
Conference Paper
Full-text available
We introduce the notion of the mixed DL and entailment-based (DLE) OWL reasoning, defining a framework inspired from the hybrid and homogeneous paradigms for integration of rules and ontologies. The idea is to combine the TBox inferencing capabilities of the DL algorithms and the scalability of the rule paradigm over large ABoxes. Towards this end, we define a framework that uses a DL reasoner to reason over the TBox of the ontology (hybrid-like) and a rule engine to apply a domain-specific version of ABox-related entailments (homogeneous-like) that are generated by TBox queries to the DL reasoner. The DLE framework enhances the entailment-based OWL reasoning paradigm in two directions. Firstly, it disengages the manipulation of the TBox semantics from any incomplete entailment-based approach, using the efficient DL algorithms. Secondly, it achieves faster application of the ABox-related entailments and efficient memory usage, comparing it to the conventional entailment-based approaches, due to the low complexity and the domain-specific nature of the entailments.
Conference Paper
Full-text available
RDF Schema (RDFS) as a lightweight ontology language is gaining popularity and, consequently, tools for scalable RDFS inference and querying are needed. SPARQL has become recently a W3C standard for querying RDF data, but it mostly provides means for querying simple RDF graphs only, whereas querying with respect to RDFS or other entailment regimes is left outside the current specification. In this paper, we show that SPARQL faces certain unwanted ramifications when querying ontologies in conjunction with RDF datasets that comprise multiple named graphs, and we provide an extension for SPARQL that remedies these effects. Moreover, since RDFS inference has a close relationship with logic rules, we generalize our approach to select a custom ruleset for specifying inferences to be taken into account in a SPARQL query. We show that our extensions are technically feasible by providing benchmark results for RDFS querying in our prototype system GiaBATA, which uses Datalog coupled with a persistent Relational Database as a back-end for implementing SPARQL with dynamic rule-based inference. By employing different optimization techniques like magic set rewriting our system remains competitive with state-of-the-art RDFS querying systems.
Conference Paper
Full-text available
Based on practical observations on rule-based inference on RDF data, we study the problem of redundancy elimination on RDF graphs in the presence of rules (in the form of Datalog rules) and constraints (in the form of so-called tuple-generating dependencies), as well as with respect to queries (ranging from conjunctive queries up to more complex ones, particularly covering features of SPARQL, such as union, negation, or filters). To this end, we investigate the influence of several problem parameters (like restrictions on the size of the rules, the constraints, and/or the queries) on the complexity of detecting redundancy. The main result of this paper is a fine-grained complexity analysis of both graph and rule minimisation in various settings.
Conference Paper
Full-text available
We introduce domain-restricted RDF (dRDF) which allows to associate an RDF graph with a fixed, finite domain that interpretations for it m ay range over. We show that dRDF is a real extension of RDF and discuss impacts on the complexity of entailment in dRDF. The entailment problem represents the key reasoning task for RDF and is well known to be NP-complete. Remarkably, we show that the restriction of domains in dRDF raises the complexity of entailment from NP- toP2 -completeness. In order to lower complexity of entailment for both domain-restricted and unrestricted graphs, we take a closer look at the graph structure. For cases where the structure of RDF graphs is restricted via the concept of bounded treewidth, we prove that the entailment is tractable for unrestricted graphs and coNP-complete for domain-restricted graphs.
Article
Full-text available
This text sets out a series of approaches to the analysis and synthesis of the World Wide Web, and other web-like information structures. A comprehensive set of research questions is outlined, together with a sub-disciplinary breakdown, emphasising the multi-faceted nature of the Web, and the multi-disciplinary nature of its study and development. These questions and approaches together set out an agenda for Web Science, the science of decentralised information systems. Web Science is required both as a way to understand the Web, and as a way to focus its development on key communicational and representational requirements. The text surveys central engineering issues, such as the development of the Semantic Web, Web services and P2P. Analytic approaches to discover the Web’s topology, or its graph-like structures, are examined. Finally, the Web as a technology is essentially socially embedded; therefore various issues and requirements for Web use and governance are also reviewed.
Article
Based on practical observations on rule-based inference on RDF data, we study the problem of redundancy detection on RDF graphs in the presence of rules (in the form of Datalog rules) and constraints, (in the form of so-called tuple-generating dependencies), and with respect to queries (ranging from conjunctive queries up to more complex ones, particularly covering features of SPARQL, such as union, negation, or filters). To this end, we investigate the influence of several problem parameters (like restrictions on the size of the rules, the constraints, and/or the queries) on the complexity of detecting redundancy. The main result of this paper is a fine-grained complexity analysis of both graph and rule minimisation in various settings.
Article
The Resource Description Framework (RDF) is a Semantic Web standard that provides a data language, simply called RDF, as well as a lightweight ontology language, called RDF Schema. We investigate embeddings of RDF in logic and show how standard logic programming and description logic technology can be used for reasoning with RDF. We subsequently consider extensions of RDF with datatype support, considering D entailment, defined in the RDF semantics specification, and D* entailment, a semantic weakening of D entailment, introduced by ter Horst. We use the embeddings and properties of the logics to establish novel upper bounds for the complexity of deciding entailment. We subsequently establish two novel lower bounds, establishing that RDFS entailment is PTime-complete and that simple-D entailment is coNP-hard, when considering arbitrary datatypes, both in the size of the entailing graph. The results indicate that RDFS may not be as lightweight as one may expect.
Article
This chapter presents an overview of results relating to computational complex- ity of reasoning with Semantic Web ontologies. An overview of the complete- ness results that form the basis for these complexity results is also given. We prove NP-completeness of two standard entailment relations, simple entailment and RDFS (RDF Schema) entailment. These two entailment relations are in P if the target graph is assumed to contain no variables (blank nodes). We show that these results also apply to two stronger entailment relations, D* entailment and pD* entailment, which extend RDFS entailment to reasoning with datatypes and to reasoning with a subset of OWL (the Web Ontology Language), respec- tively. These results make use of deductive closure graphs that can be computed in polynomial time. We present new bounds on the size of these closure graphs.
Article
We show how to reduce ontology entailment for the OWL DL and OWL Lite ontology languages to knowledge base satisfiability in (respectively) the and description logics. This is done by first establishing a correspondence between OWL ontologies and description logic knowledge bases and then by showing how knowledge base entailment can be reduced to knowledge base satisfiability.
Conference Paper
Semantic consequence (entailment) in RDF is ususally computed using Pat Hayes Interpolation Lemma. In this paper, we reformulate this mechanism as a graph homomorphism known as projection in the conceptual graphs community. Though most of the paper is devoted to a detailed proof of this result, we discuss the immediate benefits of this reformulation: it is now easy to translate results from different communities (e.g. conceptual graphs, constraint programming, ...) to obtain new polynomial cases for the NP-complete RDF entailment problem, as well as numerous algorithmic optimizations.
Conference Paper
An important open question in the semantic Web is the precise rela- tionship between the RDF(S) semantics and the semantics of standard knowledge representation formalisms such as logic programming and description logics. In this paper we address this issue by considering embeddings of RDF and RDFS in logic. Using these embeddings, combined with existing results about various fragments of logic, we establish several novel complexity results. The embed- dings we consider show how techniques from deductive databases and descrip- tion logics can be used for reasoning with RDF(S). Finally, we consider querying RDF graphs and establish the data complexity of conjunctive querying for the various RDF entailment regimes.
Conference Paper
We show how to reduce ontology entailment for the OWL DL and OWL Lite ontology languages to knowledge base satisfiability in (respectively) the SHOIN\mathcal{SHOIN}(D) and SHIF\mathcal{SHIF}(D) description logics. This is done by first establishing a correspondence between OWL ontologies and description logic knowledge bases and then by showing how knowledge base entailment can be reduced to knowledge base satisfiability.KeywordsResource Description FrameworkDescription LogicAbstract SyntaxOntology LanguageResource Description Framework GraphThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Chapter
Semantic Web models and technologies provide information in machine-readable languages that enable computers to access the Web more intelligently and perform tasks automatically without the direction of users. These technologies are relatively recent and advancing rapidly, creating a set of unique challenges for those developing applications. Semantic Web for the Working Ontologist is the essential, comprehensive resource on semantic modeling, for practitioners in health care, artificial intelligence, finance, engineering, military intelligence, enterprise architecture, and more. Focused on developing useful and reusable models, this market-leading book explains how to build semantic content (ontologies) and how to build applications that access that content. New in this edition: Coverage of the latest Semantic Web tools for organizing, querying, and processing information - see details in TOC below Detailed information on the latest ontologies used in key web applications including ecommerce, social networking, data mining, using government data, and more . Provides practical information for all programmers and subject matter experts engaged in modeling data to fit the requirements of the Semantic Web. De-emphasizes algorithms and proofs, focusing instead on real-world problems, creative solutions, and highly illustrative examples. . Presents detailed, ready-to-apply "recipes" for use in many specific situations. . Shows how to create new recipes from RDF, RDFS, and OWL constructs.
Conference Paper
We complement the RDF semantics specification of the W3C by proving decidability of RDFS entailment. Furthermore, we show completeness and decidability of entailment for RDFS extended with datatypes and a property-related subset of OWL. The RDF semantics specification provides a complete set of entailment rules for reasoning with RDFS, but does not prove decidability of RDFS entailment: the closure graphs used in the completeness proof are infinite for finite RDF graphs. We define partial closure graphs, which can be taken to be finite for finite RDF graphs, which can be computed in polynomial time, and which are sufficient to decide RDFS entailment. We consider the extension of RDFS with datatypes and a property-related fragment of OWL: FunctionalProperty, InverseFunctionalProperty, sameAs, SymmetricProperty, TransitiveProperty, and inverseOf. In order to obtain a complete set of simple entailment rules, the semantics that we use for these extensions is in line with the ‘if-semantics’ of RDFS, and weaker than the ‘iff-semantics’ defining D-entailment and OWL (DL or Full) entailment. Classes can be used as instances, the use of FunctionalProperty and TransitiveProperty is not restricted to obtain decidability, and a partial closure that is sufficient for deciding entailment can be computed in polynomial time.
Article
Description Languages (DLs) are descendants of the kl-one [15] knowledge representation system, and form the basis of several object-centered knowledge base management systems developed in recent years, including ones in industrial use. Originally used for conceptual modeling (to define views), DLs are seeing increased use as query languages for retrieving information. This paper, aimed at a general audience that includes database researchers, considers the relationship between the expressive power of DLs and that of query languages based on Predicate Calculus. We show that all descriptions built using constructors currently considered in the literature can be expressed as formulae of the First Order Predicate Calculus with at most three variable symbols, though we have to allow numeric quantifiers and infinitary disjunction in order to handle a couple of special constructors. Conversely, we show that all first-order queries (formulae with one free variable) built up from unary and bin...
  • Enrico Franconi
  • Sergio Tessaris
Enrico Franconi and Sergio Tessaris "The Semantics of SPARQL" http://www.inf.unibz.it/krdb/w3c/sparql/, 2005.
  • M Dean
  • Guus Schreiber
  • Sean Bechhofer
  • Jim Frank Van Harmelen
  • Ian Hendler
  • Horrocks
M Dean, Guus Schreiber, Sean Bechhofer, Frank van Harmelen, Jim Hendler and Ian Horrocks et.al, "OWL Web Ontology Language Reference" http://www.w3.org/TR/owl-ref/, February 2004.
  • Frank Manola
  • Eric Miller
  • Brian Mcbride
Frank Manola, Eric Miller and Brian McBride, "RDF Primer", http://www.w3.org/TR/rdf-syntax/, February 2004.