Conference PaperPDF Available

Abstract and Figures

The Linked Data concept uses a collection of Semantic Web technologies in order to interconnect, publish, and share pieces of data on the Web, in a machine-readable format. It enables querying and combining data from different datasets together in order to retrieve specific information and enable use-case scenarios which are unavailable over isolated datasets. However, the process of querying linked data published on different places on the Web poses several challenges. Generally, each user should know the schema of the data, write a query and access a relevant linked data endpoint. According to statistics, endpoints are often unavailable and have significant downtime, so this presents a serious obstacle in application development and scenarios which rely on the data. In this paper, we present a facet browser for SPARQL endpoints, based on HTML5. It allows users to search and retrieve RDF triples based on a keyword, from public SPARQL endpoints. By using HTML5 Web Storage, the triples from the results can be saved in the browser, locally, for future use. The facet browser provides management functionalities over the stored data - capabilities to update, refresh, modify, delete and download the triples in various RDF formats: JSON-LD, Turtle, NTriples, RDF/XML, JSON, CSV. The locally stored RDF triples can also be shared with other users. We believe that these features of the facet browser will help overcome the endpoint downtime issues, by providing offline data accessibility for the user and his applications.
Content may be subject to copyright.
A preview of the PDF is not available
Thesis
Full-text available
The vast amount of data available over the distributed infrastructure of the Web has initiated the development of techniques for their representation, storage and usage. One of these techniques is the Linked Data paradigm, which aims to provide unified practices for publishing and contextually interlinking data on the Web, by using the World Wide Web Consortium (W3C) standards and the Semantic Web technologies. This approach enables the transformation of the Web from a web of documents, to a web of data. With it, the Web transforms into a distributed network of data which can be used by software agents and machines. The interlinked nature of the distributed datasets enables the creation of advanced use-case scenarios for the end users and their applications , scenarios previously unavailable over isolated data silos. This creates opportunities for generating new business values in the industry. The adoption of the Linked Data principles by data publishers from the research community and the industry has led to the creation of the Linked Open Data (LOD) Cloud, a vast collection of interlinked data published on and accessible via the existing infrastructure of the Web. The experience in creating these Linked Data datasets has led to the development of a few methodo-logies for transforming and publishing Linked Data. However, even though these methodologies cover the process of modeling, transforming / generating and publishing Linked Data, they do not consider reuse of the steps from the life-cycle. This results in separate and independent efforts to generate Linked Data within a given domain, which always go through the entire set of life-cycle steps. In this PhD thesis, based on our experience with generating Linked Data in various domains and based on the existing Linked Data methodologies, we define a new Linked Data methodology with a focus on reuse. It consists of five steps which encompass the tasks of studying the domain, modeling the data, transforming the data, publishing it and exploiting it. In each of the steps, the methodology provides guidance to data publishers on defining reusable components in the form of tools, schemas and services, for the given domain. With this, future Linked Data publishers in the domain would be able to reuse these components to go through the life-cycle steps in a more efficient and productive manner. With the reuse of schemas from the domain, the resulting Linked Data dataset will be compatible and aligned with other datasets generated by reusing the same components, which additionally leverages the value of the datasets. This approach aims to encourage data publishers to generate high-quality, aligned Linked Data datasets from various domains, leading to further growth of the number of datasets on the LOD Cloud, their quality and the exploitation scenarios. With the emergence of data-driven scientific fields, such as Data Science, creating and publishing high-quality Linked Data datasets on the Web is becoming even more important, as it provides an open dataspace built on existing Web standards. Such a dataspace enables data scientists to make data analytics over the cleaned, structured and aligned data in it, in order to produce new knowledge and introduce new value in a given domain. As the Linked Data principles are also applicable within closed environments over proprietary data, the same methods and approaches are applicable in the enterprise domain as well.
Article
Full-text available
The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions-the Web of Data. In this article we present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. We describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Conference Paper
Full-text available
The success of Open Data initiatives has increased the amount of data available on the Web. Unfortunately, most of this data is only available in raw tabular form, what makes analysis and reuse quite difficult for non-experts. Linked Data principles allow for a more sophisticated approach by making explicit both the structure and semantics of the data. However, from the end-user viewpoint, they continue to be monolithic files completely opaque or difficult to explore by making tedious semantic queries. Our objective is to facilitate the user to grasp what kind of entities are in the dataset, how they are interrelated, which are their main properties and values, etc. Rhizomer is a tool for data publishing whose interface provides a set of components borrowed from Information Architecture (IA) that facilitate awareness of the dataset at hand. It automatically generates navigation menus and facets based on the kinds of things in the dataset and how they are described through metadata properties and values. Moreover, motivated by recent tests with end-users, it also provides the possibility to pivot among the faceted views created for each class of resources in the dataset.
Article
Full-text available
The term "Linked Data" refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions-the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Article
The paper discusses the semantic Web and Linked Data. The classic World Wide Web is built upon the idea of setting hyperlinks between Web documents. These hyperlinks are the basis for navigating and crawling the Web.Technologically, the core idea of Linked Data is to use HTTP URLs not only to identify Web documents, but also to identify arbitrary real world entities.Data about these entities is represented using the Resource Description Framework (RDF). Whenever a Web client resolves one of these URLs, the corresponding Web server provides an RDF/ XML or RDFa description of the identified entity. These descriptions can contain links to entities described by other data sources.The Web of Linked Data can be seen as an additional layer that is tightly interwoven with the classic document Web. The author mentions the application of Linked Data in media, publications, life sciences, geographic data, user-generated content, and cross-domain data sources. The paper concludes by stating that the Web has succeeded as a single global information space that has dramatically changed the way we use information, disrupted business models, and led to profound societal change.