Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Linked Data Wrappers (LDWs) turn Web APIs into RDF end-points, leveraging the Linked Open Data cloud with current data. Unfortunately, LDWs are fragile upon upgrades on the underlying APIs, compromising LDW stability. Hence, for API-based LDWs to become a sustainable foundation for the Web of Data, we should recognize LDW maintenance as a continuous effort that outlives their breakout projects. This is not new in Software Engineering. Other projects in the past faced similar issues. The strategy: becoming open source and turning towards dedicated platforms. By making LDWs open, we permit others not only to inspect (hence, increasing trust and consumption), but also to maintain (to cope with API upgrades) and reuse (to adapt for their own purposes). Promoting consumption, adaptation and reuse might all help to increase the user base, and in so doing, might provide the critical mass of volunteers, current LDW projects lack. Drawing upon the Helping Theory, we investigate three enablers of volunteering applied to LDW maintenance: impetus to respond, positive evaluation of contributing and increasing awareness. Insights are fleshed out through SYQL, a LDW platform on top of Yahoo’s YQL. Specifically, SYQL capitalizes on the YQL community (i.e. impetus to respond), providesannotation overlays to easy participation (i.e. positive evaluation of contributing), and introduces aHealth Checker (i.e. increasing awareness). Evaluation is conducted for 12 subjects involved in maintaining someone else’s LDWs. Results indicate that both the Health Checker and the annotation overlays provide utility as enablers of awareness and contribution.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Additionally, datawrapper browser-based data visualisation tool was used to create the choropleth map of region-based visualisation and spatio-temporal distribution [20][21][22]. Coding variables in Excel included: province, year, SS+, TB, SS-value and map file. Finally, data are visualised via Datawrapper tool, which enables generating detailed interactive graphics and maps (Journalism++ Cologne: Data wrapper. ...
Article
Full-text available
Globally, tuberculosis is a leading cause of infectious disease deaths. China ranks third among the 30 high‐burden countries for tuberculosis and accounts for approximately 7.4% of the cases reported worldwide. Since very few studies have investigated the age difference in tuberculosis prevalence in mainland China, therefore, the preliminary characterisation of age differences in tuberculosis patients is not well understood. The data of reported sputum smear‐positive, tuberculosis and sputum smear‐negative cases in 340 prefectures from mainland China were extracted from the China Information System for Disease Control and Prevention from January 2009 to December 2018. Multiple statistical analysis and GIS techniques were used to investigate the temporal trend and identify the spatial distribution of sputum smear‐positive, tuberculosis and sputum smear‐negative cases in the study area. The results showed that the incidence of sputum smear‐positive and tuberculosis has dropped to a stable level, while sputum smear‐negative exhibited a rising trend. Additionally, sputum smear‐positive, tuberculosis and sputum smear‐negative are still highly prevalent in northwestern and southwestern regions of China. Interestingly, the young adult group (20–50 age) and elder group (>50 age) are more susceptible to being infected with tuberculosis, while lower infection levels were recorded in the juvenile group (<20 age). The present study investigated the temporal–spatial distribution of sputum smear‐positive, tuberculosis and sputum smear‐negative cases in mainland China before the COVID‐19 pandemic breakout, which would help the government agency establish an effective mechanism of tuberculosis prevention in high‐risk periods and high‐risk areas in the study region.
... Among them, we list Java and Python. As of scalability, authors [12] writes that LDWs are fragile upon upgrades on the underlying APIs, compromising LDW stability. The authors proposed SYQL, a Linked data web (LDW) platform on top of Yahoo's YQL. ...
Article
Full-text available
We report a survey on the actual state of the art about semantic web and its applications. Semantic Web plays a major role in integrating data, especially open data, publicly available on the Internet. We reviewed scientifically research papers, study cases, web sites and specialty books in order to discuss the main applicative areas, especially in the field of governmental use, the main technologies and the architectures involved. Both quantitative and qualitative analyses were carried out on the data, which related to 1460 ontologies belonging to Linked Open Data Cloud. The second analysis was on the content of scientific articles belonging to Clarivate Analytics, Scopus and Google Scholar databases. We identified and analyzed 84941 articles written on the subject of ontology, from which computer science is represented by 36264 articles. The results of our research proved that semantic web technologies are an important tool for describing and integrating data and an important component in the data layer of any intelligent application. This study contributed to the mainstream of the research literature by presenting the applicative areas of semantic web and semantic web applications’ development tools, architectures, and methodology.
Article
Full-text available
User-generated content (UGC) projects involve large numbers of mostly unpaid contributors collaborating to create content. Motivation for such contributions has been an active area of research. In prior research, motivation for contribution to UGC has been considered a single, static and individual phenomenon. In this paper, we argue that it is instead three separate but interrelated phenomena. Using the theory of helping behaviour as a framework and integrating social movement theory, we propose a stage theory that distinguishes three separate sets (initial, sustained and meta) of motivations for participation in UGC. We test this theory using a data set from a Wikimedia Editor Survey (Wikimedia Foundation, 2011). The results suggest several opportunities for further refinement of the theory but provide support for the main hypothesis, that different stages of contribution have distinct motives. The theory has implications for both researchers and practitioners who manage UGC projects.
Article
Full-text available
CrunchBase is a database about startups and technology companies. The database can be searched, browsed, and edited via a website, but is also accessible via an entity-centric HTTP API in JSON format. We present a wrapper around the API that provides the data as Linked Data. The wrapper provides schema-level links to schema.org, Friend-of-a-Friend and Vocabulary-of-a-Friend, and entity-level links to DBpedia for organization entities. We describe how to harvest the RDF data to obtain a local copy of the data for further processing and querying that goes beyond the access facilities of the CrunchBase API. Further, we describe the cases in which the Linked Data API for CrunchBase and the crawled CrunchBase RDF data have been used in other works.
Article
Full-text available
One of the major barriers to the deployment of Linked Data is the difficulty that data publishers have in determining which vocabularies to use to describe the semantics of data. This systematic report describes Linked Open Vocabularies (LOV), a high-quality catalogue of reusable vocabularies for the description of data on the Web. The LOV initiative gathers and makes visible indicators such as the interconnections between vocabularies and each vocabulary's version history, along with past and current editor (individual or organization). The report details the various components of the system along with some innovations, such as the introduction of a property-level boost in the vocabulary search scoring that takes into account the property's type (e.g., dc:comment) associated with a matching literal value. By providing an extensive range of data access methods (full-text search, SPARQL endpoint, API, data dump or UI), the project aims at facilitating the reuse of well-documented vocabularies in the Linked Data ecosystem. The adoption of LOV by many applications and methods shows the importance of such a set of vocabularies and related features for ontology design and the publication of data on the Web.
Article
Full-text available
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the University. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic degree qualifications, courses, scholarly publications and open educational resources of the University. It exposes a SPARQL endpoint and several other services to support developers, including queries stored server-side and entity lookup using known identifiers such as course codes and YouTube video IDs. The platform is now a key information service at the Open University, with several core systems and websites exploiting linked data through data.open.ac.uk. Through these applications, data.open.ac.uk is now fulfilling a key role in the overall data infrastructure of the university, and in establishing connections with other educational institutions and information providers.
Conference Paper
Full-text available
It is widely accepted that proper data publishing is difficult. The majority of Linked Open Data (LOD) does not meet even a core set of data publishing guidelines. Moreover, datasets that are clean at creation, can get stains over time. As a result, the LOD cloud now contains a high level of dirty data that is difficult for humans to clean and for machines to process. Existing solutions for cleaning data (standards, guidelines, tools) are targeted towards human data creators, who can (and do) choose not to use them. This paper presents the LOD Laundromat which removes stains from data without any human intervention. This fully automated approach is able to make very large amounts of LOD more easily available for further processing right now. LOD Laundromat is not a new dataset, but rather a uniform point of entry to a collection of cleaned siblings of existing datasets. It provides researchers and application developers a wealth of data that is guaranteed to conform to a specified set of best practices, thereby greatly improving the chance of data actually being (re)used.
Article
Full-text available
The development and standardization of Semantic Web technologies has resulted in an unprecedented volume of data being published on the Web as Linked Data (LD). However, we observe widely varying data quality ranging from extensively curated datasets to crowdsourced and extracted data of relatively low quality. In this article, we present the results of a systematic review of approaches for assessing the quality of LD. We gather existing approaches and analyze them qualitatively. In particular, we unify and formalize commonly used terminologies across papers related to data quality and provide a comprehensive list of 18 quality dimensions and 69 metrics. Additionally, we qualitatively analyze the 30 core approaches and 12 tools using a set of attributes. The aim of this article is to provide researchers and data curators a comprehensive understanding of existing work, thereby encouraging further experimentation and development of new approaches focused towards data quality, specifically for LD.
Article
Full-text available
Services as part of our daily life represent an important means to deliver value to their consumers and have a great economic impact for organizations. The service consumption and their exponential proliferation show the importance and acceptance by their customers. In this sense, it is possible to predict that the infrastructure of future cities will be supported by different kind of services, such as smart city services, open data services, as well as common services (e.g., e-mail services), etc. Nowadays a large percentage of services are provided on the web and are commonly called web services (WSs). This kind of services has become one of the most used technologies in software systems. Among the challenges when integrating web services in a given system, requirements-driven selection occupies a prominent place. A comprehensive selection process needs to check compliance of Non-Functional Requirements (NFRs) which can be assessed by analyzing the Quality of Service (QoS). In this paper, we describe a framework called WeSSQoS that aims at ranking available WSs based on the comparison of their QoS and the stated NFRs. The framework is designed as an open Service Oriented Architecture (SOA) that hosts a configurable portfolio of normalization procedures and ranking algorithms which can be selected by users when starting a selection process. The QoS data from WSs can be obtained either from a static, WSDL-like description or dynamically through monitoring techniques. WeSSQoS is designed to work over multiple WS repositories and QoS sources. The impact of having a portfolio of different normalization and ranking algorithms is illustrated with an example.
Conference Paper
Full-text available
Linked Open Data (LOD) comprises of an unprecedented volume of structured data on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowd-sourced or extracted data of often relatively low quality. We present a methodology for test-driven quality assessment of Linked Data, which is inspired by test-driven software development. We argue, that vocabularies, ontologies and knowledge bases should be accompanied by a number of test cases, which help to ensure a basic level of quality. We present a methodology for assessing the quality of linked data resources, based on a formalization of bad smells and data quality problems. Our formalization employs SPARQL query templates, which are instantiated into concrete quality test case queries. Based on an extensive survey, we compile a comprehensive library of data quality test case patterns. We perform automatic test case instantiation based on schema constraints or semi-automatically enriched schemata and allow the user to generate specific test case instantiations that are applicable to a schema or dataset. % or an application. We provide an extensive evaluation of five LOD datasets, manual test case instantiation for five schemas and automatic test case instantiations for all available schemata registered with LOV. One of the main advantages of our approach is that domain specific semantics can be encoded in the data quality test cases, thus being able to discover data quality problems beyond conventional quality heuristics.
Article
Full-text available
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes it freely available on the Web using Semantic Web and Linked Data technologies. The project extracts knowledge from 111 different language editions of Wikipedia. The largest DBpedia knowledge base which is extracted from the English edition of Wikipedia consists of over 400 million facts that describe 3.7 million things. The DBpedia knowledge bases that are extracted from the other 110 Wikipedia editions together consist of 1.46 billion facts and describe 10 million additional things. The DBpedia project maps Wikipedia infoboxes from 27 different language editions to a single shared ontology consisting of 320 classes and 1,650 properties. The mappings are created via a world-wide crowd-sourcing effort and enable knowledge from the different Wikipedia editions to be combined. The project publishes releases of all DBpedia knowledge bases for download and provides SPARQL query access to 14 out of the 111 language editions via a global network of local DBpedia chapters. In addition to the regular releases, the project maintains a live knowledge base which is updated whenever a page in Wikipedia changes. DBpedia sets 27 million RDF links pointing into over 30 external data sources and thus enables data from these sources to be used together with DBpedia data. Several hundred data sets on the Web publish RDF links pointing to DBpedia themselves and make DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud. In this system report, we give an overview of the DBpedia community project, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications.
Article
Full-text available
The TourMISLOD dataset exposes as linked data a significant portion of the content of TourMIS, a key source of Eu-ropean tourism statistics data. TourMISLOD contains information about the Arrivals, Bednights and Capacity tourism indicators, recorded from 1985 onwards, about over 150 European cities and in connection to 19 major markets. Due to licensing issues, the usage of this dataset is currently limited to the TourMIS consortium. Nevertheless, a prototype application has already revealed the dataset's usefulness for decision support.
Conference Paper
Full-text available
The amount of data available in the Linked Data cloud continues to grow. Yet, few services consume and produce linked data. There is recent work that allows a user to define a linked service from an online service, which includes the specifications for consuming and producing linked data, but building such models is time consuming and requires specialized knowledge of RDF and SPARQL. This paper presents a new approach that allows domain experts to rapidly create semantic models of services by demonstration in an interactive web-based interface. First, the user provides examples of the service request URLs. Then, the system automatically proposes a service model the user can refine interactively. Finally, the system saves a service specification using a new expressive vocabulary that includes lowering and lifting rules. This approach empowers end users to rapidly model existing services and immediately use them to consume and produce linked data.
Article
Full-text available
An ever increasing amount of event-centric knowledge is spread over multiple web sites, either materialized as calendar of past and upcoming events or illustrated by cross-media items. This opens an opportunity to create an infrastructure unifying event-centric information derived from event directories, media platforms and social networks. In order to create such infrastructure, EventMedia relies on Semantic Web technologies that ensures seamless aggregation and integration of disparate data sources, some of which overlap in their coverage. In this paper, we present the EventMedia dataset composed of events descriptions associated with media and interlinked with the Linked Open Data cloud. We describe how data has been extracted, converted, interlinked and published following the best practices of the Semantic Web.
Conference Paper
Full-text available
This paper furthers inquiry into the social structure of free and open source software (FLOSS) teams by undertaking social network analysis across time. Contrary to expectations, we confirmed earlier findings of a wide distribution of centralizations even when examining the networks over time. The paper also provides empirical evidence that while change at the center of FLOSS projects is relatively uncommon, participation across the project communities is highly skewed, with many participants appearing for only one period. Surprisingly, large project teams are not more likely to undergo change at their centers. Full Text at Springer, may require registration or fee
Article
Full-text available
The development of knowledge requires investment, which may be made in terms of financial resources or time. Open source software (OSS) has challenged much of the traditional reasoning by suggesting that individuals behave altruistically and contribute to a public good, despite the opportunity to free-ride. The lion's share of the existing literature on OSS examines communities, that is, those individuals whom are already part of the OSS community. In contrast, this paper starts from users with the requisite skill to use and develop OSS. This group of skilled individuals could potentially invest into the development of OSS knowledge, but they may or may not do so in actuality. This paper, therefore, explores three issues, which have not been extensively explored in the literature, namely, (1) how frequently a group of skilled people use OSS, (2) reasons for differences among users and non-users in terms of use and attitudes, and (3) how frequently, and why, some users contribute to OSS projects (and thereby become developers). In doing so, we consider the opportunity costs of use and development of OSS, which has been largely neglected in the literature. We find that the individuals have a rather pragmatic attitude to firms and that many are active in both firms and OSS community, which raises many questions for future research about the role and influence of firms on the development and diffusion of OSS.
Article
Web APIs enjoy a significant increase in popularity and usage in the last decade. They have become the core technology for exposing functionalities and data. Nevertheless, due to the lack of semantic Web API descriptions their discovery, sharing, integration, and assessment of their quality and consumption is limited. In this paper, we present the Linked Web APIs dataset, an RDF dataset with semantic descriptions about Web APIs. It provides semantic descriptions for 11,339 Web APIs, 7,415 mashups and 7,717 developer profiles, which make it the largest available dataset from the Web APIs domain. The dataset captures the provenance, temporal, technical, functional, and non-functional aspects. In addition, we describe the Linked Web APIs Ontology, a minimal model which builds on top of several well-known ontologies. The dataset has been interlinked and published according to the Linked Data principles. Finally, we describe several possible usage scenarios for the dataset and show its potential.
Conference Paper
Linked-Data Wrappers (LDWs) have been proposed to integrate Open APIs into the linked-data cloud. A main stumbling block is maintenance: LDWs need to be kept in sync with the APIs they wrap. Hence, LDWs are not single-shot efforts, but sustained endeavors that developers might not always afford. As a result, it is not uncommon for third-party LDWs to stop working when their underlying APIs upgrade. Collaborative development might offer a way out. This requires a common platform and a community to tap into. This work investigates the suitability of the YQL platform for this job. Specifically, we look into two main properties for LDW success: effectiveness (i.e. the capability of YQL to enable users to develop LDWs) and scalability (i.e. graceful time degradation on URI dereferencing). The aim: moving LDW development from in-house development to collaborative development as promoted by YQL, on the hope of increasing LDWs’ lifespan.
Article
Facebook's Graph API is an API for accessing objects and connections in Facebook's social graph. To give some idea of the enormity of the social graph underlying Facebook, it was recently announced that Facebook has 901 million users, and the social graph consists of many types beyond just users. Until recently, the Graph API provided data to applications in only a JSON format. In 2011, an effort was undertaken to provide the same data in a semantically-enriched, RDF format containing Linked Data URIs. This was achieved by implementing a flexible and robust translation of the JSON output to a Turtle output. This paper describes the associated design decisions, the resulting Linked Data for objects in the social graph, and known issues.
Article
Since 2012, the Semantic Web journal has been accepting papers in a novel Linked Dataset description track. Here we motivate the track and provide some analysis of the papers accepted thus far. We look at the ratio of accepted papers in this time-frame that fall under this track, the relative impact of these papers in terms of citations, and we perform a technical analysis of the datasets they describe to see what sorts of resources they provide and to see if the datasets have remained available since publication. Based on a variety of such analyses, we present some lessons learnt and discuss some potential changes we could apply to the track in order to improve the overall quality of papers accepted.
Conference Paper
The central idea of Linked Data is that data publishers support applications in discovering and integrating data by complying to a set of best practices in the areas of linking, vocabulary usage, and metadata provision. In 2011, the State of the LOD Cloud report analyzed the adoption of these best practices by linked datasets within different topical domains. The report was based on information that was provided by the dataset publishers themselves via the datahub.io Linked Data catalog. In this paper, we revisit and update the findings of the 2011 State of the LOD Cloud report based on a crawl of the Web of Linked Data conducted in April 2014. We analyze how the adoption of the different best practices has changed and present an overview of the linkage relationships between datasets in the form of an updated LOD cloud diagram, this time not based on information from dataset providers, but on data that can actually be retrieved by a Linked Data crawler. Among others, we find that the number of linked datasets has approximately doubled between 2011 and 2014, that there is increased agreement on common vocabularies for describing certain types of entities, and that provenance and license metadata is still rarely provided by the data sources.
Conference Paper
Coping with the ever-increasing amount of data becomes increasingly challenging. To alleviate the information overload put on people, systems are progressively being connected directly to each other. They exchange, analyze, and manipulate humongous amounts of data without any human interaction. Most current solutions, however, do not exploit the whole potential of the architecture of the World Wide Web and completely ignore the possibilities offered by Semantic Web technologies. Based on the experiences gained by implementing and analyzing various RESTful APIs and drawing from the longer history of Semantic Web research we developed Hydra, a small vocabulary to describe Web APIs. It aims to simplify the development of truly RESTful services by leveraging the power of Linked Data. By breaking the descriptions down into small independent fragments, a new breed of interoperable Web APIs using decentralized, reusable, and composable contracts can be realized.
Conference Paper
Background / Purpose: The world wide web acts as a major publication platform for scientific publications, but more and more life sciences data are being published in a variety of incompatible formats that preclude easy integration, information retrieval and query answering. Although the semantic web effort offers machine understandable languages such as RDF and OWL to publish semantically annotated data on an emerging web of linked data, life sciences data providers don’t always use the same identifier to refer to the same entities and data must be processed in order to ensure link accuracy. Main conclusion: Bio2RDF is an open source project that uses semantic web technologies to build and provide the largest network of linked data for the life sciences. Here, we present the second release of the Bio2RDF project which features up-to-date, open-source scripts, IRI normalization through a common dataset registry, dataset provenance, data metrics, public SPARQL endpoints, compressed RDF files and full text-indexed virtuoso triple stores for download.
Article
The PSGR project is the first attempt to generate, curate, interlink and distribute daily updated public spending data in LOD formats that can be useful to both expert (i.e. scientists and professionals) and naïve users. The PSGR ontology is based on the UK payments ontology and reuses, among others, the W3C Registered Organization Vocabulary and the Core Business Vocabulary. RDFized data are linked to product classifications, Geonames and DBpedia resources. Online services contain advanced search features and domain level information (e.g. local government), simple and complex visualizations based on network analysis, linked information about payment entities and SPARQL endpoints. During February 2013, the growing dataset consists of approximately 2 mil. payment decisions valued 44.5 bil. euros forming 65 mil. triples.
Conference Paper
In this paper, we present the design and first results of the Dynamic Linked Data Observatory: a long-term experiment to monitor the two-hop neighbourhood of a core set of eighty thousand diverse Linked Data documents on a weekly basis. We present the methodology used for sampling the URIs to monitor, retrieving the documents, and further crawling part of the two-hop neighbourhood. Having now run this experiment for six months, we analyse the dynamics of the monitored documents over the data collected thus far. We look at the estimated lifespan of the core documents, how often they go on-line or off-line, how often they change; we further investigate domain-level trends. Next we look at changes within the RDF content of the core documents across the weekly snapshots, examining the elements (i.e., triples, subjects, predicates, objects, classes) that are most frequently added or removed. Thereafter, we look at how the links between dereferenceable documents evolves over time in the two-hop neighbourhood.
Article
RDF (Resource Description Framework) is seeing rapidly increasing adoption, for example, in the context of the Linked Open Data (LOD) movement and diverse life sciences data publishing and integration projects. This paper discusses how we have adapted OpenLink Virtuoso, a general purpose RDBMS, for this new type of workload. We discuss adapting Virtuoso's relational engine for native RDF support with dedicated data types, bitmap indexing and SQL optimizer techniques. We further discuss scaling out by running on a cluster of commodity servers, each with local memory and disk. We look at how this impacts query planning and execution and how we achieve high parallel utilization of multiple CPU cores on multiple servers. We present comparisons with other RDF storage models as well as other approaches to scaling out on server clusters. We present conclusions and metrics as well as a number of use cases, from DBpedia to bio informatics and collaborative web applications.
Conference Paper
A sizable amount of data on the Web is currently available via Web APIs that expose data in formats such as JSON or XML. Combining data from different APIs and data sources requires glue code which is typically not shared and hence not reused. We propose Linked Data Services (LIDS), a general, formalised approach for integrating data-providing services with Linked Data, a popular mechanism for data publishing which facilitates data integration and allows for decentralised publishing. We present conventions for service access interfaces that conform to Linked Data principles, and an abstract lightweight service description formalism. We develop algorithms that use LIDS descriptions to automatically create links between services and existing data sets. To evaluate our approach, we realise LIDS wrappers and LIDS descriptions for existing services and measure performance and effectiveness of an automatic interlinking algorithm over multiple billions of triples.
Conference Paper
LInked Data Services (LIDS) denote the integration of data-providing services and Linked Data. LIDS are parameterised and formally described web resources which return RDF when dereferenced via HTTP. In this paper we present a general method for creating Linked Data Services; LIDS consist of data access interface conventions that are compatible to Linked Data principles and a lightweight formal description model. Our approach is based on established Web standards including HTTP, RDF and SPARQL. Additionally, we announce several LIDS that we have created from existing real-life services, unlocking vast amounts of triples to the Web of Data.
Article
International open government initiatives are releasing an increasing volume of raw government datasets directly to citizens via the Web. The transparency resulting from these releases not only creates new application opportunities but also imposes new burdens inherent to large-scale distributed data integration, collaborative data manipulation and transparent data consumption. The Tetherless World Constellation (TWC) at Rensselaer Polytechnic Institute (RPI) has developed the Semantic Web-based TWC LOGD portal to support the deployment of linked open government data (LOGD). The portal is both an open source infrastructure supporting linked open government data production and consumption and a vibrant community portal that educates and serves the growing international open government community of developers, data curators and end users. This paper motivates and introduces the TWC LOGD Portal and highlights innovative aspects and lessons learned.
Book
Like other sciences and engineering disciplines, software engineering requires a cycle of model building, experimentation, and learning. Experiments are valuable tools for all software engineers who are involved in evaluating and choosing between different methods, techniques, languages and tools. The purpose of Experimentation in Software Engineering is to introduce students, teachers, researchers, and practitioners to empirical studies in software engineering, using controlled experiments. The introduction to experimentation is provided through a process perspective, and the focus is on the steps that we have to go through to perform an experiment. The book is divided into three parts. The first part provides a background of theories and methods used in experimentation. Part II then devotes one chapter to each of the five experiment steps: scoping, planning, execution, analysis, and result presentation. Part III completes the presentation with two examples. Assignments and statistical material are provided in appendixes. Overall the book provides indispensable information regarding empirical studies in particular for experiments, but also for case studies, systematic literature reviews, and surveys. It is a revision of the authors' book, which was published in 2000. In addition, substantial new material, e.g. concerning systematic literature reviews and case study research, is introduced. The book is self-contained and it is suitable as a course book in undergraduate or graduate studies where the need for empirical studies in software engineering is stressed. Exercises and assignments are included to combine the more theoretical material with practical aspects. Researchers will also benefit from the book, learning more about how to conduct empirical studies, and likewise practitioners may use it as a "cookbook" when evaluating new methods or techniques before implementing them in their organization. © Springer-Verlag Berlin Heidelberg 2012. All rights are reserved.
Article
This document describes best practice recipes for publishing an RDFS or OWL vocabulary or ontology on the Web. The features of each recipe are clearly described, so that vocabulary or ontology creators may choose the recipe best suited to the needs of their particular situations. Each recipe contains an example configuration for use with an Apache HTTP server, although the principles involved may be adapted to other environments. The recipes are all designed to be consistent with the architecture of the Web as currently specified. W3C Working Draft
Conference Paper
Software inspection is a known technique for improving software quality. It involves carefully examining the code, the design, and the documentation of software and checking these for aspects that are known to be potentially problematic based on past experience. Code smells are a metaphor to describe patterns that are generally associated with bad design and bad programming practices. Originally, code smells are used to find the places in software that could benefit from refactoring. In this paper we investigate how the quality of code can be automatically assessed by checking for the presence of code smells and how this approach can contribute to automatic code inspection. We present an approach for the automatic detection and visualization of code smells and discuss how this approach can be used in the design of a software inspection tool. We illustrate the feasibility of our approach with the development of jCOSMO, a prototype code smell browser that detects and visualizes code smells in JAVA source code. Finally, we show how this tool was applied in a case study.
A JSON-based serialization for Linked Data. Recommendation, W3C
  • M Sporny
  • G Kellog
  • M Lanther
M. Sporny, G. Kellog, M. Lanther, A JSON-based serialization for Linked Data. Recommendation, W3C, January 2014. http://www.w3.org/TR/jsonld/. (Accessed 14 October 2019).
Villazón-Terrazas, Best practices for publishing Linked Data, Working draft
  • B Hyland
  • G Atemezing
B. Hyland, G. Atemezing, B. Villazón-Terrazas, Best practices for publishing Linked Data, Working draft, W3C, January 2014. https://www.w3.org/TR/ ld-bp/. (Accessed 14 October 2019).
D2RQ-treating non-RDF databases as virtual RDF graphs
  • C Bizer
  • A Seaborne
C. Bizer, A. Seaborne, D2RQ-treating non-RDF databases as virtual RDF graphs, in: Proceedings of the 3rd International Semantic Web Conference, ISWC2004, 2004.
Dynamic linked data via linked open services
  • B Norton
  • R Krummenacher
  • A Marte
  • D Fensel
B. Norton, R. Krummenacher, A. Marte, D. Fensel, Dynamic linked data via linked open services, in: Proceedings of the Workshop on Linked Data in the Future Internet at the Future Internet Assembly (LDFI), volume 700. CEUR Workshop Proceedings, 2010.
Open data commons, a license for open data
  • P Miller
  • R Styles
  • T Heath
P. Miller, R. Styles, T. Heath, Open data commons, a license for open data, in: Proceedings of the WWW2008 Workshop on Linked Data on the Web (LDOW), volume 369. CEUR Workshop Proceedings, 2008.
Cool URIs for the semantic web, Working draft, W3C
  • L Sauermann
  • R Cyganiak
  • D Ayers
  • M Völkel
L. Sauermann, R. Cyganiak, D. Ayers, M. Völkel, Cool URIs for the semantic web, Working draft, W3C, December 2008. https://www.w3.org/ TR/cooluris/. (Accessed 14 October 2019).
Motivation, governance, and the viability of hybrid forms in open source software development
  • S K Shah
S.K. Shah, Motivation, governance, and the viability of hybrid forms in open source software development, Manage. Sci. 52 (7) (2006) 1000-1014, http://dx.doi.org/10.1287/mnsc.1060.0553.
Data on the Web best practices: Data quality vocabulary, Working draft, W3C
  • R Albertoni
  • A Isaac
  • J Debattista
  • M Dekkers
  • C Guret
  • D Lee
  • N Mihindukulasooriya
  • A Zaveri
R. Albertoni, A. Isaac, J. Debattista, M. Dekkers, C. Guret, D. Lee, N. Mihindukulasooriya, A. Zaveri, Data on the Web best practices: Data quality vocabulary, Working draft, W3C, December 2016. https://www.w3. org/TR/vocab-dqv/. (Accessed 14 October 2019).