Conference Paper

Self-service linked government data with dcat and gridworks

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Open Government Data initiatives in the US, UK and elsewhere have made large amounts of raw data available to the public on the Web. There is enormous potential in applying Linked Data principles to these datasets. This potential currently remains largely untapped because governments lack the resources required to convert from raw data to highquality Linked Data on a large scale. We present a "selfservice" approach to this problem: By connecting a powerful Gridworks-based data workbench application directly to data catalogs, via a standard Data Catalog Vocabulary, data professionals outside of government can contribute to the Linked Data conversion process, thus obtaining data for their own needs and benefiting the larger Linked Government Data effort.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Our goal is to find the corresponding URI in DBpedia. Trying to reconcile it based on label comparison only might give a large number of heterogeneous results including University of Cambridge (dbpedia:University of Cambridge) 8 and Cambridge Bay (dbpedia:Cambridge Bay). Adding a restriction to accept only results that are cities helps narrowing the results down but Cambridge, Ontario(dbpedia:Cambridge, Ontario) and Cambridge, Maryland (dbpedia:Cambridge, Maryland) will still be present in the results. ...
... Book Mashup URIs can be easily generated while translating any dataset containing ISBN values. 8 with the assumption that dbpedia is the prefix for http://dbpedia.org/resource/ 9 http://sites.wiwiss.fu-berlin.de/suhl/bizer/bookmashup ...
... Google Refine does not support direct RDF export. In [8] we describe an extension that provides RDF export capabilities 17 . The export functionality is based on describing the shape of the desired RDF through a skeleton detailing what resources and literals to include in the RDF graph, what relations to set between them and what URIs to use for resources. ...
Article
We observe that "LOD hubs" are emerging. They provide well-managed reference identifiers that attract a large share of the incoming links on the Web of Data and play a crucial role in data integration within communities of interest. But connecting to such hubs as part of the Linked Data publish-ing process is still a difficult task. In this paper, we explore several approaches to the implementation of reconciliation services that allow third-party publishers to link their data to LOD hubs as part of the data publishing process. We evaluate four approaches using the OAEI Instance Match-ing Benchmark, and describe their implementation in an ex-tension to the popular data workbench application Google Refine.
... Formats are often inconvenient, (e.g. numerical tables as PDFs), there is little consistency across datasets, and documentation is often poor [6]. ...
... In a previous work, we presented a contribution towards supporting the production of high-quality LGD, the "self-service" approach [6]. It shifts the burden of Linked Data conversion towards the data consumer. ...
... The Case for "Self-service LGD" In a nutshell, the self-service approach enables consumers who need a Linked Data representation of a raw government dataset to produce the Linked Data themselves without waiting for the government to do so. Shifting the burden of Linked Data conversion towards the data consumer has several advantages [6]: (i) there are more of them; (ii) they have the necessary motivation for performing conversion and clean-up; (iii) they know which datasets they need, and don't have to rely on the government's data team to convert the right datasets. ...
Conference Paper
We tackle the challenges involved in converting raw government data into high-quality Linked Government Data (LGD). Our approach is centred around the idea of self-service LGD which shifts the burden of Linked Data conversion towards the data consumer. The self-service LGD is supported by a publishing pipeline that also enables sharing the results with sufficient provenance information. We describe how the publishing pipeline was applied to a local government catalogue in Ireland resulting in a significant amount of Linked Data published.
... More specifically under its action on promoting semantic interoperability among the 55 European Union Member States (SEMIC). It is mainly based on the Data Catalog Vocabulary (DCAT), which was developed initially at the Digital Enterprise Research Institute in Ireland [16] [17] and became later a W3C recommendation under the responsibility of the Government Linked Data Working Group 1 . DCAT is an RDF vocabulary designed to facilitate interoperability between 60 data catalogs published on the Web. ...
... More precisely, on the European Data Portal, which is in charge of collecting (i.e., harvests) the metadata from national portals in order to improve the accessibility and increase the value of Open Data. 17 . ...
Article
In Europe, an open government data ecosystem is being developed. This ecosystem is implemented using various technologies and platforms. In fact, the use of a common metadata standard for describing datasets and open data portals, i.e., the DCAT-AP specification, appears as the lingua franca that connects an, otherwise, fragmented environment. In this context, the standard-based consolidation of open data promotes the subsidiarity principle, allowing open data portal owners to choose platforms and internal representations based on their specific requirements. However, the portal owners must provide an export with DCAT-AP compliant metadata about the dataset they store. In this paper we provide a detailed study of how the DCAT-AP specification is used in practice, both at the national and the European level. Consequently, we also identify issues, challenges, and opportunities for improvements that can be used as input for the next revision cycle of the standard. Essentially, our goal is to contribute towards the enrichment of a growing and promising European open data ecosystem.
... -Energy Consumption: The use of smart meters and other sensors can help in reducing energy consumption through monitoring use in real-time. For example, an initiative throughout the European Union is currently ongoing with the aim of controlling energy consumption and providing for a more sustainable environment 10 . ...
... This has the potential of creating the Web of Data (also known as Semantic Web); a huge distributed dataset that aims to replace decentralized and isolated data sources [13]. The benefits of applying Linked Data principles to government data as covered in literature include [10,18]: ...
Chapter
Full-text available
Governments are one of the largest producers and collectors of data in many different domains and one major aim of open government data initiatives is the release of social and commercial value. Hence, we here explore existing processes of value creation on government data. We identify the dimensions that impact, or are impacted by value creation, and distinguish between the different value creating roles and participating stakeholders. We propose the use of Linked Data as an approach to enhance the value creation process, and provide a Value Creation Assessment Framework to analyse the resulting impact. We also implement the assessment framework to evaluate two government data portals.
... We must rely on algorithms. Thankfully, there are user-friendly tools to help users transform their data more reliably [8]. In this respect, we find Google Refine [8] particularly interesting. ...
... Thankfully, there are user-friendly tools to help users transform their data more reliably [8]. In this respect, we find Google Refine [8] particularly interesting. ...
Article
Full-text available
It is becoming common to archive research datasets that are not only large but also numerous. In addition, their corresponding metadata and the software required to analyse or display them need to be archived. Yet the manual curation of research data can be difficult and expensive, particularly in very large digital repositories, hence the importance of models and tools for automating digital curation tasks. The automation of these tasks faces three major challenges: (1) research data and data sources are highly heterogeneous, (2) future research needs are difficult to anticipate, (3) data is hard to index. To address these problems, we propose the Extract, Transform and Archive (ETA) model for managing and mechanizing the curation of research data. Specifically, we propose a scalable strategy for addressing the research-data problem, ranging from the extraction of legacy data to its long-term storage. We review some existing solutions and propose novel avenues of research.
... One of the means to reduce the costs is to allow datasets consumers to participate voluntarily in linking OGD. Self-service approach introduced by [28] shifts the burden of interlinking datasets to the data consumers. It encourages them to interlink government datasets without waiting for the government to do so. ...
Article
Full-text available
This paper describes the enhanced LIRE (LInked RElations) architecture for creating relations between datasets available on open government portals. The architecture is improved to be applicable on different open government data platforms using minimal configuration at the Data processing layer. We evaluated the applicability of enhanced LIRE, its advantages and disadvantages, which resulted in necessary recommendations for the publication of dataset?s metadata to obtain better associations between datasets. Moreover, we introduced a LINDAT indicator that reflects the percentage of linked data in the total possible number of linked data on open government data portals.
... Together with the characteristics of Open Data, some attributes of city councils were of the most researched in this field to-date (e.g. the need for Non-Fragmented Strategies across departments (Cyganiak et al., 2010)). These aspects were further reflected in this research in the need for councils to have an established IT department and a city CIO in place. ...
Conference Paper
City councils produce large amounts of data. As this data becomes available, and as information and communication technology capabilities are in place to manage and exploit this data, open government data is seen as becoming more and more valuable as a catalyst for service innovation and economic growth. Notwithstanding this, evidence of open data adoption is currently largely scattered and anecdotal. This is reflected in the lack of literature focusing on users of open data for commercial purposes. This research aims to address this gap and contributes to the IS open data services debate by proposing a model of factors perceived by an open data services business as the most relevant in explaining adoption of open government data for commercial service innovation in cities. Adopting an inductive reasoning approach through qualitative methods was critical to capture the complexity of the open data services ecosystem perceived by those reusing this data.
... The application of Linked Data principles to government datasets brings enormous potential [3]. However, this potential is currently untapped mostly because of the lack of resources required to transform raw data into high-quality Linked Data on a large scale [15]. While is true that Linked Data generation and publication does not follow a set of common and clear guidelines to scale out the generation and publication of Linked Data, the Methodological Guidelines for Publishing Government Linked Data proposed in [9] established that the process of publishing datasets as Linked Data must have a life cycle, in the same way of Software Engineering, in which every development project has a life cycle. ...
Article
Full-text available
Scientific publication services are changing drastically, researchers demand intelligent search services to discover and relate scientific publications. Publishers need to incorporate semantic information to better organize their digital assets and make publications more discoverable. In this paper, we present the on-going work to publish a subset of scientific publications of CONICET Digital as Linked Open Data. The objective of this work is to improve the recovery and reuse of data through Semantic Web technologies and Linked Data in the domain of scientific publications. To achieve these goals, Semantic Web standards and reference RDF schema’s have been taken into account (Dublin Core, FOAF, VoID, etc.). The conversion and publication process is guided by the methodological guidelines for publishing government linked data. We also outline how these data can be linked to other datasets DBLP, WIKIDATA and DBPEDIA on the web of data. Finally, we show some examples of queries that answer questions that initially CONICET Digital does not allow
... Using dedicated platforms, data in heterogeneous formats coming from different sources can be integrated for uniform access. Most data published on open data platforms is in the original (raw) format (PDF, Word, etc.), and not related to other data on the same platform (Britain and Treasury 2009) Similarly, Cyganiak et al. (2010) state that "working with this data can still be a challenge, because they are provided in a haphazard way, driven by practicalities within the producing government agency, and not by the needs of the information users". In order to partially bridge this problem, open government platforms enable adding of meta descriptions to raw data in the form Who, What, When, Where and similar information, for a much wider context of data processing and definition of information patterns for knowledge extraction. ...
Chapter
Full-text available
Connecting data that government releases in diverse domains, such as economy, statistics, environment, transportation, and medicine, as open data available on the web by using semantic web technologies, leads to linked open government data (LOGD). LOGD supports development of innovative, intelligent applications that improve openness and transparency and deliver a smart environment for smart living. With LOGD, any stakeholder can browse a data source and subsequently navigate to related data sources. In this paper, we present an overview of applied Semantic Web technologies in the open government data domain that increase openness and transparency of government, provide for effective data use and deliver smart services to everyone needing information. We will explain how Semantic Web technologies can contribute to a smart collaborative environment that promotes government data use and helps citizens obtain information they need.
... The first aspect was, for example, tackled by approaches for semantic lifting of data by [5] and [4], who tried to build general strategies for putting large open government datasets in the Link Data cloud. For the standardized structuring metadata, the Data Catalog Vocabulary (DCAT) 1 [3] was developed. However, the cross-portal metadata alignment and reconciliation can not be addressed by DCAT. ...
... Authors of [24] encode hierarchical topological relations between geographic entities over traditional spatial queries to link Great Britain datasets even in the absence of explicit geometric information. Other researches focus on the ontologies linking level to discover semantic relationships [25] [26]. In [27], issues arising when interlinking the LoD cloud was addressed. ...
Conference Paper
The pressure of opening access to public sector geospatial information traditionally managed within disparate spatial data infrastructures (SDI) is driven by a combination of factors. These factors include the adoption of open data programs and the need to integrate spatial data across sectors and levels of government for specific applications. Informed by the success of the Linked Open Data community, efforts to leverage Linked Data in enabling global access to spatial data currently managed within national and regional SDIs are emerging. However, these early efforts do not provide guidelines for implementing such Linked SDI nor articulate the socio-technical requirements for a successful Linked Geospatial Data strategy. By analyzing existing SDI architectures and emerging Linked SDI requirements, we develop Reference Architecture for building interoperable Linked SDIs.
... This has the potential of creating the Web of Data (also known as Semantic Web); a huge distributed dataset that aims to replace decentralized and isolated data sources [20]. The benefits of applying Linked Data principles to government data as covered in literature include [21] [22]: • Simpler data access through a unified data model; • Rich representation of data enabling the documentation of data semantics; • Reuse of existing vocabularies; • Use of URIs allow fine-grained referencing of any information; • Related information is linked, allowing its unified access. While significant efforts in literature cover advantages of using Linked Data (for example [23] [24] [17] [25]), there is no evident effort targeted towards the benefits of using Linked Data specifically in open government data value creation. ...
... The first aspect was, for example, tackled by approaches for semantic lifting of data by [5] and [4], who tried to build general strategies for putting large open government datasets in the Link Data cloud. For the standardized structuring metadata, the Data Catalog Vocabulary (DCAT) 1 [3] was developed. However, the cross-portal metadata alignment and reconciliation can not be addressed by DCAT. ...
Article
Full-text available
This paper presents an approach for metadata reconciliation, curation and linking for Open Governamental Data Portals (ODPs). ODPs have been lately the standard solution for governments willing to put their public data available for the society. Portal managers use several types of metadata to organize the datasets, one of the most important ones being the tags. However, the tagging process is subject to many problems, such as synonyms, ambiguity or incoherence, among others. As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data. In order to address these problems, we develop and implement an approach for tag reconciliation in Open Data Portals, encompassing local actions related to individual portals, and global actions for adding a semantic metadata layer above individual portals. The local part aims to enhance the quality of tags in a single portal, and the global part is meant to interlink ODPs by establishing relations between tags.
... Currently, OpenRefi ne at its core does not support conversion from other sources to RDF, but this is achieved by RDF Refi ne, 13 an extension that incorporates the functionalities of exporting to RDF (Cyganiak et al. 2010 ) and RDF reconciliation ( Maali et al. 2011 ). The RDF export functionality is based on describing the shape of the generated RDF graph through a template that uses values from the input spreadsheet. ...
Chapter
This chapter provides an overview of the methodologies and technologies that support Linked Data designing and publishing. More specifically, this chapter starts with a presentation of the rationale and a discussion about how data can be opened up (i.e. published under an open license). Basic principles are first introduced regarding the cases in which content can be opened up and also, the most common approaches are presented in accomplishing this. Next, we discuss about how data can be modeled, authored, serialized and stored. In this chapter we also provide an overview of the most common technical solutions and widely used software tools that can serve this purpose. Overall, the chapter aims to provide an analysis of the sub-problems into which the Linked Open Data publishing task is to be broken down, namely opening, modeling, linking, processing, and visualizing content, followed by a presentation of the most representative software solutions.
... Currently, OpenRefi ne at its core does not support conversion from other sources to RDF, but this is achieved by RDF Refi ne, 13 an extension that incorporates the functionalities of exporting to RDF (Cyganiak et al. 2010 ) and RDF reconciliation ( Maali et al. 2011 ). The RDF export functionality is based on describing the shape of the generated RDF graph through a template that uses values from the input spreadsheet. ...
Book
This book explains the Linked Data domain by adopting a bottom-up approach: it introduces the fundamental Semantic Web technologies and building blocks, which are then combined into methodologies and end-to-end examples for publishing datasets as Linked Data, and use cases that harness scholarly information and sensor data. It presents how Linked Data is used for web-scale data integration, information management and search. Special emphasis is given to the publication of Linked Data from relational databases as well as from real-time sensor data streams. The authors also trace the transformation from the document-based World Wide Web into a Web of Data. Materializing the Web of Linked Data is addressed to researchers and professionals studying software technologies, tools and approaches that drive the Linked Data ecosystem, and the Web in general.
... These records consist of metadata describing a dataset which represent "a collection of information in a machine-readable format" [5]. Because data catalogs are central point of access to datasets provided by different public sector bodies they can become the key enablers of open government policy [6]. In this paper we present the experience we have gained during cataloging of datasets published by the Czech public sector bodies. ...
Article
Over the past few years number of public sector data catalogs started to exist. These data catalogs cover data from different levels of the public sector and they represent results of both volunteer activity and government policy. In this paper we present our experience from cataloging of the Czech public sector data using software Comprehensive Knowledge Archive Network (CKAN) and based on this experience we propose generic model of data cataloging tool features. Content of the Czech CKAN is a result of the academic initiative and thus it is an unofficial catalog of datasets provided by the Czech public sector bodies. Despite the limited scope of the cataloging activity it provided valuable information about available datasets provided by the Czech public sector and the results can help future cataloging initiatives.
... Por otra parte los documentos de la Web semántica pueden ser rastreados automáticamente siguiendo los enlaces RDF y los datos así obtenidos pueden ser sometidos a más sofisticadas capacidades de búsqueda ( Por otra parte, el proyecto Linked Open Data presenta un diagrama actualizado periódicamente , de los conjuntos de datos publicados en formato Linked Data por la comunidad del proyecto y otros individuos y organizaciones. En este diagrama se puede observar una diversidad de conjuntos de datos clasificados por dominios (Cyganiak et ál., 2010). Más información acerca de estos conjuntos de datos se pueden obtener en el catálogo mantenido por CKAN (Comprehensive Knowledge Archive Network) (Cyganiak, 2011 ). ...
Article
Full-text available
The Open Access Initiative primary encourage state agencies to publish their data as soon as possible. The Linked Data Technologies help in this direction.This article introduces concepts and tools that can help to implement projects of Linked Data, as well as some applications in e-Government. Our goal is to provide an overview of this new area and its potential use.
... Technical approaches propose a particular framework for the LOGD publishing. In [5] authors evaluate a framework for collecting, cleaning and converting data to RDF. The publication step is mentioned, but not described in the detail. ...
Conference Paper
Full-text available
Up to the present day much effort has been made to publish government data on the Web. However, such data has been published in different formats. For any particular source and use (e.g. exploration, visualization, integration) of such information special applications have to be written. This limits the overall usability of the information provided and makes it diffcult to access information resources. These limitations can be overridden, if the information will be provided using a homogeneous data and access model complying with the Linked Data principles. In this paper we showcase how raw Open Government Data (OGD) from heterogeneous sources can be processed, converted, published and used on the Web of Linked Data. In particular we demonstrate our experience in processing of OGD on two use cases: the Digital Agenda Scoreboard and the Financial Transparency System of the European Commission.
... Furthermore, this solution does not contain tools for editing metadata nor link to existing ontologies for use in dataset descriptions. A faceted search using Gridworks in combination with dcat was also proposed in [4]. The distributed semantic content creation and publishing approach, using shared metadata schemas, ontology services, and semantic portals for publication, has been originally developed in the semantic portals of the FinnONTO project [15]. ...
Conference Paper
Full-text available
The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in different formats, not only in RDF. The aim in releasing open datasets is for developers to use them in innovative applications, but the datasets need to be found first and metadata available is often minimal, heterogeneous, and distributed making the search for the right dataset often problematic. To address the problem, we present DataFinland, a semantic portal featuring a distributed content creation model and tools for annotating and publishing metadata about LOD and non-RDF datasets on the web. The metadata schema for DataFinland is based on a modified version of the voiD vocabulary for describing linked RDF datasets, and annotations are done using an online metadata editor SAHA connected to ONKI ontology services providing a controlled set of annotation concepts. The content is published instantly on an integrated faceted search and browsing engine HAKO for human users, and as a SPARQL endpoint and a source file for machines. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.
Chapter
The World Wide Web Consortium (W3C) as the main standardization body for Web standards has set a particular focus on publishing and integrating Open Data. In this chapter, the authors explain various standards from the W3C's Semantic Web activity and the—potential—role they play in the context of Open Data: RDF, as a standard data format for publishing and consuming structured information on the Web; the Linked Data principles for interlinking RDF data published across the Web and leveraging a Web of Data; RDFS and OWL to describe vocabularies used in RDF and for describing mappings between such vocabularies. The authors conclude with a review of current deployments of these standards on the Web, particularly within public Open Data initiatives, and discuss potential risks and challenges.
Conference Paper
Information Systems research on Open Data has been primarily focused on its contribution to e-government inquiries, government transparency, and open government. Recently, Open Data has been explored as a catalyser for service innovation as a consequence of big claims around the potential of such initiatives in terms of additional value that can be injected into the worldwide economy. Subsequently , the Open Data Services academic conversation was structured (Lindman et al. 2013a). The research project presented in this paper is an interpretive case study that was carried out to explore the factors that influence the diffusion of Open Data Services. This paper contributes to this debate by providing both a systematic literature review study that structures research efforts available to date in this topic, and an interpretive case study (Walsham, 1995) of a successful initiative that turned several city authorities' raw open datasets into a set of valuable services. Results demonstrate that 12 factors and 56 related variables are the most relevant in the process of diffusion of open data for new service development. Furthermore, this paper demonstrates the suitability of Social Constructionism and in-terpretive case study research to inductively generate knowledge in this field.
Conference Paper
Information Systems research on Open Data has been primarily focused on its contribution to e-government inquiries, government transparency, and open government. Recently, Open Data has been explored as a catalyser for service innovation as a consequence of big claims around the potential of such initiatives in terms of additional value that can be injected into the worldwide economy. Subsequently, the Open Data Services academic conversation was structured (Lindman et al. 2013a). The research project presented in this paper is an interpretive case study that was carried out to explore the factors that influence the diffusion of Open Data for new service development. This paper contributes to this debate by providing an interpretive inductive case study (Walsham, 1995) of a tourism company that successfully turned several city authorities' raw open datasets into a set of valuable services. Results demonstrate that 16 factors and 68 related variables are the most relevant in the process of diffusion of open data for new service development. Furthermore, this paper demonstrates the suitability of Social Constructionism and interpretive case study research to inductively generate knowledge in this field.
Book
Linking Government Data provides a practical approach to addressing common information management issues. The approaches taken are based on international standards of the World Wide Web Consortium. Linking Government Data gives both the costs and benefits of using linked data techniques with government data; describes how agencies can fulfill their missions with less cost; and recommends how intra-agency culture must change to allow public presentation of linked data. Case studies from early adopters of linked data approaches in international governments are presented in the last section of the book. Linking Government Data is designed as a professional book for those working in Semantic Web research and standards development, and for early adopters of Semantic Web standards and techniques. Enterprise architects, project managers and application developers in commercial, not-for-profit and government organizations concerned with scalability, flexibility and robustness of information management systems will also find this book valuable. Students focused on computer science and business management will also find value in this book.
Thesis
Full-text available
The extensive publishing of data in open formats on the Web seems to be an irreversible tendency. Regarding governments, claims for more transparency coming from the civil society are forcing public administrations to publish data government data through Open Data Portals (ODPs). Hence, it is expected a greater transparency of public administrations, which in turn feed democracy with a well informed population, and inhibits public resources misuse through the possibility of open scrutiny by the public. Alongside the great expectations created by the open data policies, we also verify a wide range of problems which still hinder a more effective growing of the open data initiatives. During the research related to this thesis, two problems called the attention: (i) the lack of adequate descriptors for open datasets, and (ii) the difficulties of the general public for dealing with open data. Thus, this thesis expects to bring a contribution for the field of open data by proposing an approach for these problems. Several studies attest that even if open data are published, it is necessary to have an empowered society to deal with it. Otherwise, there is a risk of creating an elite able to profit from these information, deepening even more the digital divide, especially in countries like Brazil. In order to tackle this matter, we present in this thesis an approach for data literacy, inspired in the pedagogy of popular education and in the participatory action-research. The application of this approach as a field study revealed that bad quality open datasets description is one of the factors hindering open data advance. ODP managers use several types of metadata to describe datasets, one of the most important ones being the tags. However, the tagging process is subject to many problems, such as synonyms, ambiguity or incoherence, among others. As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data. In order to address these problems, we developed and implemented the Semantic Tags for Open Data Portals approach, for metadata cleaning up, enriching and reconciliation in ODPs. The STODaP approach was evaluated, and results show that it enable participants to find open datasets faster and preciser than using other searching methods. It is expected that this thesis contributes with and advance in the democratisation of information, contextualizing in a more adequate form the publication of open data, and allowing its use by a broader part of the population.
Chapter
After a fierce presidential election campaign in 2014, the re-elected president Dilma Rousseff became a target of protests in 2015 asking for her impeachment. This sentiment of dissatisfaction was fomented by the tight results between the two favorite runners-up and the accusations of corruption in the media. Two main protests in March were organized and largely reported with the use of Social Networks like Twitter: one pro-government and other against it, separated by two days. In this work, we apply two supervised learning algorithms to automatically classify tweets during the protests and to perform an exploratory analysis to acquire insights of their inner divisions and their dynamics. Furthermore, we can identify a slightly different behavior from both parts: while the pro-government users criticized the opposing arguments prior the event, the group against the government generated attacked during different times, as a response to supporters of government.
Conference Paper
The aim of this paper was to analyze an implementation of the public data agenda to address the lack of empirical research on the subject. The focus of the paper is on the interplay between policy, process and people. The approach was qualitative, interpretive research and data was gathered through interaction, interviews and observations over a period of 20 months. Findings showed that the policies are a bit opportunistic and that it is not clear what data that should be made available to attract citizens to take part in the agenda, raw data or processed data? Furthermore, the incentives for citizens to engage in the public data agenda were not obvious. I therefore wonder, do we believe too much in information? Are we being information determinists?
Conference Paper
Open Government Data (OGD) is seen as a key factor of Open Government initiatives. However, government data is often scattered across various government websites which makes them difficult to find. OGD catalogs serve as a single point of access to open government datasets and thus support discovery and use of OGD. In this paper we define the term Open Government Data and present current OGD activities in the Czech Republic. Number of the OGD catalogs has been established over the past years, but recent experience shows that the quality of the catalog records might affects the ability of users to locate the data of their interest. Therefore we discuss the quality of the catalog records and we propose relevant techniques for its improvement. In addition to the academic perspective authors reflect the experience they gained as coauthors of the Open data cataloging strategy of the Czech public administration.
Article
Full-text available
The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in various formats, not only in RDF. However, the metadata available for these datasets is often minimal, heterogeneous, and distributed, which makes finding a suitable dataset for a given need problematic. Governmental open datasets are often the basis of innovative applications but the datasets need to be found by the developers first. To address the problem, we present a distributed content creation model and tools for annotating and publishing metadata about linked data and non-RDF datasets on the web. The system DATAFINLAND is based on a modified version of the VoiD vocabulary for describing linked RDF datasets, and uses an online metadata editor SAHA3 connected to ONKI ontology services for annotating contents semantically. The resulting metadata can be published instantly on an integrated faceted search and browsing engine HAKO for human users, as a SPARQL end-point for machine use, and as a source file. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.
Conference Paper
Full-text available
Recent activities of governments around the world regarding the publication of open government data on the Web, re-introduced the Open Data concept. The concept of Open Data represents the idea that certain data should be freely available to the public, i.e. the citizens, for use, reuse, republishing and redistributing, with little or no restrictions. The goal is to make non-personal data open, so that it can be used for building useful applications which leverage their value, allow insight, provide access to government services and support transparency. These data can contribute to the overall development of the society, by both boosting the ICT business sector and allowing citizens a deeper insight into the work of their government. This recent rise in interest for Open Data introduced the necessity for efficient mechanisms which enable easy publishing, management and consumption of such data. Therefore, we developed an Open Data Portal, with the use of the technologies of the Semantic Web. It allows users to publish, manage and consume data in machine-readable formats, interlink their data with data published elsewhere on the Web, publish applications build on top of the data, and interact with other users.
Chapter
Full-text available
Publishing Government Linked Data (and Linked Data in general) is a process that involves a high number of steps, design decisions and technologies. Although some initial guidelines have been already provided by Linked Data publishers, these are still far from covering all the steps that are necessary (from data source selection to publication) or giving enough details about all these steps, technologies, intermediate products, etc. In this chapter we propose a set of methodological guidelines for the activities involved within this process. These guidelines are the result of our experience in the production of Linked Data in several Governmental contexts. We validate these guidelines with the GeoLinkedData and AEMETLinkedData use cases.
ResearchGate has not been able to resolve any references for this publication.