ChapterPDF Available

A study on actions to make government datasets available in linked open data

Authors:
Knowledge Organization and Cultural Diversity
| 2
Organizers
José Augusto Chaves Guimarães
Vera Dodebei
Knowledge Organization and Cultural Diversity
Marília, São Paulo, Brasil
Sociedade Brasileira de Organização do Conhecimento
(ISKO-Brasil)
2017
Knowledge Organization and Cultural Diversity
| 3
K73k Knowledge Organization and Cultural Diversity [recurso
eletrônico] / José Augusto Chaves Guimarães, Vera Dodebei,
organizadores. -- Pernambuco: ISKO-Brasil ; UFPE, 2017.
760 f. ; 30 cm.
ISBN: 978-85-415-0924-4
Livro digital
1. Organização do conhecimento. I. Título.
CDD 025.4
Knowledge Organization and Cultural Diversity
| 4
Scientific Committee
Vera Lúcia D. Louzada de Mattos Dodebei (UNIRIO)
Gercina Ângela Borém de Oliveira Lima (UFMG)
Marisa Bräscher Basílio Medeiros (UFSC)
Maria Aparecida Moura (UFMG)
Fabio Assis Pinho (UFPE)
Editors
Isadora Victorino Evangelista
Gilberto Gomes Cândido
Rafael Aparecido Moron Semidão
Rafael Cacciolari Dalessandro
Suellen Oliveira Milani
Design
Maíra Fernandes Alencar
Translation
Natália Nakano
© Reproduction of this book in whole or in part is permitted provided that the claims mentioned.
The sale is prohibited.
Knowledge Organization and Cultural Diversity
| 5
SUMMARY
THE EPISTEMOLOGIC DIMENSION OF KNOWLEDGE ORGANIZATION
Approaches and paradigms in knowledge organization…………………………..30
Leila Cristina Weiss e Marisa Bräscher (UFSC)
Historical-epistemological dimension in Knowledge Organization: Gesner's
taxonomy contributions, 16th century………………………………………………37
Andre Vieira de Freitas Araujo e Giulia Crippa (USP)
Terminological and interdisciplinary research configurations in knowledge
organization (2010-2014)………………………………………………………………47
Leilah Santiago Bufrem, Aline Elis Arboit e Juliana Lazzarotto Freitas (UNESP)
The conceptual dimension of Knowledge Organization in NASKO conferences:
Bardian's content analysis......................................................................................55
José Augusto Chaves Guimarães (UNESP); Rodrigo de Sales (UFF), André Ynada
dos Santos (UNESP) e Daniela Fernanda de Oliveira Matos (UNESP)
The relation between knowledge organization and information science in the
Brazilian scientific community: an investigation within ISKO-Brazil…………..73
Rodrigo de Sales (UFF)
Poole, index and fractures: indexing and serial publications in the United
States of the nineteenth century……………………………………………………….85
Gustavo Silva Saldanha (IBICT/Unirio) e Naira Christofoletti Silveira (Unirio)
Processes of archival knowledge representation: historical and conceptual
elements for classification and description………………………………………….94
Natália Bolfarini Tognoli (UNESP) e Thiago Henrique Bragato Barros (UFPA)
Ontologies and Simple Knowledge Organization Systems (SKOS): similarities
and differences…………………………………………………………………………..101
Rogério Aparecido Sá Ramalho (UFSCAR)
Academic genealogy as an approach to domain analysis……………………....108
Renata Cristina Gutierres Castanha e Maria Cláudia Cabrini Grácio (UNESP)
The epistemological dimension of Document Analysis of fictional content in
Knowledge Organization……………………………………………………………….116
Roberta Caroline Vesú Alves e João Batista Ernesto de Moraes (UNESP)
Indexing languages and domain analysis...........................................................122
Vera Lucia Ribeiro Guim e Mariângela Spotti Lopes Fujita (UNESP)
Reader-Interest classification as a method of stock control: the McClellan
Legacy……………………………………………………………………………………..131
Daniel Martínez-Ávila (UNESP) e M. P. Satija (Guru Nanak Dev University)
Terminology, exhaustivity and specificity: a conceptual relation……………140
Knowledge Organization and Cultural Diversity
| 6
Isadora Victorino Evangelista, Walter Moreira e João Batista Ernesto de Moraes
(UNESP)
Aspects of subject metadata in cataloging photographs………………………151
Ana Carolina Simionato (UFSCar)
Documentary languages and audiovisual resources treatment: theoretical and
methodological interfaces……………………………………………………………..160
Francisnaira Cristina Ravazzi (Diretoria de Ensino) e Walter Moreira (UNESP)
Knowledge Organization and Information Design: a convergence study…….170
Natalia Nakano, Mariana Cantisani Padua e Maria José Vicentini Jorente (UNESP)
Domain Analysis of the first ISKO-Brazil chapter editions………………………180
Exequiel Fontans e Mario Barité (Universidad de la Republica)
Teaching in knowledge organization and representation: languages and
standards………………………………………………………………………………….196
Cibele Araújo Marques dos Santos, Eduardo de Abreu de Jesus e João Ricardo de
Luca (USP)
Genealogy of the concept of Information Science in Brazil: a discursive
analysis from founding journals in the area………………………………………..206
Larissa de Melo Lima, João Batista Ernesto de Moraes e Daniel Martínez-Ávila
(UNESP)
THE APPLY DIMENSION OF KNOWLEDGE ORGANIZATION
Semantic Web Technologies applied to Knowledge Organization: SKOS for the
construction and use of decentralized controlled vocabularies………………..220
José Eduardo Santarem Segundo (USP) e Caio Saraiva Coneglian (UNESP)
Preliminary analysis on the conversion of classification plans into controlled
vocabularies……………………………………………………………………………229
Luciana Davanzo e Walter Moreira (UNESP)
Thesaurus Reengineering: the Thesagro case……………………………………237
Benildes Maculan (UFMG), Gercina Lima (UFMG), Ivo Pierozzi Jr. (Embrapa) e
Leandro Oliveira (Embrapa)
Conceptual Model of Organizational Competitive Intelligence.........................246
Thiciane Mary Carvalho Teixeira (UECE) e Marta Lígia Pomim Valentim (Unesp)
Vocabulary control in electronic scientific journals...........................................256
José Carlos Francisco dos Santos e Brígida Maria Nogueira Cervantes (UEL)
Documentary Languages in archival description...............................................266
Maria de Fátima Santos de Lima e Francisco Aragão Pedroza da Cunha (UFBA)
Research in knowledge organization systems in DCMI conferences………….277
Knowledge Organization and Cultural Diversity
| 7
Felipe Augusto Arakaki, Plácida Leopoldina Ventura Amorim da Costa Santos e
Rachel Cristina Vesú Alves (UNESP)
Conceptual models and descriptive representation of information……………288
Elisabete Gonçalves de Souza e Wellington Freire Cunha Costa (UFF)
Nanopublishing Modeling: an experimental approach.......................................296
Lorena Tavares de Paula e Maria Aparecida Moura (UFMG)
Application of ontologies in the information retrieval process applied in
academic settings……………………………………………………………………….310
Caio Saraiva Coneglian (UNESP), Elvis Fusco (Univem) e José Eduardo Santarém
Segundo (USP)
Librarian performance in the subject analysis of theses within the theoretical
dimensions of subject cataloging and indexing…………………………………322
Roberta Cristina Dal’Evedove Tartarotti, Paula Regina Dal’Evedove e Mariângela
Spotti Lopes Fujita (UNESP)
The contribution of archival identification to knowledge organization in
personal archives………………………………………………………………………..331
Gabrieli Aparecida da Fonseca e Sonia Maria Troitiño Rodriguez (UNESP)
Document analysis and the generative trajectory of meaning in the
representation of the archival document……………………………………………337
Gilberto Gomes Cândido (UNESP), João Batista Ernesto de Moraes (UNESP) e
Deise Sabbag (USP)
Contribution of the Generative Trajectory of Meaning for documentary reading
of narrative texts of fiction…………………………………………………………….347
Deise Sabbag (USP) e João Batista Ernesto de Moraes (UNESP)
Vocabulary control and its contribution to research: the case of agrarian
sciences in Uruguay…………………………………………………………………….365
Lucía Simón e Mario Barité (Universidad de la República)
Rural cultural goods: an experience report of the taxonomy construction in the
context of the historical farms of São Paulo.......................................................373
Mayara Cristina Bernardino (UFSCar), Luciana de Souza Gracioso (UFSCar), Maria
da Graça Melo Simões (Universidade de Coimbra) e Luzia Sigoli Fernandes Costa
(UFSCar)
Harmonization of CIDOC CRM ontology in the context of archives, libraries and
museums………………………………………………………………………………….385
Laís Barbudo Carrasco (UNESP), Silvana Aparecida Borsetti Gregório Vidotti
(UNESP) e Phil Manfred Thaller (Universität zu Köln)
Cataloging of interactive resources: analysis of recommendations and
practices in international catalogs…………………………………………………395
Knowledge Organization and Cultural Diversity
| 8
Víctor Amor, Daniel Martínez-Ávila (UNESP) e Rosa San Segundo (Universidad
Carlos III)
Metadata structure for an image base in pathology………………………………404
Elisabete Gonçalves Souza, Jóice Cleide Cardoso Ennes de Souza e Elan Cardozo
Paes de Almeida (UFF)
Indexing through images: accessibility via OPACs………………………………416
Rita de Cássia do Vale Caribé e Marcílio de Brito (UnB)
Interactive conceptual maps as a didactic instrument in Information and
Knowledge Organization……………………………………………………………….435
Marilda Lopes Ginez de Lara, Gabriela Previdello e Nair Yumiko Kobashi (USP)
Domain analysis in Knowledge Organization: exploring thematic and citation
relations…………………………………………………………………………………446
Bruno Henrique Alves, Ely Francina Tannuri de Oliveira e Maria Cláudia Cabrini
Grácio (UNESP)
A study on domain in knowledge organization through author citation and co-
citation analysis…………………………………………………………………………455
Lidyane Silva Lima, Pollyana Ágata Gomes da Rocha Custódia, Ely Francina Tannuri
de Oliveira e Leilah Santiago Bufrém (UNESP)
Classification approaches of the papers submitted to ENANCIB WG2:
professional classification and non-professional classification……………….463
Walter Moreira e Isabela Santana de Moraes (UNESP)
Information behavior within ISKO - International Society For Knowledge
Organization: a study for the 2004-2014 period……………………………………473
Marli Vitor da Silva e Helen de Castro Silva Casarin (UNESP)
Subject representation in Archival Science: an analysis of national and
international scientific articles………………………………………………………..481
Graziela Martins de Medeiros, Leolibia Luana Linden, Luciane Paula Vital e Marisa
Bräscher (UFSC)
Analysis of methodology in bibliometric studies: a proposal for context
indicators………………………………………………………………………………….490
Maria Guiomar da Cunha Frota e Ana Cláudia Ribeiro (UFMG)
Knowledge Organization and Representation: contribution to metric studies
………………………………………………………………………………………………501
Cibele Araújo Marques dos Santos (USP)
Relativizing indexes h and g: a study applied to the domain of metric studies
………………………………………………………………………..…………………….508
Deise Deolindo Silva e Maria Cláudia Cabrini Grácio (UNESP)
Knowledge Organization and Cultural Diversity
| 9
Domain Analysis in knowledge organization and representation in Archival
Science…………………………………………………………………………………515
Cynthia Maria Kiyonaga Suenaga (UNESP) e Brígida Maria Nogueira Cervantes
(UEL)
The FamilySearch Indexing as a crowdsourcing initiative in the context of
knowledge organization…………………………………………………………....526
Paula Carina de Araújo e José Augusto Chaves Guimarães (UNESP)
A study on actions to make government datasets available in linked open
data…………………………………………………………………………………….533
Fernando de Assis Rodrigues e Ricardo César Gonçalves Santana (UNESP)
Knowledge Organization and industrial heritage in São Paulo: the project
Eletromemória..................................................................................................544
Vânia Mara Alves Lima (USP), Marcia Cristina e Carvalho Pazin Vitoriano (UNESP)
e Cristina Hilsdorf Barbanti (USP)
THE SOCIAL, CULTURAL AND POLITICAL DIMENSION OF
KNOWLEDGE ORGANIZATION
Reflections on the development of a methodology for subject analysis……..585
Paula Regina Dal’Evedove (UFPE), Roberta Cristina Dal’Evedove Tartarotti
(UNESP) e Mariângela Spotti Lopes Fujita (UNESP)
Information organization, representation, retrieval and access:
(re)configuration of MARC21 Format and BIBFRAME for cultural diversity in
digital information environments?.......................................................................592
Zaira Regina Zafalon e Marcela Cristina Nespoli (UFSCar)
Music, literature and audiovisual: the contributions of knowledge organization
(KO) in the intersectional relations between the works by Dorival Caymmi and
Jorge Amado……………………………………………………………………………..598
Fabio Assis Pinho (UFPE), Francisco Arrais Nascimento (UFPE) e Andréa Carla
Melo Marinho (UFRGS)
Mediation in Knowledge Organization Domain...................................................605
Mona Cleide Quirino da Silva Farias, Carlos Cândido de Almeida e Daniel Martínez-
Ávilla (UNESP)
The value of information and language in consumer society…………………617
Luciana de Souza Gracioso (UFSCar)
Academic faculty formation in Archival Description……………………………624
Laura Maria do Rego, José Augusto Chaves Guimarães e Natália Bolfarini Tognoli
(UNESP)
Knowledge Organization and Cultural Diversity
| 10
PROSPECTS IN KNOWLEDGE ORGANIZATION RESEARCH IN BRAZIL
Knowledge Organization and the core of Information Science: tales of big data,
computing clouds and social networks……………………………………………634
Renato Rocha Souza (FGV), Mauricio Barcellos Almeida (UFMG) e Renata Abrantes
Baracho Porto (UFMG)
Knowledge Organization: research and developments………………………….644
Gercina Ângela Borém de Oliveira Lima (UFMG)
ISKO-Brazil and the research groups in knowledge organization……………..663
Evelyn Goyannes Dill Orrico (UNIRIO)
Research perspectives on knowledge organization: reflections from PPGCI
IBICT-UFRJ academic papers………………………………………………………..671
Rosali Fernandez de Souza (IBICT)
Search perspectives on knowledge organization in Brazil………………………683
Leilah Santiago Bufrem (UNESP/UFPE)
RESEARCH IN DOCUMENTAL ANALYSIS ON BRAZIL: THE JEAN-CLAUDE
GARDIN INFFLUENCE
The search for efficiency in information and knowledge representation – later
developments in Gardin’s thought…………………………………………………700
Johanna W. Smit (USP)
Jean Claude Gardin and Document Analysis: trajectory of a semiology of
representation…………………………………………………………………………710
Maria de Fátima Gonçalves Moreira Tálamo e Giovana Deliberali Maimone (USP)
Documentary Language from J.-C. Gardin's viewpoint…………………………..721
Marilda L. G de Lara (USP)
From Document Analysis to terminology: theoretical and methodological
trajectory…………………………………………………………………………………730
Vania M. A. Lima (USP)
Ordering Documents: underpinnings and relations with bibliographic
classification……………………………………………………………………………737
Cristina D. Ortega (UFMG)
522
A study on actions to make government datasets available in linked
open data
Fernando de Assis Rodrigues
São Paulo State University
fernando@elleth.org
Ricardo César Gonçalves Sant’Ana
São Paulo State University
ricardosantana@marilia.unesp.br
Introduction
The principles of Linked Open Data (LOD) establish a new way of sharing
datasets opened by the Internet, aiming to promote the wide distribution of structured
data in languages, such as eXtensible Markup Language (XML) and in compliance
with the recommendations of the Resource Description Framework (RDF) (BERNERS-
LEE, 2009; BIZER; HEATH; BERNERS-LEE, 2009; HEATH, 2015; W3C, 2014, 2015).
In this scenario, government datasets play a prominent role: they represent
18.58% of the total number of existing LOD datasets and 41.54% of these government
datasets have at least one relationship with ontologies or controlled vocabularies,
according to the results of the mapping developed by Linking Open Data cloud diagram
(SCHMACHTENBERG; BIZER; PAULHEIM, 2014a, 2014b).
However, according to Schmachtenberg, Bizer, and Paulheim (2014a), there
are still characteristics in the LOD dataset structures at the moment of data retrieval
that is not considered ideal nor adopted good practices, such as the absence of
metadata and licenses information.
Actions to make public government data accessible are an integral part of
discussions on trends in the modernization of public administration models, which seek
to redistribute skills and resources among different intra-governmental and extra-
governmental organizations, allowing greater institutional pluralism in public functions
(MALIN, 2006; SANT'ANA; RODRIGUES, 2013).
The strengthening of transparency actions can be expanded by building
information sharing environments that, among other characteristics, provide an
523
increase in information flows between public administration and society, thus ensuring
greater visibility of the State activities (BOHMAN, 2000; MARCONDES, JARDIM,
2003). These environments become components of greater citizen participation,
extending possibilities of participation beyond voting; and the State can improve the
effectiveness and monitoring of the activities and results of its actions, in addition to
complying with the obligation to publish government data (BRASIL, 2011; SANT'ANA;
RODRIGUES, 2013).
Access to government datasets on the results of legislative votes is important
in monitoring the activities of representatives, supporting the construction of analyzes,
such as "[...] the identification of party clusters" and "[...] consistency of each of our
representatives in the voting during their mandates" (SANT'ANA; RODRIGUES, 2013,
page 58).
The objective of this paper is to explore the actions needed to provide
government datasets in Linked Open Data, starting from an application of a model of
recommendations for data publication "Linked Data Best Practices in Different Topical
Domains", proposed by Schmachtenberg, Bizer And Paulheim (2014a), in databases
available on legislative votes of the Brazilian Senate.
The research object was delimited to datasets available in Communication and
Information Technology (ICT) tools of the Brazilian Senate, more precisely on the
existing votes in the 'Portal e-Cidadania - Open Data', analyzed between January and
March 2015.
Methodological procedures
The methodology adopted was the exploratory analysis of research object, with
a qualitative approach, through the specification of the characteristics of the existing
dataset (i.e. the location of the resource on the web site, information about the
descriptive page and available files); and the data structures found at the time of data
collection.
Those characteristics formed a set of information that served as a subsidy for
the proposal of a strategy of actions necessary for restructuring this existing data, in
compliance with the established recommendations and good practices of LOD
datasets availability, proposed by Schmachtenberg, Bizer, and Paulheim (2014a).
Theoretical Background
524
Schmachtenberg, Bizer, and Paulheim (2014a, 2014b) propose a model with
recommendations for data publication, with the objective of identifying the compliance
to LOD concepts and good practices for data sharing by public datasets stored in
various domains. These recommendations were elaborated from community practices
and the results presented by the LOD dataset mapping developed by the Linking Open
Data cloud diagram (JENTZSCH; CYGANIAK; BIZER, 2011).
The model is divided into nine recommendations:
Providing Provenance Information
In the process of data retrieval, it is necessary that datasets have unique
identifiers to help data retrieval process by external agents, in compliance with the first
principle of LOD (BIZER; HEATH; BERNERS-LEE, 2009). These unique identifiers
must conform to rules established by the Uniform Resource Identifier (URI) and the
RDF (JENTZSCH; CYGANIAK; BIZER, 2011; SCHMACHTENBERG; BIZER;
PAULHEIM, 2014a, 2014b).
Defining links with other datasets
The dataset needs to have links to other datasets through the relationship rules
established by the RDF. This procedure facilitates the automated data collection by
external agents, including others datasets to which it was linked (JENTZSCH;
CYGANIAK; BIZER, 2011; SCHMACHTENBERG; BIZER; PAULHEIM, 2014a,
2014b).
Use of controlled vocabularies and existing ontologies
As data is a basic element "[...] formed by a sign or finite set of signs that do not
contain, intrinsically, a semantic component, but only syntactic elements" (SANTOS;
SANT'ANA, 2002, np), and it is necessary to use controlled vocabularies and
ontologies to extend the semantic load at the moment of data collection by external
agents, such as: Dublin Core Metadata Set (DC), Friend of a Friend (FOAF), Simple
Knowledge Organization System (SKOS), among others (JENTZSCH; CYGANIAK;
BIZER, 2011, SCHMACHTENBERG; BIZER; PAULHEIM, 2014a, 2014b).
Definitions of terms, elements, and attributes in vocabularies and ontologies
The additional documents linked to the dataset, containing information about
ontologies and controlled vocabularies, must have unique URIs for terms, elements,
and attributes (SCHMACHTENBERG; BIZER; PAULHEIM, 2014a, 2014b). For
example, in FOAF the definitions that describes terms like 'name' or 'birthday', it must
525
be accessible by unique URIs, either by using split symbols ('/'), or hash-tag (#) to
differentiate access to each term (BRICKLEY; MILLER, 2014).
Linking terms between vocabularies
In case it is necessary to develop new and own vocabularies, is important that
terms of this vocabulary are linked to existing vocabulary terms, such as DC, FOAF,
SKOS, among others. The linking of new vocabularies with comprehensive
vocabularies provides a greater repertoire of information on terms developed for the
vocabulary of external agents (JENZYK; CYGANIAK; BIZER, 2011,
SCHMACHTENBERG; BIZER; PAULHEIM, 2014a, 2014b).
Providing metadata
At the time of data retrieval, datasets must have metadata elements to ensure
quality on data retrieval process, to identify the data source, and to ensure quality
(JENZYK; CYGANIAK; BIZER, 2011). The metadata "[...] is a key factor to minimize
search and retrieval problems in the various informational environments [...]"
(SANTOS; ALVES, 2009) and it is recommended that: their elements be available in
the root element; and the use of DC elements (JENTZSCH; CYGANIAK; BIZER, 2011;
SCHMACHTENBERG; BIZER; PAULHEIM, 2014a, 2014b).
Use of license terms in metadata
The dataset metadata must contain licenses terms in its elements and
attributes, such as Creative Commons, Open Data Commons Attribution License,
Open Database License (ODbL), among others (JENTZSCH; CYGANIAK; BIZER,
2011; W3C, 2011).
Providing metadata about the dataset structure
In the dataset retrieval, there must be metadata containing information about its
structure - made available with the data or in supplementary documents - delimiting
elements, iterations, used terms and attributes (JENZYK; CYGANIAK; BIZER, 2011;
SCHMACHTENBERG; BIZER; PAULHEIM, 2014a, 2014b).
Use of alternative methods for data retrieval
The most common form of providing structured datasets in the RDF is through
a SPARQL Protocol and RDF Query Language Endpoint (SPARQL Endpoint)
(JENTZSCH; CYGANIAK; BIZER, 2011), which enables external agents to perform
structured searches in the SPARQL query language. However, it is recommended that
dump files (RDF Dump), are also available explicit in the RDF/XML standard or
equivalent (SEMANTICWEB.ORG, 2011).
526
Dataset Characteristics
Portal e-Cidadania aims to promote transparency of actions and activities of the
Brazilian Senate, through access to government data (BRAZIL, 2015a). In January
2015, the website had forty-five datasets, divided into eight groups: ‘Projetos e
Matérias’, ‘Plenário’, ‘Parlamentares’, ‘Composição’, ‘Comissões’, ‘LexML’,
‘Legislação’, and ‘Processo Legislativo’.
The group 'Plenário' contains eight subdivisions: ‘Diários do Senado e do
Congresso’; ‘Legislaturase Sessões Legislativas’; ‘Matérias com prazos’;
‘Pronunciamentos de senador’; ‘Questões de Ordem’; ‘Sessões do Plenário’; ‘Tabelas
de tipos relacionados a plenário’, and; ‘Votações nominais’ - this last containing data
about votes recorded in plenary and information related to sessions, bills, votes, like
subjects and votes of each member (BRAZIL, 2015b).
The subdivision 'Votações nominais' consists of 11 items: 9 dump files in XML,
containing data on votes grouped annually; 1 hyperlink to a web service, and; 1
hyperlink to a voting search page - this last in HyperText Markup Language (HTML)
format, is not the subject of this study.
Characteristics of data structure in retrieval process
Each dump file in XML format has a unique URL, formed by the composition:
the domain/primary hierarchy ‘http://legis.senado.leg.br/dadosabertos/dados/’; the
prefix ‘ListaVotacoes’ follow by year referring to the data, and; file extension '.xml'.
In dataset retrieval via web service, its present only queries grouped by daily
results. For example, to perform data collection on votes in a given month, it is
necessary to perform 'x' queries, where 'x' represents the count of days in month. This
retrieval of datasets has a URL for each daily result, with an URL value formed by the
composition: the domain/primary hierarchy
‘http://legis.senado.leg.br/dadosabertos/plenario/lista/votacao/’; the year, month and
day.
In both cases - when collected the dump files and retrieved data via web service
- the datasets are explicit in XML language and the collections of elements, attributes
and terms available are identical (Table 1).
527
Element
A
ssociated to
element
Type of
Data Attributes
ListaVotacoes None
(Root Element)
Group
Element 'xmlns:xsi'and'xsi:noNamespaceSchemaLocation'
Metadados ListaVotacoes Group
Element None
Votacoes ListaVotacoes Group
Element None
Versao Metadados Text None
VersaoServico Metadados Integer None
DescricaoDataSet Metadados Text None
Votacao Votacoes Group
Element None
CodigoSessao Votacao Integer None
SiglaCasa Votacao Text None
CodigoSessaoLegislativa Votacao Integer None
TipoSessao Votacao Text None
NumeroSessao Votacao Integer None
DataSessao Votacao Text None
HoraInicio Votacao Text None
CodigoTramitacao Votacao Integer None
CodigoSessaoVotacao Votacao Integer None
SequencialSessao Votacao Integer None
Secreta Votacao Text None
DescricaoVotacao Votacao Text None
Resultado Votacao Text None
TotalVotosSim Votacao Integer None
TotalVotosNao Votacao Integer None
TotalVotosAbstencao Votacao Integer None
CodigoMateria Votacao Integer None
SiglaMateria Votacao Text None
NumeroMateria Votacao Integer None
AnoMateria Votacao Integer None
528
Element
A
ssociated to
element
Type of
Data Attributes
Votos Votacao Group
Element None
VotoParlamentar Votos Group
Element None
CodigoParlamentar VotoParlamentar Integer None
NomeParlamentar VotoParlamentar Text None
SexoParlamentar VotoParlamentar Text None
Url VotoParlamentar Text None
Foto VotoParlamentar Text None
Tratamento VotoParlamentar Text None
Voto VotoParlamentar Text None
Source: Authors
The root element is named ‘ListaVotacoes’ and has two attributes and two
elements. Its two attributes are responsible for binding the dataset with a
supplementary document (XML Schema), containing the delimitation of available
elements, content types, and attributes.
The elements 'Metadados' and 'Votacoes' are grouping elements with their
value formed by a set of one or more elements; both with no attributes. The grouping
element 'Metadados' has three elements; 'Votacoes' contains at least one or more
elements 'Votacao'; and the element 'Votacao' has nineteen elements. None of the
elements have attributes.
The grouping element 'Votos' (affiliated to the element 'Votacao') contains one
or more elements 'VotoParlamentar', with no attributes. The element 'VotoParlamentar'
has seven elements, also without attributes.
Results
From the analysis, eight actions were proposed to apply on existing data
retrieval to publish government datasets in LOD (Figure 1).
Figure 1 - Synthesis of actions necessary for the development of government
datasets
529
Sourde: the authors.
Previously to the implementation of these actions, it is important to be available
information (input) about characteristics of existing databases and prior knowledge of
available ontologies and vocabularies that may be part of the relationships and of the
elements from new LOD dataset.
The actions identified in this study can be summarized in:
Action 1: extend the use of URI identifiers for the identification of elements,
attributes, and terms, explicit in additional documents, to facilitate dataset
understanding and the rules of this system;
Action 2: select ontologies and controlled vocabularies widely used by
communities that can be useful to explicit relationships and elements of the new LOD
dataset;
Action 3: elaborate specific ontologies and vocabularies for relationships and
elements that do not exist in ontologies and vocabularies adopted in Action 2;
Action 4: develop additional documents containing legal coverage such as
licenses of use and copyright, and link these documents with the dataset. It is important
that these licenses be explicit in metadata (Action 5);
530
Action 5: adopt the use of metadata elements sets of popular initiatives to inform
more about datasets' content in the moment of data retrieval by external agents;
Action 6: elaborate the logical structure of dataset, attributes, elements, values,
and validation rules;
Action 7: assign terms of selected ontologies and vocabularies to the dataset
elements, linking them through URI, to extend the semantic load of these data;
Action 8: implement RDF structures on XML markups, respecting the
established forms for RDF/XML documents.
These actions should develop the LOD dataset (explicitly in RDF/XML format or
equivalent) as output (result); and vocabularies and ontologies designed to meet the
needs of the data context (Votações Nominais).
Conclusions
In the current form of dataset retrieval, it not considered important
characteristics in the context of data, such as the use of controlled vocabularies and
ontologies, directly interfering in independence between external agents and
producers in data collection.
Although there are three metadata elements, there are no information about the
content itself, such as author, license, source, date of creation, and date of publication.
The application of a recommendation model in this context served as a guideline
for a development of actions proposed in this study, mainly by providing subsidies to
elaborate the set of actions required for government datasets in LOD, allowing public
managers to see important points that may be changed in available data that can be
restructured into LOD datasets and, therefore, collectible by external agents.
It is expected that these actions applied on bills datasets will stimulate the
application in other databases, in other spheres, and in other websites; but also
stimulate an emergence of new Brazilian government datasets in this area.
References
The references was made following the ABNT rules.
BERNERS-LEE, T. The next Web. In: TED2009. Estados Unidos da América: fev. 2009.
BIZER, C.; HEATH, T.; BERNERS-LEE, T. Linked Data The Story So Far: International
Journal on Semantic Web and Information Systems, v. 5, n. 3, p. 1–22, 33 2009.
531
BOHMAN, J. Public deliberation: Pluralism, complexity, and democracy. [s.l.] MIT press, 2000.
BRASIL. Lei No 12.527, de 18 de Novembro de 2011. Regula o acesso a informações previsto
no inciso XXXIII do art. 5o, no inciso II do § 3o do art. 37 e no § 2o do art. 216 da Constituição
Federal; altera a Lei no 8.112, de 11 de dezembro de 1990; revoga a Lei no 11.111, de 5 de
maio de 2005, e dispositivos da Lei no 8.159, de 8 de janeiro de 1991; e dá outras
providências. Portal da Legislação, Brasília, 2011. Disponível em:
<http://www.planalto.gov.br/ccivil_03/_ato2011-2014/2011/lei/l12527.htm>. Acesso em: 3 abr.
2015.
BRASIL. Portal e-Cidadania, 2015a. Disponível em: <http://dadosabertos.senado.gov.br>.
Acesso em: 3 abr. 2015.
BRASIL. Votações Nominais, 2015b. Disponível em:
<http://dadosabertos.senado.gov.br/dataset/votaces-nominais>. Acesso em: 3 abr. 2015.
BRICKLEY, D.; MILLER, L. FOAF Vocabulary Specification 0.99, 14 jan. 2014. Disponível em:
<http://xmlns.com/foaf/spec/20140114.html>. Acesso em: 27 mar. 2015.
HEATH, T. Frequently Asked Questions. In: Linked Data - Connect Distributed Data across
the Web, 2015. Disponível em: <http://linkeddata.org/faq>. Acesso em: 22 mar. 2015.
JENTZSCH, A.; CYGANIAK, R.; BIZER, C. State of the LOD Cloud, 2011. Disponível em:
<http://lod-cloud.net/state>. Acesso em: 24 mar. 2015.
MALIN, A. M. B. Gestão da Informação Governamental: em direção a uma metodologia de
avaliação. DataGramaZero - Revista de Ciência da Informação, v. 7, n. 5, out. 2006.
MARCONDES, C. H.; JARDIM, J. M. Políticas de Informação Governamental: a construção
de Governo Eletrônico na Administração Federal do Brasil. DataGramaZero - Revista de
Ciência da Informação, v. 4, n. 2, abr. 2003.
SANT’ANA, R. C. G.; RODRIGUES, F. A. Visualização de afinidades entre parlamentares
mediante dados de votações no Senado Brasileiro. Informação & Sociedade: estudos, v. 23,
n. 1, p. 49–59, jan. 2013.
SANTOS, P. L. V. A. DA C.; ALVES, R. C. V. Metadados e Web Semântica para estruturação
da Web 2.0 e Web 3.0. DataGramaZero – Revista de Ciência da Informação, v. 10, n. 6, dez.
2009.
SANTOS, P. L. V. A. DA C.; SANT’ANA, R. C. G. Transferência da Informação: análise para
valoração de unidades de conhecimento. DataGramaZero - Revista de Ciência da Informação,
v. 3, n. 2, abr. 2002.
SCHMACHTENBERG, M.; BIZER, C.; PAULHEIM, H. Adoption of the Linked Data Best
Practices in Different Topical Domains. In: MIKA, P. et al. (Eds.). The Semantic Web – ISWC
2014. Cham: Springer International Publishing, 2014a. p. 245–260.
SCHMACHTENBERG, M.; BIZER, C.; PAULHEIM, H. State of the LOD Cloud 2014. University
of Mannheim, 2014b. Disponível em: <http://linkeddatacatalog.dws.informatik.uni-
mannheim.de/state>. Acesso em: 25 mar. 2015.
SEMANTICWEB.ORG. SPARQL endpoint. In: Semantic Web.org, 2011. Disponível em:
<http://semanticweb.org/index.php?title=SPARQL_endpoint&oldid=52541>. Acesso em: 04
jan. 2015.
532
W3C. Data Licensing. W3C, 2011. Disponível em:
<http://www.w3.org/wiki/index.php?title=TaskForces/CommunityProjects/LinkingOpenData/D
ataLicensing&oldid=49411>. Acesso em: 20 mar. 2015.
W3C. RDF 1.1 XML Syntax. W3C, 2014. Disponível em: <http://www.w3.org/TR/2014/REC-
rdf-syntax-grammar-20140225>. Acesso em: 22 mar. 2015.
W3C. Linked Data Platform 1.0. W3C, 2015. Disponível em:
<http://www.w3.org/TR/2015/REC-ldp-20150226>. Acesso em: 24 mar. 2015.
Article
Full-text available
A ampliação da participação cidadã na esfera pública depende diretamente do acesso as informações relativas a atuação dos representantes eleitos, principalmente no que diz respeito a suas decisões nas votações durante seus mandatos. A Ciência da Informação pode colaborar neste processo, propondo e avaliando modelos de acesso a estas informações que podem ser obtidos através de dados a serem disponibilizados pelos sítios oficiais do Poder Legislativo nas esferas Federal, Estadual e Municipal. Propõe-se neste artigo a análise do processo de coleta e uso de dados sobre votações de senadores com vistas a apropriação deste modelo para coleta e uso de dados nas demais esferas. A partir dos dados é analisada a elaboração de uma matriz de afinidades que permita identificar a relação entre cada um dos parlamentares com os demais, em função das similaridades das decisões tomadas no conjunto das votações abertas. É analisa também a elaboração de visualizações iniciais e a ampliação do escopo da pesquisa através da aplicação dos dados obtidos em todas as afinidades entre os parlamentares e na obtenção de uma afinidade média entre partidos, permitindo novas dimensões de análise aos dados coletados. A elaboração das matrizes completas das relações de afinidades entre os parlamentares, pode propiciar um horizonte rico de possibilidades para elaboração de novas formas de visualização e análise, ampliando a visibilidade das ações parlamentares junto a sociedade.
Article
Full-text available
To understand and measure the value of knowledge is one of the most discussed and least understood issues in the studies concerning knowledge management. However, if this difficulty becomes more evident when analysing the whole of an organization knowledge related to the market, it will be necessary to define parameters and mechanisms of assessment of each unit of knowledge available, mainly the one which has by some means already been recorded and has to be controlled not only in its acquiring, storing and accessing processes, but also the discarding one. The purpose of this article is to describe issues involved in value identifying process of recorded knowledge considering its multidimensional functions and the communicating process of information.
Article
Full-text available
The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions-the Web of Data. In this article we present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. We describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Conference Paper
The central idea of Linked Data is that data publishers support applications in discovering and integrating data by complying to a set of best practices in the areas of linking, vocabulary usage, and metadata provision. In 2011, the State of the LOD Cloud report analyzed the adoption of these best practices by linked datasets within different topical domains. The report was based on information that was provided by the dataset publishers themselves via the datahub.io Linked Data catalog. In this paper, we revisit and update the findings of the 2011 State of the LOD Cloud report based on a crawl of the Web of Linked Data conducted in April 2014. We analyze how the adoption of the different best practices has changed and present an overview of the linkage relationships between datasets in the form of an updated LOD cloud diagram, this time not based on information from dataset providers, but on data that can actually be retrieved by a Linked Data crawler. Among others, we find that the number of linked datasets has approximately doubled between 2011 and 2014, that there is increased agreement on common vocabularies for describing certain types of entities, and that provenance and license metadata is still rarely provided by the data sources.
Disponível em: <http://dadosabertos.senado.gov.br>. Acesso em: 3 abr. 2015. BRASIL. Votações Nominais
  • Brasil Portal E-Cidadania
BRASIL. Portal e-Cidadania, 2015a. Disponível em: <http://dadosabertos.senado.gov.br>. Acesso em: 3 abr. 2015. BRASIL. Votações Nominais, 2015b. Disponível em: <http://dadosabertos.senado.gov.br/dataset/votaces-nominais>. Acesso em: 3 abr. 2015.
Disponível em: <http://dadosabertos.senado.gov.br/dataset/votaces-nominais>
  • Votações Nominais
BRASIL. Votações Nominais, 2015b. Disponível em: <http://dadosabertos.senado.gov.br/dataset/votaces-nominais>. Acesso em: 3 abr. 2015.
State of the LOD Cloud
  • A Cyganiak
  • R Bizer
JENTZSCH, A.; CYGANIAK, R.; BIZER, C. State of the LOD Cloud, 2011. Disponível em: <http://lod-cloud.net/state>. Acesso em: 24 mar. 2015.
Políticas de Informação Governamental: a construção de Governo Eletrônico na Administração Federal do Brasil. DataGramaZero -Revista de Ciência da Informação
  • C H Marcondes
  • J M Jardim
MARCONDES, C. H.; JARDIM, J. M. Políticas de Informação Governamental: a construção de Governo Eletrônico na Administração Federal do Brasil. DataGramaZero -Revista de Ciência da Informação, v. 4, n. 2, abr. 2003.
State of the LOD Cloud
  • M Schmachtenberg
  • C Bizer
  • H Paulheim
SCHMACHTENBERG, M.; BIZER, C.; PAULHEIM, H. State of the LOD Cloud 2014. University of Mannheim, 2014b. Disponível em: <http://linkeddatacatalog.dws.informatik.un imannheim.de/state>. Acesso em: 25 mar. 2015.