Claudio GutierrezUniversity of Chile · Departamento de Ciencias de la Computación
Claudio Gutierrez
Full Professor
About
166
Publications
49,058
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,436
Citations
Introduction
Skills and Expertise
Publications
Publications (166)
Mapuzugun is the language of the Mapuche people. Due to political and historical reasons, its number of speakers has decreased and the language has been excluded from the educational system in Chile and Argentina. For this reason, it is very important to support the revitalization of the Mapuzugun in all spaces and media of society. In this work we...
The Cybersyn project has lately received increased attention. In this article, we study the local technical antecedents of Stafford Beer's Cybersyn project in Chile, particularly regarding Cybernetics and Systems ideas and local computing and networking developments. We show that the Cybersyn project in Chile was hosted by a rich intellectual envir...
As humans, we can deduce more from the data graph of Figure 2.1 than what the edges explicitly indicate. We may deduce, for example, that the am festival ((eidis)) will be located in Santiago, even though the graph does not contain an edge (eidis)— location →(santiago). We may further deduce that the cities connected by flights must have some airpo...
While deductive knowledge is characterized by precise logical consequences, inductively acquiring knowledge involves generalizing patterns from a given set of input observations, which can then be used to generate novel but potentially imprecise predictions. For example, from a large data graph with geographical and flight information, we may obser...
The notion of Knowledge Graph stems from scientific advancements in diverse research areas such as Semantic Web, databases, knowledge representation and reasoning, NLP, and machine learning, among others. The integration of ideas and techniques from such disparate disciplines presents a challenge to practitioners and researchers to know how current...
In this chapter, we discuss some of the most prominent knowledge graphs that have emerged in the past years. We begin by discussing open knowledge graphs, most of which have been published on the Web per the guidelines and protocols described in Chapter 9. We later discuss enterprise knowledge graphs that have been created by companies from diverse...
In this chapter we describe extensions of the data graph–relating to schema, identity, and context–that provide additional structures for accumulating knowledge. Henceforth, we refer to a data graph as a collection of data represented as nodes and edges using one of the models discussed in Chapter 2. We refer to a knowledge graph as a data graph po...
Independent of the (kinds of) source(s) from which a knowledge graph is created, the resulting initial knowledge graph will usually be incomplete, and will often contain duplicate, contradictory or even incorrect statements, especially when taken from multiple sources. After the initial creation and enrichment of a knowledge graph from external sou...
At the foundation of any knowledge graph is the principle of first applying a graph abstraction to data, resulting in an initial data graph. We now discuss a selection of graph-structured data models that are commonly used in practice to represent data graphs. We then discuss the primitives that form the basis of graph query languages used to inter...
In this chapter, we discuss the principal techniques by which knowledge graphs can be created and subsequently enriched from diverse sources of legacy data that range from plain text to structured formats (and anything in between). The appropriate methodology to follow when creating a knowledge graph depends on the actors involved, the domain, the...
Beyond assessing the quality of a knowledge graph, there exist techniques to refine the knowledge graph, in particular to (semi-)automatically complete and correct the knowledge graph [Paul-heim, 2017], aka knowledge graph completion and knowledge graph correction, respectively. As distinguished from the creation and enrichment tasks outlined in Ch...
While it may not always be desirable to publish knowledge graphs (for example, those that offer a competitive advantage to a company [Noy et al., 2019]), it maybe desirable or even required to publish other knowledge graphs, such as those produced by volunteers [Lehmann et al., 2015, Mahdisoltani et al., 2015, Vrandecic and Krotzsch, 2014], by publ...
In this article, we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After some opening remarks, we motivate and contrast various graph-based data models, as well as lang...
Graphs have become the best way we know of representing knowledge. The computing community has investigated and developed the support for managing graphs by means of digital technology. Graph databases and knowledge graphs surface as the most successful solutions to this program. The goal of this document is to provide a conceptual map of the data...
Tracking the historical events that lead to the interweaving of data and knowledge.
The aim of this paper is to identify, given certain democratic normative standards regarding deliberation, some pros as well as cons of possible online deliberation designs due to variations in two key design dimensions: namely, asynchronicity and anonymity. In particular, we consider one crucial aspect of deliberative argumentation: namely, its re...
In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After a general introduction, we motivate and contrast various graph-based data models and query languag...
In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After a general introduction, we motivate and contrast various graph-based data models and query languag...
We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph...
We research the problem of building knowledge maps of graph-like information. We live in the digital era and similarly to the Earth, the Web is simply too large and its interrelations too complex for anyone to grasp much of it through direct observation. Thus, the problem of applying cartographic principles also to digital landscapes is intriguing....
Recent availability of data about writing processes at keystroke-granularity has enabled research on the evolution of document writing. A natural task is to develop systems that can actually show this data, that is, user interfaces that transform the data of the process of writing --today a black box-- into intelligible forms. On this line, we prop...
Implementations of a standard language are expected to give same outputs to identical queries. In this paper we study why different implementations of SPARQL (Fuseki, Virtuoso, Blazegraph and rdf4j) behave differently when evaluating queries with correlated variables. We show that at the core of this problem lies the historically troubling notion o...
Graph data management concerns the research and development of powerful technologies for storing, processing and analyzing large volumes of graph data. This chapter presents an overview about the foundations and systems for graph data management. Specifically, we present a historical overview of the area, studied graph database models, characterize...
A graph database is a database where the data structures for the schema and/or instances are modeled as a (labeled)(directed) graph or generalizations of it, and where querying is expressed by graph-oriented operations and type constructors. In this article we present the basic notions of graph databases, give an historical overview of its main dev...
Recent availability of data of writing processes at keystroke-granularity has enabled research on the evolution of document writing. A natural step is to develop systems that can actually show this data and make it understandable. Here we propose a data structure that captures a document's fine-grained history and an organic visualization that serv...
We report on a community effort between industry and academia to shape the future of graph query languages. We argue that existing graph database management systems should consider supporting a query language with two key characteristics. First, it should be composable, meaning, that graphs are the input and the output of queries. Second, the graph...
Federated SPARQL queries give unified answers from multiple and distributed SPARQL endpoints. A good example may be the collection of stops from different transport companies in the same city to create a route planning application. The performance of the evaluation of these types of queries is usually poor, a fact that makes difficult their use in...
Reflections on the Concept of Data and its Implications for Science and Society
The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing...
The paper determines the algebraic and logic structure of the multiset semantics of the core patterns of SPARQL. We prove that the fragment formed by AND, UNION, OPTIONAL, FILTER, MINUS and SELECT corresponds precisely to both, the intuitive multiset relational algebra (projection, selection, natural join, arithmetic union and except), and the mult...
Este artículo presenta la vida y obra de un matemático chileno del siglo XIX cuya obra fue
reconocida por los más importantes centros científicos de su época, y paradojalmente,
permaneció olvidada en Chile por casi 150 años. Picarte puede ser considerado el primer
científico completamente educado y entrenado en Chile que publicó desde Chile a nivel...
We research the problem of building knowledge maps of graph-like information. There exist well-consolidated cartographic principles and techniques for mapping physical landscapes. However, we live in the digital era and similarly to the Earth, the Web is simply too large and its interrelations too complex for anyone to grasp much of it through dire...
In the current SPARQL specification the notion of correlation and substitution are not well defined. This problem triggers several ambiguities in the semantics. In fact, implementations as Fuseki and Virtuoso assume different semantics. In this technical report, we provide a semantics of correlation and substitution following the classic philosophy...
This paper presents a thorough study of negation in SPARQL. The types of negation supported in SPARQL are firstly formalized and their main features discussed. Then, we study the expressive power of the corresponding negation operators. At this point, we identified a simple and minimal SPARQL algebra which could be used, instead of the original SPA...
Group centrality is an extension of the classical notion of centrality for
individuals, to make it applicable to sets of them. We perform a SWOT
(strengths, weaknesses, opportunities and threats) analysis of the use of group
centrality in semantic networks, for different centrality notions: degree,
closeness, betweenness, giving prominence to rando...
Criticism of the conference model should be put in context. Evidences suggest
that the essential features of this model have emerged as responses to
challenges posed by current trends of scientific research and the impact of the
new techno-economic paradigm, the age of Information and Communication
Technology. This context seems indispensable when...
The Web of Linked Data is a huge graph of distributed and interlinked datasources fueled by structured information. This new environment calls for formal languages and tools to automatize navigation across datasources (nodes in such graph) and enable semantic-aware and Web-scale search mechanisms. In this article we introduce a declarative navigati...
Recently Boldi and Vigna proposed axioms that would characterize good notions of centrality. We study a random-walk version of closeness centrality and prove that is satisfies Boldi-Vigna axioms for non-directed graphs.
We present the MaGe system, which helps users and devel-opers to build maps of the Web graph. Maps abstract and represent in a concise and machine-readable way regions of information on the Web.
Carlos Grandjot (1900-1979) was a German mathematician, doctorate from
G\"ottingen, who moved to Chile in 1929 and developed there his life and
career. He was influential in the development of Chilean mathematics during the
period 1930 to 1960. This article reports our investigation of his biography
and describes the mathematical environment in his...
In this short note we give an overview of our research concerning cartography on the Web and its challenges. We present a mathematical formalism to capture the notion of map on the Web, which allows to automatize the construction of maps.
RESUMEN Hasta fines del siglo XIX la enseñanza de las matemáticas en Chile a nivel secundario y superior estaba en manos de ingenieros y aficionados. Con la fundación del Instituto Pedagógico, en 1889, su enseñanza en estos niveles comienza a profesionalizarse. Este artículo estudia este proceso centrándose en torno a quien lideró este desarrollo d...
This paper presents the swget portal. By using the portal, users can instruct software modules to (virtually) move from one place (data source) to another on the Web of Data, interpret knowledge and trigger actions much in the same spirit of intelligent agents. Instructions are specified via navigational expressions in the NautiLOD language. Such e...
Based on recent results, we argue that the right method for Web clients to access relevant information from Linked Datasets has not yet been found. We propose that something is needed between (i) Linked Data dereferencing, which is simple and reliable but too vaguely defined; (ii) data dumps, which are simple and reliable but too coarse-grained, an...
Until the end of the XIX century, the teaching of mathematics, both at the secondary and higher levels, was in the hands of engineers and amateurs. With the creation of the Instituto Pedagógico, in 1889, its teaching began to professionalize. This paper studies this process, and centers on the figure of the person that lead this development in the...
This chapter takes a guided tour to the challenges of Big Semantic Data management and the role that it plays in the emergent Web of Data. Section “Big Data” provides a brief overview of Big Data and its dimensions. Section “What is Semantic Data?” summarizes the semantic web foundations and introduces the main technologies used for describing and...
The normative version of RDF Schema (RDFS) gives non-standard (intensional) interpretations to some standard notions such as classes and properties, thus departing from standard set-based semantics. In this paper we develop a standard set-based (extensional) semantics for the RDFS vocabulary while preserving the simplicity and computational complex...
Censuses are one of the most relevant types of statistical data, allowing analyses of the population in terms of demography, economy, sociology, and culture. For fine-grained analysis, census agencies publish census microdata that consist of a sample of individual records of the census containing detailed anonymous individual information. Working w...
The natural distributed character of software ecosystems calls for a shared conceptualization and language to describe their architecture and their evolution. In this regards, ontologies play a central role. In this paper: we argue in favor of such an approach by showing that there is succesful experience applying ontologies to the fields of softwa...
Inspired by the CAP theorem, we identify three desirable properties when querying the Web of Data: Alignment (results up-to-date with sources), Coverage (results covering available remote sources), and Efficiency (bounded resources). In this short paper, we show that no system querying the Web can meet all three ACE properties, but instead must mak...
The Web of Data refers to the universal database constituted by interlinked data sources on the Web. This global system is creating a new way of publishing and consuming data on the Web. A number of assumption that were valid in bounded, controlled, closed worlds of data are now being challenged. In this paper, following the seminal ideas presented...
A map is an abstract visual representation of a region, taken from a
given space, usually designed for final human consumption. Traditional
cartography focuses on the mapping of Euclidean spaces by using some
distance metric. In this paper we aim at mapping the Web space by
leveraging its relational nature. We introduce a general mathematical
frame...
The current Web of Data is producing increasingly large RDF datasets. Massive publication efforts of RDF data driven by initiatives like the Linked Open Data movement, and the need to exchange large datasets has unveiled the drawbacks of traditional RDF representations, inspired and designed by a document-centric and human-readable Web. Among the m...
The Semantic Web is the initiative of the W3C to make information on the Web readable not only by humans but also by machines. RDF is the data model for Semantic Web data, and SPARQL is the standard query language for this data model. RDF also considers a special type of objects to describe anonymous resources, called blank nodes in the RDF data mo...
The normative version of RDF Schema (RDFS) gives non-standard (intensional) interpretations to some standard notions such as classes and properties, thus departing from standard set-based semantics. In this paper we develop a standard set-based (extensional) semantics for the RDFS vocabulary while preserving the simplicity and computa-tional comple...
The normative version of RDFS gives non-standard (intensional) interpretations to some standard notions such as classes and properties. In this paper we develop the extensional semantics for the RDFS vocabulary, which surprisingly preserves the simplicity and computational complexity of deduction of the intensional case. This result will impact cur...
The main goal of current Web navigation languages is to retrieve set of nodes reachable from a given node. No information is provided about the fragments of the Web navigated to reach these nodes. In other words, information about their connections is lost. This paper presents an efficient algorithm to extract relevant parts of these Web fragments...
The first digital computer for scientific and engineering applications was installed in Chile in 1962. It was an ER-56 Standard Elektrik Lorenz (“Lorenzo” by its Spanish nickname) made in Germany. It was acquired the Faculty of Physical and Mathematical Sciences of University of Chile. It was used in teaching, scientific and technological research,...
Digital computing in Chile dates back to the years between 1961, when the first digital computer arrived in the country, and 1982, when the discipline reached a critical mass in equipment, personnel, education, research, and applications. The authors distinguish three stages in this historical period: the introduction of computers; the convergence...
The massive semantic data sources linked in the Web of Data give new meaning
to old features like navigation; introduce new challenges like semantic
specification of Web fragments; and make it possible to specify actions relying
on semantic data. In this paper we introduce a declarative language to face
these challenges. Based on navigational featu...
We review and discuss A. H. Louie’s book “More than Life Itself: A Reflexion on Formal Systems and Biology” from an interdisciplinary
viewpoint, involving both biology and mathematics, taking into account new developments and related theories.
KeywordsRelational biology–Systems biology–(M,R) systems–Robert Rosen–Efficient causation–Autopoiesis–Org...
Publishing open data is going mainstream. There are diverse initiatives, ranging from international agencies to local governments, exposing data publicly in standard formats to foster transparency, innovation and public scrutiny. Nevertheless, the publishing of statistical data still presents huge challenges. Statistical modeling, privacy concerns,...
There is a comprehensive body of theory studying updates
and schema evolution of knowledge bases, ontologies, and in particular
of RDFS. In this paper we turn these ideas into practice by presenting
a feasible and practical procedure for updating RDFS. Along the lines
of ontology evolution, we treat schema and instance updates separately,
showing t...
The Semantic Web is based on the idea of a common and minimal language to enable large quantities of existing data to be analyzed and processed. This triggers the need to develop the database foundations of this basic language, which is the Resource Description Framework (RDF). This paper addresses this challenge by: 1) developing an abstract model...
In the paper, “semQA: SPARQL with Idempotent Disjunction”, the authors study the RDF query language SPARQL. In particular, they claim that some of the results presented in are not correct. In this note, we refute the claims made in, and actually show that some of the formal results of are incorrect.
Social Network (SN) data has become ubiquitous, demanding advanced and exible means to represent, transform and query such data. In addition to the intrinsic challenges of querying graph data is the requirement that networks be restructured, and thus that new values be created. To address these, we introduce a dedicated data model and query
languag...
Subqueries are a poweful feature which allows to enforce reuse, composition, rewriting and optimization in a query language. In this paper we perform a comprehensive study of the incorporation of subqueries into SPARQL. We consider the many possible choices as suggested by the experience of similar languages, as well as features that developers are...
The Web of Data is producing large RDF datasets from diverse fields. The increasing size of the data being published threatens to make these datasets hardly to exchange, index and consume. This scalability problem greatly diminishes the potential of interconnected RDF graphs. The HDT format addresses these problems through a compact RDF representat...
These notes are meant as a companion to a lecture on the topic at the Reasoning Web Summer School 2011. The goal of this work is to present diverse and known material on modeling the Web from a data perspective, to help students to get a first overview of the subject.
Methodologically, the objective is to give pointers to the relevant topics and li...
Huge RDF Datasets are currently being published in the Linked-Open-Data cloud. Appropriate data structures are required to address scalability and performance issues when storing, sharing and querying these datasets. HDT (Header, Dictionary, Triples) is a binary format that represents RDF data in a compressed manner, therefore saving space whilst p...
SPARQL currently does not include any form of nested queries. In this paper we present a proposal to incorporate nested queries into SPARQL along the design philosophy of SQL nested queries. We present rewriting algorithms and show that all the proposed nested queries can be expressed in a natural and simple extension of SPARQL syntax.
The Resource Description Framework (RDF) is the standard data model for representing information about World Wide Web resources. In January 2008, it was released the recommendation of the W3C for querying RDF data, a query language called SPARQL. In this chapter, we give a detailed description of the semantics of this language. We start by focusing...