Gaetano Geck's research while affiliated with Technische Universität Dortmund and other places

Publications (13)

Preprint
Full-text available
The paper studies the rewriting problem, that is, the decision problem whether, for a given conjunctive query $Q$ and a set $\mathcal{V}$ of views, there is a conjunctive query $Q'$ over $\mathcal{V}$ that is equivalent to $Q$, for cases where the query, the views, and/or the desired rewriting are acyclic or even more restricted. It shows that, if...
Preprint
Full-text available
The Iltis project provides an interactive, web-based system for teaching the foundations of formal methods. It is designed to allow modular addition of educational tasks as well as to provide immediate and comprehensive feedback. Currently, exercises for various aspects of typical automated reasoning workflows for propositional logic, modal logic,...
Preprint
This paper introduces a declarative framework to specify and reason about distributions of data over computing nodes in a distributed setting. More specifically, it proposes distribution constraints which are tuple and equality generating dependencies (tgds and egds) extended with node variables ranging over computing nodes. In particular, they can...
Conference Paper
Iltis is an interactive, web-based system for teaching logic. It is designed to provide immediate and comprehensive feedback for exercises covering various aspects of the reasoning workflow. This poster presentation reports on new exercises and feedback mechanisms for modal and first-order logic.
Article
Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query, also referred to as parallel-correctness. This article extends the study of the co...
Article
Logic is a foundation for many modern areas of computer science. In artificial intelligence, as a basis of database query languages, as well as in formal software and hardware verification — modelling scenarios using logical formalisms and inferring new knowledge are important skills for going-to-be computer scientists. The Iltis project aims at pr...
Article
A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data are first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified a...
Article
Evaluating queries over massive amounts of data is a major challenge in the big data era. Modern massively parallel systems, like e.g. Spark, organize query answering as a sequence of rounds each consisting of a distinct communication phase followed by a computation phase. The communication phase redistributes data over the available servers, while...
Article
A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication- free way. The reshuffling itself is specified a...
Article
Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query, also referred to as parallel-correctness. This paper extends the study of the comp...
Conference Paper
Given a set-comparison predicate P and given two lists of sets A = (A1,...,Am) and B = (B1,...,Bm), with all Ai, Bj ⊆ [n], the P-set join A bowtiePB is defined to be the set {(i, j) in [m] x [m] | P(Ai,Bj)}. When P(Ai,Bj) is the condition "Ai ∩ Bj ≠ is empty " we call this the set-intersection-notempty join (a.k.a. the composition of A and B); when...
Article
A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as...

Citations

... Además, facilita la lectura, escritura y Prácticas docentes de aula en la enseñanza del pensamiento computacional en escuelas medias oficiales y particulares de la región metropolitana de la Ciudad de Panamá REVISTA ANUAL, ACCIÓN Y REFLEXIÓN EDUCATIVA, N° 46 Enero, 2021 ISSN L 2644-3775 212 comprensión de especificaciones formales y su análisis (Mossakowski, 2010). Por otro lado, modelar escenarios utilizando formalismos lógicos e inferir nuevos conocimientos son habilidades importantes en el campo de la enseñanza de la informática (Geck et al., 2018). Por lo tanto, el aprendizaje de formalismos lógicos y, en particular, el modelado lógico es de suma importancia para los estudiantes de informática. ...
... There exists some cases with lower complexity, e.g., the Hypercube distribution policy used in [2,10,11] leads to NP-complete complexity to decide parallel correctness. See [26,38,54] for more results on parallel correctness. ...
... While the effect on students' performance has not been measured for this editor, a study examining ITA (Yacef, 2005), a set of editors similar to P-Logic Tutor, has shown that using the editors has a strong impact on students' performance. ILTIS (Geck et al., 2018) is another recently developed system of logic editors including an editor for resolution in propositional logic, where authors concluded that the feedback provided by their system supports students in understanding the subject matter as well. However, although there are already various editors for logical reasoning, the authors are not aware of any editor for Resolution for first order logic. ...
... In turn, this feature simplifies the development of analysis tools and techniques [18,16]. In fact, those have been exploited in various settings, but providing only empirical and experimental assessments [18,16,12,13], or formal models to study distributed query computation strategies [7,6,5], disregarding their temporal evolution. ...
... The second semantics comes from the database theory community and uses only set semantics [1, 6, 38]. This line of work has led to theoretical results characterizing languages for which query equivalence is decidable (and often fully characterizing the complexity of the equivalence problem), and separating them from richer languages where equivalence is undecidable [8, 19, 42, 51]. For example, equivalence is decidable (and Π P 2 -complete for a fixed database schema [51] , and coNEXPTIME-complete in gen- eral [19] ) for conjunctive queries with safe negation, but undecidable for conjunctive queries with unsafe negation. ...
... The hypercube algorithm was introduced by Afrati and Ullman[1]in the context of Mapreduce, where it is called shares algorithm, and has since then been intensively studied[5,6,3,11]. For a conjunctive query Q, the hypercube algorithm defines a reshuffling strategy based on hashing of data values allowing to compute Q through local evaluation on servers in parallel. ...
... Big data Internet technology is a computer Internet language description of huge amount of information (massive information), experimental analysis after algorithm, scattered data sources, and diverse data formats, which cannot be collected, analyzed, and processed by current technology within a specified time frame, and become a collection of data that can be used as a reference for national thinking and political decisions [23][24][25]. Big data Internet analytics methods focus more on data coverage than on data accuracy, replacing random analysis (sampling) with full-sample analysis and rigorous causal analysis with statistical correlation analysis [26]. ...
... Big Data is composed of several components and various researches proposed techniques to improve those components-starting from efficient indexing [2-4] and caching/filtering [5], to techniques such as improved query execution plans [6-8] and effective data partitioning [9][10][11]. However, we found one aspect of Big Data which existing researches failed to address-improving query execution time by avoiding unnecessary query delegation to remote nodes in a Big Data cluster when it is pre-known that the nodes do not have the data for the requested partition. ...
... The second semantics comes from the database theory community and uses only set semantics [1,6,44]. This line of work has led to query containment techniques for conjunctive queries using tableau or homomorphisms [1,8] and to various complexity results for the query containment and equivalence problems [8,24,48,57]. The problem here is that it is restricted only to set semantics, and query equivalence under bag semantics is quite different; in fact, the first paper that noted that is entitled "Optimization of Real Conjunctive Queries," with an emphasis on real [10]. ...
... [14] pc-trans(CQ) is Π p 3 -complete. We note that the same complexity bounds continue to hold in the presence of inequalities and for unions of conjunctive queries [15]. It is shown in [14,15] that the complexity of deciding transferability can be lowered to NP in some cases when considering restricted classes of queries (as, for instance, the full queries) or when restricting the class of considered distribution policies (as, for instance, HyperCube distributions). ...