Laconic schema mappings: computing core universal solutions by means of SQL queries

Source: arXiv


We present a new method for computing core universal solutions in data exchange settings specified by source-to-target dependencies, by means of SQL queries. Unlike previously known algorithms, which are recursive in nature, our method can be implemented directly on top of any DBMS. Our method is based on the new notion of a laconic schema mapping. A laconic schema mapping is a schema mapping for which the canonical universal solution is the core universal solution. We give a procedure by which every schema mapping specified by FO s-t tgds can be turned into a laconic schema mapping specified by FO s-t tgds that may refer to a linear order on the domain of the source instance. We show that our results are optimal, in the sense that the linear order is necessary and the method cannot be extended to schema mapping involving target constraints.

Download full-text


Available from: Phokion Kolaitis, Aug 25, 2014
  • Source
    • "Core identification has been shown to be a co-NP hard problem (Fagin et al, 2005) for certain mapping dependencies. Despite these complexity results, there have been successful developments of efficient techniques that given two schemas and a set of mapping dependencies between them, in the form of tuple generating dependencies, produce a set of transformation scripts, e.g., in XSLT or SQL, whose execution efficiently generates a core target instance (Mecca et al, 2009; ten Cate et al, 2009). Time performance is becoming particularly critical in ETL tools that typically deal with large volumes of data. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The increasing demand of matching and mapping tasks in modern integration scenarios has led to a plethora of tools for facilitating these tasks. While the plethora made these tools available to a broader audience, it led to some form of confusion regarding the exact nature, goals, core functionalities, expected features, and basic capabilities of these tools. Above all, it made performance measurements of these systems and their distinction a difficult task. The need for design and development of comparison standards that will allow the evaluation of these tools is becoming apparent. These standards are particularly important to mapping and matching system users, since they allow them to evaluate the relative merits of the systems and take the right business decisions. They are also important to mapping system developers, since they offer a way of comparing the system against competitors, and motivating improvements and further development. Finally, they are important to researchers as they serve as illustrations of the existing system limitations, triggering further research in the area. In this work, we provide a generic overview of the existing efforts on benchmarking schema matching and mapping tasks. We offer a comprehensive description of the problem, list the basic comparison criteria and techniques, and provide a description of the main functionalities and characteristics of existing systems.
    Schema Matching and Mapping, 12/2010: pages 253-291;
  • Source
    Conference Paper: Core schema mappings
    [Show abstract] [Hide abstract]
    ABSTRACT: Research has investigated mappings among data sources under two perspectives. On one side, there are studies of practical tools for schema mapping generation; these focus on algorithms to generate mappings based on visual specifications provided by users. On the other side, we have theoretical researches about data exchange. These study how to generate a solution - i.e., a target instance - given a set of mappings usually specified as tuple generating dependencies. However, despite the fact that the notion of a core of a data exchange solution has been formally identified as an optimal solution, there are yet no mapping systems that support core computations. In this paper we introduce several new algorithms that contribute to bridge the gap between the practice of mapping generation and the theory of data exchange. We show how, given a mapping scenario, it is possible to generate an executable script that computes core solutions for the corresponding data exchange problem. The algorithms have been implemented and tested using common runtime engines to show that they guarantee very good performances, orders of magnitudes better than those of known algorithms that compute the core as a post-processing step.
    Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29 - July 2, 2009; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce the +Spicy mapping system. The system is based on a number of novel algorithms that contribute to in- crease the quality and expressiveness of mappings. +Spicy integrates the computation of core solutions in the mapping generation process in a highly efficient way, based on a nat- ural rewriting of the given mappings. This allows for an efficient implementation of core computations using com- mon runtime languages like SQL or XQuery and guarantees very good performances, orders of magnitude better than those of previous algorithms. The rewriting algorithm can be applied both to mappings generated by the system, or to pre-defined mappings provided as part of the input. To do this, the system was enriched with a set of expressive primitives, so that +Spicy is the first mapping system that brings together a sophisticate and expressive mapping gen- eration algorithm with an efficient strategy to compute core solutions.
    Proceedings of the VLDB Endowment 08/2009; 2(2):1582-1585. DOI:10.14778/1687553.1687597
Show more