Article

Debugging Data Exchange with Vagabond

PVLDB 08/2011; 4(12):1383-1386.
Source: DBLP

ABSTRACT

In this paper, we present Vagabond, a system that uses a novel holistic approach to help users to understand and debug data exchange scenarios. Developing such a scenario is a complex and labor-intensive process where errors are

Download full-text

Full-text

Available from: Boris Glavic
  • Source
    • "For most data integration tasks, this kind of fixed-schema benchmarks are not suitable because users do need schemas of different sizes and characteristics to allow them to test the properties of different integration systems, to quantify their performance , to reason about incompleteness, and so on. iBench has already been successfully used to evaluate the performance of mapping rewriting algorithms for value invention in data exchange [4], for correlating independent schema mappings [1], and for evaluating a provenance-based mapping debugger [9]. We believe that iBench is a tool that can help the integration community to provide stronger empirical evaluations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Integration systems are typically evaluated using a few realworld scenarios (e.g., bibliographical or biological datasets) or using synthetic scenarios (e.g., based on star-schemas or other patterns for schemas and constraints). Reusing such evaluations is a cumbersome task because their focus is usually limited to showcasing a specific feature of an approach. This makes it dificult to compare integration solutions, understand their generality, and understand their performance for different application scenarios. Based on this observation, we demonstrate some of the requirements for developing integration benchmarks. We argue that the major abstractions used for integration problems have converged in the last decade which enables the application of robust empirical methods to integration problems (from schema evolution, to data exchange, to answering queries using views and many more). Specifically, we demonstrate that schema mappings are the main abstraction that now drives most integration solutions and show how a metadata generator can be used to create more credible evaluations of the performance and scalability of data integration systems. We will use the demonstration to evangelize for more robust, shared empirical evaluations of data integration systems.
    Full-text · Article · Aug 2015 · Proceedings of the VLDB Endowment