Conference Paper

Relaxing Join and Selection Queries.

Conference: Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, September 12-15, 2006
Source: DBLP
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: One useful feature that is missing from today's database systems is an explain capability that enables users to seek clarifications on unexpected query results. There are two types of unexpected query results that are of interest: the presence of unexpected tuples, and the absence of expected tuples (i.e., missing tuples). Clearly, it would be very helpful to users if they could pose follow-up why and why-not questions to seek clarifications on, respectively, unexpected and expected (but missing) tuples in query results. While the why questions can be addressed by applying established data provenance techniques, the problem of explaining the why-not questions has received very little attention. There are currently two explanation models proposed for why-not questions. The first model explains a missing tuple t in terms of modifications to the database such that t appears in the query result wrt the modified database. The second model explains by identifying the data manipulation operator in the query evaluation plan that is responsible for excluding t from the result. In this paper, we propose a new paradigm for explaining a why-not question that is based on automatically generating a refined query whose result includes both the original query's result as well as the user-specified missing tuple(s). In contrast to the existing explanation models, our approach goes beyond merely identifying the "culprit" query operator responsible for the missing tuple(s) and is useful for applications where it is not appropriate to modify the database to obtain missing tuples.
    Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: While the growing number of learning resources increases the choice for learners on how, what and when to learn, it also makes it more and more difficult to find the learning resources that best match the learners' preferences and needs. The same applies to learning systems that aim to adapt or recommend suitable courses and learning resources according to a learner's wishes and requirements. Improved representations for a learner's preferences as well as improved search capabilities that take these preferences into account leverage these issues. In this paper, we propose an approach for selecting optimal learning resources based on preference-enabled queries. A preference-enabled query does not only allow for hard constraints (like 'return lectures about Mathematics') but also for soft constraints (such as 'I prefer a course on Monday, but Tuesday is also fine') and therefore allow for a more fine-grained representation of a learner's requirements, interests and wishes. We show how to exploit the representation of learner's wishes and interests with preferences and how to use preferences in order to find optimal learning resources. We present the personal preference search service~(PPSS), which offers significantly enhanced search capabilities for learning resources by taking the learner's detailed preferences into account.
    IEEE Transactions on Learning Technologies 04/2008; · 0.76 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We focus on large graphs where nodes have attributes, such as a social network where the nodes are labelled with each person's job title. In such a setting, we want to find subgraphs that match a user query pattern. For example, a "star" query would be, "find a CEO who has strong interactions with a Manager, a Lawyer,and an Accountant, or another structure as close to that as possible". Similarly, a "loop" query could help spot a money laundering ring. Traditional SQL-based methods, as well as more recent graph indexing methods, will return no answer when an exact match does not exist. This is the first main feature of our method. It can find exact-, as well as near-matches, and it will present them to the user in our proposed "goodness" order. For example, our method tolerates indirect paths between, say, the "CEO" and the "Accountant" of the above sample query, when direct paths don't exist. Its second feature is scalability. In general, if the query has nq nodes and the data graph has n nodes, the problem needs polynomial time complexity O(n nq), which is prohibitive. Our G-Ray ("Graph X-Ray") method finds high-quality subgraphs in time linear on the size of the data graph. Experimental results on the DLBP author-publication graph (with 356K nodes and 1.9M edges) illustrate both the effectiveness and scalability of our approach. The results agree with our intuition, and the speed is excellent. It takes 4 seconds on average fora 4-node query on the DBLP graph.
    Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007; 01/2007


Available from