-
[show abstract]
[hide abstract]
ABSTRACT: User-defined preferences allow personalized ranking of query results. A user provides a declarative specification of his/lier preferences, and the system is expected to use that specification to give more prominence to preferred answers. We study constraint formalisms for expressing user preferences as base facts in a partial order. We consider a language that allows comparison and a limited form of arithmetic, and show that the transitive closure computation required to complete the partial order terminates. We consider various ways of composing partial orders from smaller pieces, and provide results on the size of the resulting transitive closures. Finally, we show how preference queries within our language can be supported by suitable index structures for efficient evaluation over large data sets. Our results provide guidance about when complex preferences can be efficiently evaluated, and when they cannot.
Data Engineering Workshop, 2007 IEEE 23rd International Conference on; 05/2007
-
[show abstract]
[hide abstract]
ABSTRACT: The ability to compute top-k matches to XML queries is gaining importance due to the increasing number of large XML repositories. The efficiency of top-k query evaluation relies on using scores to prune irrelevant answers as early as possible in the evaluation process. In this context, evaluating the same query plan for all answers might be too rigid because, at any time in the evaluation, answers have gone through the same number and sequence of operations, which limits the speed at which scores grow. Therefore, adaptive query processing that permits different plans for different partial matches and maximizes the best scores is more appropriate. In this paper, we propose an architecture and adaptive algorithms for efficiently computing top-k matches to XML queries. Our techniques can be used to evaluate both exact and approximate matches where approximation is defined by relaxing XPath axes. In order to compute the scores of query answers, we extend the traditional tf*idf measure to account for document structure. We conduct extensive experiments on a variety of benchmark data and queries, and demonstrate the usefulness of the adaptive approach for computing top-k queries in XML.
Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on; 05/2005
-
[show abstract]
[hide abstract]
ABSTRACT: Repositories of multimedia objects having multiple types of attributes (e.g., image, text) are becoming increasingly common. A query on these attributes will typically, request not just a set of objects, as in the traditional relational query model (filtering), but also a grade of match associated with each object, which indicates how well the object matches the selection condition (ranking). Furthermore, unlike in the relational model, users may just want the k top-ranked objects for their selection queries for a relatively small k. In addition to the differences in the query model, another peculiarity of multimedia repositories is that they may allow access to the attributes of each object only through indexes. We investigate how to optimize the processing of top-k selection queries over multimedia repositories. The access characteristics of the repositories and the above query model lead to novel issues in query optimization. In particular, the choice of the indexes used to search the repository strongly influences the cost of processing the filtering condition. We define an execution space that is search-minimal, i.e., the set of indexes searched is minimal. Although the general problem of picking an optimal plan in the search-minimal execution space is NP-hard, we present an efficient algorithm that solves the problem optimally with respect to our cost model and execution space when the predicates in the query are independent. We also show that the problem of optimizing top-k selection queries can be viewed, in many cases, as that of evaluating more traditional selection conditions. Thus, both problems can be viewed together as an extended filtering problem to which techniques of query processing and optimization may be adapted.
IEEE Transactions on Knowledge and Data Engineering 09/2004; · 1.66 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A query to a Web search engine usually consists of a list of keywords, to which the search engine responds with the best or "top" k pages for the query. This top-k query model is prevalent over multimedia collections in general, but also over plain relational data for certain applications. For example, consider a relation with information on available restaurants, including their location, price range for one diner, and overall food rating. A user who queries such a relation might simply specify the user's location and target price range, and expect in return the best 10 restaurants in terms of some combination-of proximity to the user, closeness of match to the target price range, and overall food rating. Processing such top-k queries efficiently is challenging for a number of reasons. One critical such reason is that, in many Web applications, the relation attributes might not be available other than through external Web-accessible form interfaces, which we will have to query repeatedly for a potentially large set of candidate objects. In this paper, we study how to process top-k queries efficiently in this setting, where the attributes for which users specify target values might be handled by external, autonomous sources with a variety of access interfaces. We present several algorithms for processing such queries, and evaluate them thoroughly using both synthetic and real Web-accessible data
Data Engineering, 2002. Proceedings. 18th International Conference on; 02/2002