Publications (1)0 Total impact
-
[show abstract]
[hide abstract]
ABSTRACT: Skyline queries are often used on data sets in multi-dimensional space for many decision-making applications. Traditionally, a point p is said to dominate another point q if, for all dimension, it is no worse than q and is better on at least one dimension. Therefore, the skyline of a data set consists of all points not dominated by any other point. To better cater to application requirements such as controlling the size of the skyline or handling data sets that are not well-structured, various works have been proposed to extend the definition of skyline based on variants of the dominance relationship. However, it is difficult to implement each of these variants separately in a system setting and instead effort must be made to provide a general framework so that these specific implementations can be easily materialized over the framework. In this paper, a generalized framework is proposed for this purpose. Our framework explicitly and care-fully examines the various properties that should be preserved in a variant of the dominance relationship so that: (1) the original advantages of skyline can be maintained while adaptivity to application semantics is also catered to and (2) computational complexity is al-most unaffected. We prove that traditional dominance is the only relationship satisfying all desirable proper-ties and present some new dominance relationships to illustrate that other skyline variants always have their tradeoff in relaxing some of the properties. We then de-veloped generic algorithms that compute skyline vari-ants subject to the constraints that certain properties are relaxed and illustrate the use of our framework in computing of skyline over datasets with missing values. Extensive experimental results are presented to evalu-ate the efficiency and effectiveness of our framework.