Conference Paper

Preference Queries with SV-Semantics

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Personalization of database queries requires a semantically rich, easy to handle and flexible preference model. Building on preferences as strict partial orders we provide a variety of intuitive base preference constructors for numerical and categorical data, including so-called d-parameters. As a novel semantic concept for complex preferences we introduce the notion of ‘substitutable values’ (SV-semantics), characterizing equally good values amongst indifferent values. Pareto and prioritized preference construction preserves strict partial orders, which instantly solves crucial wellknown problems for preference queries. We can point out a new semantic-guided way to cope with the infamous flooding effect of query engines. Contrary to a wide-spread belief we can give evidence that the result sizes of Pareto or skyline queries not necessarily explode for multiple attributes. Moreover, we can show that known laws from preference relational algebra remain valid under SV-semantics. Since most of these laws rely on transitivity, preservation of strict partial order is essential to algebraically optimize complex preference queries. Similarly, well-known efficient evaluation algorithms for the preference selection operator rely on transitivity. In a nutshell, preference constructors with SVsemantics enable an intuitive and powerful personalization of database queries and at the same time are the key to efficient preference query evaluation.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... The semantics of f 1 < P f 2 is that f 2 is preferred to f 1 ; the semantics of f 1 ∼ =P f 2 is that f 2 is equivalent (or substitutable) to f 1 [5]. ...
... CONTAIN|NEAR|COARSEST|FINEST where base preference constructors operate either on attributes, measures, or hierarchies. Adopting the SV-semantics allows for closing the set of composition operators on the set of preferences, thus obtaining an algebra [5]. ...
... As reported in [5], Pareto composition with SV-semantics preserves s.p.o.'s. Thus, the result of applying this composition operator starting from the base preference constructors defined in this section is still a preference according to Def. 4. Note that Pareto composition is commutative and associative. ...
Conference Paper
Full-text available
Multidimensional databases play a relevant role in statistical and scientific applications, as well as in business intelligence systems. Their users express complex OLAP queries, often returning huge volumes of facts, sometimes providing little or no information. Thus, expressing preferences could be highly valuable in this domain. The OLAP domain is representative of an unexplored class of preference queries, characterized by three peculiarities: preferences can be expressed on both numerical and categorical domains; they can also be expressed on the aggregation level of facts; the space on which preferences are expressed includes both elemental and aggregated facts. In this paper we propose a preference algebra for OLAP, that takes into account the three peculiarities above.
... On the other hand, in the database area several researchers have proposed approaches to extend standard database systems by preference handling towards more personalized and cooperative databases (see [5] for an overview). Here we will adopt the approach of preferences being modeled as strict partial orders, promoted by [11,12,14]. Within this powerful and very flexible model a variety of intuitive preference constructors are available, including constructors on finite categorical domains as well as on infinite numerical data types. ...
... Formally, a base preference constructor has one ore more arguments, the first characterizing the attribute names A and the others the strict partial order <P , referring to A. Now, we will provide a formal definition of some base preference constructors. Detailed information and more preference constructors are given in [12]. ...
... On the other side there is the BMO query model ("Best Matches Only") introduced in [11,12] for the preference model given by the database community. This BMO model also retrieves preferentially optimal outcomes where a match-making between wishes and reality has to be accomplished. ...
Article
TCP-nets are a popular approach within the AI com- munity for preference handling. Concurrently, within the database community lots of work has been done to integrate preference han- dling into database query languages like Preference SQL, building on a variety of preference constructors under a strict partial order se- mantics. Moreover, many sophisticated algebraic optimization tech- niques and preference query evaluation algorithms have been de- veloped there. This paper attempts to build a bridge between these two disciplines. We show how to systematically transform TCP- net queries into database preference queries, introducing a special ceteris-paribus embedding preference constructor. Moreover, this transformation opens up options to enhance the expressiveness of TCP-nets. On the other hand, it forms a basis for future studies of ad- vanced TCP-net optimization techniques that can benefit from known results for preference query optimization in database systems.
... Then, explicit ways to combine these single valuations into a common group preference are described followed by a depiction of numeric means to determine the quality of the corresponding group result in Section 3. Based on quality calculations, Section 4 presents heuristics designed to improve group quality through dynamic creation of subgroups. The laws of preference algebra defined by [8,9] as well as statistical data provided by the database serve as background knowledge for this optimization process. Finally, Section 5 sums up our preliminary results and points out areas of ongoing research. ...
... In order to express preferences on single attributes, the preference framework defined in [8,9] provides a base preference hierarchy for both numerical and non-numerical domains. The following preferences are presented exemplarily: All base preferences described in this paper can be considered as subconstructors of the SCORE preference. ...
... The algorithm distributes weight according to a 99 to 1 ratio as specified by λ for Prioritization nodes between the left and the right child node, deeming the left subtree as more important. No λ value can in general achieve the same behavior as a Prioritization [3,9]. However, by tweaking λ the cases in which the right subtree affects the result even with the left subtree value not being equal can be minimized which is the measure of precision in this case. ...
Conference Paper
Full-text available
Modern communication technologies, in particular social network services and mobiles Internet applications, have generated a tremendous hype enabling new forms of human interaction patterns. Millions of users voluntarily give away their preference profiles and want to share common interests in groups. Thus, the proper and effi-cient management of group preferences becomes a challenging area. Since any approach towards its solution should be scalable to even very large user groups, we propose a database-centered framework in this paper. We extend a well-known constructor-driven approach for modeling preferences as strict partial orders towards modeling and optimization capabilities for dynamic group formation. Since the quality of group formations can be judged by semantic aspects like e.g. group homogeneity, group optimization techniques must be ca-pable to properly take this aspect into consideration. In more detail we describe how to model the quality of group formations depend-ing on individual group member preferences. Then we propose sev-eral optimization heuristics for subgroup formation, based on laws of preference algebra and also on statistical knowledge. In summary we are confident that the preliminary results presented here can be extended towards intuitive, robust and scalable algorithms for the dy-namic management of even very large user groups with sophisticated preference profiles.
... Their work is orthogonal to the approach in this chapter, which focuses on which combination operator to choose. Kießling (2005) introduces three preference combination operators: ...
... In this thesis we consider quantitative preferences and preference combinations using score combination functions, similar to the rank F combination from Kießling (2005). There are three motivations why we chose for scores rather than for a partial ordering. ...
... Example 4.2.1. As an example of how to use scores to specify the weight of a set of preferences, consider the following preferences relating to an order that Marge wants to place at Homer, as taken from Kießling (2005): (1) As these preferences are shown as a motivation for the qualitative approach, from Kießling, it is not possible to represent everything using score functions. For example, preferences (1) and (2) ...
Article
In this thesis, we show how context data, taken into account its specific characteristics, can be used to provide proactive answers and rank existing query answers based on their relevance to the users, via a user representation.
... Pareto and Prioritized compositions are two popular ways to combine preference relations [11], [16], which naturally implement the ⊕ and ⊛ operators, respectively. In order to preserve the strict partial order property and, at the same time, to allow for an increased flexibility in combining preferences, a refinement of the indifference relation ∼ associated to a preference relation ≻ is needed [17]. ...
... Both Prioritized and Pareto composition preserve strict partial orders [17]. Concerning equivalence of tuples, let ≻ 1,2 stand for either (≻ 1 ⊛ Pr ≻ 2 ) or (≻ 1 ⊕ Pa ≻ 2 ). ...
... Concerning equivalence of tuples, let ≻ 1,2 stand for either (≻ 1 ⊛ Pr ≻ 2 ) or (≻ 1 ⊕ Pa ≻ 2 ). Then t 1 ≈ 1,2 t 2 if and only if t 1 ≈ 1 t 2 and t 1 ≈ 2 t 2 both hold [17]. It is known that ⊕ Pa is commutative and associative and that ⊛ Pr is associative [17] (but obviously not commutative). ...
Article
Full-text available
User preferences are a fundamental ingredient in the deployment of personalized database applications, in par-ticular those in which context plays a key role. Given a set of preferences defined in different contexts, in this paper we address the problem of inferring which are the preferences that should be used for answering queries in a given context. For the sake of generality, we work with an abstract context model and two uninterpreted algebraic operators for combining preferences. In this framework, we study how preferences should be combined according to a set of basic propagation principles, among which the one stating that if a conflict arise, then the more specific context prevails. We investigate the general properties of the operators, define canonical ways to build expressions respecting the propagation principles, and identify syntactical conditions that guarantee the equivalence of all the expressions that are well-formed: these results hold for any interpretation of the operators. Then we consider a specific interpretation, which corresponds to the well-known Pareto and Prioritized composition rules. We study three alternative semantics for this scenario and provide precise containment relationships.
... These arise from a formalisation of the user's preferences in the form of partial strict-orders. A realisation of this idea is presented by the language Preference SQL [11] and its current implementation [12]. For example, a consumer wants to buy a new car. ...
... Strict partial orders satisfying negative transitivity are sometimes called strict weak orders. In the scope of preferences, i.e. in [11] they are called weak order preferences. In this paper we will only use the term layered preferences. ...
... b, it is quite intuitive to say that t 2 is better than t 1 (and not incomparable with t 1 ) in a & b. Formally this is reflected by the SV-semantics of [11]. " SV " stands for " substitutable values " and means that a comparison between two tuples t 1 , t 2 with respect to a remains unchanged if t 1 is substituted for an SV-related t 1 . ...
Article
Full-text available
Preferences allow more flexible and personalised queries in database systems. Evaluation of such a query means to select the maximal elements from the respective database w.r.t. to the preference, which is a partial strict-order. We present a point-free calculus of such preferences and exemplify its use in proving algebraic laws about preferences that can be used in query optimisation. We show that this calculus can be mechanised using off-the-shelf automated first-order theorem provers.
... Preferences in Databases [1] as shown by a recent survey [2] is a well-established framework to create personalized information systems. By using well designed preference models, users can be provided with just the information they require, thereby overcoming the dreaded empty result set and flooding effect [3]. ...
... Preference modeling has been in focus for some time, leading to diverse approaches [1,3,10]. We follow the preference model from [1], which directly maps preferences to relational algebra and declarative query languages. ...
... Preference modeling has been in focus for some time, leading to diverse approaches [1,3,10]. We follow the preference model from [1], which directly maps preferences to relational algebra and declarative query languages. It is semantically rich, easy to handle and very flexible to represent user preferences which are ubiquitous in our life. ...
Article
Full-text available
In this paper we present a framework for a novel kind of context-aware preference query composition whereby queries for the Preference SQL system are created. We choose a commercial e-business platform for outdoor activities as a use case and develop a context model for this domain within our framework. The suggested model considers explicit user input, domain-specific knowledge, contextual knowledge and location-based sensor data in a comprehensive approach. Aside from the theoretical background of preferences, the optimization of preference queries and our novel generator based model we give special attention to the aspects of the implementation and the practical experiences. We provide a sketch of the implementation and summarize our user studies which have been done in a joint project with an industrial partner. © 2012. The Korean Institute of Information Scientists and Engineers.
... An important subclass of preferences are weak order preferences (WOP, (Kießling 2005)), i.e., strict partial orders for which negative transitivity holds: ...
... For WOPs two domain values x and y having the same function value are considered as substitutable and are treated as one equivalence class (regular SVsemantics , cp. (Kießling 2005)). The utility function has to be defined individually for every type of base preference. ...
... The utility function has to be defined individually for every type of base preference. For numerical base preferences the utility function is interpreted as the numerical distance from a perfect value, e.g., f LOW EST (A) (x) := x − min, where min is the minimum value of the domain of A (Kießling 2005). ...
Article
Skyline evaluation techniques (also known as Pareto preference queries) follow a common paradigm that eliminates data elements by finding other elements in a data set that dominate them. Nowadays already a variety of sophisticated skyline evaluation techniques are known, hence skylines are considered a well re-searched area. On the other hand, the skyline op-erator does not stand alone in database queries. In particular, the skyline operator may commute with the selection operator which may express hard con-straints on the skyline. In this paper, we address skyline queries that satisfy some hard constraints, so-called constrained skyline queries. We will present novel optimization techniques for such queries, which allow more efficient computation. For this, we pro-pose semi-skylines which can be used effectively for algebraic optimizations of skyline queries having a mixture of hard constraints and soft preference con-ditions. All our efficiency claims are supported by a series of performance benchmarks.
... A tuple r 1 ∈ R dominates a tuple r 2 ∈ R if r 1 is better at least in one preference contained in the Pareto preference and at the same time r 1 is not worse than or incomparable to r 2 in all other preferences. While two values that are equally good with respect to a preference, but not actually equal, are classified as "incomparable" in common Pareto preferences, the SV-semantics introduced in [38] often render them as "substitutable". Pareto preference queries are also known as skyline queries ( [6]), although the latter form just a specialization of Pareto preferences. ...
... This book will describe different ways to improve the performance of Pareto preference evaluation. Before we come to this, existing algorithms and other related work will be discussed in chapter 2. Then Kießling's preference model of [37] and [38]) will be introduced in chapter 3. We will use this model throughout the whole paper. ...
... There is the need for a sophisticated but intuitive, a simple but comprehensive way to express preferences. Kießling's preference model introduced in [37,38] has these features on top of a well-founded theoretical base. Needless to say, different preference models have been introduced in the last ten years, but most of them lack some of the mentioned key features. ...
Thesis
Searching a database is one of the most common procedures in everyday life. Usually, the results of such a search match the query parameters perfectly. But if no perfect match is found, the user usually has to find out by himself how to change search parameters in order to get results. To overcome this problem, Kießling has introduced a model of preferences in databases. This model is based on simple strict partial orders as given in expressions like "red is better than blue". For every query, the best-matching objects are returned, whether these are perfect matches or not. A best match is a tuple that matches the preference not worse than any other tuple – or as we say – that is not dominated by any other tuple. The specific problem we address is finding best matches for Pareto preferences, the combination of preferences with all of them being equally important. This problem is closely related to skyline queries. Based on the better-than graph, a visualization of the strict partial orders constructed by Pareto preferences, we have found a novel type of optimization called pruning that can be applied to all existing generic algorithms. While common generic algorithms rely on tuple-to-tuple comparisons to identify dominated tuples, our optimization technique uses the structure of the better-than graph to identify elements in the order that are definitively dominated by some given tuple. This enables us to omit many comparisons. By further analysis of the better-than graph, we were able to find a new kind of algorithm. This generic algorithm, Hexagon, is capable of finding the best matches in some previously unknown set of tuples in linear time with respect to the size of the better-than graph. Apart from the standard algorithm, we present a number of optimizations for it regarding its memory requirements. But Hexagon is not limited to standard preference queries. We also address top-k queries with a variant of Hexagon. These queries return the best k tuples of an input relation with respect to some rating function. The performance benchmarks we have made show the superiority of algorithms using pruning and especially of Hexagon, although the latter cannot be used in all cases due to memory requirements. Moreover, Hexagon can be combined with existing algorithms that have been optimized by pruning to enable the cost-based algorithm selection for Pareto preference evaluation.
... An example of approach where it is possible to define multiple preferences is presented in [Kie02] and [Kie05]. Preferences are expressed for attributes appearing in some relation schema. ...
... Another approach to expressing preferences among selection criteria is used in PREFERENCE SQL [Kie02,Kie05]. It enables defining positive (POS) or negative (NEG) sets of values for a given attribute and if these sets are not sufficient, explicit (EXPLICIT) preferences among a couple of values can be expressed. ...
... Some query languages enable defining preference hierarchies which specify partial orders between selection criteria. In PREFERENCE SQL [Kie02,Kie05] for example, this partial order is introduced with the help of two clauses: PREFERING and CASCADING. The first clause (PREFERING) fixes the initial choice of selection criteria. ...
Article
This thesis contains two parts. The first one is a study of the state of the art on data personalization and a proposition of a user profile model. The second one is a focus on a specific problem which is the query reformulation using profile knowledge. The goal of personalization is to facilitate the expression of the need for a particular user and to enable him to obtain relevant information when he accesses an information system. The relevance of the information is defined by a set of criteria and preferences specific to each user or community of users. These criteria describe the user's domain of interest, the quality level of the data he is looking for or the modalities of the presentation of this data. The data describing the users is often gathered in the form of profiles. In this thesis we propose a generic and extensible model of profile, which enables the classification of the profile's contents. Personalization may occur in each step of the query life cycle. The second contribution of this thesis is the study of two query reformulation approaches based on algorithms for query enrichment and query rewriting and the proposition of an advanced query reformulation approach. The three reformulation approaches are evaluated on a benchmark described in the thesis.
... Kießling in [Kie02]. That framework was extended in [Kie05] by relaxing definitions of the accumulation operators and using SV-relations, instead of equality, as indifference re- ...
... lations. [Kie05] shows that such an extension preserves the SPO properties of accumulated relations. Moreover, the resulting accumulated relations were shown to be larger (in the set theoretic sense) than the relations composed using the equality-based accumulational operators. ...
... Moreover, the resulting accumulated relations were shown to be larger (in the set theoretic sense) than the relations composed using the equality-based accumulational operators. However, the problems of relative importance of attributes induced by such preference relations were not addressed in [Kie02] and [Kie05]. The problems of containment of preference relations and minimal extensions were also not considered. ...
... Preferences and their integration with databases have been in focus for some time by now, leading to diverse approaches. We will follow Kießling's way ([8, 9] ) and look at a preference as P = (A, <P ), where A is a set of attributes and <P is a strict partial order on the domain of A. One implementation of this preference model is an extension of SQL, Preference SQL ([10]).Figure 1 shows the usage of user preferences in a database query, expressing some user preferences on a hotel after the keyword PREFERRING. It is a Pareto preference (AND) consisting of base preferences on the price (AROUND) and the hotel category (POS1), and a prioritization (PRIOR TO) stating that having a jacuzzi (POS2) is more important than a fitness room (POS3). ...
... The preference model of [8, 9] proposes to use intuitive preference constructors for preference definition. Before we have a look onto some of them, let's state an important subclass of strict partial orders. ...
... The sample query infigure 1 shows three POS preferences with singleton POS-sets for hotel category and flags indicating the existance of a jacuzzi and a fitness room. The generalization of the POS preference is the LAY- ERED preference introduced in [9] . It enables users to specify any number of different sets. ...
Article
Database queries expressing user preferences have been found to be crucial for personalized applications. Such pref-erence queries, in particular Pareto preference queries, pose new optimization challenges for efficient evaluation. So far however, all known generic Pareto evaluation algorithms suf-fer from non-linear worst case runtimes. Here we present the first generic algorithm, called Hexagon, with linear worst case complexity for any data distribution under certain rea-sonable assumptions. In addition, our performance investi-gations provide evidence that Hexagon also beats compet-ing Block-Nested-Loop style algorithms in the average case. Therefore Hexagon has the potential to become one key al-gorithm in each preference query optimizer's repertoire.
... In the following we will formalize the possible definitions of equivalence relations eq i and will show which definition is optimal in decreasing skyline sizes while always maintaining transitivity of the combined preference. There are already several instantiations of eq i in literature: a) Pareto Accumulation [15]: (o 1 ,o 2 ) ∈ eq i , iff att i (o 1 ) = Hj att i (o 2 ), i.e. the attribute values have to be identical b) (Maximal) Substitute Value (SV) Semantics [16]: (o 1 ,o 2 ) ∈ eq i , iff att i (o 1 ) and att i (o 2 ) share the same sets of parents and descendants with respect to H j (This includes the case att i (o 1 ) = Hj att i (o 2 )) c) Pareto Composition [10]: (o 1 ,o 2 ) ∈ eq i , iff att i (o 1 ) and att i (o 2 ) are incomparable with respect to H i (This also includes the case att i (o 1 ) = Hi att i (o 2 ), since then neither att i (o 1 ) < Hi att i (o 2 ), nor att i (o 2 ) < Hi att i (o 1 ) holds. Note that in general eq i is not an equivalence relation.) ...
... In the following we will formalize the possible definitions of equivalence relations eq i and will show which definition is optimal in decreasing skyline sizes while always maintaining transitivity of the combined preference. There are already several instantiations of eq i in literature: a) Pareto Accumulation [15]: (o 1 ,o 2 ) ∈ eq i , iff att i (o 1 ) = Hj att i (o 2 ), i.e. the attribute values have to be identical b) ( [16]: (o 1 ,o 2 ) ∈ eq i , iff att i (o 1 ) and att i (o 2 ) share the same sets of parents and descendants with respect to H j (This includes the case att i (o 1 ) = Hj att i (o 2 )) c) Pareto Composition [10]: (o 1 ,o 2 ) ∈ eq i , iff att i (o 1 ) and att i (o 2 ) are incomparable with respect to H i (This also includes the case att i (o 1 ) = Hi att i (o 2 ), since then neither att i (o 1 ) < Hi att i (o 2 ), nor att i (o 2 ) < Hi att i (o 1 ) holds. Note that in general eq i is not an equivalence relation.) ...
... Obviously the definitions of equivalence get weaker from a) to b) and from b) to c). Also in case b) the aggregated preference is always transitive [16]. But of course there are many more possible definitions of equivalence relations between a) and c) (as we will discuss in section 4). ...
Conference Paper
Full-text available
Unlike numerical preferences, preferences on at- tribute values do not show an inherent total order, but skyline computation has to rely on partial orderings explicitly stated by the user. In such orders many ob- ject values are incomparable, hence skylines sizes become unpractical. However, the Pareto semantics can be modified to benefit from indifferences: skyline result sizes can be essentially reduced by allowing the user to declare some incomparable values as equally desirable. A major problem of adding such equiva- lences is that they may result in intransitivity of the aggregated Pareto order and thus efficient query proc- essing is hampered. In this paper we analyze how far the strict Pareto semantics can be relaxed while al- ways retaining transitivity of the induced Pareto ag- gregation. Extensive practical tests show that skyline sizes can indeed be reduced about two orders of mag- nitude when using the maximum possible relaxation still guaranteeing the consistency with all user prefer- ences.
... The more general preference queries allow a more granular specification of user wishes as well as the specification of the relative importance of these preferences [2,12,3]. Thereby we follow the preference model of [2,12]. ...
... The more general preference queries allow a more granular specification of user wishes as well as the specification of the relative importance of these preferences [2,12,3]. Thereby we follow the preference model of [2,12]. ...
... There are several other complex preference constructors which are essential in rule based preference query optimization, cp. [2,12]. ...
Article
Full-text available
Skyline query processing and the more general preference queries become reality in current database systems. Preference queries select those tuples from a database that are optimal with respect to a set of designated preference attributes. In a Skyline query these preferences only refer to minimum and maximum, whereas the more general approach of preference queries allow a more granular specification of user wishes as well as the specification of the relative importance of individual preferences. The incorporation of preferences into practical relational database engines necessitates an efficient and effective selectivity estimation module: A better understanding of the preference selectivity is useful for better design of algorithms and necessary to extend a database query optimizer's cost model to accommodate preference queries. This paper presents a survey on selectivity and cardinality estimation for arbitrary preference queries. The paper presents current approaches and discusses their advantages and disadvantages, such that one could decide which model should be used in a database engine to estimate optimization costs.
... Preference modeling has been in focus for some time, leading to diverse approaches, e.g. [3,10,11]. We follow the preference model from [11] which is a direct mapping to relational algebra and declarative query languages, e.g., Preference SQL which is discussed in Section 5. ...
... [3,10,11]. We follow the preference model from [11] which is a direct mapping to relational algebra and declarative query languages, e.g., Preference SQL which is discussed in Section 5. It is semantically rich, easy to handle and very flexible to represent user preferences which are ubiquitous in our life. ...
... Subsequently, we present some selected preference constructors used in this paper. More preference constructors as well as their formal definition can be found in [10,11,12]. ...
Conference Paper
Full-text available
In the last decade there has been much interest in preference query processing for various applications like personalized information or decision making systems. Preference queries aim to find only those objects that are most preferred by the user. However, the underlying data set may contain NULL values which represent unknown or incomplete data. Most of the existing algorithms for preference query evaluation do not know how to treat these NULL values and consider them worse than any other value. Other algorithms do not allow NULLs in their input data set. However, NULL values are common in data sets and must be considered in preference query evaluation. In this paper we introduce an approach to handle NULL values in preference queries which extends preference algebra, a formal model for preference specification. Our approach can be adopted by all preference query algorithms which rely on strict partial orders, because it does not violate the transitivity relation as other methods do.
... The pioneering work of introducing this concept to the area of databases was done in [BKS01] using the Skyline operator. This name stems from the fact that the Pareto front line in the diagram are those points which are visible when viewed from the hypothetical optimum, i.e., the point (0, 0) in Figure 1. 1. Generalizations to database preferences with an extended set of operators have been introduced in [Kie02, Kie05,Cho03] and many following papers, many building on those three papers. In these works, additional constructs are introduced into the preference framework. ...
... • In Chapter 2 we model data sets and preferences as typed sets and relations following [MRE12,MR15]. We introduce the preference framework from [Kie02,Kie05] and adapt it to our formalism. This chapter defines a concrete relational model of database preferences. ...
... Primarily we introduce the general notion of a preference in a notational fashion closely adapted to the [Kie02] and SV semantics introduced in [Kie05]. We use the typing mechanism presented in the previous section to formally connect a preference relation with its domain. ...
Thesis
Full-text available
Relational methods in computer science have been studied intensively in the last decades, especially for program verification and correctness. In the present thesis we apply them to database preferences, which are a generalization of Skyline queries. This topic is connected to relations and algebra in various respects. The formal basis of databases is given by the relational data model, and preferences are strict order relations on tuples from a given data set. Moreover, preference operators and operands form an algebraic structure by themselves, which led to the research field of algebraic optimization of database preference queries. In this work we develop a coherent family of calculi for dealing with database preferences. Wherever possible, we use algebraic structures such as semirings, abstract relation algebras, and related concepts. The relational algebraic approach allows us to reason about many aspects of Skyline computation and preference term equivalences in a point-free way. We generalize and unify existing theorems in the scope of database preferences and simplify some of their proofs by means of algebraic structures. Next to this, we introduce the new field of preference decomposition. A subgoal of this is the characterization of the expressiveness of preference queries, i.e., the classification of orders constructed by a given class of preference terms. The results of this thesis have various applications regarding the correctness, soundness and efficiency of database preference implementations. In addition to our theoretical contributions we implemented the rPref package (available at CRAN) for handling preferences within the statistical computing software R. There is a tight connection between our calculi and the query language in that package. This allowed us to implement the algorithms and examples from the theoretical parts of this thesis, demonstrating the applicability of our results.
... A refinement of the indifference relation ∼ associated to a preference relation allows some indifferent tuples to be also equivalent, which, as we will see, is a key property for the composition of preference relations [9]. ...
... Actually, both Prioritized and Pareto composition preserve strict partial or- ders [9], whereas this is not guaranteed by replacing in their definition ≈ with ∼ [6] . It is known that ⊕ is commutative and associative and that is associa- tive [9] (but obviously not commutative). ...
... Actually, both Prioritized and Pareto composition preserve strict partial or- ders [9], whereas this is not guaranteed by replacing in their definition ≈ with ∼ [6] . It is known that ⊕ is commutative and associative and that is associa- tive [9] (but obviously not commutative). It is also evident that both operators are idempotent (that is, ⊕ = and = ) and have ∅ as the identity. ...
Conference Paper
Full-text available
User preferences are a fundamental ingredient of personalized database applications, in particular those in which the user context plays a key role. Given a set of preferences defined in different contexts, in this paper we study the problem of deriving the preferences that hold in one of them, that is, how preferences propagate through contexts. For the sake of generality, we work with an abstract context model, which only requires that the contexts form a poset. We first formalize the basic properties of the propagation process: specificity, stating that more specific contexts prevail on less specific ones, and fairness, stating that this behavior does not hold for incomparable contexts. We then introduce an algebraic model for preference propagation that relies on two well-known operators for combining preferences: Pareto and Prioritized composition. We study three alternative propagation methods and precisely characterize them in terms of the fairness and specificity properties.
... The PreferenceSQL query language supports "LAYERED" preferences (cf. [Kie05]) for situations where a domain dom(A) of an attribute can be partitioned into subsets that are ordered according to a "better than" relation. In our approach, all attribute values that have the same utility are grouped together in the same "layer", leading to a straight-forward application of the LAYERED preference constructor. ...
... Since the semantics of our approach relies on the notion that the customer is generally indifferent about attribute values with the same utility (i.e., all values with the same value are mutually substitutable), we annotate each LAYERED preference with the additional "REGULAR" keyword (cf. section 4 in [Kie05]). These preferences are combined using the pareto "AND" operator to form the complete PreferenceSQL query. ...
Thesis
In this thesis a new approach to building product recommender systems is introduced. By using a customer-centric dialogue, the customers' preferences are elicited. These are the basis for inferring utility estimations about the desired technical properties of the products in question. Systems built this way can both operate autonomously, e.g., in an online store, and support a salesperson directly at the point-of-sale. The core of the approach is formed by a layered domain description that models customer stereotypes and needs, product attributes, the products themselves, and the causal interrelations between customer and product properties. Maintenance of the domain description, i.e., keeping the model up-to-date in face of frequent changes, is facilitated by the clear separation of concerns provided by the layered structure. In fact, the most frequently used class of updates can be handled in an entirely automated way if some constraints are satisfied. On a high level of abstraction, the system behavior is described by State Charts that are parameterized according to the domain description. Those parts of the system description where State Charts would be too imprecise are implemented by separate components realizing the required complex semantics. From the domain description, a Bayesian network is generated that forms the core of the inference engine of the recommender system. The network essentially controls the system-initiated dialogue flow and the recommendation process. Due to the characteristics of Bayesian networks, it is possible to respond to user-initiated dialogue steps in a natural way. Moreover, an explanation of the current recommendation can be generated without having to explicitly encode additional information in the modeling layer. Finally, a database structure and the SQL queries necessary to obtain recommendations can be inferred from the corresponding parts of the domain description. Instantiation of the system to a specific business domain is supported by a dedicated maintenance application that hides the complexities of the underlying algorithms. Thus, day-to-day system updates by non-technical domain experts, e.g., product managers, are facilitated. The developed concepts were implemented in cooperation with a local industry partner who intends to apply the recommender system in the field of mobile communications.
... In the course of this paper we therefore want to introduce an augmented approach to tackle the problem of group preferences. The preference framework introduced in [3] and extended in [4] provides powerful methods to model single user preferences and to combine the corresponding preference terms to form a common group preference. Furthermore, preferences can be interpreted in both a numeric and even more importantly a semantic fashion. ...
... This generalization of past preferences has to be assessed critically since environmental aspects and moderator variables [9] that vary across situations are neglected and thus these functions not necessarily reflect the individual's intuitive preferences at the given moment. In contrast, [3,4] introduce a hierarchy of base preference constructors that provide a framework to express preferences on single attributes. This process can be seen as an explicit means for a group member to express opinions on multiple criteria. ...
Conference Paper
Full-text available
With the beginning of the new millenium, the concept of group interactions in communication systems was boosted by the emergence of Web 2.0 technologies. Based on this new area of application, the notion of group decisions and group preferences also evolved, leading to new requirements for corresponding modeling frameworks. Purely numeric approaches are barely able to meet these newly emerging challenges. Therefore, we provide a comprehensive group preference framework to overcome the deficits of previous solutions and demonstrate possible applications in social network services. The concept provides both numeric and semantic means which can be applied to determine group preferences and to perform further evaluations based on the semantic value of preference terms. With Preference SQL a powerful system exists to implement the presented group preference model using standard commercial databases.
... In the course of this paper we therefore want to introduce an augmented approach to tackle the problem of group preferences. The preference framework introduced in [3] and extended in [4] provides powerful methods to model single user preferences and to combine the corresponding preference terms to form a common group preference. Furthermore, preferences can be interpreted in both a numeric and even more importantly a semantic fashion. ...
... This generalization of past preferences has to be assessed critically since environmental aspects and moderator variables [9] that vary across situations are neglected and thus these functions not necessarily reflect the individual's intuitive preferences at the given moment. In contrast, [3,4] introduce a hierarchy of base preference constructors that provide a framework to express preferences on single attributes. This process can be seen as an explicit means for a group member to express opinions on multiple criteria. ...
... some other attributes being equal ). An interesting direction of the future work is to extend our semantics using the some other attributes being equivalent principle where instead of equality attribute values are considered to be equivalent according to some indifference relation [11]. One should also consider applying the existing preference modification and construction techniques [5, 10] to hierarchical CP-networks. ...
Article
We present here a variant of acyclic CP-networks. It allows not only finite but also infinite domain attributes. It also has the property that a preference over each attribute in the network has higher priority then all the descendants' prefer-ences. We provide an algorithm of constructing a preference formula representing the order induced by a hierarchical CP-network, thus making it possible to work with hierarchical CP-networks in the database context. We also provide a complexity analysis of the size of preference formula con-structed by the algorithm.
... As reported in [24], Pareto composition and prioritization with substitutability semantics preserve s.p.o.'s. Thus, the result of applying these composition operators starting from the base constructors defined in this section is still a preference according to Definition 5. Note that, while Pareto is commutative and associative, prioritization is associative but not commutative. ...
Article
Multidimensional databases are the core of business intelligence systems. Their users express complex OLAP queries, often returning large volumes of facts, sometimes providing little or no information. Thus, expressing preferences could be highly valuable in this domain. The OLAP domain is representative of an unexplored class of preference queries, characterized by three peculiarities: preferences can be expressed on both numerical and categorical domains; they can also be expressed on the aggregation level of facts; the space on which preferences are expressed includes both elemental and aggregated facts. In this paper, we present myOLAP, an approach for expressing and evaluating OLAP preferences, devised by taking into account the three peculiarities above. We first propose a preference algebra where users are enabled to express their preferences, besides on attributes and measures, also on the aggregation level of facts, for instance, by stating that monthly data are preferred to yearly and daily data. Then, with respect to preference evaluation, we propose an algorithm called WeSt that relies on a novel graph representation where two types of domination between sets of facts may be expressed, which considerably improves efficiency. The approach is extensively tested for efficiency and effectiveness on real data, and compared against two other approaches in the literature.
... Preferences are usually considered as expressions which enable to order elements of the information [5,6,36,20] and most often, user profile is reduced to user preferences. Then, a user profile is a set of descriptors including which a user envisage to fulfil in the system, how to do it, what is the type and the order of the obtained results and how they can be displayed. ...
Article
Full-text available
Data warehousing is an essential element of decision support systems. It aims at enabling the user knowledge to make better and faster daily business decisions. To improve this decision support system and to give more and more relevant information to the user, the need to integrate user's profiles into the data warehouse process becomes crucial. In this paper, we propose to exploit users' preferences as a basis for adapting OLAP (On-Line Analytical Processing) queries to the user. For this, we present a user profile-driven data warehouse approach that allows dening user's profile composed by his/her identifier and a set of his/her preferences. Our approach is based on a general data warehouse architecture and an adaptive OLAP analysis system. Our main idea consists in creating a data warehouse materialized view for each user with respect to his/her profile. This task is performed off-line when the user defines his/her profile for the first time. Then, when a user query is submitted to the data warehouse, the system deals with his/her data warehouse materialized view instead of the whole data warehouse. In other words, the data warehouse view summaries the data warehouse content for the user by taking into account his/her preferences. Moreover, we are implementing our data warehouse personalization approach under the SQL Server 2005 DBMS (DataBase Management System).
... These techniques are based on the winnow operator, an algebraic operator that picks from a given relation the set of the most preferred outcomes, according to a Preference aggregation and elicitation: tractability in the presence of incompleteness and incomparability given preference formula. In [6] it is proposed another methodology for combining complex preferences that is based on the SV-semantics, that is, a semantics characterizing equally good values among the indifferent ones. ...
Article
Full-text available
We consider how to combine the preferences of multiple agents despite the pres- ence of incompleteness and incomparability in their preference orderings. An agent's preference ordering may be incomplete because, for example, there is an ongoing preference elicitation process. It may also contain incomparability, which can be useful, for example, in multi-criteria scenarios. We focus on the problem of com- puting the possible and necessary winners, that is, those outcomes which can be or always are the most preferred for the agents. Possible and necessary winners are useful in many scenarios. For example, preference elicitation need only focus on the unknown relations between possible winners and can ignore completely all other outcomes. Whilst computing the sets of possible and necessary winners is in general a difficult problem, we identify sufficient conditions where we can obtain the neces- sary winners and an upper approximation of the set of possible winners in polynomial time. Such conditions concern either the language for stating preferences, or general properties of the preference aggregation function.
... Kadlag et al. [16] presented a query-relaxation algorithm that, given a user's initial range query and a desired cardinality for the answer set, produces a relaxed query that is expected to contain the required number of answers based on multi-dimensional histograms for query-size estimation. Finally, our work is also related to the work on preference queries [18,7,9,17] ...
Conference Paper
Full-text available
Database users can be frustrated by having an empty answer to a query. In this paper, we propose a framework to systematically relax queries involving joins and selections. When considering relaxing a query condition, intuitively one seeks the 'minimal' amount of relaxation that yields an answer. We first characterize the types of answers that we return to relaxed queries. We then propose a lattice based framework in order to aid query relaxation. Nodes in the lattice correspond to different ways to relax queries. We characterize the properties of relaxation at each node and present algorithms to compute the corresponding answer. We then discuss how to traverse this lattice in a way that a non-empty query answer is obtained with the minimum amount of query condition relaxation. We implemented this framework and we present our results of a thorough performance evaluation using real and synthetic data. Our results indicate the practical utility of our framework.
... Clearly the user should be free to express any desired preference; hence the preference relation could be, in principle, any set of pairs. However, in the literature, the preference relation is usually modeled as a strict partial order [35][36][37]31] that is a binary relation "<" satisfying the following properties: x≮x for all x in X (non reflexivity) x < y implies y ≮ x for all x, y in X such that x = y (asymmetry) x < y and y < z implies x < z (transitivity) We consider this modeling choice as unnecessarily restrictive. In particular, we consider that transitivity should not be imposed as a constraint for the prefer-ence relation to be acceptable. ...
Article
Full-text available
As information becomes available in increasing amounts, and to growing numbers of users, the shift towards a more user-centered, or personalized access to information becomes crucial. In this paper we consider the semantics and pragmatics of preference queries over tables containing information objects described through a set of attributes. In particular, we address two basic issues: – how to define a preference query and its answer (semantics) – how to evaluate a preference query (pragmatics) With respect to existing work, our main contributions are (a) the proposal of an expressive language for declaring qualitative preferences, (b) a unified framework for expressing and evaluating both quantitative and qualitative preference queries and (c) rewriting algorithms for processing such queries. Although our main motivation originates in digital libraries, our proposal is quite general and can be used in several application contexts.
... Mrs. Diet wants to find all such meals which fulfill the hard constraint and satisfy her preferences best possible. Using Kießling's approach of modelling preferences as strict partial orders (Kießling 2002; 2005 ), the above mentioned hard and soft constraints can be expressed by Preference SQL (Kießling and Köstler 2002) as follows: This query expresses Mrs. Diet's preferences after the keyword PREFERRING. It is a Pareto preference (AND) consisting of preferences on soups, meats and beverages. ...
Article
Many important applications, e.g. planning tasks, de-mand the flexible and efficient use of personaliza-tion and preference handling techniques. Apply-ing preference-based search technology could improve things quite a lot, e.g. using Preference SQL where pref-erences (i.e. soft constraints) can be combined with hard constraints. However, there are still fundamental effi-ciency issues that need to be addressed. In this paper we study preference database queries involving hard con-straints over the sum of multiple attributes. We develop algebraic optimization techniques to transform a prefer-ence query with a sum constraint in order to enable its efficient processing by database engines. For this pur-pose we present new transformation laws for an efficient solution of this problem.
... Instead, over the last decade queries with soft constraints have been studied. These arise from a formalisation of the user's preferences in the form of partial strict orders [12,13]. Instead of returning an empty result set, one can then present the user with the maximal or "best" tuples w.r.t. ...
Chapter
We present an algebra for the classical database operators. Contrary to most approaches we use (inner) join and projection as the basic operators. Theta joins result by representing theta as a database table itself and defining theta-join as a join with that table. The same technique works for selection. With this, (point-free) proofs of the standard optimisation laws become very simple and uniform. The approach also applies to proving join/projection laws for preference queries. Extending the earlier approach of [16], we replace disjointness assumptions on the table types by suitable consistency conditions. Selected results have been machine-verified using the Open image in new window tool.
... Werner Kießling[Kie02,KK02,Kie05]. Kießling proposes a framework for dealing with such preferences, named Preference-SQL[KK02] which extends the database query language SQL. ...
... Within the research program "It's a Preference World" at the University of Augsburg preferences are treated as significant factor for e-services. Preferences are modeled as strict partial orders with intuitive comprehensible "A is better than B" semantics (see [16], [17]). In the following we will demonstrate the usage as well as the benefits of our flexible middleware components within the tourism domain. ...
Article
Today there are still a lot of people preferring to consult a human employee in a travel agency instead of using the internet for booking or organizing a travel. Technical problems, incomprehensible interfaces, and insufficient search engines are part of that problem. With the interplay and adjustment of several advanced, preference driven middleware components we achieve to automate skills that so far could be executed only by a human employee in a travel agency. Our search engines Preference XPath and Preference SQL deliver best alternatives, if there is no perfect match. It is possible to distinguish between hard and soft constraints like a human vendor would do. The Preference Presenter implements a smart and sales psychology based presentation of search results, supporting various human sales strategies; the Preference Repository provides the management of situated long-term preferences. A novel query rewriting approach enables the smart combination of single preferences, e.g. the category of a hotel, with global preferences like the overall prize for the whole journey enabling a deep personalized packaging of travels. The key technologies for this breakthrough are based on preferences modeled as strict partial orders. Our first advanced prototype COSIMA T is promising. 1.
... Note that alternative definitions using different semantics exist, as for instance the one given in[61]. ...
Article
Full-text available
OLAP (On-Line Analytical Processing), the process of efficiently enabling common analytical operations on the multidimensional view of data, is a corner stone of Business Intelligence. While OLAP is now a mature, efficiently implemented technology, very little attention has been paid to the effectiveness of the analysis and the user-friendliness of this technology, often considered tedious of use. This dissertation is a contribution to developing user-centric OLAP, focusing on the use of former queries logged by an OLAP server to enhance subsequent analyses. It shows how logs of OLAP queries can be modeled, constructed, manipulated, compared, and finally leveraged for personalization and recommendation. Logs are modeled as sets of analytical sessions, sessions being modeled as sequences of OLAP queries. Three main approaches are presented for modeling queries: as unevaluated collections of fragments (e.g., group by sets, sets of selection predicates, sets of measures), as sets of references obtained by partially evaluating the query over dimensions, or as query answers. Such logs can be constructed even from sets of SQL query expressions, by translating these expressions into a multidimensional algebra, and bridging the translations to detect analytical sessions. Logs can be searched, filtered, compared, combined, modified and summarized with a language inspired by the relational algebra and parametrized by binary relations over sessions. In particular, these relations can be specialization relations or based on similarity measures tailored for OLAP queries and analytical sessions. Logs can be mined for various hidden knowledge, that, depending on the query model used, accurately represents the user behavior extracted. This knowledge includes simple preferences, navigational habits and discoveries made during former explorations, and can be it used in various query personalization or query recommendation approaches. Such approaches vary in terms of formulation effort, proactiveness, prescriptiveness and expressive power: query personalization, i.e., coping with a current query too few or too many results, can use dedicated operators for expressing preferences, or be based on query expansion; query recommendation, i.e., suggesting queries to pursue an analytical session, can be based on information extracted from the current state of the database and the query, or be purely history based, i.e., leveraging the query log. While they can be immediately integrated into a complete architecture for User-Centric Query Answering in data warehouses, the models and approaches introduced in this dissertation can also be seen as a starting point for assessing the effectiveness of analytical sessions, with the ultimate goal to enhance the overall decision making process.
Article
Skyline queries are well known for their intuitive query formalization and easy to understand semantics when selecting the most interesting database objects in a personalized fashion. They naturally fill the gap between set-based SQL queries and rank-aware database retrieval and thus have emerged in the last few years as a popular tool for personalized retrieval in the database research community. Unfortunately, the Skyline paradigm also exhibits some significant drawbacks. Most prevalent among those problems is the so called "curse of dimensionality" which often leads to unmanageable result set sizes. This flood of query results, usually containing a significant portion of the original database, in turn severely hampers the paradigm's applicability in real-life systems. In this chapter, we will provide a survey of techniques to remedy this problem by choosing the most interesting objects from the multitude of skyline objects in order to obtain truly manageable and personalized query results.
Chapter
Stream data analysis is a high relevant topic in various academic and business fields. Users want to analyze data streams to extract information in order to learn from this ever-growing amount of data. Although many approaches exist for effective processing of data streams, learning from streams requires new algorithms and methods to be able to learn under the evolving and unbounded data. In this chapter we focus on the task of preference-based stream processing and clustering to analyze data streams. We show that this method is a real alternative to the state-of-the-art approaches.
Conference Paper
Skyline queries are well-known in the database community and there are many algorithms for the computation of the Pareto frontier. But users do not only think of finding the Pareto optimal objects, they often want to find the best objects concerning an explicit specified preference order. While preferences themselves often are defined as general strict partial orders, almost all algorithms are designed to evaluate Pareto preferences combining weak orders, i.e., Skylines. In this paper, we consider general strict partial orders and we present a method to evaluate such explicit preferences by embedding any strict partial order into a complete lattice. This enables preference evaluation with specialized lattice based algorithms instead of algorithms relying on tuple-to-tuple comparisons and therefore speed-ups their computation as can be seen in our experiments.
Conference Paper
Web-basierte, datenbankgestützte Beratungssysteme finden durch zentrale Wartbarkeit bei sich permanent ändernden Produktpaletten aktuell starke Verbreitung. Kernpunkt für eine optimale Produktempfehlung ist dabei die Berücksichtigung der Präferenzen des Kunden, welche auf eine möglichst einfache und nachvollziehbare Art und Weise spezifiziert werden sollten. Daher wird ein Ansatz präsentiert, der er- laubt, die vom Benutzer ohnehin anzugebenden Selektionsbedingungen zusätzlich mit Gewichten zu annotieren und damit die Sortierung der Empfehlungen zu beeinflussen. Dies wird durch eine erweiterte SQL-Syntax ermöglicht, über die theoretisch fundiert ein Ranking auf der Ergebnismenge definiert wird. 1 Einleitung Der Bedarf an Beratungssystemen steigt durch die zunehmende Kontextualisierung stän- dig. Eines der prominentesten Beispiele sind die personalisierten Buchempfehlungen von Amazon (LSY03). Wichtig ist, die Präferenzen des Nutzers zu berücksichtigen und für ein Ranking der Treffermenge zu verwenden. Es ist daher wünschenswert, dem Nutzer die Möglichkeit zu geben, seine Wünsche und Präferenzen auf einfache und verständliche Weise zu spezifizieren. Der Beitrag dieser Arbeit ist eine einheitliche Methode zur Annotation von SQL-Anfragen mit Gewichten, die zusätzlich die Angabe von Softconstraints erlaubt. Basierend auf den angegebenen Gewichten wird ein Ranking der Resultatrelation definiert. Dabei bleibt im Gegensatz zu anderen Verfahren (s. Abschnitt 7) die Antwortsemantik der Anfrage erhal- ten. Dies und die Tatsache, dass nur die SQL-Anfrage annotiert wird, macht die Benutzung beliebiger Datenbanksysteme möglich. Das schließt aber nicht aus, durch Anpassung des Anfrageoptimierers die Ranking-Information bereits bei der Anfrageauswertung nutzbrin- gend einzusetzen. Die weiteren Teile des Artikels sind folgendermaßen strukturiert. Abschnitt 2 beschreibt kurz den Anwendungsfall. In Abschnitt 3 wird auf die Gewichtsannotationen in SQL ein- gegangen, die dann in der formalen Definition des Rankings (Abschnitt 4) verwendet wer- den. Bevor der Artikel mit einer Diskussion (Abschnitt 7) endet, werden in Abschnitt 5 Softconstraints u.ä. beschrieben.
Conference Paper
Full-text available
Skyline queries have recently received a lot of attention due to their intuitive query formulation: users can state preferences with respect to several attributes. Unlike numerical preferences, preferences over discrete value domains do not show an inherent total order, but have to rely on partial orders as stated by the user. In such orders typically many object values are incomparable, increasing the size of skyline sets significantly, and making their computation expensive. In this paper we explore how to enable interactive tasks like query refinement or relevance feedback by providing 'prime cuts'. Prime cuts are interesting subsets of the full Pareto skyline, which give users a good overview over the skyline. They have to be small, efficient to compute, suitable for higher numbers of query predicates, and representative. The key to improved performance and reduced result set sizes is the relaxation of Pareto semantics to the concept of weak Pareto dominance. We argue that this relaxation yields intuitive results and show how it opens up the use of efficient and scalable query processing algorithms. Assessing the practical impact, our experiments show that our approach leads to lean result set sizes and outperforms Pareto skyline computations by up to two orders of magnitude.
Conference Paper
Personalization includes the adaptation of database queries according to the user's needs, wishes and situation. We examine the influence of the d- parameter as powerful personalization instrument for the Preference XPath search engine. Using a heuristic approach we present a possibility to deliver not only the qualitative best matching objects but also the desired amount of data to the user. Performing a series of test queries on proper e-catalog data, we demonstrate the ef- fectiveness of our approach.
Conference Paper
As information becomes available in increasing amounts, and to growing numbers of users, the shift towards a more user-centered, or personalized access to information becomes crucial. In this paper we consider the semantics and pragmatics of preference queries over tables containing information objects described through a set of attributes. In particular, we address two basic issues: how to define a preference query and its answer (semantics) how to evaluate a preference query (pragmatics) The main contributions of this paper are (a) the proposal of an expressive language for declaring qualitative preferences, (b) a novel approach to evaluating a preference query (c) the design of a user friendly interface with preference queries. Although our main motivation originates in digital libraries, our proposal is quite general and can be used in several application contexts.
Conference Paper
There is a strong demand for a deep personalization of search systems for many Internet applications. In this respect the proper handling of user preferences plays an important role. Here we focus on the efficient evaluation of the Pareto preference operator for structured data in very large databases. The result set of such a Pareto query, also known as the “skyline”, tends to become very large for higher dimensionalities. Often it is too time-consuming or just not necessary to compute the entire skyline, instead only some fraction of it, called a “snippet”, is sufficient. In this paper we contribute a novel algorithm for a fast computation of such skyline snippets. Our solutions do not rely on the availability of specialized pre-computed indexes, hence are generally applicable. We demonstrate the performance of our approach by several benchmarks studies. The presented results suggest that even for complex Pareto queries, yielding very large skylines, snippets can be computed sufficiently fast, and therefore can be integrated into online Web services.
Conference Paper
Preferences are an important natural concept in real life and are well-known in the database and artificial intelligence community. Modeling preferences as strict partial orders closely matches people’s intuition. There are many algorithms for the evaluation of these strict partial orders. In particular some algorithms rely on the total order or the lattice structure constructed by a preference query. This paper provides an overview of the structure of preference orders. We present several measures of the different “better-than graphs” and give a deep insight into the structure of preferences. In fact, a careful analysis of the underlying “better-than graph” enables one to develop efficient algorithms for preference computation.
Article
With the emerging of e-catalog standards the product search can be used in a very different and improved matter. Beyond the usual keyword search this opens the arena for attribute-based search engines, including parametric search and preference search. We analyze the impact of different search techniques for such e-catalogs on the overall search process costs. It turns out that preference search has a high potential to significantly reduce the process costs. A large-scale use case with the MAN2B e-procurement platform supports our claim. We identify improvements achievable by using preference search, in particular less navigation steps during the product search and better search results due to the BMO query model. Expensive cases, where frustrated users accept bad search results or phone up the company’s purchasing department, should decrease significantly. This in turn will enable the purchasing department to focus more on strategic issues like supplier relationship management than on operative issues as it still happens widely today.
Conference Paper
Full-text available
Preferences allow more flexible and personalised queries in database systems. Evaluation of such a query means to select the maximal elements from the respective database w.r.t. to the preference, which is a partial strict-order. Often one requires the additional property of negative transitivity; such a strict weak order induces equivalence classes of “equally good” tuples, arranged in layers of the order. We extend our recent algebraic, point-free, calculus of database preferences to cope with weak orders. Since the approach is completely first-order, off-the-shelf automated provers can be used to show theorems concerning the evaluation algorithms for preference-based queries and their optimisation. We use the calculus to transform arbitrary preferences into layered ones and present a new kind of Pareto preference as an application.
Chapter
In this paper we have highlighted five existing approaches for introducing personalization in OLAP: preference constructors, dynamic personalization, visual OLAP, recommendations with user session analysis and recommendations with user profile analysis and have analyzed research papers within these directions. We have provided an evaluation in order to point out (i) personalization options, described in these approaches, and its applicability to OLAP schema elements, aggregate functions, OLAP operations, (ii) the type of constraints (hard, soft or other), used in each approach, (iii) the methods for obtaining user preferences and collecting user information. The goal of our paper is to systematize the ideas proposed already in the field of OLAP personalization to find out further possibility for extending or developing new features of OLAP personalization.
Chapter
As long as there have been database search engines there has been the problem of what to present to the user when there is no perfect match and how to present that query result to the user. Respecting the user’s search preferences is the suitable way to search for best matching alternatives. Modelling such preferences as strict partial orders in “A is better than B” semantics has been proven to be user intuitive in various internet applications. The better the search result, the better is the psychological advantage of the presenter. Thus, there is the necessity to know the quality of the search result with respect to the search preferences. This chapter introduces a novel personalized and situated quality assessment for query results. Based on a human comprehensible linguistic model of five quality categories a very intuitive framework for valuations is defined for numerical as well as for categorical search preferences. These quality valuations provide human comprehensible presentation arguments. Moreover, they are used to compute the situated overall quality of a search result. For delivery of the results a flexible and situated fillter decides which results to present, e.g. by respecting quality requirements of the user. A so called presentation preference determines which results are predestined to be especially pointed out to a user. Eventually, it will be evaluated how ecommerce applications will profit from the use of a preference based search in combination with the introduced human comprehensible quality assessment. Considering the procurement of goods via internet the idea is simple. A customer expects to have at least the service he or she has when directly contacting a human sales person. That means the customer wants to be treated individually according to his or her needs. But the misery already begins with the first step, the usage of the search engine.
Thesis
The combination of travel and tourism represents the leading domain for applications in B2C e-commerce. Thus, it deserves highest attention. Since most people only have a very limited number of vacation days each year, they have learned to be more demanding about their trips. More and more they ask for better-personalized travel products instead of standard packages designed by tourist operators. Due to insufficient search engines and the lack of personalization, however, arranging a trip on current online travel portals is often not as easy as it should be. Even for rather straightforward scenarios, searching and booking a suitable travel package can be tedious and might often take longer than 1 hour. In order to provide good sales experiences and custom-tailored products similar to the ones competent human travel agents can offer, a personalized search approach for online travel portals has been overdue for some time. This thesis, therefore, presents a novel personalized search process delivering travel products exactly tailored to customers with respect to their situations and preferences. In a first step, a novel model for the search process in electronic commerce will be introduced. A deep personalization of the search will be provided by dividing the process into four stages, namely Preference Analysis & Modeling, Search Interface, Query Processing, and Presentation. The main part of this thesis, will then apply the new model to the tourism domain, i.e. each step of the search will be examined in the context of tourism. A situation model adequately adjusted to the tourism domain will then provide each stage of the search process with additional situational knowledge. Based on this, several essential components for a domain specific search in tourism will be introduced accordingly: a new preference constructor dealing with typical price-quality tradeoffs, a smart preference elicitation process supporting customers who have to find an optimal departure airport, the composition and evaluation of database queries supporting the interplay of individual and global preferences, and an appropriate adaptation of search interface and product presentation. Moreover, by using preference search technologies as underlying basis for the search itself, best alternatives can be delivered in case there is no perfect match. Several novel software components for a personalized search process in tourism have come into existence in the context of this thesis, e.g., the personalized prototype COSIMAT. The interplay of these components with existing preference components will be examined and evaluated by means of numerous use case scenarios at the end of this work. It will be demonstrated that by a proper combination of these components, custom-tailored travel products with respect to preferences and situations can be found and presented to the customer in an intuitive, fast and more comfortable manner than before.
Article
Full-text available
Skylines with partial order preference semantics often result in huge answer sets and what is worse, they cannot be computed efficiently. In this paper we will explore the evaluation of so-called re-stricted skyline queries with partial order prefer-ences under the paradigm of weak Pareto domi-nance. Weak Pareto dominance removes all objects from skylines, which are dominated by other ob-jects in some query predicates, but in turn do not dominate these objects in any predicate. We will argue that this paradigm yields intuitive results, prove that it leads to lean sizes of the restricted skyline and show how it opens up the use of effi-cient algorithms for evaluation adopting the itera-tion of ranked result lists for each query predicate.
Conference Paper
This paper presents a modular approach to context-aware preference query composition based on a novel kind of preference generator. We introduce a constructive model to generate preference terms within the Preference SQL framework. Given several sources for preference related knowledge like explicit user input, information extracted from a preference repository, domain-specific application knowledge, location-based sensor data, or web service feeds for weather data our preference generator can compile a user search request into one rather complex context-aware Preference SQL query. Choosing as use case a commercial e-business platform for outdoor activities, we demonstrate how such queries despite the power and complexity of this approach can be evaluated efficiently on a practical data set.
Article
vydavatelství Matematicko-fyzikální fakulty
Article
SQL queries containing Group-by are common in data warehouse environments and OLAP. From this the concept of grouped Skyline queries emerged, wherein a Skyline of each group of tuples is requested. Grouped preference queries generalize this kind of Skyline queries. In this paper we present new algebraic transformation rules for grouped preference queries which are one of the most intuitive and practical type of queries. Our optimization laws reduce intermediate result sizes in the computation of joins, Cartesian products, and the preference selection. We have integrated these new rules into our rule-based Preference SQL query optimizer. Our performance benchmarks, building upon the well-known TPC-H and IMDB datasets, show that significant performance gains can be achieved.
Article
Personalized database systems give users answers tailored to their personal preferences. While numerous preference evaluation methods for databases have been proposed (e.g., skyline, top-k, k-dominance, k-frequency), the implementation of these methods at the core of a database system is a double-edged sword. Core implementation provides efficient query processing for arbitrary database queries, however, this approach is not practical since each existing (and future) preference method requires implementation within the database engine. To solve this problem, this article introduces FlexPref, a framework for extensible preference evaluation in database systems. FlexPref, implemented in the query processor, aims to support a wide array of preference evaluation methods in a single extensible code base. Integration with FlexPref is simple, involving the registration of only three functions that capture the essence of the preference method. Once integrated, the preference method “lives” at the core of the database, enabling the efficient execution of preference queries involving common database operations. This article also provides a query optimization framework for FlexPref, as well as a theoretical framework that defines the properties a preference method must exhibit to be implemented in FlexPref. To demonstrate the extensibility of FlexPref, this article also provides case studies detailing the implementation of seven state-of-the-art preference evaluation methods within FlexPref. We also experimentally study the strengths and weaknesses of an implementation of FlexPref in PostgreSQL over a range of single-table and multitable preference queries.
Article
Full-text available
A maximal vector of a set is one which is not less than any other vector in all components. A recurrence relation is derived for computing the average number of maximal vectors in a set of n vectors in d-space under the assumption that all (n! )**d relative orderings are equally probable. Solving the recurrence shows that the average number of maxima is O((ln n)**d** minus **1) for fixed d. This result is used to construct an algorithm for finding all the maxima that have expected running time linear in n (for sets of vectors drawn under these assumptions). The result is then used to find an upper bound on the expected number of convex hull points in a random point set.
Conference Paper
Full-text available
Advanced personalization of database applications is a big challenge, in particular for distributed mo- bile environments. We present several new results from a prototype of a route planning system. We demonstrate how to combine qualitative and quantitative preferences gained from situational aspects and from personal user preferences. For performance studies we a nalyze the runtime efficiency of the SR-Combine algorithm used to evaluate top-k queries. By determining the cost-ratio of random to sorted accesses SR-Combine can automati- cally tune its performance within the given system architecture. Top-k queries are generated by mapping linguis- tic variables to numerical weightings. Moreover, we analyze the quality of the query results by several test se- ries, systematically varying the mappings of the linguistic variables. We report interesting insights into this rather under-researched important topic. More investigations, incorporating also cognitive issues, need to be conducted in the future.
Conference Paper
Full-text available
Advanced personalization techniques are required to cope with novel challenges posed by attribute-rich MPEG-7 based digital libraries. At the heart of our deeply personalized news dissemination system P-News is one extensible preference model that serves all purposes, preventing impedance mismatches between the various stages: User modeling by structured preference patterns, automatic query expansion including ontologies, preference query evaluation by Preference XPath including nested preferences on categorical data, quality assessment of query results, personalized notification and news syndication.
Conference Paper
Full-text available
Though skyline queries already have claimed their place in retrieval over central databases, their application in Web information systems up to now was impossible due to the distributed aspect of retrieval over Web sources. But due to the amount, variety and volatile nature of information accessible over the Internet extended query capabilities are crucial. We show how to efficiently perform distributed skyline queries and thus essentially extend the expressive- ness of querying today's Web information systems. Together with our innova- tive retrieval algorithm we also present useful heuristics to further speed up the retrieval in most practical cases paving the road towards meeting even the real- time challenges of on-line information services. We discuss performance evaluations and point to open problems in the concept and application of sky- lining in modern information systems. For the curse of dimensionality, an in- trinsic problem in skyline queries, we propose a novel sampling scheme that al- lows to get an early impression of the skyline for subsequent query refinement.
Article
Full-text available
Declarative languages for deductive and object-oriented databases require some high-level mechanism for specifying semantic control knowledge. This paper proposes user-supplied subsumption information as a paradigm to specify desired, prefered or useful deductions at the meta level. For this purpose we augment logic programming by subsumption relations and succeed to extend the classical theorems for least models, fixpoints and bottom-up evaluation accordingly. Moreover, we provide a differential fixpoint operator for efficient query evaluation in deductive databases. This operator discards subsumed tuples on the fly. We also exemplify the ease of use of this programming methodology. In particular, we demonstrate how heuristic AI search procedures can be integrated into deductive databases in this way.
Article
Full-text available
The paper is a theoretical study of a generalization of the lexicographic rule for combining ordering relations. We define the concept of priority operator: a priority operator maps a family of relations to a single relation which represents their lexicographic combination according to a certain priority on the family of relations. We present four kinds of results. • We show that the lexicographic rule is the only way of combining preference relations which satisfies natural conditions (similar to those proposed by Arrow). • We show in what circumstances the lexicographic rule propagates various conditions on preference relations, thus extending Grosof's results. • We give necessary and sufficient conditions on the priority relation to determine various relationships between combinations of preferences. • We give an algebraic treatment of this form of generalized prioritization. Two operators, called but and on the other hand , are sufficient to express any prioritization. We present a complete equational axiomatization of these two operators. These results can be applied in the theory of social choice (a branch of economics), in non&hyphen;monotonic reasoning (a branch of artificial intelligence), and more generally wherever relations have to be combined.
Conference Paper
Full-text available
As information becomes available in increasing amounts to a wide spectrum of users, the need for a shift towards a more user-centered information access paradigm arises. We develop a personalization framework for database systems based on user profiles and identify the basic architectural modules required to support it. We define a preference model that assigns to each atomic query condition a personal degree of interest and provide a mechanism to compute the degree of interest in any complex query condition based on the degrees of interest in the constituent atomic ones. Preferences are stored in profiles. At query time, personalization proceeds in two steps: (a) preference selection and (b) preference integration into the original user query. We formulate the main personalization step, i.e. preference selection, as a graph computation problem and provide an efficient algorithm for it. We also discuss results of experimentation with a prototype query personalization system.
Conference Paper
Personalization of Web services requires a pow- erful preference model that smoothly and effi- ciently integrates with standard database query languages. We make the case for preferences as strict partial orders, supported in Preference SQL and Preference XPATH. Performance of Web services will crucially depend on various archi- tectural design decisions. We pointed out that a central server architecture is desirable. Concern- ing the implementation of preference queries we investigated the tightly coupled architecture, pre- senting a novel approach for algebraic optimiza- tion based on preference algebra. We provided new transformation laws and gave evidence for the power of this heuristic optimization. This forms the basis for a new preference query opti- mization methodology, promising sufficient per- formance even for complex Web services.
Conference Paper
Personalization of e-services poses new challenges to database technology, demanding a powerful and flexible modeling technique for complex preferences. Preference queries have to be answered cooperatively by treating preferences as soft constraints, attempting a best possible match-making. We propose a strict partial order semantics for preferences, which closely matches people's intuition. A variety of natural and of sophisticated preferences are covered by this model. We show how to inductively construct complex preferences by means of various preference constructors. This model is the key to a new discipline called preference engineering and to a preference algebra. Given the Best-Matches-Only (BMO) query model we investigate how complex preference queries can be decomposed into simpler ones, preparing the ground for divide & conquer algorithms. Standard SQL and XPATH can be extended seamlessly by such preferences (presented in detail in the companion paper [15]). We believe that this model is appropriate to extend database technology towards effective support of personalization.
Conference Paper
The advent of the World Wide Web has created an explosion in the available on-line information. As the range of potential choices expand, the time and effort required to sort through them also expands. We propose a formal framework for expressing and combining user preferences to address this problem. Preferences can be used to focus search queries and to order the search results. A preference is expressed by the user for an entity which is described by a set of named fields; each field can take on values from a certain type. The * symbol may be used to match any element of that type. A set of preferences can be combined using a generic combine operator which is instantiated with a value function, thus providing a great deal of flexibility. Same preferences can be combined in more than one way and a combination of preferences yields another preference thus providing the closure property. We demonstrate the power of our framework by illustrating how a currently popular personalization system and a real-life application can be realized as special cases of our framework. We also discuss implementation of the framework in a relational setting.
Conference Paper
We present a new XML-based search technology that enables users to formulate complex customer or vendor preferences which typically occur within e-commerce applications. Preferences are modeled in a natural way by partial orders. Since our semantics of multi-attribute preferences implements the Pareto-optimality principle Preference XPATH queries avoid both the unwanted “emptyresult”-effect and the flooding-effect with lots of irrelevant query results. If perfect matches are not available best possible alternatives are found instead. We have extended the XML query language XPATH by the capability to formulate preferences as soft selection conditions. As our extensions are fully compatible with the XPATH standard both hard and soft selection conditions become now available to any XML-based e-commerce application. Several e-shopping examples show how easy and elegant it is to transform customer wishes into Preference XPATH queries. Our prototype implementation is smoothly integrated with the XML database system Tamino of Software AG. Moreover we show how Preference XPATH can be used within the XML query language QUILT. It even merges with XML style sheets (XSLT) and the XML pointer language (XPointer). Thus with Preference XPATH powerful personalized search engines and match-making processes for B2C and B2B can be implemented completely inside the XML framework.
Conference Paper
The design and implementation of advanced personalized database applications requires a preference-driven approach. Representing preferences as strict partial orders is a good choice in most practical cases. Therefore the efficient integration of preference querying into standard database technology is an important issue. We present a novel approach to relational preference query optimization based on algebraic transformations. A variety of new laws for preference relational algebra is presented. This forms the foundation for a preference query optimizer applying heuristics like 'push preference'. A prototypical implementation and a series of benchmarks show that significant performance gains can be achieved. In summary, our results give strong evidence that by extending relational databases by strict partial order preferences one can get both: good modelling capabilities for personalization and good query runtimes. Our approach extends to recursive databases as well.
Conference Paper
Advanced personalized e-applications require comprehensive knowl- edge about their user's likes and dislikes in order to provide individual product recommendations, personal customer advice and custom-tailored product of- fers. In our approach we model such preferences as strict partial orders with "A is better than B" semantics, which has been proven to be very suitable in vari- ous e-applications. In this paper we present novel Preference Mining techniques for detecting strict partial order preferences in user log data. The main advan- tage of our approach is the semantic expressiveness of the Preference Mining results. Experimental evaluations prove the effectiveness and efficiency of our algorithms. Since the Preference Mining implementation uses sophisticated SQL statements to execute all data-intensive operations on database layer, our algorithms scale well even for large log data sets. With our approach personal- ized e-applications can gain valuable knowledge about their customers' prefer- ences, which is essential for a qualified customer service.
Article
The handling of user preferences is becoming an increasingly important issue in present-day information systems. Among others, preferences are used for information filtering and extraction to reduce the volume of data presented to the user. They are also used to keep track of user profiles and formulate policies to improve and automate decision making. We propose here a simple, logical framework for formulating preferences as preference formulas. The framework does not impose any restrictions on the preference relations, and allows arbitrary operation and predicate signatures in preference formulas. It also makes the composition of preference relations straightforward. We propose a simple, natural embedding of preference formulas into relational algebra (and SQL) through a single winnow operator parameterized by a preference formula. The embedding makes possible the formulation of complex preference queries, for example, involving aggregation, by piggybacking on existing SQL constructs, It also leads in a natural way to the definition of further, preference-related concepts like ranking. Finally, we present general algebraic laws governing the winnow operator and its interactions with other relational algebra operators. The preconditions on the applicability of the laws are captured by logical formulas. The laws provide a formal foundation for the algebraic optimization of preference queries. We demonstrate the usefulness of our approach through numerous examples.
Conference Paper
We present a fully automated electronic sales agent for e-procurement portals. The key technologies for this breakthrough are based on preferences modeled as strict partial orders, enabling a deep personalization of the B2B sales process. The interplay of several novel middleware components achieves to automate skills that so far could be executed only by a human vendor. As personalized search engine for XML based e-catalogs, we use Preference XPath; the Preference Presenter implements a sales psychology based presentation of search results, supporting various human sales strategies; the Preference Repository provides the management of situated long-term preferences; the flexible Personalized Price Offer and the multi-objective Preference Bargainer provide a personalized price fixing and the opportunity to bargain about the price of an entire product bundle, applying up/cross and down selling techniques. Our prototype COSIMA<sup>B2B</sup>, supported by industrial partners, has been successfully demonstrated at a large computer fair.
Article
We propose to extend database systems by a Skyline operation. This operation filters out a set of interesting points from a potentially large set of data points. A point is interesting if it is not dominated by any other point. For example, a hotel might be interesting for somebody traveling to Nassau if no other hotel is both cheaper and closer to the beach. We show how SQL can be extended to pose Skyline queries, present and evaluate alternative algorithms to implement the Skyline operation, and show how this operation can be combined with other database operations (e.g., join and Top N ).
Fishburn: Nontransitive Preferences in Decision Theory
P.C. Fishburn: Nontransitive Preferences in Decision Theory. Journal of Risk and Uncertainty, Kluwer Academic Publishers, 4:113 – 134, 1991.
Design and Implementation of COSIMA -A Smart and Speaking E- Sales Assistant. 3rd Intern. Workshop on Advanced Issues of E-Commerce and Web-Based Inform
  • W Kießling
  • S Fischer
  • S Holland
  • T Ehm
W. Kießling, S. Fischer, S. Holland, T. Ehm: Design and Implementation of COSIMA -A Smart and Speaking E- Sales Assistant. 3rd Intern. Workshop on Advanced Issues of E-Commerce and Web-Based Inform. Systems (WECWIS), pp. 21-30, San Jose, 2001. (demonstrated also at SIGMOD 2001, St. Barbara, U.S.A.)
Preference Constructors for Deeply Personalized Database Queries (extended version of this paper
  • W Kießling
W. Kießling: Preference Constructors for Deeply Personalized Database Queries. Technical Report 2004-7, Institute of Computer Science, Univ. of Augsburg, Febr. 2004. (extended version of this paper; www.informatik.uniaugsburg.de/forschung/ techBerichte/reports/2004-7.pdf)
Fishburn: Preference Structures and their Numerical Representations
P. C. Fishburn: Preference Structures and their Numerical Representations. Journal of Theoretical Computer, 217, 1999, pp. 359 – 383.