Article

Optimal full ranking from pairwise comparisons

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Furthermore, Chen et al. (2022b) prove that for partial recovery, MLE is optimal but the spectral method is not when we have a general conditional number. They further extend their frameworks to study the full ranking problem (Chen et al., 2022a). It is worth noting that the aforementioned prior arts mostly focus on studying the parametric BTL model. ...
... Most existing works on ranking mainly study the first order asymptotic behavior of their estimators (Hunter, 2004;Chen and Suh, 2015;Jang et al., 2016;Shah and Wainwright, 2017;Chen et al., 2019Chen et al., , 2022a. Deriving limiting distributional results in ranking models is important for uncertainty quantification but rarely conducted, especially when covariate information is incorporated into the ranking problem. ...
... The percentage of total net assets allocated to the stocks in a portfolio shows the fund manager's views on their expected future returns. If the percentage of asset A is higher than asset B in a portfolio, it is an indicatation that the fund manager ranks asset A higher than asset B. As a result, similar to Chen et al. (2022a), the holding information of the mutual funds provides us with pairwise comparisons between the two assets. Although there are a lot of financial assets such as stocks and derivatives on the market, we concentrate on the stocks in the S&P500 list. ...
Preprint
This paper concerns with statistical estimation and inference for the ranking problems based on pairwise comparisons with additional covariate information such as the attributes of the compared items. Despite extensive studies, few prior literatures investigate this problem under the more realistic setting where covariate information exists. To tackle this issue, we propose a novel model, Covariate-Assisted Ranking Estimation (CARE) model, that extends the well-known Bradley-Terry-Luce (BTL) model, by incorporating the covariate information. Specifically, instead of assuming every compared item has a fixed latent score {θi}i=1n\{\theta_i^*\}_{i=1}^n, we assume the underlying scores are given by {αi+xiβ}i=1n\{\alpha_i^*+{x}_i^\top\beta^*\}_{i=1}^n, where αi\alpha_i^* and xiβ{x}_i^\top\beta^* represent latent baseline and covariate score of the i-th item, respectively. We impose natural identifiability conditions and derive the \ell_{\infty}- and 2\ell_2-optimal rates for the maximum likelihood estimator of {αi}i=1n\{\alpha_i^*\}_{i=1}^{n} and β\beta^* under a sparse comparison graph, using a novel `leave-one-out' technique (Chen et al., 2019) . To conduct statistical inferences, we further derive asymptotic distributions for the MLE of {αi}i=1n\{\alpha_i^*\}_{i=1}^n and β\beta^* with minimal sample complexity. This allows us to answer the question whether some covariates have any explanation power for latent scores and to threshold some sparse parameters to improve the ranking performance. We improve the approximation method used in (Gao et al., 2021) for the BLT model and generalize it to the CARE model. Moreover, we validate our theoretical results through large-scale numerical studies and an application to the mutual fund stock holding dataset.
... Since we consider a class of likelihood-based estimators rather than a single one, it is helpful to discuss each estimator separately. For the choice-one MLE and QMLE, which are the most common in the literature (Chen et al., 2022a;Fan et al., 2023), our method exclusively relies on the truncated error analysis. This approach is based on the Neumann series expansion of the normalized Hessian and differs from the state-of-the-art leave-one-out analysis in the field. ...
... Under such circumstances, we combine the truncated error analysis with a leave-one-out perturbation argument to obtain the desired results. While the latter is similar to Chen et al. (2022a); Fan et al. (2023) in spirit, there are notable differences. For instance, our perturbation argument is applied when the hypergraph sequence is deterministic. ...
Preprint
Full-text available
The Plackett--Luce model is a popular approach for rank data analysis, where a utility vector is employed to determine the probability of each outcome based on Luce's choice axiom. In this paper, we investigate the asymptotic theory of utility vector estimation by maximizing different types of likelihood, such as the full-, marginal-, and quasi-likelihood. We provide a rank-matching interpretation for the estimating equations of these estimators and analyze their asymptotic behavior as the number of items being compared tends to infinity. In particular, we establish the uniform consistency of these estimators under conditions characterized by the topology of the underlying comparison graph sequence and demonstrate that the proposed conditions are sharp for common sampling scenarios such as the nonuniform random hypergraph model and the hypergraph stochastic block model; we also obtain the asymptotic normality of these estimators and discuss the trade-off between statistical efficiency and computational complexity for practical uncertainty quantification. Both results allow for nonuniform and inhomogeneous comparison graphs with varying edge sizes and different asymptotic orders of edge probabilities. We verify our theoretical findings by conducting detailed numerical experiments.
... As a result, maximum likelihood estimation is a natural candidate to recover the hidden scores from the binary measurement. The algorithm and performance of the MLE for the BTL models are studied in works such as [11,19,24]. On the other hand, finding the MLE for noisy sorting problem is usually NP-hard [7,3]. ...
Preprint
Full-text available
Given pairwise comparisons between multiple items, how to rank them so that the ranking matches the observations? This problem, known as rank aggregation, has found many applications in sports, recommendation systems, and other web applications. As it is generally NP-hard to find a global ranking that minimizes the mismatch (known as the Kemeny optimization), we focus on the Erd\"os-R\'enyi outliers (ERO) model for this ranking problem. Here, each pairwise comparison is a corrupted copy of the true score difference. We investigate spectral ranking algorithms that are based on unnormalized and normalized data matrices. The key is to understand their performance in recovering the underlying scores of each item from the observed data. This reduces to deriving an entry-wise perturbation error bound between the top eigenvectors of the unnormalized/normalized data matrix and its population counterpart. By using the leave-one-out technique, we provide a sharper \ell_{\infty}-norm perturbation bound of the eigenvectors and also derive an error bound on the maximum displacement for each item, with only Ω(nlogn)\Omega(n\log n) samples. Our theoretical analysis improves upon the state-of-the-art results in terms of sample complexity, and our numerical experiments confirm these theoretical findings.
... This separation condition is imposed to simplify our main results, and the full results in Section 6 are more general. This type of separation condition has appeared in, e.g., Chen et al. (2022), where the pairwise comparison design was studied. ...
Preprint
We consider a symmetric mixture of linear regressions with random samples from the pairwise comparison design, which can be seen as a noisy version of a type of Euclidean distance geometry problem. We analyze the expectation-maximization (EM) algorithm locally around the ground truth and establish that the sequence converges linearly, providing an \ell_\infty-norm guarantee on the estimation error of the iterates. Furthermore, we show that the limit of the EM sequence achieves the sharp rate of estimation in the 2\ell_2-norm, matching the information-theoretically optimal constant. We also argue through simulation that convergence from a random initialization is much more delicate in this setting, and does not appear to occur in general. Our results show that the EM algorithm can exhibit several unique behaviors when the covariate distribution is suitably structured.
... In the past few years, a series of works [4][5][6]24] have studied in depth the theoretical properties of these two estimators, with focus on their 2 / ∞ estimation accuracy and performance in partial/full ranking; see Section 2 ahead for a detailed review of related results. ...
Article
The Bradley–Terry–Luce (BTL) model is a benchmark model for pairwise comparisons between individuals. Despite recent progress on the first-order asymptotics of several popular procedures, the understanding of uncertainty quantification in the BTL model remains largely incomplete, especially when the underlying comparison graph is sparse. In this paper, we fill this gap by focusing on two estimators that have received much recent attention: the maximum likelihood estimator (MLE) and the spectral estimator. Using a unified proof strategy, we derive sharp and uniform non-asymptotic expansions for both estimators in the sparsest possible regime (up to some poly-logarithmic factors) of the underlying comparison graph. These expansions allow us to obtain: (i) finite-dimensional central limit theorems for both estimators; (ii) construction of confidence intervals for individual ranks; (iii) optimal constant of 2\ell _2 estimation, which is achieved by the MLE but not by the spectral estimator. Our proof is based on a self-consistent equation of the second-order remainder vector and a novel leave-two-out analysis.
... Both these stochastic methods and the generalization of the AHP can be used on incomplete data, when some of the paired comparisons are missing (Harker, 1987;Ishizaka and Labib, 2011), which is often demonstrated on sport examples Bozóki et al., 2016). However, in this regard, several theoretical questions have been investigated in the most recent literature (Chen et al., 2022). ...
Preprint
Full-text available
Several methods of preference modeling, ranking, voting and multi-criteria decision making include pairwise comparisons. It is usually simpler to compare two objects at a time, furthermore, some relations (e.g., the outcome of sports matches) are naturally known for pairs. This paper investigates and compares pairwise comparison models and the stochastic Bradley-Terry model. It is proved that they provide the same priority vectors for consistent (complete or incomplete) comparisons. For incomplete comparisons, all filling in levels are considered. Recent results identified the optimal subsets and sequences of multiplicative/additive/reciprocal pairwise comparisons for small sizes of items (up to n = 6). Simulations of this paper show that the same subsets and sequences are optimal in case of the Bradley-Terry and the Thurstone models as well. This, somehow surprising, coincidence suggests the existence of a more general result. Further models of information and preference theory are subject to future investigation in order to identify optimal subsets of input data.
Preprint
The recent paper \cite{GSZ2023} on estimation and inference for top-ranking problem in Bradley-Terry-Lice (BTL) model presented a surprising result: componentwise estimation and inference can be done under much weaker conditions on the number of comparison then it is required for the full dimensional estimation. The present paper revisits this finding from completely different viewpoint. Namely, we show how a theoretical study of estimation in sup-norm can be reduced to the analysis of plug-in semiparametric estimation. For the latter, we adopt and extend the general approach from \cite{Sp2024} for high-dimensional estimation. The main tool of the analysis is a theory of perturbed marginal optimization when an objective function depends on a low-dimensional target parameter along with a high-dimensional nuisance parameter. A particular focus of the study is the critical dimension condition. Full-dimensional estimation requires in general the condition \mathbbmsl{N} \gg \mathbb{p} between the effective parameter dimension p \mathbb{p} and the effective sample size \mathbbmsl{N} corresponding to the smallest eigenvalue of the Fisher information matrix \mathbbmsl{F} . Inference on the estimated parameter is even more demanding: the condition \mathbbmsl{N} \gg \mathbb{p}^{2} cannot be generally avoided; see \cite{Sp2024}. However, for the sup-norm estimation, the critical dimension condition can be reduced to \mathbbmsl{N} \geq \CONST \log(\dimp) . Compared to \cite{GSZ2023}, the proposed approach works for the classical MLE and does not require any resampling procedure, applies to more general structure of the comparison graph, and yields more accurate expansions for each component of the parameter vector.
Article
Given pairwise comparisons between multiple items, how to rank them so that the ranking matches the observations? This problem, known as rank aggregation, has found many applications in sports, recommendation systems and other web applications. We focus on the ranking problem under the Erdös–Rényi outliers model: only a subset of pairwise comparisons is observed, being either clean or corrupted copies of the true score differences. We investigate the spectral ranking algorithms that are based on unnormalized and normalized data matrices. The key is to understand their performance in recovering the underlying scores of each item from the observed data. This reduces to deriving an entry-wise perturbation error bound between the top eigenvectors of the unnormalized/normalized data matrix and its population counterpart. By using the leave-one-out technique, we provide a sharper \ell _{\infty }-norm perturbation bound of the eigenvectors and derive an error bound on the maximum displacement for each item, with only O(nlogn)O(n\log n) samples. In addition, we also derive the sample complexity to perform top-K ranking under mild assumptions. Our theoretical analysis improves upon the state-of-the-art results in terms of sample complexity, and our numerical experiments confirm these theoretical findings.
Preprint
This paper addresses the item ranking problem with associate covariates, focusing on scenarios where the preference scores can not be fully explained by covariates, and the remaining intrinsic scores, are sparse. Specifically, we extend the pioneering Bradley-Terry-Luce (BTL) model by incorporating covariate information and considering sparse individual intrinsic scores. Our work introduces novel model identification conditions and examines the regularized penalized Maximum Likelihood Estimator (MLE) statistical rates. We then construct a debiased estimator for the penalized MLE and analyze its distributional properties. Additionally, we apply our method to the goodness-of-fit test for models with no latent intrinsic scores, namely, the covariates fully explaining the preference scores of individual items. We also offer confidence intervals for ranks. Our numerical studies lend further support to our theoretical findings, demonstrating validation for our proposed method
Preprint
We consider a covariate-assisted ranking model grounded in the Plackett--Luce framework. Unlike existing works focusing on pure covariates or individual effects with fixed covariates, our approach integrates individual effects with dynamic covariates. This added flexibility enhances realistic ranking yet poses significant challenges for analyzing the associated estimation procedures. This paper makes an initial attempt to address these challenges. We begin by discussing the sufficient and necessary condition for the model's identifiability. We then introduce an efficient alternating maximization algorithm to compute the maximum likelihood estimator (MLE). Under suitable assumptions on the topology of comparison graphs and dynamic covariates, we establish a quantitative uniform consistency result for the MLE with convergence rates characterized by the asymptotic graph connectivity. The proposed graph topology assumption holds for several popular random graph models under optimal leading-order sparsity conditions. A comprehensive numerical study is conducted to corroborate our theoretical findings and demonstrate the application of the proposed model to real-world datasets, including horse racing and tennis competitions.
Article
Rank aggregation with pairwise comparisons is widely encountered in sociology, politics, economics, psychology, sports, etc . Given the enormous social impact and the consequent incentives, the potential adversary has a strong motivation to manipulate the ranking list. However, the ideal attack opportunity and the excessive adversarial capability cause the existing methods to be impractical. To fully explore the potential risks, we leverage an online attack on the vulnerable data collection process. Since it is independent of rank aggregation and lacks effective protection mechanisms, we disrupt the data collection process by fabricating pairwise comparisons without knowledge of the future data or the true distribution. From the game-theoretic perspective, the confrontation scenario between the online manipulator and the ranker who takes control of the original data source is formulated as a distributionally robust game that deals with the uncertainty of knowledge. Then we demonstrate that the equilibrium in the above game is potentially favorable to the adversary by analyzing the vulnerability of the sampling algorithms such as Bernoulli and reservoir methods. According to the above theoretical analysis, different sequential manipulation policies are proposed under a Bayesian decision framework and a large class of parametric pairwise comparison models. For attackers with complete knowledge, we establish the asymptotic optimality of the proposed policies. To increase the success rate of the sequential manipulation with incomplete knowledge, a distributionally robust estimator, which replaces the maximum likelihood estimation in a saddle point problem, provides a conservative data generation solution. Finally, the corroborating empirical evidence shows that the proposed method manipulates the results of rank aggregation methods in a sequential manner.
Preprint
Hanson-Wright inequality provides a powerful tool for bounding the norm ξ|\xi| of a centered stochastic vector ξ\xi with sub-gaussian behavior. This paper extends the bounds to the case when ξ\xi only has bounded exponential moments of the form logEexpV1ξ,uu2/2\log E \exp \langle V^{-1} \xi,u \rangle \leq |u|^2/2, where V2Var(ξ)V^2 \geq \mathrm{Var}(\xi) and ug|u| \leq g for some fixed g. For a linear mapping Q, we present an upper quantile function zc(B,x)z_{c}(B,x) ensuring P(Qξ>zc(B,x))3exP(| Q \xi | > z_{c}(B,x)) \leq 3 e^{-x} with B=QV2QTB = Q \, V^2 Q^{T}. The obtained results exhibit a phase transition effect: with a value xcx_{c} depending on g and B, for xxcx \leq x_{c}, the function zc(B,x)z_{c}(B,x) replicates the case of a Gaussian vector ξ\xi, that is, zc2(B,x)=tr(B)+2xtr(B2)+2xBz_{c}^2 (B,x) = {\rm tr}(B) + 2 \sqrt{x {\rm tr}(B^2)} + 2 x |B|. For x>xcx > x_{c}, the function zc(B,x)z_{c}(B,x) grows linearly in x. The results are specified to the case of Bernoulli vector sums and to covariance estimation in Frobenius norm.
Preprint
This technical report studies the problem of ranking from pairwise comparisons in the classical Bradley-Terry-Luce (BTL) model, with a focus on score estimation. For general graphs, we show that, with sufficiently many samples, maximum likelihood estimation (MLE) achieves an entrywise estimation error matching the Cram\'er-Rao lower bound, which can be stated in terms of effective resistances; the key to our analysis is a connection between statistical estimation and iterative optimization by preconditioned gradient descent. We are also particularly interested in graphs with locality, where only nearby items can be connected by edges; our analysis identifies conditions under which locality does not hurt, i.e. comparing the scores between a pair of items that are far apart in the graph is nearly as easy as comparing a pair of nearby items. We further explore divide-and-conquer algorithms that can provably achieve similar guarantees even in the regime with the sparsest samples, while enjoying certain computational advantages. Numerical results validate our theory and confirm the efficacy of the proposed algorithms.
Article
Full-text available
The sustainable management of Land-Water-Energy-Food (LWEF) nexus requires an environmental characterization that allows the comparison of complex interlinkages between nexus resources and livelihoods. This complexity makes this characterization difficult coupled with limited study in quantifying sustainability of LWEF nexus and its linkage with livelihood. Therefore, the present study aimed to investigate the link between sustainable LWEF nexus and livelihoods. In order to address the objective the proposed methodology starts with a detailed identification of LWEF and livelihood indicators which depicts well-defined, shared, and holistic methods to evaluate sustainability. With this we used analytical hierarchy process and pair wise comparison matrix in combination with weighting model. The result of composite LWEF nexus index was 0.083 representing , low sustainability. Besides, this composite index implies the use and management of LWEF nexus resources in the study area is very low, as the composite index approach to 1, the use and management of nexus resources are in a good condition which characterized by sustainability. This could be linked with nexus resources consumption, use, and management. From the analysis of the weight of land, water, energy and food nexus resources, the highest weight was observed for food. The focus of on food production only shows no clear synergy on provisioning, supporting or regulating nexus resources to address livelihoods. The result further showed that LWEF nexus resources have strong correlation with livelihoods. This was evidenced by social (r > 0.8, p < 0.01), natural (r > 0.3, p < 0.05) and physical (r > 0.6, p < 0.01) livelihood indicators showed strong positive correlation with LWEF nexus resources. Based on the finding of the study, it was observed that managing nexus resources not only provide a significant contribution to achieve sustainable LWEF nexus, but also be effective for enhancing livelihood through food security. This could be attained by strong evidence based policy to ensure sustainable use of nexus resources. The results provided by this study would serve as the foundation for future study, policy formulation and implementation.
Article
Full-text available
In this paper, we introduce a new ranking system where the data are preferences resulting from paired comparisons. When direct preferences are missing or unclear, then preferences are determined through indirect comparisons. Given that a ranking of n subjects implies (2n) paired preferences, the resultant computational problem is the determination of an optimal ranking where the agreement between the implied preferences via the ranking and the data preferences is maximized. Comparisons are carried out via simulation studies where the proposed rankings outperform Bradley–Terry in a particular predictive comparison.
Article
Full-text available
This paper is concerned with the problem of top-K ranking from pairwise comparisons. Given a collection of n items and a few pairwise comparisons across them, one wishes to identify the set of K items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model - the Bradley-Terry-Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Recent works have made significant progress towards characterizing the performance (e.g. the mean square error for estimating the scores) of several classical methods, including the spectral method and the maximum likelihood estimator (MLE). However, where they stand regarding top-K ranking remains unsettled. We demonstrate that under a natural random sampling model, the spectral method alone, or the regularized MLE alone, is minimax optimal in terms of the sample complexity - the number of paired comparisons needed to ensure exact top-K identification, for the fixed dynamic range regime. This is accomplished via optimal control of the entrywise error of the score estimates. We complement our theoretical studies by numerical experiments, confirming that both methods yield low entrywise errors for estimating the underlying scores. Our theory is established via a novel leave-one-out trick, which proves effective for analyzing both iterative and non-iterative procedures. Along the way, we derive an elementary eigenvector perturbation bound for probability transition matrices, which parallels the Davis-Kahan Θ theorem for symmetric matrices. This also allows us to close the gap between the l 2 error upper bound for the spectral method and the minimax lower limit.
Article
Full-text available
We explore the top-K rank aggregation problem. Suppose a collection of items is compared in pairs repeatedly, and we aim to recover a consistent ordering that focuses on the top-K ranked items based on partially revealed preference information. We investigate the Bradley-Terry-Luce model in which one ranks items according to their perceived utilities modeled as noisy observations of their underlying true utilities. Our main contributions are two-fold. First, in a general comparison model where item pairs to compare are given a priori, we attain an upper and lower bound on the sample size for reliable recovery of the top-K ranked items. Second, more importantly, extending the result to a random comparison model where item pairs to compare are chosen independently with some probability, we show that in slightly restricted regimes, the gap between the derived bounds reduces to a constant factor, hence reveals that a spectral method can achieve the minimax optimality on the (order-wise) sample size required for top-K ranking. That is to say, we demonstrate a spectral method alone to be sufficient to achieve the optimality and advantageous in terms of computational complexity, as it does not require an additional stage of maximum likelihood estimation that a state-of-the-art scheme employs to achieve the optimality. We corroborate our main results by numerical experiments.
Article
Full-text available
This paper explores the preference-based top-K rank aggregation problem. Suppose that a collection of items is repeatedly compared in pairs, and one wishes to recover a consistent ordering that emphasizes the top-K ranked items, based on partially revealed preferences. We focus on the Bradley-Terry-Luce (BTL) model that postulates a set of latent preference scores underlying all items, where the odds of paired comparisons depend only on the relative scores of the items involved. We characterize the minimax limits on identifiability of top-K ranked items, in the presence of random and non-adaptive sampling. Our results highlight a separation measure that quantifies the gap of preference scores between the KthK^{\text{th}} and (K+1)th(K+1)^{\text{th}} ranked items. The minimum sample complexity required for reliable top-K ranking scales inversely with the separation measure irrespective of other preference distribution metrics. To approach this minimax limit, we propose a nearly linear-time ranking scheme, called \emph{Spectral MLE}, that returns the indices of the top-K items in accordance to a careful score estimate. In a nutshell, Spectral MLE starts with an initial score estimate with minimal squared loss (obtained via a spectral method), and then successively refines each component with the assistance of coordinate-wise MLEs. Encouragingly, Spectral MLE allows perfect top-K item identification under minimal sample complexity. The practical applicability of Spectral MLE is further corroborated by numerical experiments.
Article
Full-text available
The question of aggregating pairwise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR's TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on transactions. In most settings, in addition to obtaining a ranking, finding `scores' for each object (e.g. player's rating) is of interest for understanding the intensity of the preferences. In this paper, we propose Rank Centrality, an iterative rank aggregation algorithm for discovering scores for objects (or items) from pairwise comparisons. The algorithm has a natural random walk interpretation over the graph of objects with an edge present between a pair of objects if they are compared; the score, which we call Rank Centrality, of an object turns out to be it's stationary probability under this random walk. To study the efficacy of the algorithm, we consider the popular Bradley-Terry-Luce (BTL) model in which each object has an associated score which determines the probabilistic outcomes of pairwise comparisons between objects. We bound the finite sample error rates between the scores assumed by the BTL model and those estimated by our algorithm. In particular, the number of samples required to learn the score well with high probability depends on the structure of the comparison graph. When the Laplacian of the comparison graph has a strictly positive spectral gap, e.g. each item is compared to a subset of randomly chosen items, this leads to an order-optimal dependence on the number of samples. Experimental evaluations on synthetic datasets generated according to the BTL model show that our algorithm performs as well as the Maximum Likelihood estimator for that model and outperforms a recently proposed algorithm by Ammar and Shah (2011).
Article
Full-text available
We study Cramér-Rao bounds (CRB's) for estimation problems on Riemannian manifolds. In [S. T. Smith, “Covariance, Subspace, and Intrinsic Cramér-Rao bounds,” IEEE Trans. Signal Process., vol. 53, no. 5, 1610-1630, 2005], the author gives intrinsic CRB's in the form of matrix inequalities relating the covariance of estimators and the Fisher information of estimation problems. We focus on estimation problems whose parameter space P̅ is a Riemannian submanifold or a Riemannian quotient manifold of a parent space P, that is, estimation problems on manifolds with either deterministic constraints or ambiguities. The CRB's in the aforementioned reference would be expressed w.r.t. bases of the tangent spaces to P̅. In some cases though, it is more convenient to express covariance and Fisher information w.r.t. bases of the tangent spaces to P. We give CRB's w.r.t. such bases expressed in terms of the geodesic distances on the parameter space. The bounds are valid even for singular Fisher information matrices. In two examples, we show how the CRB's for synchronization problems (including a type of sensor network localization problem) differ in the presence or absence of anchors, leading to bounds for estimation on either submanifolds or quotient manifolds with very different interpretations.
Article
Full-text available
The problem of matching two sets of features appears in various tasks of computer vision and can be often formalized as a problem of permutation estimation. We address this problem from a statistical point of view and provide a theoretical analysis of the accuracy of several natural estimators. To this end, the minimax rate of separation is investigated and its expression is obtained as a function of the sample size, noise level and dimension. We consider the cases of homoscedastic and heteroscedastic noise and establish, in each case, tight upper bounds on the separation distance of several estimators. These upper bounds are shown to be unimprovable both in the homoscedastic and heteroscedastic settings. Interestingly, these bounds demonstrate that a phase transition occurs when the dimension d of the features is of the order of the logarithm of the number of features n. For d=O(logn)d=O(\log n), the rate is dimension free and equals σ(logn)1/2\sigma (\log n)^{1/2}, where σ\sigma is the noise level. In contrast, when d is larger than clognc\log n for some constant c>0c>0, the minimax rate increases with d and is of the order σ(dlogn)1/4\sigma(d\log n)^{1/4}. We also discuss the computational aspects of the estimators and provide empirical evidence of their consistency on synthetic data. Finally, we show that our results extend to more general matching criteria.
Article
Full-text available
From the viewpoint of networks, a ranking system for players or teams in sports is equivalent to a centrality measure for sports networks, whereby a directed link represents the result of a single game. Previously proposed network-based ranking systems are derived from static networks, i.e., aggregation of the results of games over time. However, the score of a player (or team) fluctuates over time. Defeating a renowned player in the peak performance is intuitively more rewarding than defeating the same player in other periods. To account for this factor, we propose a dynamic variant of such a network-based ranking system and apply it to professional men's tennis data. We derive a set of linear online update equations for the score of each player. The proposed ranking system predicts the outcome of the future games with a higher accuracy than the static counterparts.
Article
Full-text available
Pairwise comparison matrices are widely used in Multicriteria Decision Making. This article applies incomplete pairwise comparison matrices in the area of sport tournaments, namely proposing alternative rankings for the 2010 Chess Olympiad Open tournament. It is shown that results are robust regarding scaling technique. In order to compare different rankings, a distance function is introduced with the aim of taking into account the subjective nature of human perception. Analysis of the weight vectors implies that methods based on pairwise comparisons have common roots. Visualization of the results is provided by Multidimensional Scaling on the basis of the defined distance. The proposed rankings give in some cases intuitively better outcome than currently used lexicographical orders.
Conference Paper
Full-text available
The paper is concerned with learning to rank, which is to construct a model or a function for ranking objects. Learning to rank is useful for document retrieval, collaborative filtering, and many other applications. Several methods for learning to rank have been proposed, which take object pairs as 'instances' in learning. We refer to them as the pairwise approach in this paper. Al- though the pairwise approach offers advantages, it ignores the fact that ranking is a prediction task on list of objects. The paper postulates that learn- ing to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning. The paper proposes a new proba- bilistic method for the approach. Specifically it introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning. Neural Network and Gradient Descent are then employed as model and algorithm in the learning method. Experimental results on infor- mation retrieval show that the proposed listwise approach performs better than the pairwise ap- proach.
Conference Paper
Full-text available
The majority of recommender systems are designed to make recommendations for individual users. However, in some circumstances the items to be selected are not intended for personal usage but for a group; e.g., a DVD could be watched by a group of friends. In order to generate effective recommendations for a group the system must satisfy, as much as possible, the individual preferences of the group's members. This paper analyzes the effectiveness of group recommendations obtained aggregating the individual lists of recommendations produced by a collaborative filtering system. We compare the effectiveness of individual and group recommendation lists using normalized discounted cumulative gain. It is observed that the effectiveness of a group recommendation does not necessarily decrease when the group size grows. Moreover, when individual recommendations are not effective a user could obtain better suggestions looking at the group recommendations. Finally, it is shown that the more alike the users in the group are, the more effective the group recommendations are.
Book
Learning to Rank for Information Retrieval is an introduction to the field of learning to rank, a hot research topic in information retrieval and machine learning. It categorizes the state-of-the-art learning-to-rank algorithms into three approaches from a unified machine learning perspective, describes the loss functions and learning mechanisms in different approaches, reveals their relationships and differences, shows their empirical performances on real IR applications, and discusses their theoretical properties such as generalization ability. As a tutorial, Learning to Rank for Information Retrieval helps people find the answers to the following critical questions: To what respect are learning-to-rank algorithms similar and in which aspects do they differ? What are the strengths and weaknesses of each algorithm? Which learning-to-rank algorithm empirically performs the best? Is ranking a new machine learning problem? What are the unique theoretical issues for ranking as compared to classification and regression? Learning to Rank for Information Retrieval is both a guide for beginners who are embarking on research in this area, and a useful reference for established researchers and practitioners
Conference Paper
Due to the prevalence of group activities in people's daily life, recommending content to a group of users becomes an important task in many information systems. A fundamental problem in group recommendation is how to aggregate the preferences of group members to infer the decision of a group. Toward this end, we contribute a novel solution, namely AGREE (short for ''Attentive Group REcommEndation''), to address the preference aggregation problem by learning the aggregation strategy from data, which is based on the recent developments of attention network and neural collaborative filtering (NCF). Specifically, we adopt an attention mechanism to adapt the representation of a group, and learn the interaction between groups and items from data under the NCF framework. Moreover, since many group recommender systems also have abundant interactions of individual users on items, we further integrate the modeling of user-item interactions into our method. Through this way, we can reinforce the two tasks of recommending items for both groups and users. By experimenting on two real-world datasets, we demonstrate that our AGREE model not only improves the group recommendation performance but also enhances the recommendation for users, especially for cold-start users that have no historical interactions individually.
Article
There has been a recent surge of interest in studying permutation-based models for ranking from pairwise comparison data. Despite being structurally richer and more robust than parametric ranking models, permutation-based models are less well understood statistically and generally lack efficient learning algorithms. In this work, we study a prototype of permutation-based ranking models, namely, the noisy sorting model. We establish the optimal rates of learning the model under two sampling procedures. Furthermore, we provide a fast algorithm to achieve near-optimal rates if the observations are sampled independently. Along the way, we discover properties of the symmetric group which are of theoretical interest.
Conference Paper
Consider a noisy linear observation model with an unknown permutation, based on observing y = Π*Ax* + w, where x* ∈ ℝ d is an unknown vector, Π* is an unknown n × n permutation matrix, and w ∈ ℝ n is additive Gaussian noise. We analyze the problem of permutation recovery in a random design setting in which the entries of the matrix A are drawn i.i.d. from a standard Gaussian distribution, and establish sharp conditions on the SNR, sample size n, and dimension d under which Π* is exactly and approximately recoverable. On the computational front, we show that the maximum likelihood estimate of Π* is NP-hard to compute, while also providing a polynomial time algorithm when d = 1.
Article
S ummary Spearman's measure of disarray D is the sum of the absolute values of the difference between the ranks. We treat D as a metric on the set of permutations. The limiting mean, variance and normality are established. D is shown to be related to the metric I arising from Kendall's τ through the combinatorial inequality I ≤ D ≤ 2 I .
Chapter
The Analytic Hierarchy Process (AHP) is a problem solving framework. It is a systematic procedure for representing the elements of any problem. It organizes the basic rationality by breaking down a problem into its smaller constituent parts and then calls for only simple pairwise comparison judgments, to develop priorities in each hierarchy.
Conference Paper
The recent explosion of sports tracking data has dramatically increased the interest in effective data processing and access of sports plays (i.e., short trajectory sequences of players and the ball). And while there exist systems that offer improved categorizations of sports plays (e.g., into relatively coarse clusters), to the best of our knowledge there does not exist any retrieval system that can effectively search for the most relevant plays given a specific input query. One significant design challenge is how best to phrase queries for multi-agent spatiotemporal trajectories such as sports plays.We have developed a novel query paradigm and retrieval system, which we call Chalkboarding, that allows the user to issue queries by drawing a play of interest (similar to how coaches draw up plays). Our system utilizes effective alignment, templating, and hashing techniques tailored to multi-agent trajectories, and achieves accurate play retrieval at interactive speeds.We showcase the efficacy of our approach in a user study, where we demonstrate orders-of-magnitude improvements in search quality compared to baseline systems.
Article
We consider data in the form of pairwise comparisons of n items, with the goal of precisely identifying the top k items for some value of k < n, or alternatively, recovering a ranking of all the items. We consider a simple counting algorithm that ranks the items in order of the number of pairwise comparisons won, and show it has three important and useful features: (a) Computational efficiency: the simplicity of the method leads to speed-ups of several orders of magnitude in computation time as compared to prior work; (b) Robustness: our theoretical guarantees make no assumptions on the pairwise-comparison probabilities, while prior work is restricted to the specific BTL model and performs poorly if the data is not true to it; and (c) Optimality: we show that up to constant factors, our algorithm achieves the information-theoretic limits for recovering the top-k subset. Finally, we extend our results to obtain sharp guarantees for approximate recovery under the Hamming distortion metric.
Article
A probability distribution is defined over the r! permutations of r objects in such a way as to incorporate up to r! minus 1 parameters. Problems of estimation and testing are considered. The results are applied to data on voting at elections and beanstores.
Article
This survey is divided into three major sections. The first concerns mathematical results about the choice axiom and the choice models that devoIve from it. For example, its relationship to Thurstonian theory is satisfyingly understood; much is known about how choice and ranking probabilities may relate, although little of this knowledge seems empirically useful; and there are certain interesting statistical facts. The second section describes attempts that have been made to test and apply these models. The testing has been done mostly, though not exclusively, by psychologists; the applications have been mostly in economics and sociology. Although it is clear from many experiments that the conditions under which the choice axiom holds are surely delicate, the need for simple, rational underpinnings in complex theories, as in economics and sociology, leads one to accept assumptions that are at best approximate. And the third section concerns alternative, more general theories which, in spirit, are much like the choice axiom. Perhaps I had best admit at the outset that, as a commentator on this scene, I am quali- fied no better than many others and rather less well than some who have been working in this area recently, which I have not been. My pursuits have led me along other, somewhat related routes. On the one hand, I have contributed to some of the recent, purely algebraic aspects of fundamental measurement (for a survey of some of this material, see Krantz, Lute, Suppes, & Tversky, 1971). And on the other hand, I have worked in the highly probabilistic area of psychophysical theory; but the empirical materials have led me away from axiomatic structures, such as the choice axiom, to more structural, neural models which are not readily axiomatized at the present time. After some attempts to apply choice models to psychophysical phenomena (discussed below in its proper place), I was led to conclude that it is not a very promising approach to, these data, and so I have not been actively studying any aspect of the choice axiom in over 12 years. With that understood, let us begin.
Book
1. Choosing as a way of life Appendix A1. Choosing a residential telecommunications bundle 2. Introduction to stated preference models and methods 3. Choosing a choice model Appendix A3. Maximum likelihood estimation technique Appendix B3. Linear probability and generalised least squares models 4. Experimental design 5. Design of choice experiments Appendix A5. 6. Relaxing the IID assumption-introducing variants of the MNL model Appendix A6. Detailed characterisation of the nested logit model Appendix B6. Advanced discrete choice methods 7. Complex, non-IID multiple choice designs 8. Combining sources of preference data 9. Implementing SP choice behaviour projects 10. Marketing case studies 11. Transportation case studies 12. Environmental valuation case studies 13. Cross and external validity of SP models.
Article
I am grateful to Joseph B. Kadane for numerous constructive suggestions offered during discussions of this research. The financial sponsorship of the U.S. Department of Transportation through grant DOT-OS-4006 is also acknowledged. The opinions and conclusions expressed herein are solely those of the author.
Article
Pairwise comparison is commonly used to estimate preference values of finite alternatives with respect to a given criterion. We discuss 18 estimating methods for deriving preference values from pairwise judgment matrices under a common framework of effectiveness: distance minimization and correctness in error free cases. We point out the importance,of commensurate scales when aggregating all the columns of a judgment matrix and the desirability of weighting the columns according to the preference values. The common framework is useful in differentiating the strength and weakness of the estimated methods. Some comparison results of these 18 methods on two sets of judgment matrices with small and large errors are presented. We also give insight regarding the underlying mathematical structure of some of the methods.
Conference Paper
We study the subset ranking problem, motivated by its important application in web-search. In this context, we consider the standard DCG criterion (discounted cumulated gain) that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, the DCG criterion leads to a non-convex optimization problem that can be NP-hard. Therefore a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the generalization ability of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.
Article
This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, pairwise, and listwise approaches, analyze the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures, evaluate the performance of these approaches on the LETOR benchmark datasets, and demonstrate how to use these approaches to solve real ranking applications. In the second part of the tutorial, we will discuss some advanced topics regarding learning to rank, such as relational ranking, diverse ranking, semi-supervised ranking, transfer ranking, query-dependent ranking, and training data preprocessing. In the third part, we will briefly mention the recent advances on statistical learning theory for ranking, which explain the generalization ability and statistical consistency of different ranking methods. In the last part, we will conclude the tutorial and show several future research directions.
Article
This paper is a practical study of how to implement the Quicksort sorting algorithm and its best variants on real computers, including how to apply various code optimization techniques. A detailed implementation combining the most effective improvements to Quicksort is given, along with a discussion of how to implement it in assembly language. Analytic results describing the performance of the programs are summarized. A variety of special situations are considered from a practical standpoint to illustrate Quicksort's wide applicability as an internal sorting method which requires negligible extra storage.
Conference Paper
We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building meta-search engines, combining ranking functions, selecting documents based on multiple criteria, and improving search precision through word associations. We develop a set of techniques for the rank aggregation problem and compare their performance to that of well-known methods. A primary goal of our work is to design rank aggregation techniques that can effectively combat "spam," a serious problem in Web searches. Experiments show that our methods are simple, efficient, and effective.
Article
The efficiency of mergesort programs is analysed under a simple unit-cost model. In our analysis the time performance of the sorting programs includes the costs of key comparisons, element moves and address calcula- tions. The goal is to establish the best possible time-bound relative to the model when sorting n integers. By the well-known information-theoretic ar- gument n log 2 n - O(n) is a lower bound for the integer-sorting problem in our framework. New implementations for two-way and four-way bottom-up mergesort are given, the worst-case complexities of which are shown to be bounded by 5.5nlog 2 n + O(n) and 3.25nlog 2 n + O(n), respectively. The theoretical findings are backed up with a series of experiments which show the practical relevance of our analysis when implementing library routines for internal-memory computations.
Article
This paper studies problems of inferring order given noisy information. In these problems there is an unknown order (permutation) π\pi on n elements denoted by 1,...,n. We assume that information is generated in a way correlated with π\pi. The goal is to find a maximum likelihood π\pi^* given the information observed. We will consider two different types of observations: noisy comparisons and noisy orders. The data in Noisy orders are permutations given from an exponential distribution correlated with \pi (this is also called the Mallow's model). The data in Noisy Comparisons is a signal given for each pair of elements which is correlated with their true ordering. In this paper we present polynomial time algorithms for solving both problems with high probability. As part of our proof we show that for both models the maximum likelihood solution π\pi^{\ast} is close to the original permutation π\pi. Our results are of interest in applications to ranking, such as ranking in sports, or ranking of search items based on comparisons by experts.
Article
The Bradley-Terry model for paired comparisons is a simple and much-studied means to describe the probabilities of the possible outcomes when individuals are judged against one another in pairs. Among the many studies of the model in the past 75 years, numerous authors have generalized it in several directions, sometimes providing iterative algorithms for obtaining maximum likelihood estimates for the generalizations. Building on a theory of algorithms known by the initials MM, for minorization-maximization, this paper presents a powerful technique for producing iterative maximum likelihood estimation algorithms or a wide class of generalizations of the Bradley-Terry model. While algorithms for problems of this type have tended to be custom-built in the literature, the techniques in this paper enable their mass production. Simple conditions are stated that guarantee that each algorithm described will produce a sequence that converges to the unique maximum likelihood estimator. Several of the algorithms and convergence results herein are new.
Article
: This paper considers mixed, or random coefficients, multinomial logit (MMNL) models for discrete response, and establishes the following results: Under mild regularity conditions, any discrete choice model derived from random utility maximization has choice probabilities that can be approximated as closely as one pleases by a MMNLmodel. Practical estimation of a parametric mixing family can be carried out by Maximum Simulated Likelihood Estimation or Method of Simulated Moments, and easily computed instruments are provided that make the latter procedure fairly efficient. The adequacy of a mixing specification can be tested simply as an omitted variable test with appropriately defined artificial variables. An application to a problem of demand for alternative vehicles shows that MMNL provides a flexible and computationally practical approach to discrete response analysis. Acknowledgments: Both authors are at the Department of Economics, University of California, Berkeley CA 94720-3880...
Article
In this paper we study noisy sorting without re-sampling. In this problem there is an unknown order aπ(1)<...<aπ(n)a_{\pi(1)} < ... < a_{\pi(n)} where π\pi is a permutation on n elements. The input is the status of (n2)n \choose 2 queries of the form q(ai,xj)q(a_i,x_j), where q(ai,aj)=+q(a_i,a_j) = + with probability at least 1/2+\ga if π(i)>π(j)\pi(i) > \pi(j) for all pairs iji \neq j, where \ga > 0 is a constant and q(ai,aj)=q(aj,ai)q(a_i,a_j) = -q(a_j,a_i) for all i and j. It is assumed that the errors are independent. Given the status of the queries the goal is to find the maximum likelihood order. In other words, the goal is find a permutation σ\sigma that minimizes the number of pairs σ(i)>σ(j)\sigma(i) > \sigma(j) where q(σ(i),σ(j))=q(\sigma(i),\sigma(j)) = -. The problem so defined is the feedback arc set problem on distributions of inputs, each of which is a tournament obtained as a noisy perturbations of a linear order. Note that when \ga < 1/2 and n is large, it is impossible to recover the original order π\pi. It is known that the weighted feedback are set problem on tournaments is NP-hard in general. Here we present an algorithm of running time nO(γ4)n^{O(\gamma^{-4})} and sampling complexity Oγ(nlogn)O_{\gamma}(n \log n) that with high probability solves the noisy sorting without re-sampling problem. We also show that if aσ(1),aσ(2),...,aσ(n)a_{\sigma(1)},a_{\sigma(2)},...,a_{\sigma(n)} is an optimal solution of the problem then it is ``close'' to the original order. More formally, with high probability it holds that iσ(i)π(i)=Θ(n)\sum_i |\sigma(i) - \pi(i)| = \Theta(n) and maxiσ(i)π(i)=Θ(logn)\max_i |\sigma(i) - \pi(i)| = \Theta(\log n). Our results are of interest in applications to ranking, such as ranking in sports, or ranking of search items based on comparisons by experts.
A brief survey of bandwidth selection for density estimation
  • M C Jones
  • J S Marron
  • S J Sheather
Optimal sample complexity of m-wise data for top-k ranking
  • M Jang
  • S Kim
  • C Suh
  • S Oh
Permutation estimation and minimax matching thresholds
  • O Collier
  • A Dalalyan
Estimation of skill distributions
  • A Jadbabaie
  • A Makur
  • D Shah
Trueskill 2: An improved Bayesian skill rating system
  • T Minka
  • R Cleven
  • Y Zaykov
Supplement to “Optimal full ranking from pairwise comparisons
  • P Chen
  • C Gao
  • A Y Zhang