How similarity helps to efficiently compute Kemeny rankings.
ABSTRACT The computation of Kemeny rankings is central to many applications in the context of rank aggregation. Unfortu nately, the problem is NPhard. We show that the Kemeny score (and a corresponding Kemeny ranking) of an election can be computed efficiently whenever the average pairwise distance between two input votes is not too large. In other words, Kemeny Score is fixedparameter tractable with respect to the parameter "average pairwise KendallTau dis tance da". We describe a fixedparameter algorithm with running time 16� d a� · poly. Moreover, we extend our stud ies to the parameters "maximum range" and "average range" of positions a candidate takes in the input votes. Whereas Kemeny Score remains fixedparameter tractable with re spect to the parameter "maximum range", it becomes NP complete in case of an average range value of two. This excludes fixedparameter tractability with respect to the pa rameter "average range" unless P=NP.

Conference Paper: Parameterized Complexity of Candidate Control in Elections and Related Digraph Problems.
[Show abstract] [Hide abstract]
ABSTRACT: There are different ways for an external agent to influence the outcome of an elec tion. We concentrate on "control" by adding or deleting candidates. Our main focus is to investigate the parameterized complexity of various control problems for dif ferent voting systems. To this end, we introduce natural digraph problems that may be of independent interest. They help in determining the parameterized complex ity of control for different voting systems including Llull, Copeland, and plurality voting. Devising several parameterized reductions, we provide an overview of the parameterized complexity of the digraph and control problems with respect to nat ural parameters such as adding/deleting only a bounded number of candidates or having only few voters.Combinatorial Optimization and Applications, Second International Conference, COCOA 2008, St. John's, NL, Canada, August 2124, 2008. Proceedings; 01/2008 
Conference Paper: A Dynamic Programming Approach to the Rank Aggregation Problem
[Show abstract] [Hide abstract]
ABSTRACT: Rank aggregation is an essential approach for aggregating the preferences of multiple agents. One rank aggregation rule of particular interest is the Kemeny rule, which maximises the number of pairwise agreements between the final ranking and the existing rankings, and has an important interpretation as a maximum likelihood estimator. However, Kemeny rankings are NPhard to compute. This has resulted in the development of various algorithms for computing Kemeny rankings. Fortunately, NPhardness may not reflect the difficulty of solving problems that arise in practice. As a result, we aim to demonstrate that the Kemeny consensus can be computed efficiently when aggregating different rankings in real case. In this paper, we describe a dynamic programming model for aggregating university rankings. We also provide details on the implementation of the model. Finally, we present results obtained from an experimental comparison of different models based on real world and randomly generated problem instances, and show that the dynamic programming approach has comparable efficiency as other approaches.16th UK International Conference on Modelling and Simulation (UKSim 2014), Cambridge, UK; 03/2014
Page 1
657
How Similarity Helps to Efficiently Compute
Kemeny Rankings
∗
Nadja Betzler
Institut für Informatik
FriedrichSchillerUniversität
Jena
ErnstAbbePlatz 2
D07743 Jena, Germany
betzler@minet.uni
jena.de
Jiong Guo
Institut für Informatik
FriedrichSchillerUniversität
Jena
ErnstAbbePlatz 2
D07743 Jena, Germany
guo@minet.unijena.de
Michael R. Fellows
PC Research Unit
Office of DVC (Research)
University of Newcastle
Callaghan, NSW 2308,
Australia
michael.fellows@newcastle.edu.au
Rolf Niedermeier
Institut für Informatik
FriedrichSchillerUniversität
Jena
ErnstAbbePlatz 2
D07743 Jena, Germany
niedermeier@minet.uni
jena.de
Frances A. Rosamond
PC Research Unit
Office of DVC (Research)
University of Newcastle
Callaghan, NSW 2308,
Australia
frances.rosamond@newcastle.edu.au
ABSTRACT
The computation of Kemeny rankings is central to many
applications in the context of rank aggregation. Unfortu
nately, the problem is NPhard. We show that the Kemeny
score (and a corresponding Kemeny ranking) of an election
can be computed efficiently whenever the average pairwise
distance between two input votes is not too large. In other
words, Kemeny Score is fixedparameter tractable with
respect to the parameter“average pairwise KendallTau dis
tance da”. We describe a fixedparameter algorithm with
running time 16?da?· poly. Moreover, we extend our stud
ies to the parameters“maximum range”and“average range”
of positions a candidate takes in the input votes. Whereas
Kemeny Score remains fixedparameter tractable with re
spect to the parameter “maximum range”, it becomes NP
complete in case of an average range value of two. This
excludes fixedparameter tractability with respect to the pa
rameter “average range” unless P=NP.
Categories and Subject Descriptors
F.2.2 [Theory of Computation]: Analysis of Algorithms
and Problem Complexity—Nonnumerical Algorithms and Prob
lems; G.2.1 [Mathematics of Computing]: Discrete Math
ematics—Combinatorics; I.2.8 [Computing Methodolo
gies]: Artifical Intelligence—Problem Solving, Control Meth
ods, and Search; J.4 [Computer Applications]: Social
∗Most of the results of this paper have been presented at
COMSOC’08 under the title“Computing Kemeny Rankings,
Parameterized by the Average KTDistance”.
and Behavioral Sciences
General Terms
Algorithms
Keywords
Rank aggregation, NPhard problem, exact algorithm, fixed
parameter tractability, structural parameterization
1.INTRODUCTION
Aggregating inconsistent information has many applica
tions ranging from voting scenarios to meta search engines
and fighting spam [1, 8, 11, 14]. In some sense, one deals
with consensus problems where one wants to find a solution
to various“input demands”such that these demands are met
as well as possible. Naturally, contradicting demands cannot
be fulfilled at the same time. Hence, the consensus solution
has to provide a balance between opposing requirements.
The concept of Kemeny consensus (or Kemeny ranking) is
among the most important conflict resolution proposals in
this context. In this paper, extending and improving previ
ous results [3], we study new algorithmic approaches based
on parameterized complexity analysis [13, 17, 21] for effi
ciently computing optimal Kemeny consensus solutions in
practically relevant special cases. To this end, we employ
the “similarity” between votes by measuring their average
pairwise distance.
Kemeny’s voting scheme can be described as follows. An
election (V,C) consists of a set V of n votes and a set C of
m candidates. A vote is a preference list of the candidates,
that is, a permutation on C. For instance, in the case of
three candidates a,b,c, the order c > b > a would mean that
candidate c is the bestliked and candidate a is the least
liked for this voter. A “Kemeny consensus” is a preference
list that is“closest”to the preference lists of the voters. For
Cite as: How Similarity Helps to Effi ciently Compute Kemeny Rank
ings, Nadja Betzler, Michael R. Fellows, Jiong Guo, Rolf Niedermeier,
Frances A. Rosamond, Proc. of 8th Int. Conf. on Autonomous Agents
and Multiagent Systems (AAMAS 2009), Decker, Sichman, Sierra and
Castelfranchi (eds.), May, 10–15, 2009, Budapest, Hungary, pp. 657–664
Copyright © 2009, International Foundation for Autonomous Agents
and Multiagent Systems (www.ifaamas.org), All rights reserved.
Page 2
AAMAS 2009 • 8th International Conference on Autonomous Agents and Multiagent Systems • 10–15 May, 2009 • Budapest, Hungary
658
each pair of votes v,w, the socalled KendallTau distance
(KTdistance for short) between v and w, also known as the
inversion distance between two permutations, is defined as
KTdist(v,w) =
X
{c,d}⊆C
dv,w(c,d),
where the sum is taken over all unordered pairs {c,d} of
candidates, and dv,w(c,d) is 0 if v and w rank c and d in the
same order, and 1 otherwise. Using divideandconquer, the
KTdistance can be computed in O(m·logm) time [20]. The
score of a preference list l with respect to an election (V,C)
is defined asP
its scoreP
is as follows:
v∈VKTdist(l,v). A preference list l with the
minimum score is called a Kemeny consensus of (V,C) and
v∈VKTdist(l,v) is the Kemeny score of (V,C),
denoted as Kscore(V,C). The underlying decision problem
Kemeny Score
Input: An election (V,C) and an integer k > 0.
Question: Is Kscore(V,C) ≤ k?
Known results. Bartholdi et al. [2] showed that Kemeny
Score is NPcomplete, and it remains so even when re
stricted to instances with only four votes [14, 15]. Given the
computational hardness of Kemeny Score on the one side
and its practical relevance on the other side, polynomial
time approximation algorithms have been studied. The Ke
meny score can be approximated to a factor of 8/5 by a
deterministic algorithm [23] and to a factor of 11/7 by a
randomized algorithm [1]. Recently, a polynomialtime ap
proximation scheme (PTAS) has been developed [19]. How
ever, its running time is completely impractical. Conitzer,
Davenport, and Kalagnanam [11, 8] performed computa
tional studies for the efficient exact computation of a Ke
meny consensus, using heuristic approaches such as greedy
and branchandbound. Their experimental results encour
age the search for practically relevant, efficiently solvable
special cases. These experimental investigations focus on
computing strong admissible bounds for speeding up search
based heuristic algorithms. In contrast, our focus is on exact
algorithms with provable asymptotic running time bounds
for the developed algorithms.
provided further, exact classifications of the classical com
putational complexity of Kemeny elections.
ically, whereas Kemeny Score is NPcomplete, they pro
vided PNP
?
sions of the problem. Very recently, a parameterized com
plexity study based on various problem parameterizations
has been initiated [3].There, fixedparameter tractabil
ity results for the parameters “Kemeny score”, “number of
candidates”and“maximum KTdistance between two input
votes” are reported.
Finally, it is interesting to note that Conitzer [7] uses a
(different) notion of similarity (which is, furthermore, im
posed on candidates rather than voters) to efficiently com
pute the closely related Slater rankings. Using the concept
of similar candidates, he identifies efficiently solvable spe
cial cases, also yielding a powerful preprocessing technique
for computing Slater rankings.
Hemaspaandra et al. [18]
More specif
completeness results for other, more general ver
New results. Our main result is that Kemeny Score can
be solved in 16?da?· poly(n,m) time, where da denotes the
average KTdistance between the pairs of input votes. This
means a significant improvement over the previous algorithm
for the maximum KTdistance dmax between pairs of input
votes, which has running time (3dmax+ 1)! · poly(n,m) [3].
Clearly, da ≤ dmax.
can show that Kemeny Score can be solved in 32rmax·
poly(n,m) time, where rmax denotes the maximum range of
candidate positions of an election (see Section 2 for a formal
definition). In contrast, these two fixedparameter tractabil
ity results are complemented by an NPcompleteness result
for the case of an average range of candidate positions of only
two, thus destroying hopes for fixedparameter tractability
with respect to this parameterization.
In addition, using similar ideas, we
2.PRELIMINARIES
Let the position of a candidate c in a vote v, denoted
by v(c), be the number of candidates that are better than c
in v. That is, the leftmost (and best) candidate in v has
position 0 and the rightmost has position m − 1. For an
election (V,C) and a candidate c ∈ C, the average posi
tion pa(c) of c is defined as
pa(c) :=1
n·
X
v∈V
v(c).
For an election (V,C), the average KTdistance da is de
fined as1
1
n(n − 1)·
da :=
X
u,v∈V,u=v
KTdist(u,v).
Note that an equivalent definition is given by
da :=
1
n(n − 1)·
X
a,b∈C
#v(a > b) · #v(b > a),
where for two candidates a and b the number of votes in
which a is ranked better than b is denoted by #v(a > b).
The latter definition is useful if the input is provided by the
outcomes of the pairwise elections of the candidates includ
ing the margins of victory. Furthermore, we define
d := ?da?.
Further, for an election (V,C) and for a candidate c ∈ C,
the range r(c) of c is defined as
r(c) := max
v,w∈V{v(c) − w(c)} + 1.
The maximum range rmax of an election is given by rmax :=
maxc∈Cr(c) and the average range ra is defined as
ra :=1
m
X
c∈C
r(c).
Finally, we briefly introduce the relevant notions of pa
rameterized complexity theory [13, 17, 21]. Parameterized
algorithmics aims at a multivariate complexity analysis of
problems. This is done by studying relevant problem param
eters and their influence on the computational complexity
of problems. The hope lies in accepting the seemingly in
evitable combinatorial explosion for NPhard problems, but
confining it to the parameter. Thus, the decisive question
is whether a given parameterized problem is fixedparameter
1To simplify the presentation, the following definition counts
the pair (u,v) as well as the pair (v,u), thus having to divide
by n(n − 1) to obtain the correct average distance value.
Page 3
Nadja Betzler, Michael R. Fellows, Jiong Guo, Rolf Niedermeier, Frances A. Rosamond • How Similarity Helps to Effi ciently Compute Kemeny Rankings
659
v1
: a > b > c > d > e > f > ...
...
a > b > c > d > e > f > ...
b > a > d > c > f > e > ...
...
b > a > d > c > f > e > ...
vi
vi+1
:
:
v2i
:
Figure 1: Small maximum range but large average
KTdistance.
tractable (FPT) with respect to the parameter. In other
words, for an input instance I together with the parame
ter k, we ask for the existence of a solving algorithm with
running time f(k)·poly(I) for some computable function f.
3.ON PARAMETERIZATIONS OF
KEMENY SCORE
This section discusses the “art” of finding different, prac
tically relevant parameterizations of Kemeny Score. Our
paper focusses on structural parameterizations, that is, struc
tural properties of input instances that may be exploited to
develop efficient solving algorithms for Kemeny Score. To
this end, here we investigate the realistic scenario (which, to
some extent, is also motivated by previous experimental re
sults [11, 8]) that the given preference lists of the voters show
some form of similarity. More specifically, we consider the
parameters “average KTdistance” between the input votes,
“maximum range of candidate positions”, and“average range
of candidate positions”. Clearly, the maximum value is al
ways an upper bound for the average value. The parameter
“average KTdistance”reflects the situation that in an ideal
world all votes would be the same, and differences occur to
some (limited) form of noise which makes the actual votes
different from each other (see [12, 10, 9]). With average
KTdistance as parameter we can affirmatively answer the
question whether a consensus list that is closest to the input
votes can efficiently be found. By way of contrast, the pa
rameterization by position range rather reflects the situation
that whereas voters can be more or less decided concerning
groups of candidates (e.g., political parties), they may be
quite undecided and, thus, unpredictable, concerning the
ranking within these groups. If these groups are small this
can also imply small range values, thus making the quest for
a fixedparameter algorithm in terms of range parameteri
zation attractive.
It is not hard to see, however, that the parameterizations
by “average KTdistance” and by “range of position” can
significantly differ. As described in the following, there are
input instances of Kemeny Score that have a small range
value and a large average KTdistance, and vice versa. This
justifies separate investigations for both parameterizations;
these are performed in Sections 4 and 5, respectively. We
end this section with some concrete examples that exhibit
the announced differences between our notions of vote sim
ilarity, that is, our parameters under investigation. First,
we provide an example where one can observe a small max
imum candidate range whereas one has large average KT
distance, see Figure 1. The election in Figure 1 consists
of n = 2i votes such that there are two groups of i identi
v1
v2
v?
:
:
:
a
b
a
...
>
>
>
b
c
b
>
>
>
c
d
c
>
>
>
d
e
d
>
>
>
e
f
e
>
>
>
f>
>
>
...
...
f
a
1
...
Figure 2: Small average KTdistance but large max
imum range.
cal votes. The votes of the second group are obtained from
the first group by swapping neighboring pairs of candidates.
Clearly, the maximum range of candidates is 2. However,
for m candidates the average KTdistance da is
da =2 · (n/2)2· (m/2)
n(n − 1)
> m/4
and, thus, da is unbounded for an unbounded number of
candidates.
Second, we present an example where the average KT
distance is small but the maximum range of candidates is
large, see Figure 2. In the election of Figure 2 all votes
are equal except that candidate a is at the last position in
the second vote, but on the first position in all other votes.
Thus, the maximum range equals the range of candidate a
which equals the number of candidates, whereas by adding
more copies of the first vote the average KTdistance can be
made smaller than one.
Finally, we have a somewhat more complicated example
displaying a case where one observes small average KT
distance but large average range of candidates.2
end, we make use of the following construction based on an
election with m candidates. Let Vm be a set of m votes
such that every candidate is in one of the votes at the first
and in one of the votes at the last position; the remaining
positions can be filled arbitrarily. Then, for some N > m3,
add N further votes VN in which all candidates have the
same arbitrary order. Then, the average KTdistance of the
constructed election is
To this
da = D(Vm) + D(VN) + D(VN,Vm),
where D(Vm) (D(VN)) is the average KTdistance within
the votes of Vm (VN) and D(VN,Vm) is the average KT
distance between pairs of votes with one vote from VN and
the other vote from Vm. Since m2is an upper bound for the
pairwise (and average) KTdistance between any two votes,
it holds that D(Vm) ≤ m2, D(VN) = 1, and D(VN,Vm) ≤
m2. Further, we have m · (m − 1) ordered pairs of votes
within Vm, N ·m pairs between VN and Vm, and N ·(N −1)
pairs within VN. Since N > m3it follows that
da ≤m(m − 1) · m2+ Nm · m2+ N(N − 1) · 1
N(N − 1)
In contrast, the range of every candidate is m, thus the
average range is m.
≤ 3.
4.PARAMETERAVERAGEKTDISTANCE
2Clearly, this example also exhibits the situation of a
large maximum candidate range with a small average KT
distance. We chose nevertheless to present the example from
Figure 2 because of its simplicity.
Page 4
AAMAS 2009 • 8th International Conference on Autonomous Agents and Multiagent Systems • 10–15 May, 2009 • Budapest, Hungary
660
In this section, we further extend the range of parameteri
zations studied so far (see [3]) by giving a fixedparameter al
gorithm with respect to the parameter“average KTdistance”.
We start with showing how the average KTdistance can be
used to upperbound the range of positions that a candidate
can take in any optimal Kemeny consensus. Based on this
crucial observation, we then state the algorithm.
4.1A Crucial Observation
Our fixedparameter tractability result with respect to the
average KTdistance of the votes is based on the following
lemma.
Lemma 1. Let da be the average KTdistance of an elec
tion (V,C) and d = ?da?. Then, in every optimal Kemeny
consensus l, for every candidate c ∈ C with respect to its
average position pa(c) we have pa(c) − d < l(c) < pa(c) + d.
Proof. The proof is by contradiction and consists of two
claims: First, we show that we can find a vote with Ke
meny score less than d · n, that is, the Kemeny score of the
instance is less than d · n. Second, we show that in every
Kemeny consensus every candidate is in the claimed range.
More specifically, we prove that every consensus in which the
position of a candidate is not in a“range d of its average po
sition”has a Kemeny score greater than d·n, a contradiction
to the first claim.
Claim 1: Kscore(V,C) < d · n.
Proof of Claim 1: To prove Claim 1, we show that there
is a vote v ∈ V withP
By definition,
X
⇒ ∃v ∈ V with da ≥
w∈VKTdist(v,w) < d · n, implying
this upper bound for an optimal Kemeny consensus as well.
da =
1
n(n − 1)·
v,w∈V,v=w
KTdist(v,w)(1)
1
n(n − 1)· n ·
X
w∈V,v=w
KTdist(v,w)
(2)
=
1
n − 1·
X
KTdist(v,w).
w∈V,v=w
KTdist(v,w)(3)
⇒ ∃v ∈ V with da· n >
X
w∈V,v=w
(4)
Since we have d = ?da?, Claim 1 follows directly from
Inequality (4).
The next claim shows the given bound on the range of pos
sible candidates positions.
Claim 2: In every optimal Kemeny consensus l, every
candidate c ∈ C fulfills pa(c) − d < l(c) < pa(c) + d.
Proof of Claim 2: We start by showing that, for every
candidate c ∈ C, we have
Kscore(V,C) ≥
X
v∈V
l(c) − v(c). (5)
Note that, for every candidate c ∈ C, for two votes v,w
we must have KTdist(v,w) ≥ v(c) − w(c). Without loss
of generality, assume that v(c) > w(c). Then, there must
be at least v(c) − w(c) candidates that have a smaller po
sition than c in v and that have a greater position than c
in w. Further, each of these candidates increases the value of
KTdist(v,w) by one. Based on this, Inequality (5) directly
follows as, by definition, Kscore(V,C) =P
the positions in l such that l(c) = 0. Accordingly, we shift
the positions in all votes in V , that is, for every v ∈ V
and every a ∈ C, we decrease v(a) by the original value
of l(c). Clearly, shifting all positions does not affect the rel
ative differences of positions between two candidates. Then,
let the set of votes in which c has a nonnegative position
be V+and let V−denote the remaining set of votes, that
is, V−:= V \V+.
Now, we show that if candidate c is placed outside of
the given range in an optimal Kemeny consensus l, then
Kscore(V,C) > d · n. The proof is by contradiction. We
distinguish two cases:
v∈VKTdist(v,l).
To simplify the proof of Claim 2, in the following, we shift
Case 1: l(c) ≥ pa(c) + d.
As l(c) = 0, in this case pa(c) becomes negative. Then,
0 ≥ pa(c) + d ⇔ −pa(c) ≥ d.
It follows that pa(c) ≥ d. The following shows that Claim 2
holds for this case.
X
v∈V
l(c) − v(c) =
X
X
v∈V
v(c)
(6)
=
v∈V+
v(c) +
X
v∈V−
v(c). (7)
Next, replace the termP
this, use the following, derived from the definition of pa(c):
X
⇔
v∈V−
= n · pa(c) +
v∈V−v(c) in (7) by an equiv
v∈V+v(c). For
alent term that depends on pa(c) andP
n · pa(c) =
X
v∈V+
v(c) −
X
X
X
v∈V−
v(c)
v(c) = n · (−pa(c)) +
v∈V+
v(c)
v∈V+
v(c).
The replacement results in
X
v∈V
l(c) − v(c) = 2 ·
X
v∈V+
v(c) + n · pa(c)
≥ n · pa(c) ≥ n · d.
This says that Kscore(V,C) ≥ n · d, a contradiction to
Claim 1.
Case 2: l(c) ≤ pa(c) − d.
Since l(c) = 0, the condition is equivalent to 0 ≤ pa(c)−d ⇔
d ≤ pa(c), and we have that pa(c) is nonnegative. Now, we
show that Claim 2 also holds for this case.
X
v∈V
l(c) − v(c) =
X
X
v∈V
v(c) =
X
X
v∈V+
v(c) +
X
v∈V−
v(c)
≥
v∈V+
v(c) +
v∈V−
v(c) = pa(c) · n ≥ d · n.
Thus, also in this case, Kscore(V,C) ≥ n·d, a contradic
tion to Claim 1.
Page 5
Nadja Betzler, Michael R. Fellows, Jiong Guo, Rolf Niedermeier, Frances A. Rosamond • How Similarity Helps to Effi ciently Compute Kemeny Rankings
661
Based on Lemma 1, for every position we can define the
set of candidates that can take this position in an optimal
Kemeny consensus. The subsequent definition will be useful
for the formulation of the algorithm.
Definition 1. Let (V,C) be an election. For every inte
ger i ∈ {0,...,m − 1}, let Pi denote the set of candidates
that can assume the position i in an optimal Kemeny con
sensus, that is, Pi := {c ∈ C  pa(c) − d < i < pa(c) + d}.
Using Lemma 1, we can easily show the following.
Lemma 2. For every position i, Pi ≤ 4d.
Proof. The proof is by contradiction. Assume that there
is a position i with Pi > 4d. Due to Lemma 1, for every
candidate c ∈ Pi the positions which c may assume in an
optimal Kemeny consensus can differ by at most 2d−1. This
is true because, otherwise, candidate c could not be in the
given range around its average position. Then, in a Kemeny
consensus, each of the at least 4d + 1 candidates must hold
a position that differs at most by 2d−1 from position i. As
there are only 4d − 1 such positions (2d − 1 on the left and
2d − 1 on the right of i), one obtains a contradiction.
4.2 Basic Idea of the Algorithm
In Subsection 4.4, we will present a dynamic programming
algorithm for Kemeny Score. It exploits the fact that every
candidate can only appear in a fixed range of positions in
an optimal Kemeny consensus.3The algorithm “generates”
a Kemeny consensus from the left to the right. It tries out
all possibilities for ordering the candidates locally and then
combines these local solutions to yield an optimal Kemeny
consensus.
More specifically, according to Lemma 2, the number of
candidates that can take a position i in an optimal Kemeny
consensus for any 0 ≤ i ≤ m−1 is at most 4d. Thus, for po
sition i, we can test all possible candidates. Having chosen a
candidate for position i, the remaining candidates that could
also assume i must either be left or right of i in a Kemeny
consensus. Thus, we test all possible twopartitionings of
this subset of candidates and compute a “partial” Kemeny
score for every possibility. For the computation of the par
tial Kemeny scores at position i we make use of the partial
solutions computed for the position i − 1.
4.3Definitions for the Algorithm
To state the dynamic programming algorithm, we need
some further definitions. For i ∈ {0,...,m − 1}, let I(i)
denote the set of candidates that could be “inserted” at po
sition i for the first time, that is,
I(i) := {c ∈ C  c ∈ Pi and c / ∈ Pi−1}.
Let F(i) denote the set of candidates that must be “forgot
ten” at latest at position i, that is,
F(i) := {c ∈ C  c / ∈ Pi and c ∈ Pi−1}.
3In contrast, the previous dynamic programming algo
rithms [3] for the parameters“maximum range of candidate
positions” and “maximum KTdistance” rely on decompos
ing the input whereas here we rather have a decomposition
of the score into partial scores. Further, here we obtain a
much better running time by using a more involved dynamic
programming approach.
For our algorithm, it is essential to subdivide the overall
Kemeny score into partial Kemeny scores (pK). More pre
cisely, for a candidate c and a subset R of candidates with
c / ∈ R, we set
pK(c,R) :=
X
c?∈R
X
v∈V
dR
v(c,c?),
where for c / ∈ R and c?∈ R we have dR
have c > c?, and dR
partial Kemeny score denotes the score that is“induced”by
candidate c and the candidate subset R if the candidates
of R have greater positions than c in an optimal Kemeny
consensus.4Then, for a Kemeny consensus l := c0 > c1 >
··· > cm−1, the overall Kemeny score can be expressed by
partial Kemeny scores as follows.
v(c,c?) := 0 if in v we
v(c,c?) := 1, otherwise. Intuitively, the
Kscore(V,C) =
m−2
X
i=0
m−1
X
j=i+1
X
v∈V
dv,l(ci,cj)(8)
=
m−2
X
m−2
X
i=0
X
c?∈R
X
v∈V
dR
v(ci,c?) for R := {cj  i < j < m}
(9)
=
i=0
pK(ci,{cj  i < j < m}).(10)
Next, consider the corresponding threedimensional dy
namic programming table T. Roughly speaking, define an
entry for every position i, every candidate c that can as
sume i, and every candidate subset C?⊆ Pi\{c}.
entry stores the “minimum partial Kemeny score” over all
possible orders of the candidates of C?under the condition
that c takes position i and all candidates of C?take positions
smaller than i. To define the dynamic programming table
formally, we need some further notation.
Let Π(C?) denote the set of all possible orders of the candi
dates in C?, where C?⊆ C. Further, consider a Kemeny con
sensus in which every candidate of C?has a position smaller
than every candidate in C\C?. Then, the minimum partial
Kemeny score restricted to C?is defined as
(
s=1
The
min
(d1>d2>···>dx)∈Π(C?)
x
X
pK(ds,{dj  s < j < m} ∪ (C\C?))
)
with x := C?. That is, it denotes the minimum partial
Kemeny score over all orders of C?. We define an entry of
the dynamic programming table T for a position i, a candi
date c ∈ Pi, and a candidate subset P?
For this, we define L :=S
stricted to the candidates in L∪ {c} under the assumptions
that c is at position i in a Kemeny consensus, all candidates
of L have positions smaller than i, and all other candidates
have positions greater than i. That is, for L = i−1, define
i⊆ Pi with c / ∈ P?
i. Then, an entry
i.
j≤iF(j) ∪ P?
T(i,c,P?
i) denotes the minimum partial Kemeny score re
T(i,c,P?
i) :=min
(d1>···>di−1)∈Π(L)
+ pK(c,C\(L ∪ {c})).
Dynamic Programming Algorithm
i−1
X
s=0
pK(ds,C\{dj  j ≤ s})
4.4
4By convention and somewhat counterintuitively, we say
that a candidate c has a greater position than a candidate c?
in a vote if c?> c.
Page 6
AAMAS 2009 • 8th International Conference on Autonomous Agents and Multiagent Systems • 10–15 May, 2009 • Budapest, Hungary
662
Input: An election (V,C) and, for every 0 ≤ i < m, the
set Piof candidates that can assume position i in an optimal
Kemeny consensus.
Output: The Kemeny score of (V,C).
Initialization:
01 for i = 0,...,m − 1
02 for all c ∈ Pi
03for all P?
04T(i,c,P?
05 for all c ∈ P0
06T(0,c,∅) := pK(c,C\{c})
Update:
07 for i = 1,...,m − 1
08for all c ∈ Pi
09for all P?
10if P?
and T(i − 1,c?,(P?
11
i⊆ Pi\{c}
i) := +∞
i⊆ Pi\{c}
j≤iF(j) = i − 1
i∪ F(i))\{c?}) is defined then
i) =min
c?∈P?
+pK(c,(Pi∪
i<j<m
i∪S
T(i,c,P?
i∪F(i)T(i − 1,c?,(P?
i∪ F(i))\{c?})
[
I(j))\(P?
i∪ {c}))
Output:
12 Kscore = minc∈Pm−1T(m − 1,c,Pm−1\{c})
Figure 3: Dynamic programming algorithm for Ke
meny Score
The algorithm is displayed in Figure 3.
modify the algorithm such that it outputs an optimal Ke
meny consensus: for every entry T(i,c,P?
has to store a candidate c?that minimizes T(i − 1,c?,(P?
F(i))\{c?}) in line 11. Then, starting with a minimum en
try for position m−1, one reconstructs an optimal Kemeny
consensus by iteratively adding the“predecessor”candidate.
The asymptotic running time remains unchanged. More
over, in several applications, it is useful to compute not
just one optimal Kemeny consensus but to enumerate all of
them. At the expense of an increased running time, which
clearly depends on the number of possible optimal consensus
rankings, our algorithm can be extended to provide such an
enumeration by storing all possible predecessor candidates.
It is easy to
i), one additionally
i∪
Lemma 3. The algorithm in Figure 3 correctly computes
Kemeny Score.
Proof. For the correctness, we have to show two points:
First, all table entries are welldefined, that is, for an en
try T(i,c,P?
i) concerning position i there must be exactly
i − 1 candidates that have positions smaller than i. This
condition is assured by line 10 of the algorithm.5
Second, we must ensure that our algorithm finds an opti
mal solution. Due to Equality (10), we know that the Ke
meny score can be decomposed into partial Kemeny scores.
5It can still happen that a candidate takes a position out
side of the required range around its average position. Since
such an entry cannot lead to an optimal solution according
to Lemma 1, this does not affect the correctness of the algo
rithm. To improve the running time it would be convenient
to “cut away” such possibilities. We leave considerations in
this direction to future work.
Thus, it remains to show that the algorithm considers a de
composition that leads to an optimal solution. For every
position, the algorithm tries all candidates in Pi. According
to Lemma 1, one of these candidates must be the “correct”
candidate c for this position. Further, for c we can observe
that the algorithm tries a sufficient number of possibilities
to partition all remaining candidates C\{c} such that they
have either smaller or greater positions than i. More pre
cisely, every candidate from C\{c} must be in exactly one
of the following three subsets:
1. The set F of candidates that have already been forgot
ten, that is, F :=S
2. The set of candidates that can assume position i, that
is, Pi\{c}.
3. The set I of candidates that are not inserted yet, that
is, I :=S
Due to Lemma 1 and the definition of F(j), we know that
a candidate from F cannot take a position greater than i−1
in an optimal Kemeny consensus. Thus, it is sufficient to ex
plore only those partitions in which the candidates from F
have positions smaller than i. Analogously, one can argue
that for all candidates in I, it is sufficient to consider parti
tions in which they have positions greater than i. Thus, it
remains to try all possibilities for partitioning the candidates
from Pi. This is done in line 09 of the algorithm. Thus, the
algorithm returns an optimal Kemeny score.
0≤j≤iF(j).
i<j<mI(j).
Theorem 1. Kemeny Score can be solved in O(16d·
(d2· m + d · m2logm · n) + n2· mlogm) time with average
KTdistance da and d = ?da?.
programming table is O(16d· d · m).
Proof. The dynamic programming procedure requires
the set of candidates Pi for 0 ≤ i < m as input. To deter
mine Pi for all 0 ≤ i < m, one needs the average positions
of all candidates and the average KTdistance da of (V,C).
To determine da, compute the pairwise distances of all pairs
of votes. As there are O(n2) pairs and the pairwise KT
distance can be computed in O(mlogm) time [20], this takes
O(n2· mlogm) time. The average positions of all candi
dates can be computed in O(n · m) time by iterating once
over every vote and adding the position of every candidate
to a counter variable for this candidate. Thus, the input
for the dynamic programming algorithm can be computed
in O(n2· mlogm) time.
Concerning the dynamic programming algorithm itself,
due to Lemma 2, for 0 ≤ i < m, the size of Pi is upper
bounded by 4d. Then, for the initialization as well as for
the update, the algorithm iterates over m positions, 4d can
didates, and 24dsubsets of candidates. Whereas the initial
ization in the innermost instruction (line 04) can be done in
constant time, in every innermost instruction of the update
phase (line 11) one has to look for a minimum entry and one
has to compute a pKscore. To find the minimum, one has to
consider all candidates from P?
set of Pi−1, it can contain at most 4d candidates. Further,
the required pKscore can be computed in O(n · mlogm)
time.Thus, for the dynamic programming we arrive at
the running time of O(m · 4d · 24d· (4d + n · mlogm)) =
O(16d· (d2· m + d · m2logm · n)).
Concerning the size of the dynamic programming table,
there are m positions and any position can be assumed by
The size of the dynamic
i∪F(i). As P?
i∪F(i) is a sub
Page 7
Nadja Betzler, Michael R. Fellows, Jiong Guo, Rolf Niedermeier, Frances A. Rosamond • How Similarity Helps to Effi ciently Compute Kemeny Rankings
663
at most 4d candidates. The number of considered subsets is
bounded from above by 24d. Hence, the size of the table T
is O(16d· d · m).
Finally, let us discuss the differences between the dynamic
programming algorithm used for the “maximum pairwise
KTdistance”in [3] and the algorithm presented in this work.
In [3], the dynamic programming table stored all possible
orders of the candidates of a given subset of candidates. In
this work, we eliminate the need to store all orders by using
the decomposition of the Kemeny score into partial Kemeny
scores. This allows us to restrict the considerations for a
position to a candidate and its order relative to all other
candidates.
5.SMALL CANDIDATE RANGE
In this section, we consider two further parameterizations,
namely“maximum range”and“average range”of candidates.
As exhibited in Section 3, the range parameters in gen
eral are“orthogonal”to the distance parameterizations dealt
with in Section 4. Whereas for the parameter “maximum
range” we can obtain fixedparameter tractability by using
the dynamic programming algorithm given in Figure 3, the
Kemeny Score problem becomes NPcomplete already in
case of an average range of two.
5.1Parameter Maximum Range
In the following, we show how to bound the number of
candidates that can assume a position in an optimal Kemeny
consensus by a function of the maximum range. This enables
the application of the algorithm from Figure 3.
Lemma 4. Let rmax be the maximum range of an elec
tion (V,C). Then, for every candidate its relative order in
an optimal consensus with respect to all but at most 3rmax
candidates can be computed in O(n · m2) time.
Proof. We use an observation that follows directly from
the Extended Condorcet criterion [22]: If for two candi
dates b,c ∈ C we have v(b) > v(c) for all v ∈ V , then in
every Kemeny consensus l it holds that l(b) > l(c). Thus, it
follows that for b,c ∈ C with maxv∈V v(b) < minv∈V v(c),
in an optimal Kemeny consensus l we have l(b) < l(c). That
is, for two candidates with “nonoverlapping range” their
relative order in an optimal Kemeny consensus can be de
termined using this observation. Clearly, all these candidate
pairs can be computed in O(n · m2) time.
Next, we show that for every candidate c there are at most
3rmax candidates whose range overlaps with the range of c.
The proof is by contradiction. Let the range of c go from
position i to j, with i < j. Further, assume that there is
a subset of candidates S ⊆ C with S ≥ 3rmax + 1 such
that for every candidate s ∈ S there is a vote v ∈ V with
i ≤ v(s) ≤ j. Now, consider an arbitrary input vote v ∈ V .
Since there are at most 3rmax positions p with i − rmax ≤
p ≤ j + rmax for one candidate s ∈ S it must hold that
v(s) < i − rmax or v(s) > j + rmax. Thus, the range of s is
greater than rmax, a contradiction. Hence, there can be at
most 3rmax candidates that have a position in the range of c
in a vote v ∈ V . As described above, for all other candidates
we can compute the relative order in O(n·m2) time. Hence,
the lemma follows.
As a direct consequence of Lemma 4, we conclude that
every candidate can assume one of at most 3rmaxconsecutive
positions in an optimal Kemeny consensus. Recall that for
a position i the set of candidates that can assume i in an
optimal consensus is denoted by Pi(see Definition 1). Then,
using the same argument as in Lemma 2, one obtains the
following.
Lemma 5. For every position i, Pi ≤ 6rmax.
In complete analogy to Theorem 1, one arrives at the fol
lowing.
Theorem 2. Kemeny Score can be solved in O(32rmax·
(r2
maximum range rmax. The size of the dynamic programming
table is O(32rmax· rmax· m).
5.2Parameter Average Range
max· m + rmax · m2logm · n) + n2· mlogm) time with
Theorem 3. Kemeny Score is NPcomplete for elec
tions with average range two.
Proof. The proof uses a reduction from an arbitrary in
stance ((V,C),k) of Kemeny Score to a Kemeny Score
instance ((V?,C?),k) with average range less than two. The
construction of the election (V?,C?) is given in the follow
ing. To this end, let ai,1 ≤ i ≤ C2, be new candidates not
occurring in C.
• C?:= C ? {ai  1 ≤ i ≤ C2}.
• For every vote v = c1 > c2 > ··· > cm in V , put the
vote v?:= c1 > c2 > ··· > cm > a1 > a2 > ··· > am2
into V?.
It follows from the extended Condorcet criterion [22] that
if a pair of candidates has the same order in all votes, it must
have this order in a Kemeny consensus as well. Thus, in a
Kemeny consensus it holds that ai > aj for i > j and, there
fore, adding the candidates from C?\C does not increase
the Kemeny score. Hence, an optimal Kemeny consensus
of size k for (V?,C?) can be transformed into an optimal
Kemeny consensus of size k for (V,C) by deleting the can
didates of C?\C. The average range of (V?,C?) is bounded
as follows:
1
m + m2·
c∈C?
0
c∈C
1
m + m2· (m2+ m2) < 2.
ra =
X
@X
r(c)
=
1
m + m2·
r(c) +
X
c∈C?\C
r(c)
1
A
≤
Clearly, the reduction can be easily modified to work for
every constant value of at least two by choosing a C?of
appropriate size.
6. CONCLUSION
Compared to earlier work [3], we significantly improved
the running time for the natural parameterization “maxi
mum KTdistance”for the Kemeny Score problem. There
have been some experimental studies [11, 8] that hinted that
the Kemeny problem is easier when the votes are close to a
consensus and, thus, tend to have a small average distance.
Our results for the average distance parameterization can
Page 8
AAMAS 2009 • 8th International Conference on Autonomous Agents and Multiagent Systems • 10–15 May, 2009 • Budapest, Hungary
664
also be regarded as a theoretical explanation with provable
guarantees for this behavior. Moreover, we provided fixed
parameter tractability in terms of the parameter“maximum
range of positions”, whereas this is excluded for the parame
ter “average range of positions” unless P=NP. These results
are of particular interest because we indicated in Section 3
that the parameters“position range”and“pairwise distance”
are independent of each other.
As challenges for future work, we envisage the following:
• Extend our findings to the Kemeny Score problem
with input votes that may have ties or that may be
incomplete (also see [3]).
• Improve the running time as well as the memory con
sumption (which is exponential in the parameter)—we
believe that still significant improvements are possible.
• Implement the algorithms, perhaps including heuris
tic improvements of the running times, and perform
experimental studies.
• Investigate typical values of the average KTdistance
and the maximum candidate range, either under some
distributional assumption or for realworld data.
Finally, we want to advocate parameterized algorithmics [13,
17, 21] as a very helpful tool for better understanding and
exploiting the numerous natural parameters occuring in vot
ing szenarios with associated NPhard combinatorial prob
lems. Only few investigations in this direction have been
performed so far, see, for instance [4, 5, 6, 16].
7.ACKNOWLEDGEMENTS
We are grateful to an anonymous referee of COMSOC
2008 for constructive feedback. This work was supported
by the DFG, research project DARE, GU 1023/1, Emmy
Noether research group PIAF, NI 369/4, and project PALG,
NI 369/8 (Nadja Betzler and Jiong Guo). Michael R. Fellows
and Frances A. Rosamond were supported by the Australian
Research Council. This work was done while Michael Fel
lows stayed in Jena as a recipient of the Humboldt Research
Award of the Alexander von Humboldt foundation, Bonn,
Germany.
8.ADDITIONAL AUTHORS
9. REFERENCES
[1] N. Ailon, M. Charikar, and A. Newman. Aggregating
inconsistent information: ranking and clustering.
Journal of the ACM, 55(5), 2008. Article 23 (October
2008).
[2] J. Bartholdi III, C. A. Tovey, and M. A. Trick. Voting
schemes for which it can be difficult to tell who won
the election. Social Choice and Welfare, 6:157–165,
1989.
[3] N. Betzler, M. R. Fellows, J. Guo, R. Niedermeier,
and F. A. Rosamond. Fixedparameter algorithms for
Kemeny scores. In Proc. of 4th AAIM, volume 5034 of
LNCS, pages 60–71. Springer, 2008.
[4] N. Betzler, J. Guo, and R. Niedermeier.
Parameterized computational complexity of Dodgson
and Young elections. In Proc. of 11th SWAT, volume
5124 of LNCS, pages 402–413. Springer, 2008.
[5] N. Betzler and J. Uhlmann. Parameterized complexity
of candidate control in elections and related digraph
problems. In Proc. of 2nd COCOA ’08, volume 5165
of LNCS, pages 43–53. Springer, 2008.
[6] R. Christian, M. R. Fellows, F. A. Rosamond, and
A. Slinko. On complexity of lobbying in multiple
referenda. Review of Economic Design, 11(3):217–224,
2007.
[7] V. Conitzer. Computing Slater rankings using
similarities among candidates. In Proc. 21st AAAI,
pages 613–619. AAAI Press, 2006.
[8] V. Conitzer, A. Davenport, and J. Kalagnanam.
Improved bounds for computing Kemeny rankings. In
Proc. 21st AAAI, pages 620–626. AAAI Press, 2006.
[9] V. Conitzer, M. Rognlie, and L. Xia. Preference
functions that score rankings and maximun likelihood
estimation. In Proc. of 2nd COMSOC, pages 181–192,
2008.
[10] V. Conitzer and T. Sandholm. Common voting rules
as maximum likelihood estimators. In Proc. of 21st
UAI, pages 145–152. AUAI Press, 2005.
[11] A. Davenport and J. Kalagnanam. A computational
study of the Kemeny rule for preference aggregation.
In Proc. 19th AAAI, pages 697–702. AAAI Press,
2004.
[12] M. J. A. N. de Caritat (Marquis de Condorcet). Essai
sur l’application de l’analyse ` a la probabilit´ e des
d´ ecisions redues ` a la pluralit´ e des voix. Paris:
L’Imprimerie Royal, 1785.
[13] R. G. Downey and M. R. Fellows. Parameterized
Complexity. Springer, 1999.
[14] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar.
Rank aggregation methods for the Web. In Proc. of
10th WWW, pages 613–622, 2001.
[15] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar.
Rank aggregation revisited, 2001. Manuscript.
[16] P. Faliszewski, E. Hemaspaandra, L. A.
Hemaspaandra, and J. Rothe. Llull and Copeland
voting broadly resist bribery and control. In Proc. of
22nd AAAI, pages 724–730. AAAI Press, 2007.
[17] J. Flum and M. Grohe. Parameterized Complexity
Theory. Springer, 2006.
[18] E. Hemaspaandra, H. Spakowski, and J. Vogel. The
complexity of Kemeny elections. Theoretical Computer
Science, 349:382–391, 2005.
[19] C. KenyonMathieu and W. Schudy. How to rank with
few errors. In Proc. 39th STOC, pages 95–103. ACM,
2007.
[20] J. Kleinberg and E. Tardos. Algorithm Design.
Addison Wesley, 2006.
[21] R. Niedermeier. Invitation to FixedParameter
Algorithms. Oxford University Press, 2006.
[22] M. Truchon. An extension of the Condorcet criterion
and Kemeny orders. Technical report, cahier 9815 du
Centre de Recherche en´Economie et Finance
Appliqu´ ees, Universit´ e Laval, Qu´ ebec, Candada, 1998.
[23] A. van Zuylen and D. P. Williamson. Deterministic
algorithms for rank aggregation and other ranking and
clustering problems. In Proc. 5th WAOA, volume 4927
of LNCS, pages 260–273. Springer, 2007.