ArticlePDF Available

A simple and fast linear-time algorithm for divisor methods of apportionment

Authors:

Abstract and Figures

Proportional apportionment is the problem of assigning seats to states (resp. parties) according to their relative share of the population (resp. votes), a field heavily influenced by the early work of Michel Balinski, not least his influential 1982 book with Peyton Young (Fair representation, 2nd edn. Brookings Institution Press, Washington, D.C., 2001). In this article, we consider the computational cost of divisor methods (also known as highest averages methods), the de-facto standard solution that is used in many countries. We show that a simple linear-time algorithm can exactly simulate all instances of the family of divisor methods of apportionment by reducing the problem to a single call to a selection algorithm. All previously published solutions were iterative methods that either offer no linear-time guarantee in the worst case or require a complex update step that suffers from numerical instability.
Content may be subject to copyright.
Mathematical Programming
https://doi.org/10.1007/s10107-023-01929-5
FULL LENGTH PAPER
Series B
A simple and fast linear-time algorithm for divisor methods
of apportionment
Raphael Reitzig1·Sebastian Wild2
Received: 31 January 2021 / Accepted: 12 January 2023
© The Author(s) 2023
Abstract
Proportional apportionment is the problem of assigning seats to states (resp. parties)
according to their relative share of the population (resp. votes), a field heavily influ-
enced by the early work of Michel Balinski, not least his influential 1982 book with
Peyton Young (Fair representation, 2nd edn. Brookings Institution Press, Washington,
D.C., 2001). In this article, we consider the computational cost of divisor methods
(also known as highest averages methods), the de-facto standard solution that is used
in many countries. We show that a simple linear-time algorithm can exactly simulate
all instances of the family of divisor methods of apportionment by reducing the prob-
lem to a single call to a selection algorithm. All previously published solutions were
iterative methods that either offer no linear-time guarantee in the worst case or require
a complex update step that suffers from numerical instability.
Keywords Proportional apportionment ·Selection algorithms ·Divisor methods ·
d’Hondt method ·Fair division ·Rounding percentages
Mathematics Subject Classification 68Q25 ·68Q17 ·91B12
1 Introduction
The mathematical problem of proportional apportionment arises whenever we have a
finite supply of kindivisible, identical resources which are to be distributed across n
The majority of this research was done while both authors were at University of Kaiserslautern.
BSebastian Wild
Sebastian.Wild@liverpool.ac.uk
Raphael Reitzig
reitzig@verrech.net
1Würzburg, Germany
2Department of Computer Science, University of Liverpool, Liverpool, UK
123
R. Reitzig, S. Wild
parties proportionally to their publicly known and agreed-upon values v1,...,v
n.The
indivisibility constraint makes a perfectly proportional assignment impossible unless
the quotas k ·vi/Vwith V=v1+···+vnhappen to be all integral for i=1,...,n;
apportionment methods decide how to allocate resources in the general case.
Apportionment directly arises in politics in two forms:
In a proportional-representation electoral system seats in parliament are assigned
to political parties according to their share of all votes. (The resources are seats,
and the values are vote counts.)
In federal states the number of representatives from each component state often
reflects the population of that state. (Resources are again seats, values are the
numbers of residents.)
While not identical in their requirements—for example, any state will typically have at
least one representative no matter how small it is—the same mathematical framework
applies to both instances. Further applications are tables wherein rounded percentages
should add up to 100%, the assignment of workers to jobs, or the allocation of service
facilities to areas proportional to demand.
In order to use consistent language throughout this article, we will stick to the first
metaphor. That is, we assign k seats to n parties proportionally to their respective
votes vi; we call kthe house size. In the case of electoral systems which exclude
parties below a certain threshold of overall votes from seat allocation altogether, we
assume they have already been removed from our list of nparties. An apportionment
method maps vote counts v=(v1,...,v
n)and house size kto a seat allocation
s=(s1,...,sn)so that s1+···+sn=k. We interpret sas party igetting siseats.
There are many conceivable such methods, but [3] show that the divisor methods
(introduced below) are the only methods that guarantee pairwise vote monotonic-
ity (population monotonicity in [3]), which requires that a party icannot lose seats
to a party jwhen i gains votes while jloses votes (and all other parties remain
unchanged). For a comprehensive introduction into the topic with its historical, polit-
ical, and mathematical dimensions, including desirable and undesirable properties of
various apportionment methods and corresponding impossibility results, we refer the
reader to the books of [3,9].
1.1 Problem definition
Divisor methods (also known as
highest-averages methods
) are characterized by the
used rounding rule ·; examples include rounding down, rounding up, or rounding to
the nearest even integer (see also Table 1and [9, p. 70]). Party iis then assigned vi/D
seats, where Dis a divisor chosen so that s1+···+sn=k; such a Dis guaranteed to
exist for any sensible rounding rule and obtained by solving the following optimization
problem: max Ds. t. n
i=1vi/Dk. We point out that without an algorithm to
solve this problem, divisor methods of apportionment cannot feasibly be applied in
practice.
While the concept of divisor methods can be used more generally, a typical assump-
tion is that x≤x≤x, which implies that siis roughly proportional to vi/V.
123
A simple and fast linear-time algorithm for divisor...
For this section, we also make this assumption; later, we slightly weaken it to nearly-
arithmetic divisor sequences (Definition 1).
Different rounding rules yield in general different apportionment methods, and
there is no per se best choice. For example, there are competing notions of fairness,
each favoring a different divisor method [3, Sect. A.3]. A reasonable approach is
therefore to run computer simulations of different methods and compare their outcomes
empirically, for example w.r.t. the distribution of final average votes per seat vi/si.
For this purpose, many apportionments may have to be computed, making efficient
algorithms desirable. Apart from that, settling the computational complexity of this
fundamental optimization problem is interesting in its own right.
1.2 Previous work
While methods for proportional apportionment have been studied for a long time, the
question of algorithmic complexity has only more recently been considered. A direct
iterative method (see Sect. 2) has complexity Θ(nk)when implemented naively, and
Θ(klog n)when using a priority queue. Note that typically kn, and indeed, the
input consists of n+1 numbers, so this running time can be exponential in the size of
a binary encoding of the input.
A simple refinement, the jump-and-step algorithm described by [9] (see also
Sect. 4.1), avoids any dependency on k. It is based on the iterative method, but jumps
to within O(n)of the target value, so worst-case running times are O(n2)with naive
iteration and O(nlog n)using a priority queue. These bounds seem to be folklore; they
are mentioned explicitly for example by [6,14]. This running time is not optimal, but
the algorithm is simple and performs provably well in certain average-case scenarios
[9, Sect. 6.7].
Finally, [5] obtained an algorithm with the optimal O(n)complexity in the worst
case. They reduce the problem of finding a divisor Dto selecting the kth smallest
element from a multiset formed by narithmetic progressions, and design a somewhat
involved algorithm to solve this special rank-selection problem in O(n)time. This
settles the theoretical complexity of the problem since clearly Ω(n)time is necessary
to read the input. However, apart from conceptual complexity, Cheng and Eppstein’s
algorithm suffers from a numerical-instability issue that we uncovered when imple-
menting their algorithm.
1.3 Contribution
Our main contribution is a much simpler algorithm than Cheng and Eppstein’s algo-
rithm for divisor methods of apportionment. It directly constructs a multiset ˆ
Aof size
O(n)and a rank ˆ
kso that Dis obtained as the ˆ
kth smallest element in ˆ
A. An exam-
ple execution of our algorithm is shown in Table 2(p. 9). Apart from its improved
conceptual simplicity and practical efficiency (see Sect. 4), this also circumvents any
issues from imprecise arithmetic. Formally, our result is as follows.
123
R. Reitzig, S. Wild
Theorem 1 (Main result) Given any rounding rule ·with x≤x≤xfor
all x R0, any vector of votes vNn, and house size k N, our algorithm
SandwichSelect computes a divisor D that yields seat allocations s=(s1,...,sn)
respecting sivi
Dand s1+···+sn=k using running time in O(n). It can do so
without explicitly computing ·.
Moreover, we report from an extensive running-time study of the above apportionment
methods. We find that our new method is almost an order of magnitude (a factor 10)
faster than Cheng and Eppstein’s algorithm while at the same time avoiding the super-
linear complexity of the jump-and-step algorithm for large inputs. Implementations
of all algorithms and sources for the experiments are available online [10].
Outline. Section 2defines divisor methods formally. In Sect. 3, we describe the
selection-based algorithms, including our new method. Section 4describes results
of our running-time study; Sect. 5concludes the paper. For the reader’s convenience,
we include an index of notation in Appendix A”.
2 Preliminaries
Our exposition follows the notation of [5], but we also give the names as used by [9].
2.1 Divisor sequences
Adivisor sequence is a nonnegative, strictly increasing and unbounded sequence of real
numbers. Throughout the paper, we consider a fixed divisor sequence d=(dj)
j=0;for
notational convenience, we set d1:=−∞. We require a monotonic continuation δof
don the reals which is easy to invert; formally, we assume a function δ:R0Rd0
with
(D1) δis continuous and strictly increasing,
(D2) δ1(x)for xd0can be computed with a constant number of arithmetic oper-
ations, and
(D3) δ(j)=dj(and thus δ1(dj)=j) for all jN0.
All the divisor sequences used in practice fulfill these requirements; cf. Table 1.For
convenience, we continue δ1on the complete real line requiring
(D4) δ1(x)∈[1,0)for x<d0.
Lemma 1 Assuming (D1) to (D4),δ1(x)is continuous and strictly increasing on
Rd0. Furthermore, it is the inverse of j → djin the sense that
δ1(x)=max{jZ≥−1|djx}
for all x R.
In particular, δ1(x)+1=
{jN0:djx}
is the rank function for the set of
all dj.
123
A simple and fast linear-time algorithm for divisor...
Table 1 Commonly used divisor methods [5, Table 1]
Method Also known as Divisor sequence δ(x)Sandwich
Smallest divisors Adams 0, 1, 2, 3, x
Greatest divisors d’Hondt, Jefferson 1, 2, 3, 4, x+1–
Sainte-Laguë Webster, major fractions 1, 3, 5, 7, 2x+1–
Modified Sainte-Laguë 1.4, 3, 5, 7, 2x+1x1
1.6x+1.4x<12x+6
5±1
5
Equal proportions Huntington–Hill 0, 2, 6, 12, x(x+1)x+1
4±1
4
Harmonic mean Dean 0, 4
3,12
5,24
7,… 2x(x+1)
2x+1x+1
4±1
4
Imperiali 2, 3, 4, 5, x+2–
Danish 1, 4, 7, 10, 3x+1–
For each of the methods, we give a possible continuation δof the respective divisor sequence as well as linear sandwich bounds on δ(where nontrivial; cf. Lemma 4on p. 11)
123
R. Reitzig, S. Wild
In the terminology of [9], dis a jumppoint sequence (of a rounding rule, see below),
but with a shift of indices (we start with d0instead of s(1)). A divisor sequence with
jdjj+1 for all jN0is called a signpost sequence.1
While divisor methods can be defined for any nonnegative, strictly increasing and
unbounded sequence, we will focus our attention on those with the following property.
Definition 1 (nearly arithmetic) A divisor sequence (dj)jN0resp. its continuation δ
is called nearly arithmetic if there are constants α>0, β∈[0], and β0 so that
xR0αx+βδ(x)αx+β.
Table 1lists divisor sequences for common apportionment methods; all are nearly
arithmetic. Further, any signpost sequence is trivially nearly arithmetic, including
power-mean signposts [9, Sect. 3.11–312] and geometric-mean signposts [6]. Nearly
arithmetic sequences are also exactly the class of divisor sequences addressed by [5].
2.2 Ties, rounding rules, and seat Allocations
Since the actual seat allocation sis not uniquely determined in case of ties, it is
convenient to have a set-valued rounding rule in addition to the rank function. The
rounding rule ·induced by divisor sequence dis defined by x=δ1(x)+1,
were x={x} if x/Zand n={n1,n}for nZ.(The+1 is due to
the index shift in divisor sequences; ·is the natural extension of · that returns
both limits at jump discontinuities). Note that we have x≤x≤x(for one of
the possible values of xin case of ties) if and only if the jumppoint sequence is a
signpost sequence, making these particularly natural choices for rounding rules. The
set of valid seat assignments for given votes and house size is then given by
S(v,k)=sNn
0
n
i=1
si=k∧∃D>0.i∈[n].sivi
D.(1)
2.3 Highest averages
Divisor methods can equivalently be defined by an iterative method [3, Prop. 3.3]:
Starting with no allocated seats, si=0fori∈[n], we iteratively assign the next seat
to a party with a currently “highest average”, i.e., maximal vi/dsi: a party with the
most votes per seat. For technical reasons, it turns out to be much more convenient
to work with reciprocal averages, i.e., assign the next seat to a party with minimal
dsi/vi(fewest current seats per vote). In case of ties, any choice leads to a valid seat
allocation sS(v,k).
This iterative method does not yield an efficient algorithm, but it gives rise to to
a key structural observation: the minimal quotients dj/viare weakly increasing over
1Note that [9] additionally requires that signpost sequences do not touch both endpoints of the interval
[j,j+1](“left-right disjunction”). Our conditions (D1) to (D4) already imply this property.
123
A simple and fast linear-time algorithm for divisor...
subsequent iterations (by monotonicity of d), and we obtain the final (largest) quotient
directly as
a=max dsi1
vi
i∈[n](2)
using the final seat assignment s. The iterative method yields the same seat assignments
as Eq. (1)usingD=1/a(cf. [9, 59f]); to get the full set of all feasible assignments
S(v,k), one has to simulate all possibilities of breaking ties when selecting the next
party to be awarded a seat.
3 Fast apportionment through selection
Worst-case optimal algorithms for divisor methods of apportionment exploit that the
quotients dj/viin the iterative method change monotonically: The final multiplier a
is the kth smallest of all possible quotients dj/vi, and can hence be found directly
using a selection algorithm [5]. The challenge is to suitably restrict the candidate set
from which to select.
We need some more notation. Given a (multi)set Mof (not necessarily distinct)
numbers, we write M(k)for the kth order statistic of M, i. e., the kth smallest ele-
ment (counting duplicates) in M. For example, if M={5,8,8,8,10,10},wehave
M(1)=5, M(2)=M(3)=M(4)=8, and M(5)=M(6)=10. For given votes
v=(v1,...,v
n)Qn
>0, we define the sets
Ai:= ai,j
j=0,1,2,...
with ai,j:=dj
vi
and their multiset union A:= n
i=1Ai.With that notation, we obtain that a=A(k).
We further define the rank function r(x,A)as the number of elements in multiset A
that are no larger than x, that is
r(x,A):=
A(−∞,x]
=
n
i=1
{ai,jA|ai,jx}
.(3)
We write r(x)instead of r(x,A)when Ais clear from context. In light of the optimiza-
tion formulation, min as.t. r(a,A)k,wecallavaluea feasible if r(a,A)k,
otherwise it is infeasible. Feasible a>aare called suboptimal.
Note that Ais infinite, but A(k)always exists since the terms ai,j=dj/viare strictly
increasing in jfor all i∈{1,...,n}.
3.1 Cheng and Eppstein’s algorithm
Cheng and Eppstein [5] devise an iterative method that maintains an approximation ξ
of a. In each step, the method either (at least) halves the difference of r(ξ) to kor it (at
123
R. Reitzig, S. Wild
least) halves the number of parties still under consideration. By ensuring that the initial
distance of r ) from kis O(n), their algorithm terminates after O(n)iterations. Each
iteration selects the median of the set of ai,jclosest to ξfor all remaining parties i;
using a linear-time selection algorithm, this yields overall O(n)time. More concretely,
their algorithm, ChengEppsteinSelect, uses the following three steps.
(a) Identify contributing sequences and compute an initial coarse solution ξ,i.e.,a
value with rank r ) =k±O(n).
(The initial coarse solution is essentially our aas defined below.)
(b) Compute a lower-rank coarse solution ξwith rank r)∈[kn..k]starting
with ξ.
(c) Compute astarting at ξ.
Each of the steps involves a variant of the iterative median-based algorithm sketched
above.
Remark 1 (Precision issues) While implementing it for our running time study, we
discovered the following shortcoming of ChengEppsteinSelect. After the median
selection, one has to determine for how many parties ithe closest ai,jto ξyields
exactly the new upper bound u; all but one of these have to be excluded and their
number must be known precisely (see the computation of min Algorithm 2 of [5]).
This comes down to testing whether dj/vi=dj/vifor various values of i,j,i,j;
with a naive implementation based on floating-point arithmetic, this cannot be done
reliably. The situation is aggravated by the fact that such an implementation can return
incorrect results without any obvious signs of failure.
To circumvent this issue, one can either work with exact (rational) arithmetic, which
slows down comparisons during median selection considerably, or keep a mapping
from quotients ai,jback to party ito check dj/vi=dj/viby testing djvi=djvi.
The latter requires additional space and slows down swaps during median selection.
We are not aware of a fully satisfactory solution to this issue.
3.2 Our algorithm
Our algorithm relies on explicitly constructing a small “slice” A∩[a,a]that contains
a; we can then directly apply a rank-selection algorithm on this slice. We delay any
detailed justifications to Sect. 3.3 and first state our algorithm. An application of the
algorithm to an exemplary input is given in Table 2.
Recall that we assume a fixed apportionment scheme with a nearly-arithmetic divi-
sor sequence, i. e., αj+βdjαj+β(Definition 1).
Algorithm 1 SandwichSelectd(v,k):
Step 1 Compute a:= max0 k
V β) n
Vand a:=αk
V+βn
V.
Step 2 Initialize ˆ
A:= and ˆ
k:=k.
Step 3 For i=1,...,ndo:
Step 3.1 Compute j:= max0,δ1(vi·a)and j:=δ1(vi·a).
123
A simple and fast linear-time algorithm for divisor...
Step 3.2 For all j=j,..., j, add dj/vito ˆ
A.
Step 3.3 Update ˆ
k:= ˆ
kj.
Step 4 Select and return the ˆ
kth smallest element of ˆ
A.
The intuition behind bounds [a,a]on ais to investigate the rank of ain the
multiset Aof all candidates. Since any number between one and nparties can tie
for the last seat, all we can say a priori is that kr(a)k+n. We thus
make an ansatz with r(a)k+nand r(a)<k, and try solve for aand a. While
it seems not possible to do this exactly, we can obtain sufficiently tight bounds for
nearly-arithmetic divisor sequences to guarantee |ˆ
A|=O(n).
Remark 2 (Numerical stability) Wenote that all arithmetic computations in Sandwich-
Select can safely be implemented with imprecise floating-point arithmetic when
rounding conservatively, i. e., rounding towards −∞ for aand j, resp. towards +∞
for aand j. Round-off errors may imply a minor slow-down (by making ˆ
Aslightly
larger than necessary), but they do not affect correctness since we use the same value
jfor filling ˆ
Aand for adjusting ˆ
k.
Remark 3 (Avoid evaluation of ·) Functions δ1resp. ·might be expensive to evalu-
ate in general. We can replace Step 3.1 by j:= max{0,(viaβ)
/α}and j:=(viaβ )
/α.
This may make ˆ
Aslightly larger, but our upper bound from Lemma 4on |ˆ
A|still
applies (cf. Eq. (8)). Thus SandwichSelect can run without ever evaluating a rank
function or computing an inverse of the divisor sequence. Although this may not be
a serious concern for the divisor sequences used in applications, it is unclear whether
ChengEppsteinSelect can similarly avoid evaluating rank functions precisely.
Remark 4 (Relation to envy-free stick-division)SandwichSelect is based on a gener-
alization of our solution for the
envy-free stick-division
problem [12], a task that arose
as a subproblem in a cake-cutting protocol [13]. Given nsticks of lengths L1,...,Ln
and an integer k, the task is to find the longest length so that we can cut ksticks
of length exactly from the given sticks (without gluing pieces together); this is
essentially equivalent to apportionment with dj=j+1.
3.3 Proof of main result
Towards proving Theorem 1, we first establish a few intermediate results. We will
indeed prove the slightly stronger statement that SandwichSelect correctly com-
putes ausing O(n)arithmetic operations for any nearly-arithmetic divisor sequence,
not just signpost sequences. We point out that the running time of SandwichSelect
is thus independent of k, even when kgrows much faster than n. The proofs are ele-
mentary, but require care to correctly deal with ties and boundary cases, so we give
detailed calculations.
We start by expressing the rank function r(x)in terms of δ1.
123
R. Reitzig, S. Wild
Table 2 Execution of SandwichSelect on the example input from [9, §4.9], the 2009 European Parliament election in Austria with n=6 parties competing for k=17
seats
Party ÖVP SPÖ Martin FPÖ GRÜNE BZÖ Total
Votes 858921 680 041 506 092 364 207 284 505 131261 2 825 027
Step 1 a=17/2 825 027 =6.01764 ·106,a=23/2 825 027 =8.14151 ·106
Step 3.1 vi·a5.1687 4.0922 3.0455 2.1917 1.7121 0.7899
j=5j=4j=3j=2j=1j=0
vi·a6.9929 5.5366 4.1204 2.9652 2.3163 1.0687
j=5j=4j=3j=1j=1j=0
Step 3.2 ˆ
A6.9855 7.3525 7.9037 7.0298 7.6184 ·106
ˆ
k17 543210=2
Step 4 ˆ
A(ˆ
k)a=7.0298
safe seats ( j) 54321 015
+number ai,ja10001 02
Seats si64322 017
The divisor sequence used is dj=j+1, so we have α=1andβ=β=1 and we obtain δ1(x)=x1. The final seat distribution for party iis obtained by summing the
jvalue for that party and the number of terms ai,jcontributed to ˆ
Athat are no larger than a. Note that to determine the seat distribution, we can simply recompute ai,jfor
jjjusing the exact same computation; any imprecision in these computations are without consequence as long as they are deterministic and errors are small enough
to not affect the relative order of terms in ˆ
A
123
A simple and fast linear-time algorithm for divisor...
Lemma 2 (Rank via continuation) The rank function r(x,A)satisfies
r(x,A)=
n
i=1δ1(vi·x)+1.
Proof By Eq. (3) on p. 7, it suffices to show that
{jN0|dj/vix}
=r(x,Ai)=δ1(vix)+1
for each i∈{1,...,n}. By Lemma 1,r(y,{d0,d1,...})=δ1(y)+1 for all
yR. Since xdj/viif and only if y=xvidj,itfollowsthatr(x,Ai)=
r(xvi,{d0,d1,...})=δ1(vix)+1.
Next, we show simple sufficient conditions for bounds [a,a]to contain our target
multiplier a.
Lemma 3 (Valid slices) If a and a are chosen so that they fulfill
n
i=1
δ1(vi·a)kn and
n
i=1
δ1(vi·a)k,
then a aa.
Proof As a direct consequence of Lemma 2together with the fundamental bounds
y1<y≤yon floors, we find that
n
i=1
δ1(vi·x)<r(x)
n
i=1δ1(vi·x)+1=n+
n
i=1
δ1(vi·x)
(4)
for any x. We now first show that any a<ais infeasible. There are two cases: if there
is a vi, such that viad0, we get by strict monotonicity of δ1
r(a)
(4)n+
n
i=1
δ1(vi·a)<n+
n
i=1
δ1(vi·a)k
and ais infeasible. If otherwise via<d0for all i,amust clearly have rank r(a)=0
as it is smaller than any element ai,jA.
In both cases we found that a<ahas rank r(a)<k.
It remains to show that aa.Byassumptionwehave
r(a)>
(4)
n
i=1
δ1(vi·a)k,
123
R. Reitzig, S. Wild
so |A(−∞,a]| >k. Hence a=A(k)A(−∞,a]and the claim aa
follows.
The next lemma shows how to compute explicit bounds for afor nearly-arithmetic
divisor sequences.
Lemma 4 (Sandwich bounds) Assume the continuation δof divisor sequence d fulfills
αx+βδ(x)αx+β
for all x R0with α>0,β∈[0]and β0. Then, the pair (a,a)defined by
a:= max0,αk β)·n
Vand a:= αk+β·n
V(5)
fulfills the conditions of Lemma 3, that is a aa. Moreover,
A∩[a,a]
21+ββ
α·n.
Proof We consider the linear divisor sequence continuations
δ(j)=αj+βand δ( j)=αj+β
for all jR0and start by noting that the inverses are
δ1(x)=x/αβ/αand δ1(x)=x/αβ/α
for xδ(0)=βand xδ(0)=β, respectively. For smaller x, we are free to choose
the value of the continuation from [−1,0)(cf. (D4)); noting that x/αβ/α<0for
x<β, a choice that will turn out convenient is
δ1(x):= maxx
αβ
α,1resp. δ1(x):= maxx
αβ
α,1.(6)
We state the following simple property for reference; it follows from δ(j)δ(j)
δ( j)and the definition of the inverses (recall that βα):
x
αβ
αδ1(x)δ1(x)δ1(x)x
αβ
α,for x0.(7)
Equipped with these preliminaries, we compute
a=αk+βn
V.
123
A simple and fast linear-time algorithm for divisor...
⇐⇒ k=aV βn
α=
n
i=1vi·a
αβ
α
(7)
n
i=1
δ1(vi·a),
so asatisfies the condition of Lemma 3. Similarly, we find
aαk β)n
V,
⇐⇒ kaV+ β)n
α=n+
n
i=1vi·a
αβ
α
(7)n+
n
i=1
δ1(vi·a),
that is aalso fulfills the conditions of Lemma 3.
For the bound on the number of elements falling between aand a, we compute
A∩[a,a]
=
n
i=1
Ai∩[a,a]
=
n
i=1
jN0
adj
via
=
n
i=1
jN0
vi·adjvi·a
=
n
i=1
jN0
δ1(vi·a)jδ1(vi·a)
(7)
n
i=1
jN0
δ1(vi·a)jδ1(vi·a)
n
i=1δ1(vi·a)δ1(vi·a)+1
(7)
n
i=1vi·aβ
αvi·aβ
α+1
=1+ββ
α·n+(aa)·V
α
1+ββ
α·n+ +ββ) ·n
V·V
α
=21+ββ
α·n.(8)
123
R. Reitzig, S. Wild
We are now in the position to prove our main result.
Proof (Theorem 1) We construct the multiset ˆ
AAas the subsequent union of
Ai∩[a,a], that is
ˆ
A=
n
i=1dj
viA
j(i)jj(i)
=
n
i=1dj
viA
δ1(vi·a)≤j≤δ1(vi·a)
=
n
i=1dj
viA
vi·adjvi·a
=
n
i=1dj
viA
adj
via
=A∩[a,a].
By Lemma 4, we know that aaafor the bounds computed in Step 1, so
we get in particular that aˆ
A. It remains to show that we calculate ˆ
kcorrectly.
Clearly, we discard with (ai,0,...,ai,j1)exactly jelements in Step 3.2, that is
|Ai(−∞,a)|=j. Therefore, we compute with
ˆ
k=k
n
i=1
Ai(−∞,a)
=r(a,A)
A(−∞,a)
=r(a,ˆ
A)
the correct rank of ain ˆ
A.
For the running time, we observe that the computations in Step 1 and Step 2 are
easily done with O(n)primitive instructions. The loop in Step 3 and therewith Step
3.1 and Step 3.3 are executed ntimes. The overall number of set operations in Step
3.2 is |ˆ
A|=O(n)(cf. Lemma 4). Finally, Step 4 runs in time O(|ˆ
A|)O(n)when
using a (worst-case) linear-time rank selection algorithm (e. g., the median-of-medians
algorithm [4]).
4 Comparison of algorithms
We now report from an extensive empirical comparison of all algorithms for divisor
methods of apportionment that we found reported in the literature. A more complete
discussion of our results is given in our technical report [11]; all source codes are
available on GitHub [10]. For the reader’s convenience, we first briefly summarize the
algorithms that have not yet been introduced in this article.
123
A simple and fast linear-time algorithm for divisor...
4.1 Iterative methods
A naive implementation of the iterative apportionment method (Sect. 2.3) takes time
Θ(kn); using a priority queue, this can be sped up to O(klog n).
Pukelsheim [9] notes that the above iterative method can be vastly improved in
many instances by starting from a more intelligently chosen initial value for s.His
jump-and-step algorithm [9, Sect. 4.6] can be formulated using our notation as follows:
Algorithm 2 JumpAndStepd(v,k):
Step 1 Compute an estimate afor a; the exact value depends on d:
(a) If dis a
stationary signpost sequence
,i.e.,
dj=αj+βand β/α∈[0,1], then set a:=α
Vk+n·(β/α1
/2).
(b) If dis a
stationary jumppoint sequence
,i.e.,
dj=αj+βbut β/α/∈[0,1]then set a:= α
Vk+n·
β/α.
(c) Otherwise set a:=αk
V.
Step 2 Initialize si=δ1(vi·a)+1fori=1,...,n.
Step 3 While si= k
Step 3.1 If si<kset I:=arg maxn
i=1vi/dsiand sI:=sI+1;
else set I:= arg minn
i=1vi/dsiand sI:=sI1.
The performance of JumpAndStep clearly depends on the initial distance from the
house size, Δa:=n
i=1δ1(vi·a)+1k: the running time is in Θ(n+|Δalog n)
when using priority queues for Step 3.1. With initial estimate a=k/V,wehave
Δanfor any
signpost sequence
[6, Prop. 1], yielding an O(nlog n)method overall.
We multiply by αin the formula for ato account for nearly arithmetic sequences
that are not signposts. Slight improvements to |Δa|≤n/2 are possible for
stationary
signpost sequences
,dj=j+βwith β∈[0,1],usinga=(k+n 1
2))/V[9,
Sect. 6.1]; this corresponds to the case (a) in Step 1. This bound for Δais best possible
for the worst case [9, Chap.6]; it is therefore not possible to obtain a worst-case
linear-time algorithm based on JumpAndStep.
4.2 Running time comparison
We have implemented all discussed algorithms in Java and conducted a running-time
study to compare the practical efficiency of the methods. We use artificial instances;
for a given number of parties n, house size kand divisor sequence, we draw multiple
vote vectors vat random according to different distributions. We fix kto a multiple of
nand consider arithmetic divisor sequences of the form j+β)jN0.
We focus on two scenarios here: one resembling current political applications and
one exhibiting the worst-case behavior of JumpAndStep; see our technical report
[11] for a more comprehensive evaluation. We note that in the context of democratic
elections, the effort of tallying up the votes likely dwarfs the effort spent for appor-
tioning afterwards; similarly for census data. However, the algorithmic tasks solved
123
R. Reitzig, S. Wild
Fig. 1 This figure shows averagerunning times on a logarithmic scale for SandwichSelect ,Cheng-
EppsteinSelect ,JumpAndStep with naive resp. priority-queue minimum selection, and
IterativeMethod with naive resp. priority-queue minimum selection, normalized by dividing
by the number of parties n. The inputs are random apportionment instances with vote counts vidrawn i.i.d.
uniformly from [1,3]. The numbers of parties n, house size kand method parameters (α, β) have been
chosen to resemble national parliaments in Europe (left) and the U.S. House of Representatives (right),
respectively
in this context are fundamental optimization problems interesting in their own right.
We therefore do not want to limit ourselves to the characteristics of specific current
applications.
The implementation of ChengEppsteinSelect posed the complication mentioned
in Remark 1. To not unduly slow it down in our running time study, we adopted a fast
ad-hoc solution of adding a small constant before computing floors of floating-point
numbers. We could manually determine a suitable for our benchmark inputs, but we
point out that this approach will in general lead to incorrect results (if vote counts are
very close).
Figure 1shows the results of two experiments with practical parameter choices. It
is clear that JumpAndStep dominates the field, although SandwichSelect comes
close. All other algorithms are substantially slower. As shown in our report [11],
these observations are stable across many parameter choices. It is worth noting that
for small instances, the priority-queue based implementations are slower than the
sequential-search based implementations of the iterative method. This is likely due to
the initialization overhead for the priority queue.
4.3 Super-linear worst case for JUMPANDSTEP
While JumpAndStep outperforms SandwichSelect for parameters modeling real-
istic political scenarios—where its initial jump brings it close to the desired house
size—in other configurations, it clearly exhibits superlinear behavior; Fig. 2shows
such a scenario. Although the sizes are beyond current political applications, for suf-
ficiently large nthis makes JumpAndStep slower than SandwichSelect.
Our report [11] further shows that SandwichSelect has much smaller variance
in running times compared to JumpAndStep, both when varying the individual vote
vectors and the used divisor sequences.
123
A simple and fast linear-time algorithm for divisor...
Fig. 2 The left plot shows normalized running times of SandwichSelect and JumpAndStep
on instances with k=2nand Pareto(3)-distributed vifor (α, β) =(1,0.001). The right plot shows |Δa|/n
for all 1000 runs per n; it shows that the expected |Δa|seems to converge to a constant fraction of nin this
case. The challenge of this family of apportionment instances lies in the heavy tail of the votes distribution:
the majority of parties will get 2 seats unless there are sufficiently many sufficiently popular parties in the
instance. Since JumpAndStep’s initial estimate only considers V, which for large nis dominated by the
vast majority of parties with few votes, the initial seat allocation will give most parties 2 seats and additional
seats to the popular parties. More precisely, the expected value of ais 1 +2
3β=1.0006; fixing a=1,
the expected number of allocated seats is (1+ζ(3))n2.2021n, so the expected allocation in excess of
k=2nis (3)1)n, which matches the data very well
In summary, we see that, despite its ω(n)worst case, JumpAndStep is very fast for
many scenarios, and is the best choice for small inputs. SandwichSelect allows for a
robust implementation and provides reliable performance across all tested scenarios,
independent of divisor sequence and vote distributions; for large instances of the
apportionment optimization problem, it is the fastest choice available.
5 Conclusion
We have shown that divisor methods of apportionment can be implemented by a simple
and numerically stable algorithm, SandwichSelect, that achieves the optimal linear
complexity even in the worst case; the same algorithm works for any rounding rule. The
algorithm is simple to state and implement, but its efficiency derives from a close study
of the structure of the problem. This concludes the quest for a robust and worst-case
efficient implementation of divisor methods.
A closely related area where this quest has not conclusively been achieved is bi-
proportional apportionment (double proportionality) [1,2,9]. We leave the question
whether new insights from the one-dimensional version can be put to good use in the
two-dimensional variant for future work.
Acknowledgements We thank Chao Xu for pointing us towards the work by [5] and noting that the problem
of envy-free stick-division [12] is related to proportional apportionment as discussed there. He also observed
that our approach for cutting sticks—the core ideas of which turned out to carry over to this article—could be
improved to run in linear time. Furthermore, we owe thanks to an anonymous reviewer whose constructive
feedback sparked broad changes which have greatly improved the article over its previous incarnations.
Declarations
Conflict of interest The authors declare that they have no conflict of interest.
123
R. Reitzig, S. Wild
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if changes were made. The images or other third party material in this article are included
in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons licence and your intended use is not permitted
by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
A Notation index
In this section, we collect the notation used in this paper.
Generic mathematical notation
x,xfloor and ceiling functions, as used in [8].
·used rounding rule; see Sect. 2.2
·set-valued floor; x={x} if x/Nand n={n1,n}for
nN.
O(f(n)),Ω,Θ,asymptotic notation as defined, e. g., in [7, Sect. A.2].
M(k)The kth smallest element of (multi)set/vector M(assuming it
exists); if the elements of Mcan be written in non-decreasing
order, Mis given by M(1)M(2)M(3)···.
x=(x1,...,xd)to emphasize that xis a vector, it is written in bold
Mto emphasize that Mis a multiset, it is written in calligraphic type.
M1M2multiset union; multiplicities add up.
Notation specific to the problem
party, seat, vote (count), house size Parties are assigned seats (in parliament), so that
the number of seats sithat party iis assigned
is (roughly) proportional to that party’s vote
count viand the overall number of assigned seats
equals the house size k.
d=(dj)
j=0the divisor sequence used in the highest aver-
ages method; dmust be a nonnegative, (strictly)
increasing and unbounded sequence.
δ,δ1a continuation of j→ djon the reals and its
inverse.
nnumber of parties in the input.
v,viv=(v1,...,v
n)Qn
>0, vote counts of the
parties in the input.
Vthe sum v1+···+vnof all vote counts.
kkN, the number of seats to be assigned; also
called house size.
123
A simple and fast linear-time algorithm for divisor...
s,sis=(s1,...,sn)N0, the number of seats
assigned to the respective parties; the result.
ai,jai,j:=dj/vi, the ratio used to define divisor
methods; iis the party, jis the number of seats
ihas already been assigned.
AiFor party i,Ai:={ai,0,ai,1,ai,2,...}is the list
of (reciprocals of) party i’s ratios
aWe use aas a free variable when an arbitrary
ai,jis meant.
AA:=A1···Anis the multiset of all averages.
r(x,A)the rank of xin A, that is the number of elements
in multiset Athat are no larger than x;r(x)for
short if Ais clear from context.
athe ratio a=ai
,jselected for assigning the
last, i.e., kth seat.
a,alower and upper bounds on candidates aa
asuch that still aA∩[a,a].
References
1. Balinski, M.L., Demange, G.: Algorithms for proportional matrices in reals and integers. Math. Pro-
gram. 45(1–3), 193–210 (1989)
2. Balinski, M.L., Demange, G.: An axiomatic approach to proportionality between matrices. Math. Oper.
Res. 14(4), 700–719 (1989)
3. Balinski, M.L., Young, H.P.: Fair Representation, 2nd edn. Brookings Institution Press, Washington,
D.C. (2001)
4. Blum, M., Floyd, R.W., Pratt, V., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput.
Syst. Sci. 7(4), 448–461 (1973)
5. Cheng, Z., Eppstein, D.: Linear-time algorithms for proportional apportionment. In: International
Symposium on Algorithms and Computation (ISAAC) 2014. Springer, Berlin (2014)
6. Dorfleitner, G., Klein, T.: Rounding with multiplier methods: an efficient algorithm and applications
in statistics. Stat. Pap. 40(2), 143–157 (1999)
7. Flajolet, P., Sedgewick, R.: Analytic Combinatorics. Cambridge University Press, Cambridge (2009)
8. Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics: A Foundation for Computer Science.
Addison-Wesley, Boston (1994)
9. Pukelsheim, F.: Proportional Representation, 1st edn. Springer, Berlin (2014)
10. Reitzig, R., Wild, S.: Companion source code (2015). https://github.com/reitzig/2015_apportionment,
revision db43ee7f05
11. Reitzig, R., Wild, S.: A practical and worst-case efficient algorithm for divisor methods of apportion-
ment (2017). arXiv:1504.06475
12. Reitzig, R., Wild, S.: Building fences straight and high: an optimal algorithm for finding the maximum
length you can cut ktimes from given sticks. Algorithmica 80(11), 3365–3396 (2018)
13. Segal-Halevi, E., Hassidim, A., Aumann, Y.: Waste makes haste: bounded time algorithms for envy-free
cake cutting with free disposal. ACM Trans. Algorithms 13(1), 1–32 (2016)
14. Zachariasen, M.: Algorithmic aspects of divisor-based biproportional rounding. Technical Report
06/05, University of Copenhagen (2006)
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.
123
... for some λ > 0, where mid(S) represents the median value of S, for S ⊂ R. Note that, in a slight abuse of notation, fixing I i = 0 and U i = H for each i we have A 1 (V, H) = A 1 (V, H, I, U ) for every V ∈ N n and H ∈ N. The described apportionments are guaranteed to exist and can be found through fast combinatorial algorithms or linear programming [Balinski andYoung, 2010, Reitzig andWild, 2024]. ...
Preprint
How to elect the representatives in legislative bodies is a question that every modern democracy has to answer. This design task has to consider various elements so as to fulfill the citizens' expectations and contribute to the maintenance of a healthy democracy. The notion of proportionality, in that the support of a given idea in the house should be nearly proportional to its support in the general public, lies at the core of this design task. In the last decades, demographic aspects beyond political support have been incorporated by requiring that they are also fairly represented in the body, giving rise to a multidimensional version of the apportionment problem. In this work, we provide an axiomatic justification for a recently proposed notion of multidimensional proportionality and extend it to encompass two relevant constraints often used in electoral systems: a threshold on the number of votes that a list needs in order to be eligible and the election of the most-voted candidate in each district. We then build upon these results to design methods based on multidimensional proportionality. We use the Chilean Constitutional Convention election (May 15-16, 2021) results as a testing ground -- where the dimensions are given by political lists, districts, and genders -- and compare the apportionment obtained under each method according to three criteria: proportionality, representativeness, and voting power. While local and global methods exhibit a natural trade-off between local and global proportionality, including the election of most-voted candidates on top of methods based on 3-dimensional proportionality allows us to incorporate both notions while ensuring higher levels of representativeness and a balanced voting power.
Article
Full-text available
Proportional apportionment is the problem of assigning seats to parties according to their relative share of votes. Divisor methods are the de-facto standard solution, used in many countries. In recent literature, there are two algorithms that implement divisor methods: one by Cheng and Eppstein (ISAAC, 2014) has worst-case optimal running time but is complex, while the other (Pukelsheim, 2014) is relatively simple and fast in practice but does not offer worst-case guarantees. We demonstrate that the former algorithm is much slower than the other in practice and propose a novel algorithm that avoids the shortcomings of both. We investigate the running-time behavior of the three contenders in order to determine which is most useful in practice.
Article
Full-text available
Given a matrix p ≥ 0 what does it mean to say that a matrix f (of the same dimension), whose row and column sums must fall between specific limits, is “proportional to” p? This paper gives an axiomatic solution to this question in two distinct contexts. First, for any real “allocation” matrix f. Second, for any integer constrained “apportionment” matrix f. In the case of f real the solution turns out to coincide with what has been variously called biproportional scaling and diagonal equivalence and has been much used in econometrics and statistics. In the case of f integer the problem arises in the simultaneous apportionment of seats to regions and to parties and also in the rounding of tables of census data.
Article
Full-text available
LetR be the set of nonnegative matrices whose row and column sums fall between specific limits and whose entries sum to some fixedh > 0. Closely related axiomatic approaches have been developed to ascribe meanings to the statements: the real matrixf ∈ R and the integer matrixa ∈ R are “proportional to” a given matrixp ≥ 0. These approaches are described, conditions under which proportional solutions exist are characterized, and algorithms are given for finding proportional solutions in each case.
Book
The book offers a rigorous description of the procedures that proportional representation systems use to translate vote counts into seat numbers. Since the methodological analysis is guided by practical needs, plenty of empirical instances are provided and reviewed to motivate the development, and to illustrate the results. Concrete examples, like the 2009 elections to the European Parliament in each of the 27 Member States and the 2013 election to the German Bundestag, are analyzed in full detail. The level of mathematical exposition, as well as the relation to political sciences and constitutional jurisprudence makes this book suitable for special graduate courses and seminars.
Article
Given a set of n sticks of various (not necessarily different) lengths, what is the largest length so that we can cut k equally long pieces of this length from the given set of sticks? We analyze the structure of this problem and show that it essentially reduces to a single call of a selection algorithm; we thus obtain an optimal linear-time algorithm. This algorithm also solves the related envy-free stick-division problem, which Segal-Halevi et al. (ACM Trans Algorithms 13(1):1–32, 2016. ISSN: 15496325. https://doi.org/10.1145/2988232) recently used as their central primitive operation for the first discrete and bounded envy-free cake cutting protocol with a proportionality guarantee when pieces can be put to waste.
Article
We consider the classic problem of envy-free division of a heterogeneous good ("cake") among several agents. It is known that, when the allotted pieces must be connected, the problem cannot be solved by a finite algorithm for three or more agents. The impossibility result, however, assumes that the entire cake must be allocated. In this article, we replace the entire-allocation requirement with a weaker partial-proportionality requirement: the piece given to each agent must be worth for it at least a certain positive fraction of the entire cake value. We prove that this version of the problem is solvable in bounded time even when the pieces must be connected. We present simple, bounded-time envy-free cake-cutting algorithms for (1) giving each of n agents a connected piece with a positive value; (2) giving each of three agents a connected piece worth at least 1/3; (3) giving each of four agents a connected piece worth at least 1/7; (4) giving each of four agents a disconnected piece worth at least 1/4; and (5) giving each of n agents a disconnected piece worth at least (1 - ε)/n for any positive ε.
Conference Paper
The apportionment problem deals with the fair distribution of a discrete set of k indivisible resources (such as legislative seats) to n entities (such as parties or geographic subdivisions). Highest averages methods are a frequently used class of methods for solving this problem. We present an O(n)-time algorithm for performing apportionment under a large class of highest averages methods. Our algorithm works for all highest averages methods used in practice.
Book
The book offers a rigorous description of the procedures that proportional representation systems use to translate vote counts into seat numbers. Since the methodological analysis is guided by practical needs, plenty of empirical instances are provided and reviewed to motivate the development, and to illustrate the results. Concrete examples, like the 2009 elections to the European Parliament in each of the 27 Member States and the 2013 election to the German Bundestag, are analyzed in full detail. The level of mathematical exposition, as well as the relation to political sciences and constitutional jurisprudence makes this book suitable for special graduate courses and seminars.
Article
The number of comparisons required to select the i-th smallest of n numbers is shown to be at most a linear function of n by analysis of a new selection algorithm—PICK. Specifically, no more than 5.4305 n comparisons are ever required. This bound is improved for extreme values of i, and a new lower bound on the requisite number of comparisons is also proved.