Content uploaded by Ronald Fagin
Author content
All content in this area was uploaded by Ronald Fagin on Apr 21, 2015
Content may be subject to copyright.
of
research and development
Volume
27,
Number
2,
March
1983
Ronald Fagin
John
H.
Williams
A
Fair Carpool Scheduling Algorithm
0
Copyright
1983
by International Business Machines Corporation.
See
individual articles
for
copying information. Pages containing the table
of contents and “Recent Papers by IBM Authors” may
be
freely
copied and distributed,
in
any
form.
ISSN
18-8646.
Printed in
USA.
Ronald Fagin
John
H.
Williams
A
Fair Carpool Scheduling Algorithm
We present a simple carpool scheduling algorithm in which
no
penalty is assessed to a carpool member who does not ride
on
any
given day. The algorithm is shown to be fair, in a certain reasonable sense. The amount
of
bookkeepinggrows only linearly with
the number
of
carpool members.
1.
Introduction
Suppose that
N
people, tired of spending their time and
money in gasoline lines, decide to form a carpool. We present
a scheduling algorithm for determining which person should
drive
on
any given day. We want a scheduling algorithm that
will be perceived as fair by all the members
so
as to
encourage their continued participation. We begin by pre-
senting three algorithms (Scheduling Algorithms
1-3
below) and discussing their flaws. We then present the
algorithm (Scheduling Algorithm
4)
that we propose. We
assume for now that
on
any given day at most one car is the
“carpool car.” This assumption is relaxed later.
Scheduling Algorithm
I
(simple rotation)
The simplest
scheme, and the one most often used, is simply to rotate
driving, e.g., in alphabetical order. Thus,
if
there are
N
members of the carpool, then person
i
is responsible for
driving
on
the ith day and every
N
driving days thereafter.
This scheme has the obvious advantage that it
is
simple to
describe and it is easy to determine who drives next. The
difficulty with this scheme arises when one
or
more people do
not participate in the carpool
on
a particular day. If the
designated driver has to stay out
on
the day that he is
supposed to drive, then he will have to swap days with
someone else. After
a
few such occurrences, it may become
difficult to determine who is to drive the next day. If a
non-driver misses one
or
more days, should he be expected to
drive in his normal rotation? If
so,
he may
soon
perceive the
carpool to be more
of
a burden than a blessing and drop out
altogether.
Just as big a problem as the person who cannot drive
on
his
scheduled day is the person who must (for personal reasons)
drive
on
someone else’s day but could otherwise participate in
the carpool (for example, a person who is going to work as
usual but needs to have his car in order to go to the bank to
deposit the money he has saved by carpooling). We want a
scheduling algorithm that will always be tolerant of excep-
tional conditions and that will never discourage participa-
tion.
In
particular, we want an algorithm that is
robust,
in
the following sense: A person can drive
on
a day that the
algorithm says someone else should drive, and it is then easy
to see how to get “back in synch” later.
Scheduling Algorithm
2
(simple tokens)
In
order to cor-
rect the deficiencies of simple rotation, we might adopt the
following procedure. Each time a person
R
rides with a driver
D
#
R,
then
R
pays
D
one “ride token.” Of course, the
tokens would not actually need to be handled; each person’s
current token holding could simply be recorded somewhere,
and that record could be updated daily. Then the algorithm
for
determining who drives next would be to choose, from
among the people participating that day, the person with the
smallest holding of tokens.
When we formally define fairness, in Section
3,
we shall
see that this scheduling algorithm is not fair in
our
sense.
In
the worst case, some carpool member may be forced to drive
far more than his “fair share,”
as
we shall see. We now
briefly mention a few intuitive reasons why this algorithm is
0
Copyright
1983
by International Business Machines Corporation. Copying
in
printed form
for
private use is permitted without payment of
royalty provided that
(1)
each reproduction
is
done without alteration and
(2)
the
Journal
reference and IBM copyright notice are included
on
the first page. The title and abstract,
but
no
other portions. of this paper may be copied
or
distributed royalty free without further permission by
computer-based and other information-service systems. Permission
to
republish
any other portion
of
this paper
must
be obtained
from
the
Editor.
133
IBM J. RES.
DEVELOP.
VOL.
21
-NO.
2
-
MARCH
1983
RONALD FAClN AND JOHN H. WILLIAMS
Don John Phyllis Ron
Ill1
Ill
Ill1
Ill
Don John Phyllis Don John Ron
1
I
I /I
1
~IIII’M
Ill1
Don Phyllis Ron John Phyllis Ron
Phyllis Ron
Figure
1
Books
for Scheduling Algorithm
3.
Date
I
Don John Phyllis Ron
Figure
2
Books
for
Scheduling Algorithm
4.
not fair. (1) It is certainly quite advantageous to drive
on
days when many people are participating (since the driver
gets one ride token from each of the other participants). If a
carpool member were unlucky enough to be the designated
driver on several “bad” (sparsely attended) days, then he
might decide that the algorithm
is
not fair, and might even be
driven to drop out
of
the carpool.
(2)
On a “good” day (a day
in which there are many participants), if two carpool partici-
pants
A
and
B
were tied for the lowest score, then both
A
and
B
would want very much to drive, and some tie-breaking
scheme would have to be devised.
(3)
Finally, this algorithm
is not robust in the sense we have defined: If
A
were a carpool
member, if it were not
A’s
turn to drive according to the
algorithm (that is,
A
did not have the lowest score among the
participants
on
that day), and if
A
insisted on driving his car
on
that day
for
personal reasons, then the other carpool
members would be quite unhappy if this were a “good” day.
Scheduling Algorithm
3
(subsets)
The next scheduling
algorithm to be described does turn out to be fair in our sense;
the problem, as we shall
see,
is
the amount
of
bookkeeping
134
required. This algorithm records, for each of the
2N
-
(N
+
1)
nontrivial subsets of carpool members (subsets
of
two
or
more), the number of times that each member of the
subset has driven that particular group of people. For exam-
ple, if there are four people named Don, John, Phyllis, and
Ron in the carpool, then the books at a given point might look
like Fig.
1
(where, for example, a tally is entered under
Phyllis in the Don-Phyllis-Ron table
on
a day in which only
Don, Phyllis, and
Ron
participate in the carpool and Phyllis
drives). If the table is as in Fig. 1, then
on
the next day
in
which the only participants are Don, Phyllis, and Ron, the
driver should be the person (in this case,
Ron)
with the least
number
of
tallies in the Don-Phyllis-Ron table. With this
method, it is clear that a person is not penalized for
non-
participation on any day. It is intuitively clear that this
algorithm is fair, since it is essentially simple rotation applied
separately to each
of
the
2N
-
(N
+
1)
nontrivial subsets.
Further, it is clear that this algorithm is robust in our sense.
Unfortunately, the bookkeeping for this algorithm becomes a
nightmare (if the number
N
of people is, say, four
or
more)
because the size
of
the book grows exponentially with the size
of the carpool. Further, this scheduling algorithm neglects
certain trade-offs. For example, Phyllis and John appear
together in four of the tables in Fig.
1,
but Scheduling
Algorithm
3
makes
no
attempt to trade
off
rides in the tables
in which Phyllis and John appear together.
In
fact, in Fig.
1,
Phyllis has driven more times than John in each of the four
tables
in
which they both appear.
2.
The proposed scheduling algorithm
We now give our proposed scheduling algorithm.
Scheduling Algorithm
4
(fair carpool scheduling algo-
rithm)
We begin by defining
U
to be a value that, intui-
tively, represents the total cost of a trip. It is convenient to
take
U
to be the least common multiple of
1,
2,
.-.,
m,
where
m
is the largest number of people who ever ride together at a
time in the carpool. In the running example we shall give, we
assume that this number
m
is taken equal to the total number
N
of members of the carpool, which in turn is assumed to be
4.
Thus,
U
is taken to be the least common multiple of
1,2,
3,
and
4;
that is, Uis
12.
As
drawn in Fig.
2,
the books consist
of
a single table, with one column for the date and one column
for
each carpool participant. Each day that the carpool
drives, a new row is entered into the table. The table is
initialized with a row of all
0’s
(the first row of the table in
Fig.
2).
If,
on a given day, there are
k
participants in the
carpool and
A
is the driver, then the
A
entry
is
increased by
U(k
-
l)/k
units (that is, the entry for that day in the
A
column is
U(k
-
l)/k
more than the
A
entry in the previous
row), and the entries of the riders who do not drive are each
decreased by
U/k.
For example, in Fig.
2,
the first day of the
carpool was May
1,
and John was the driver.
On
that day,
Phyllis and Ron rode
in
John’s car. Thus, John gained
8
RONALD
FAGlN
AND JOHN
H.
WILLIAMS IBM
J.
RES.
DEVELOP..
VOL
21
NO.
2 MARCH
1983
units, and Phyllis and Ron each lost
4
units.
On
the next day,
May
2,
all four carpool members participated, and Ron was
the driver. (The algorithm says that either Phyllis
or
Ron
should
be
the driver on May
2,
since they are tied for the
lowest score, with
-4
units each.) Since Ron drove, he
gained
9
units, and each of the others lost
3
units. On the next
day, May
3,
only Don and Phyllis participated. Since Phyllis
had
a
lower score than Don
(-7
versus
-3),
she was the
driver. She gained
6
units and Don lost
6
units. Note that by
choosing
U
as we have (in this case,
U
=
12),
every entry
of
the table is an integer.
.
An intuitive way to view this scheduling algorithm is that
the “cost” of driving is taken
to
be
U
units, and this cost is
divided equally among each of the participants.
So,
if
there
are
k
participants, then the cost to each participant is
U/k.
Thus, each
of
the participants who
is
not the driver “pays”
Ulk
units to the driver.
We now show that for each row, the
checksum
(the sum
of
the entries) is zero. For example, on May
2,
the entries are
-3,
5,
-7,
and
5,
which add to
0.
This property provides a
redundancy check
on
the arithmetic.
Proposition
I
In
each table generated by Scheduling Algorithm
4,
the
checksum of each row is zero.
Proof
When
k
people participate, one of them (the driver)
gains U(k
-
l)/k
units, the other
k
-
1
participants each
lose
U/k
units, and the values
of
the nonparticipants are
unchanged. Thus, the
net
gain
or
loss
is
0,
and since the table
is initialized to all O’s, the checksum is always
0.
0
We now show that the entries in the table are bounded for
each
N
(where
N
is the number of members
of
the carpool).
We shall make use of this result later,
in
our proof
of
fairness.
The
schedule
of
arrivals
is a finite sequence
(S,,
S,,
...,
Sn),
where
S,
is the set
of
participants
in
the carpool
on
day
i
(or
as we may also say,
at
time
i).
Intuitively, the schedule
of
arrivals tells who participated in the carpool, day by day. For
example, the schedule
of
arrivals
(ABC,
BD,
ACD),
where
ABC
is an abbreviation for
[A,
B,
C},
etc., corresponds to
persons
A,
B,
and
C
participating
in
the carpool (riding in the
carpool car)
on
the first day, persons
B
and
D
participating
on
the second day, and
so
on.
Theorem
2
Let
N,
the number
of
members
of
the carpool, be fixed. Then
there is a number
M
such that, for each schedule
of
arrivals,
the table derived by applying Scheduling Algorithm
4
con-
tains no entry larger than
M.
Proof
Assume that the theorem is false; .we shall derive a
contradiction. Find
N
such that, for each
M,
there is a table
T
(which can be derived by applying Scheduling Algorithm
4
to some schedule of arrivals) with an entry larger than
M.
Define the sequence
a,.
...,
a,
recursively by letting
a,
=
0,
and
a,,
,
=
1
+
ia,,
for
1
I
i
<
N.
Let
M
be
a,U.
Let
T
be
a
table (that
is
derived by applying Scheduling Algorithm
4)
with an entry larger than
M.
Let us
call
the top row (with all
zeros as entries) of table
Trow
0,
the next row of the table
row
I,
and
so
on.
If the
N
entries of row
t
are
b,
L
b,
L
...
2
b,, then define
s,(i)
to be b,. Thus (with ties properly
accounted for),
s,(i)
is the ith largest entry of row
t.
We think
of
row
t
as containing the scores of members
of
the carpool
just after the carpool has driven
on
time
t
(that is, the scores
after time
t
but before time
t
+
1).
Since table
T
contains an entry larger than
M,
we know
that
sl(j)
>
M,
for some
t
and
j.
Hence,
sI(
1)
>
M,
since
s,(l)
2
sl(j).
Let
t,
be the least
t
such that
s,(l)
>
M.
We
now show that there are
t,,
...,
t,,
where
t,
>
t,
>
...
>
t,,
such that for each
i
(1
5
i
i
N),
s,(i)
>
M
-
a,U.
(1)
We already know that
(I)
holds when
i
=
I,
since
a,
=
0.
Assume inductively that we have found
t,
>
t,
>
...
>
t,
such that
s
>
M
-
apU
for
1
5
p
5
i;
in particular (when
p
=
i)
we see that
(I)
holds. We must find
t,+,
<
t,
such that
‘P
s,,+,
(i
+
1)
>
M
-
a,+,
U.
(2)
Now
sr(j)
2
s,(i)
when
1
Ij
I
i.
Hence,
s,(l)
+
...
+
s,(i)
2
isf(i).
s,(l)
+
...
+
s,(i)
>
iM
-
ia,U.
(3)
By
(I)
and
(3),
it follows that when
t
=
t,,
we have
(4)
Let
k
be the least value oft such that
(4)
holds. Note for
later use that
k
>
0,
since
so(j)
=
0
for eachj. We now show
that
sk(i)
>
M
-
ia,lJ.
(5)
Ifi
=
I,
then
k
=
t,,
by definition oft, (since
a,
=
0).
So,
if
i
=
I,
then
(5)
holds. We now show that
(5)
holds
if
i
>
1.
We
know that
k
I
t,,
since, as we showed,
(4)
holds when
t
=
t,.
Since
k
5
t,
<
t,,
it follows by minimality
oft,
that
sk(j)
i
M,
for
1
I
j
I
N.
In
particular,
sk(j)iM,forlsj5i-
1.
(6)
By
(4),
with
t
=
k,
and by
(6),
it follows that
(5)
holds,
which was to be shown.
We know that
k
is the least value oft such that
(4)
holds,
and that, as noted,
k
>
0.
Therefore,
135
IBM
J
RES. DEVELOP. VOL.
21
-
NO.
2
-
MARCH
1983
RONALD FAGIN AND JOHN H. WILLIAMS
(7)
We now show that
(7)
implies that
sk-,(i
+
1)
+
U
>
sk(i).
(8)
Now
(7)
says that the sum of the
i
biggest scores strictly
increases between rows
k
-
1
and
k.
How can this happen?
Let
A
be the driver of the carpool at time
k.
Thus,
A
has the
lowest score in row
k
-
1
among those who participate in the
carpool
on
day
k.
It
is
not hard to see that for the sum of the
i
biggest scores to strictly increase between rows
k
-
1
and
k,
it is necessary that
I.
A’s
score in row
k
-
1
is
sk-
,(j)
for somej
>
i;
that is,
A’s
score is one of the lowest
N
-
i
scores in row
k
-
1,
and
2.
A’s
score in row
k
is
sk(m)
for some
m
5
i;
that is,
A’s
score is one of the biggest
i
scores in row
k.
Now the driver’s score increases by less than
U
when he
drives. Therefore,
A’s
score just before he drove [that is,
sk-,(j)]
differs from his score just after he drove [that is,
s,(m)]
by less than
U.
Hence,
(9)
Now
sk-,(i
+
1)
2
sk-,(j),
since;
>
i,
and
so
(by adding
U
to both sides), we get
Sk&,
(i
+
1)
+
u
2
Sk&,
(;)
+
u.
(10)
Further,
since
m
I
i.
Clearly,
(8)
follows immediately from
(9),
(lo),
and
(1
1).
Now
(5)
and
(8)
together imply that
sk&,
(i
+
1)
>
M
-
(ia,
+
1)U,
that is,
sk-,
(i
t
1)
>
M
-
a,+,
U.
(12)
Define
t,+,
to be
k
-
1.
Then
(12)
tells
us
that (2) holds.
Further,
t,+,
<
f,,
since we already showed that
k
5
t,.
This
completes the induction. Hence,
(1)
holds for each
i
(1
5
i
5
N).
Let
t
=
t,.
We see from
(l),
when
i
=
N, that
sI(N)
>
M
-
a,U.
But
M
=
aNU,
and
so
s,(N)
>
0.
(13)
Since
s,(i)
I
s,(N)
for
1
i
i
I
N, it follows from (13) that
s,(i)
>
0
for each
i
(1
5
i
i
N).
Thus, every entry of row
t
is
strictly positive, and
so
the checksum of row
t
is strictly
positive. But this contradicts Proposition
1,
which says that
the checksum of every row is
0.
This contradiction completes
the proof.
0
Corollary
3
Let N, the number of members of the carpool, be fixed. Then
there is a number
M’
such that
for
each schedule of arrivals,
136
the table derived by applying Scheduling Algorithm 4 con-
tains
no
entry whose absolute value is larger than
M’.
Proof
Let
M
be as in Theorem
2,
and let
T
be
a
table
derived by applying Scheduling Algorithm
4
to some sched-
ule of arrivals. By Theorem
2,
we know that
no
positive entry
in the table can be larger than
M.
How large in absolute
value can the smallest entry (the negative entry with the
biggest absolute value) in the table be? Let
r
be a row of the
table. Now
no
entry of the table can be larger than
M,
and
there can be at most
N
-
1
positive entries
in
row
r
(because,
by Proposition
1,
the checksum of row ris
0).
Hence, the sum
of the positive entries in row
r
is at most
(N
-
1)M.
Since the
checksum of row
r
is
0,
the absolute value of the sum of the
negative entries in row
r
is equal to the sum of the positive
entries in row
r,
and
so
is
also at most (N
-
1)M.
Therefore,
the absolute value of the smallest (“most negative”) member
of row
r
is at most (N
-
1)M.
Thus, we can take
M’
to be
(N
-
1)M.
0
It
follows from
our
proof of Theorem 2 that an upper
bound
Mon
the size of the biggest entry that can ever appear
in the table is
a#,
where
N
is the number of carpool
members and where
a,
=
0
and
ai+,
=
1
+
ia,
(1
5
i
5
N).
This bound is not the best possible.
For
example, if
N
=
2,
then
our
upper bound is
U,
whereas it is very easy to see that
in this case the actual upper bound
is
only
U/2.
If N
=
3,
then
our
upper bound is
2U,
whereas a careful examination
of the possibilities shows that the actual upper bound is
(5/6)U.
Let
us
define the functionfby lettingf(N)U be the
actual upper bound if there are
N
carpool members. Thus,
f(2)
=
112
andf(3)
=
516.
We note thatf(4)
=
7/6
and
f(5)
=
8/5.
We have not foundf(N) exactly for
N
26.
Proposition
4
The function
f
is monotone and unbounded.
Note
By
monotone,
we mean that if
N,
i
N2, then f(N,)
if(N2). By
unbounded,
we meanf(N) gets arbitrarily large
as
N
gets large.
Proof
Any
score that can be obtained in a carpool with
N,
members can be obtained in a carpool with
N2
2
N,
members: we can simply assume that
N2
-
N,
members of
the larger carpool never participate. Monotonicity follows
immediately.
We now show unboundedness. Let
N
=
2’,
and assume
that the carpool members are
A,,
..A,
A,.
Assume that
on
the
first day, the participants are
A,
and
A,,
and the driver is
A,;
on
the second day, the participants are
A,
and
A,,
and the
driver is
A,;
and
so
on
for a total of N/2 days. Then there is a
second round that begins
on
the ((N/2)
+
I)th day.
On
the
first day of the second round, the participants are
A,
and
A,,
and the driver is
A,;
on
the next day, the participants are
A,
RONALD
FAGlN
AND
JOHN
H.
WILLIAMS
IBM
J.
RES.
DEVELOP.
VOL.
21
0
NO.
2
MARCH
1983
and
A,,
and the driver is
A,;
and
so
on.
Then there is a third
round; on the first day of the third round, the participants are
A,
and
A,,
and the driver is
A,;
and
so
on.
This continues for a
total of
L
log,
NJ
rounds, where
L
XJ
is
the greatest integer
not exceeding
x.
It is straightforward to see that after the
final round,
4’s
score is
r/2
(where
N
=
2‘).
Thus,
f(2‘)
2
r/2.
Hence,
f
is unbounded.
We close the proof by noting another way of showing
unboundedness.
As
before, let
A,,
...,
A,
be the carpool
members. Assume that
on
the first day, everyone partici-
pates, and the driver is
A,.
Assume that
on
the second day,
the participants are
A,,
...,
A,_,,
and the driver is
A,-,,
and
so
on. Thus,
on
the ith day
(1
I
i
I
N
-
l),
the participants
are
A,,
...,
AN-i+,,
and the driver is
AN-,+,.
It is clear that
A,’s
score after
N
-
1
days is
-U/N
-
U/(N
-
1)
-
U/(N
-
2)
-
...
-
U/2,
which gets arbitrarily large in
absolute value
as
N
increases (and which,
in
particular, is
asymptotic to
-
U
log
N).
0
It follows from the proof of Proposition
4
that
f
(N)
2
(1/2)
L
log, NJ .
D. Coppersmith (private communica-
tion) has improved this logarithmic lower bound to a linear
lower bound by using the following argument. Let
N
be the
number of members of the carpool. As in the proof
of
Theorem
2,
let the scores just after time
t
be
s,(
I)
2
s,(2)
2
...
2
s,(N).
Define the “figure of merit” just after time
t
to
be
(N
-
l)s,(l)
+
(N
-
2)s,(2)
+
...
+
(O)s,(N).
We now
define the schedule of arrivals. On each day, the set of
participants consists of two members with the same score.
If
there are no two members with the same score, then the
carpool stops running. If
i
and
j
ride together, and
i
is the
driver, then
i’s
score increases by
U/2
and
j’s
score decreases
by
U/2.
The net effect
on
the figure of merit of increasing
one value
s,(i)
by
U/2
and decreasing
s,(i
+
1)
by
U/2
is to
increase the figure of merit by
U/2.
Further, it is easy to see
that the net effect of reshuffling the scores to keep the
s,(i)’s
nondecreasing can
only
increase the figure
of
merit further.
Keep the carpool running until either
no
two participants
have the same score,
or
until the figure of merit has gone
beyond
N3U,
whichever comes first.
In
the first case (where
the carpool is run until
no
two participants have the same
score), we know that since all carpool members started with a
score
of
0,
and since scores change by
U/2
at a time, the
scores will be at least
U/2
apart. That is,
no
two scores will
be closer together in value than
U/2.
It
is
not hard to verify
that this fact, along with the fact that the sum of the scores is
0,
implies that the largest score is at least
(N
-
1)U/4.
In
the
second case (where the carpool is run until the figure of merit
has gone beyond
N3U),
it
is
clear that the largest score
is
greater than
NU.
So
in either case, the largest score
is
at least
(N
-
1)
U/4,
which is linear in
N,
as promised. Note that
the linear lower bound
is
attained even when no more than
two carpool members ever ride together. Coppersmith also
shows (by a more detailed analysis) a lower bound
of
(N
-
I)U/3,
which is attained even with
no
more than three
carpool members ever riding together.
Coppersmith’s argument, taken together with the proof
of
Theorem
2,
shows that
(N
-
1)/3
If
(N)
5
a,,
where
a,
=
0
and
a,,,
=
1
+
ia,
(1
5
i
I
N).
There is an exponential
gap between these lower and upper bounds. It is an interest-
ing combinatorial problem to tighten these bounds
[
I].
We close this section by noting another interesting combi-
natorial problem. Let us say that a vector
(a,,
...,
a,),
where
a,
2
...
2
ah,
is
an
attainable
vector of
scores
if
there is a
schedule of arrivals such that, starting with a score of
0
for
every member of the carpool, and always applying Schedul-
ing Algorithm
4,
there is a time
t
where the vector
(s,(
l),
...,
s,(N))
of
scores is equal to
(a,,
...,
a,).
We conjecture that
if
(a,,
...,
a,)
is an attainable vector of scores, then
so
is the
negation
(-aN,
...,
-a,).
If
the conjecture is true, then the
M’
of
Corollary
3
and the
M
of Theorem
2
can, of course, be
taken to be the same.
3.
Fairness
In
this section, we discuss a concept of fairness and show that
our scheduling algorithm (Scheduling Algorithm
4)
is fair.
However, we shall see that Scheduling Algorithm
2
(simple
tokens) is not fair. We shall also see that Scheduling Algo-
rithm
l
(simple rotation)
is
fair (when it can be applied), and
that Scheduling Algorithm
3
(subsets) is fair (but it requires
too much bookkeeping).
To help us understand fairness, let us first consider
Scheduling Algorithm
3
(subsets). Scheduling Algorithm
3
is fair
in
the sense that among the times that person
A
rides
precisely with, say,
B
and
C,
the driver is person
A
approxi-
mately
1/3
of the time (with the obvious generalization that
A
is
the driver approximately
Ilk
of the time that he rides
with a fixed subset of
k
-
1
others.) Less restrictively, we
might consider a scheduling algorithm fair if each person is
the driver approximately
Ilk
of the time that he rides with
k
-
1
others (not necessarily a fixed subset
of
k
-
1
others.) Thus, if the carpool consists precisely of
A,
B,
C,
and
D,
then
A
might be expected to drive approximately
1/3
of
the time that he rides with precisely two among
B,
C,
and
D.
In other words, let
cx
be the number of times (through time
t)
that
X
is precisely the set of those participating in the
carpool
on
that day. Then during thosedays that
A
rides with
precisely two among
B,
C,
and
D,
the number
of
times that
we might want
A
to drive is approximately
1
j
(cmc
+ cABD
+
cAcD).
Even less restrictively, assume that through time
t,
person
A
has participated
in
the carpool
on
6,
days when exactly
2
137
IBM
J.
RES. DEVELOP.
VOL.
27
o
NO.
2
o
MARCH
1983
RONALD
FAGIN
AND JOHN H.
WILLIAMS
138
persons participated in the carpool,
on
b,
days when exactly
3
persons participated in the carpool, and
so
on.
Let us define
A's ideal number
of
drives to be the number
Our notion of fairness is that
A
should be the driver of the
carpool car approximately this number
of
times.
We are now ready to give our formal definition of fairness.
We say that a carpool scheduling algorithm is fair
if
for each
N
(where
N
is the number of members of the carpool), there
is a number
P
such that whatever the schedule of arrivals, it
is the case that at each time
t
and for each carpool member
A,
the number of times that A has actually driven differs
from his ideal number
of
drives in absolute value by
no
more
than
P.
We shall show that our scheduling algorithm is fair. We
first prove a simple proposition.
Proposition
5
Let
x
be the number of times that A has actually driven
through time
t,
and let y be
A's
ideal number of drives
through time
t.
Then the number
(x
-
y)U is
A's
entry
in
row t of the table in Scheduling Algorithm
4.
Proof
We prove the proposition by induction
on
t. It is
obviously true
for
t
=
0,
since every entry
in
row
0
is
0,
and in
this case
x
=
y
=
0.
Assume inductively that the statement
of the proposition is true for t
=
m;
we shall show that it
holds for t
=
m
+
1.
Let
x,
be the number of times that
A
has
actually driven through time t, let y, be A's ideal number of
drives through time t, and let A, be A's entry
in
row
t
of
the
table. By inductive assumption, the number
(x,,,
-
y,)U
equals
A,
(that is, A's entry in row
m
of the table). We must
show that the number
(xm+,
-
y,+,)Uequals
A,,,
(that is,
A's
entry in row
m
+
1
of
the table). Assume that there are k
participants in the carpool
at
(on
the day corresponding to)
time
t
+
I.
There are two cases, depending on whether A is
the driver at time
m
+
1.
Case1
andy,+,
=
y,
+
(l/k). Hence,
Aisthedriverattimem+ l.Thenxm+,
=x,
+
I,
-
Y,+l)U
=
(x,
-
Y,)U
+
U(k
~
l)/k.
(15)
But by assumption,
(x,
-
Y,)U
=
A,
(16)
Since A is the driver at time
m
+
1,
it follows from
Scheduling Algorithm
4
that
A,+,
=
A,,,
+
U(k
-
l)/k.
(17)
It follows from (15),
(16).
and
(17)
that
(x,+,
-
y,+,)U
=
A,,,,
which was to be shown.
Case
2
A
is not the driver at time
m
+
1.
Then
x,,,+~
=
x,,
and
Y,,,
=
Y,
+
(I/k).
Hence,
(x,,,+,
-
Y,+,)U
=
(x,
-
y,)U
-
U/k. As in Case
1,
it follows easily that
(x,+,
-
~,+~)Uis A's entry in row
m
+
1.
0
The next theorem discusses the fairness
or
unfairness of
the scheduling algorithms we have discussed. We are most
interested in the result that Scheduling Algorithm
4
is fair.
Theorem6
Scheduling Algorithm
1
(when it applies), Scheduling Algo-
rithm
3,
and Scheduling Algorithm
4
are fair, but Schedul-
ing Algorithm 2 is not fair.
Proof
Recall that a carpool scheduling algorithm isfair if
for
each
N
(where
N
is
the number
of
members
of
the
carpool), there is a number
P
such that whatever the
schedule of arrivals, it is the case that at each time
t
and for
each carpool member A, the number of times that
A
has
actually driven differs from his ideal number of drives in
absolute value by
no
more than
P.
Scheduling Algorithm
1
(simple rotation) is fair (when it
applies) Of course, Scheduling Algorithm
1
is very lim-
ited, since it is not even defined unless every carpool member
participates
in
the carpool
on
every day. If
so,
then it is easy
to see that the desired number
P
above can be taken to be
I.
Scheduling Algorithm
2
(simple tokens)
is
not
fair
As-
sume that there are
6
carpool members A, B, C,
A',
B', and
C', and that the schedule of arrivals is (AA', ABC, AB, AC,
ABC, AB, AC, A'B'C', A'B', A'C'), where the sequence
ABC, AB, AC, A'B'C', A'B', A'C' repeats over and over a
total of
m
times after the initial AA' (and
so
the number of
days is
6m
+
1
.)
We shall show that there is
no
number
P
as
defined above that works for every
m.
On
the first day, when
AA' is the set of carpool participants, either A
or
A' is the
driver. Assume without
loss
of
generality that
A'
is the
driver; otherwise, everything we now say holds when we
replace A, B,
C
by (respectively) A', B', C'. We leave to the
reader the simple verification that under Scheduling Algo-
rithm 2 it follows that
on
each of them days the set of carpool
participants is precisely ABC (respectively, AB
or
AC), the
driver is always A (respectively, B
or
C.) Now there are
exactly 2m
+
1
days that A participates in the carpool when
precisely
2
people participate (namely, A and one of A', B,
or
C),
and there are exactly
m
days that A participates in the
carpool when precisely
3
people participate (namely
A,
B,
and
C).
Thus, A's ideal number of drives is (1/2)(2m
+
1)
+
(1/3)m,
which equals
(4/3)m
+
(1/2). The number of
times that A actually drives is
m
+
1. The difference between
ideal and actual is
(m/3)
-
(1/2), which is not bounded by
any fixed number
P
(as
m
gets large.) This was to be shown.
A'B'C', A'B', A'C', ABC, AB, AC, A'B'C', A'B', A'C',
...,
RONALD
FAGlN
AND
JOHN
H.
WILLIAMS
IBM
J.
RES
DEVELOP
VOL.
27
NO.
2
MARCH
1983
Scheduling Algorithm
3
(subsets)
is
fair
It is easy to see
that Pcan be taken to be equal to
the
number of subsets that
contain a given member
A,
that is, Pcan be taken to be
2N-’,
where
N
is the number of carpool members.
Scheduling Algorithm
4
is fair
As in the statement of
Proposition
5,
let
x
be the number of times that
A
has
actually driven through time
t,
and let
y
be
A’s
ideal number
of drives through time
1.
By Proposition
5,
the number
(x
-
y)U
is
A’s
entry
in
row
t
of the table. By Corollary
3,
we
can find a positive number
M’
(which depends only
on
N,
the
number of members of the carpool) such that
no
entry
in
the
table is larger in absolute value than
M’.
Thus,
I
(x
-
y)
U
I
I
M‘.
Hence,
I
(x
-
y)
I
5
M‘/U.
So,
we can take
P
to be
M’/U.
0
4.
Further observations
As well as being both fair and manageable, our carpool
scheduling algorithm (Scheduling Algorithm
4)
has some
additional attractive features. First, although it will always
determine whose turn it is to drive
on
a particular day, it is
robust in the presence of deliberate imbalance. Thus a person
could drive (or not drive) for several days in a row
if
he
needed to, regardless of whether the scheduling algorithm
says he should
or
should
not
drive, and the imbalance would
eventually be eliminated. (It
is
beyond the scope of this paper
to make this last sentence precise. One meaning is that after
the driving table is artificially made imbalanced, the entries
will remain bounded from then
on,
as in Theorem
2,
provided
the scheduling algorithm is faithfully adhered to from then
on.)
Second, the “ride units” of the method can become a
commodity that can be bought and sold. This can also allow
for “carpool members” who never drive at all. Thus, if
A
does
not
have a car but wishes to participate in the carpool,
if
B
is
a carpool participant (with a car), and
if
A
and
B
can agree
on
a fair market value for a ride unit, then
B
can sell ride
units to
A,
and
A
need
never
drive.
(In
effect,
B
is
“driving
for”
A,)
In
fact,
the group for which this scheme was
developed had such
a
participant. His name being
Don
and
twelve being the least common multiple of the possible subset
sizes of the carpool, the “ride unit” became affectionately
known as the
Duodecadon.
Finally, although we derived this scheduling algorithm
on
the assumption that there would be only one official carpool
car
on
any one day, that assumption turned out to be
superfluous!
In
fact, there can be as many carpool cars as
there are people driving. Each driver of a car containing
k
participants gets credited with
U(k
-
I)/k
units,
and
each
rider who does not drive in such a car gets debited
U/k
units
in
some master (perhaps company-wide!) record. Note that a
person driving alone gets
0
units,
or
no
change to the record.
In
this generalized scheduling algorithm, an arbitrary group
of
people, who had never previously carpooled with one
another, could decide to ride to work together, and it would
make perfect sense for them to ask, “Whose turn is it to drive
today?”
Acknowledgments
The first author would like to acknowledge his former
Berkeley roommates, Larry Carter and John Gill, who
successfully implemented with him Scheduling Algorithm
3
(to determine whose turn it was to cook). Further, the
authors thank Phyllis Reisner,
Don
Stanat, and Jim Sutton,
who at various times participated with the authors in the
carpool
in
which Scheduling Algorithm
4
was developed. We
are grateful to Carl Hauser for calculatingf(5) and to
Don
Coppersmith for proving the linear lower bound
on
f
in
Section
2.
Note
I.
After this
paper
went to press, Coppersmith lowered the
upper
bound to
(N
-
l)/2.
Thus we now know that
(N
-
1)/3
Sj(N)
I
(N
-
l)/2.
Received July
6.
1982;
revised September
13,
1982
Ronald
Fagin
IBM
Research Division, 5600
Cottle
Road,
Sun
Jose, CaliJornia 95/93,
Dr. Fagin
is
the manager of the
foundations of computer science group in the Computer Science
Department in
San
Jose. He joined IBM in 1973 at the Thomas
J.
Watson Research Center, Yorktown Heights, New York. While
there, he
did
research on storage management analysis.
In
1975, he
transferred to San Jose, where most
of
his research has centered on
the theory of relational data bases. He has received two IBM
Outstanding Innovation Awards. The first, received in 198
I,
was for
fundamental contributions
to
relational data base theory. The sec-
ond, also received in 198
I,
was for his joint research on extendible
hashing, a fast access method for dynamic
files.
He received his B.A.
in mathematics from Dartmouth College, Hanover, New Hamp
shire, in
1967,
and his Ph.D. in mathematics from the University of
California at Berkeley in 1973.
Dr.
Fagin is a member of the
Association for Computing Machinery and
its
special interest groups
on
the Management of Data and
on
Automata and Computability
Theory.
John
Hayden
Williams
IBM
Research Division, 5600 Cottle
Road, Sun Jose, California 95/93.
Dr.
Williams is a Research staff
member in
San
Jose, where he is working with IBM Fellow John
Backus on the development of Functional Programming Languages,
an alternative to conventional programming languages. He joined
IBM in 1978; prior to that, he was an Associate Professor
of
Computer Science at Cornell University, Ithaca, New York. Dr.
Williams received his B.S. and M.S. in mathematics and his Ph.D. in
computer science in 1969 from the University of Wisconsin at
Madison.
IBM
J.
RES
DEVELOP.
VOL.
27
0
NO.
2
MARCH
1983
139
RONALD
FAGlN
AND
JOHN
H.
WILLIAMS