The Effects of Untruthful Bids on User Utilities and Stability in Computing Markets.
ABSTRACT Markets of computing resources typically consist of a cluster (or a multi-cluster) and jobs that arrive over time and request computing resources in exchange for payment. In this paper we study a real system that is capable of preemptive process migration (i.e. moving jobs across nodes) and that uses a market-based resource allocation mechanism for job allocation. Specifically, we formalize our system into a market model and employ simulation-based analysis (performed on real data) to study the effects of users' behavior on performance and utility. Typically online settings are characterized by a large amount of uncertainty, therefore it is reasonable to assume that users will consider simple strategies to game the system. We thus suggest a novel approach to modeling users' behavior called the Small Risk-aggressive Group model. We show that under this model untruthful users experience degraded performance. The main result and the contribution of this paper is that using the k-th price payment scheme, which is a natural adaptation of the classical second-price scheme, discourages these users from attempting to game the market. The preemptive capability makes it possible not only to use the k-th price scheme, but also makes our scheduling algorithm superior to other non-preemptive algorithms. Finally, we design a simple one-shot game to model the interaction between the provider and the consumers. We then show (using the same simulation-based analysis) that market stability in the form of (symmetric) Nash-equilibrium is likely to be achieved in several cases.
- SourceAvailable from: citeseerx.ist.psu.edu
Conference Proceeding: On the importance of migration for fairness in online grid markets.[show abstract] [hide abstract]
ABSTRACT: ABSTRACT Computational grids oer,users simple access to tremendous computer resources for solving large scale computing prob- lems. Traditional performance analysis of scheduling algo- rithms considers overall system performance (e.g. in terms of makespan and total completion time) while economic fair- ness analysis focuses on the individual performance each user receives. Until recently, only few grid and cluster systems provided preemptive migration (e.g. ), which is the ability of dynamically moving computational tasks across machines during runtime. The emergent technology of virtualization (e.g. ) provides o-the-shelf support for migration, thus making the use of this feature widely accessible. Existing literature largely neglects the close interrelation- ship between technical migration and economic fairness. In this paper we take a,rst step towards closing this gap. We present fairness and quality of service properties for eco- nomic online scheduling algorithms. Under mild assump- tions we show that it is impossible to achieve these prop- erties without the use of migration. On the other hand, if zero cost migration is used, then these properties can be satised.,In order to evaluate the eect,of migration cost on the scheduling algorithm, we performed extensive empir- ical analyses based on real data. The results indicate that7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), Estoril, Portugal, May 12-16, 2008, Volume 3; 01/2008
Conference Proceeding: An organizational grid of federated MOSIX clusters[show abstract] [hide abstract]
ABSTRACT: MOSIX is a cluster management system that uses process migration to allow a Linux cluster to perform like a parallel computer. Recently it has been extended with new features that could make a grid of Linux clusters run as a cooperative system of federated clusters. On one hand, it supports automatic workload distribution among connected clusters that belong to different owners, while still preserving the autonomy of each owner to disconnect its cluster from the grid at any time, without sacrificing migrated processes from other clusters. Other new features of MOSIX include grid-wide automatic resource discovery; a precedence scheme for local processes and among guest processes (from other clusters); flood control; a secure run-time environment (sandbox) which prevents guest processes from accessing local resources in a hosting system, and support of cluster partitions. The resulting grid management system is suitable to create an intra-organizational high-performance computational grid, e.g., in an enterprise or in a campus. The paper presents enhanced and new features of MOSIX and their performance.Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International Symposium on; 06/2005
- [show abstract] [hide abstract]
ABSTRACT: This keynote paper: presents a 21st century vision of computing; identifies various computing paradigms promising to deliver the vision of computing utilities; defines Cloud computing and provides the architecture for creating market-oriented Clouds by leveraging technologies such as VMs; provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; presents some representative Cloud platforms especially those developed in industries along with our current work towards realising market-oriented resource allocation of Clouds by leveraging the 3rd generation Aneka enterprise Grid technology; reveals our early thoughts on interconnecting Clouds for dynamically creating an atmospheric computing environment along with pointers to future community research; and concludes with the need for convergence of competing IT paradigms for delivering our 21st century vision.09/2008;
The Effects of Untruthful Bids on User Utilities and Stability in Computing
Sergei Shudler∗, Lior Amar†, Amnon Barak†
Department of Computer Science
The Hebrew University of Jerusalem
Jerusalem, 91904 Israel
Social and Information Sciences Laboratory
California Institute of Technology
Pasadena, California, USA 91125
Markets of computing resources typically consist of a
cluster (or a multi-cluster) and jobs that arrive over time
and request computing resources in exchange for payment.
In this paper we study a real system that is capable of pre-
emptive process migration (i.e. moving jobs across nodes)
and that uses a market-based resource allocation mech-
anism for job allocation. Specifically, we formalize our
system into a market model and employ simulation-based
analysis (performed on real data) to study the effects of
users’ behavior on performance and utility. Typically on-
line settings are characterized by a large amount of un-
certainty; therefore it is reasonable to assume that users
will consider simple strategies to game the system.
thus suggest a novel approach to modeling users’ behav-
ior called the Small Risk-aggressive Group (SRG) model.
We show that under this model untruthful users experi-
ence degraded performance. It is also shown that the k-
th price payment scheme, which is a natural adaptation of
the classical second-price scheme, discourages these users
from attempting to game the market. The preemptive ca-
pability makes it possible not only to use the k-th price
scheme, but also makes our scheduling algorithm superior
to other non-preemptive algorithms. Finally, we design a
simple one-shot game to model the interaction between the
provider and the consumers. We then show (using the same
simulation-based analysis) that market stability in the form
In recent years the area of High-Performance Computing
(HPC) has witnessed a shift towards market-based resource
allocation mechanisms.These mechanisms incorporate
†Supported in part by the EU IST program, grant 034286 “SORMA”.
market features into their architectures [12, 5, 17, 4, 3, 21]
and allocate scarce resources more efficiently, not only by
increasing utilization (in the traditional sense) but also by
maximizing users’ utilities . One of the central princi-
ples in these mechanisms is incentive compatibility; i.e., the
ability to discourage strategic users from gaming the mar-
ket to obtain better personal utility  for themselves. By
gaming the market these users disproportionally harm other
users who are best described as “conservative” in terms of
gaming attempts. Usually the goal of incentive compati-
ble mechanisms is to maximize the utility for all the users
such that gaming the market will not increase the utility of
any of the users. However, there are costs associated with
using incentive compatible mechanisms. For example, in
the combinatorial auctions field, the well-known Vickrey-
Clarke-Groves (VCG) mechanism is known to be incentive
computationally hard to use it in real systems. As a result,
many real world implementations use various approxima-
tions of VCG [14, 8, 17, 18] thus compromising incentive
compatibility in its game-theoretical meaning.
This paper is based on  in which it was shown that
using a type of preemptive job scheduler, specifically the
Online Greedy Migration (GM) scheduler, in a cluster re-
sulted in increased performance and increased fairness. The
GM scheduler and its preemptive capability are part of the
MOSIX  multi-cluster grid deployed at the Hebrew Uni-
versity. Our goal in this paper is to analyze the effects of
users’ untruthful behavior, suggest a way to discourage this
and to explore the stability of the market. To this end,
in Section 2, we transform the GM scheduler into a mar-
ket mechanism by allowing it to support various payment
schemes and introduce the Small Risk-aggressive Group
(SRG) untruthfulness model which provides a simple char-
acterization of users’ behavior. The study was performed
by running real-world workload traces in a simulated envi-
ronment. The rationale for performing simulations instead
of running workloads on a real environment was to save
time and resources, since normally workloads span months
(or even years) and assume clusters of hundreds of nodes.
Although there is a significant amount of theoretical work
in this field [13, 10, 14], the theoretical treatment of the
problem is complicated due to: (a) an on-line environment
in which the number and the arrival time of future jobs is
unknown; (b) no job run-time information is available; (c)
the preemptive capability of the scheduler. In Section 3
we show that untruthful users experience degraded perfor-
mance in that their slowdown is higher than for truthful
users. This section also verifies that compared to other non-
preemptive scheduling algorithms, our upgraded scheduler
retains the same properties as the GM scheduler. In Sec-
tion 4 we report the most significant results of this paper. By
dividing all of the jobs in a workload into three groups with
increasing run-times and analyzing each group separately,
we show that the k-th price payment scheme discourages
untruthful users from attempting to game the market. In ad-
dition to being robust, it also increases the utility of all the
users compared to the simple First price payment scheme.
These results are another strong case for preemptive migra-
tion ability in clusters since this ability is the key to using
the k-th price scheme within our mechanism. In Section 5
we explore the stability of the market by formulating a one-
shot game between the mechanism designer and untruth-
ful users. As it turns out, there are specific combinations
of strategies that produce equilibria in this game. Finally,
Section 6 discusses related work and Section 7 presents the
conclusions and ideas for future work.
2. Computing Market Models
2.1. The Basic Model
We now formalize our existing system into a somewhat
simpler model that allows us to specify the scheduling algo-
rithm. Consider a set of n homogeneous (same architecture
and same speed) computers (a cluster or a multi-cluster) in
which an on-line bidding scheduling algorithm is used to
assign incoming sequential jobs to computers (nodes). As-
sume that each node runs only one job at a time and that the
system supports job preemptions; i.e., a job can be stopped
at any stage of its execution and be resumed later, not nec-
essarily in the same node. Before starting a job, each user
must determine his or her private bid, which is the maximal
amount that the user is willing to pay per unit of run-time.
Note that the private bid is not known to the other users.
Upon submitting a job each user states a bid, called the re-
ported bid, which might be lower than the private bid. After
the job is submitted the bid cannot be changed. Through-
out this article the notions of jobs and users are used inter-
The following Highest-Bid (HB) on-line algorithm is
based on the Online Greedy Migration (GM) algorithm ex-
tensively studied in , with the addition of a payment cal-
culation. The algorithm is used to determine whether to as-
sign a job to a node or to place it in a queue of waiting jobs.
It supports various payment schemes by assuming that p?
which is the current payment per unit of run-time for job
i, was already calculated somehow by one of the payment
Algorithm 1 Highest-Bid (HB)
Upon job arrival, job termination or bid update do:
1. If it is a new job j set its total payment: pj= 0.
2. For each running or newly terminated job i, set: pi=
assignment of the job and p?
per unit of run-time.
i, where t?
iis the elapsed time since the last
iis the current payment
3. Sort the set of new and currently queued and uncom-
pleted jobs in a descending order according to their
bid. Break ties according to the submission time.
4. Assign the first n jobs from the sorted list to the n
nodes, possibly preempting jobs with lower bids.
5. For each assigned job i, determine its current p?
6. Queue unassigned jobs until the next run of the algo-
In terms of algorithmic mechanism design, each user has
a well defined utility function (see below) that represents
the user’s preference over various outcomes of the HB algo-
rithm. It is assumed that users will try to game the system,
for example by reporting false information to optimize their
respective utilities .
For each job i, let:
• wiand ˜ wibe the private and the reported bids respec-
• ϕibe the flow time (in seconds), defined as the total
time between the submission time and the time when
the job was finished. Note that ϕidepends on ˜ wi.
• BSDibe the bounded slowdown factor of the job, de-
where τiis the actual run-time  in seconds and 60
is a threshold value which prevents getting excessive
BSD values in extreme conditions.
The private valuation viof job i which represents the
user’s preference for a shorter completion time, was defined
in  as: −wiϕi. In this paper we use the same definition:
Definition: Quasi-Linear Utility
For job i define: ui= vi− pi.
By using Eq. 2 the utility function of job i becomes:
ui= −wiϕi− pi
Ideally users truth-reveal their private bids, which can be
formally stated as:
Full-Truthfulness (FT) Model: For each job i: wi =
However, assuming that users are self-interested, ra-
tional and strategic, they may report untruthful bids (i.e.
wi?= ˜ wi) if it maximizes their utility.
2.2. The Untruthful Bids Model
To the best of our knowledge, there is no sufficiently
clear definition of untruthful behavior that fits the clus-
ter environment we are modelling.
model which can be used for this purpose. The reported
bid range can be parameterized by 0 ≤ β < 1, such that
˜ wi∈ [(1 − β)wi,wi]. Assume that users strongly prefer to
avoid higher payments. Assume that ˜ wiis chosen randomly
and uniformly from the above range defined by the private
bid. We now define the users’ behavior model implemented
in this paper:
Small Risk-aggressive Group (SRG) Model: Divide
all of the jobs randomly and uniformly into two groups: the
first group contains 90% of the jobs with β = 10% and is
called the “risk-conservative group”; and the second group
contains the remaining 10%, with β = 90%, and is called
the “risk-aggressive group”.
In words, our first assumption in the SRG model is that
users facing uncertainties have simple strategies, and there-
fore can only underbid their values.
would never overbid and would never delay their arrival
time to the system. Our second assumption in this model
is that the majority of users are conservative, and therefore
their willingness to underbid is very small. Only a small
group of aggressive users can severely underbid their val-
Although this model assumes that all of the jobs are
known in advance, it also can be applied without knowing
the total amount of jobs. Each time a new job is submitted
a biased 10%-90% coin is flipped so that the job has a 10%
chance of being “aggressive” and a 90% chance of being
“conservative”. More elaborate models of user untruthful-
ness could make the market more complicated to analyze
and alter the original FT bid distribution, which is assumed
Below we define a
to be bi-modal. For example, increasing the percentage of
the risk-aggressive users to 80% and lowering the percent-
age of the risk-conservative users to 20% (the β values stay
unchanged) gives a distribution which is more exponential
rather than bi-modal. In case of the SRG the distribution is
nearly identical to FT.
It is not reasonable to assume that every job is highly
important and consequently has a higher bid; a much more
realistic assumption is that only a fraction of jobs are really
important (highly prioritized). The majority of the jobs are
regular jobs with normal priorities. The aforementioned bi-
modal distribution for FT is generated by uniformly com-
bining two normal distributions such that the first one is
used for 80% of the jobs and the second one is used for
the other 20%. The first distribution models jobs with low
bids (normally prioritized) with a mean of 30, whereas the
second one models jobs with high bids (highly prioritized)
whose mean is 150. For both distributions the standard
deviation is 15. Throughout the article, low-bid jobs re-
fer to jobs with wi∈ [0,60], middle-bid jobs refer to jobs
with wi ∈ (60,120) and high-bid jobs refer to jobs with
In the next section we present the performance of Al-
gorithm HB with preemptions and other non-preemptive
3. Performance of HB for the SRG model
In this section we compare the performance of Algo-
rithm HB on the FT and the SRG models. We also compare
Algorithm HB with preemptions to other, non-preemptive
3.1. SRG vs. FT
As was stated earlier, we have taken an empirical ap-
proach in this study and carried out simulations using
real-world workloads from three homogeneous clusters :
DAS2 , LPC  and REQUIN . Each workload
consists of records for each submitted job that includes the
job submission time, its run-time, the user’s estimate of its
run-time, and other non-relevant parameters. Since bidding
was not used in these workloads we used randomly gener-
ated bidding values. For each workload, Table 1 presents
the number of jobs, the actual number of nodes used (in the
simulations) and the mean (average) run-time of all the jobs
(in seconds). Note that since our model does not deal with
parallel jobs, we converted such jobs to sequences of se-
rial jobs . Also note that due to under-utilization of both
the DAS2 and the LPC clusters, we artificially reduced the
number of nodes in these clusters by 25%, in order to create
a competitive environment. Although the simulations were
performed for all of the aforementioned workloads only the
results of the LPC workload are presented since the results
for the other workloads were conceptually the same.
Table 1: Workloads
No. serial jobs
Average BSD vs. Private bid
Fig. 1 presents the average BSD for different values of pri-
vatebids(wi). Inthefigure, theredlinepresentstheaverage
BSD of the risk-aggressive users in the SRG model, while
the blue line presents the average BSD of all the users in the
Figure 1: BSD comparison: SRG vs. FT
From the figure it can be seen that the average BSD of
the low-bid jobs is higher in the SRG model. The number of
middle-bid jobs is very small which leads to a noisy results
for this range of bids. In the case of high-bid jobs, there is
model have a greater ˜ withan the majority of other jobs.
Since the HB algorithm prefers jobs with higher bids over
jobs with lower bids, lowering a job’s bid leads to increased
slowdown of this job. As a result, the risk-aggressive jobs
are subjected to a greater slowdown.
Measure of SSJs
Based on the analysis in , it is assumed that users show
tolerance to the growing slowdown up to a certain point
(threshold). We now define this threshold value for exces-
sive BSD values:
Definition: Severely Slowdowned Job (SSJ)
A job i such that, BSDi≥ 5.
The value of 5 was chosen based on the analysis in .
of private bids (wi). In the figure, the red line presents the
percentage of SSJs of the risk-aggressive users in the SRG
model, while the blue line presents the percentage of SSJs
of all the users in the FT model.
Figure 2: SSJ comparison: SRG vs. FT
From the figure it can be seen that the percentage of low-
bid SSJs is higher in the SRG model than in the FT model
(as before, the results for middle-bid jobs are noisy). This
means that a non-negligible percentage of risk-aggressive
users’ jobs are subject to excessive slowdown. It is as-
sumed that a job’s bid correlates with its importance and
that the user wants more important jobs to have less slow-
down. Therefore, good performance means that jobs with
higher bids do not become SSJs and have the lowest slow-
down possible. The above results show that risk-aggressive
users are penalized by degraded performance compared to
3.2. HB vs. Non-Preemptive Algorithms
In this section we compare the performance of the HB
preemptive algorithm to several non-preemptive algorithms
using the adjusted workloads described in the previous sec-
tion. Specifically, we extend the analysis presented in 
by incorporating the SRG model.
We used the following three algorithms:
1. A non-preemptive version of the HB (NPHB) algo-
rithm that sorts the jobs in the queue according to their
reported bid and then assigns jobs to available nodes.
2. WSPT - similar to NPHB, except that the queue is
sorted according to the ratio between the reported bid
and the run-time: ˜ wi/τi, known as the Smith promi-
nent ratio rule . Note that this algorithm assumes
that the run-times are known.
3. WSPT-EST - a version of WSPT, in which the actual
run-time τi, is replaced by a user estimation of the
run-time ˜ τi. Note that the run-time estimation either
appears in the workload records (DAS2, LPC) or was
generated (REQUIN) according to .
Average BSD vs. Private bid
Fig. 3 presents the average BSD of the risk-aggressive users
for different values of private bids (wi) and for the afore-
mentioned algorithms. In the figure, the green line presents
the average BSD for the HB algorithm, while the orange,
blue and red lines present the average BSD for NPHB,
WSPT and WSPT-EST algorithms respectively.
Figure 3: BSD comparison: HB vs.
From the figure it can be seen that the WSPT-EST per-
forms less well than WSPT, but both these algorithms per-
form better than HB for low-bid jobs. In case of the high-
bid jobs, the performance of HB improves and the average
BSD values become smaller. As for the middle-bid jobs, the
results are noisy as before. The HB algorithm also outper-
forms the NPHB algorithm for the whole spectrum of bids.
These results correlate with the performance results in ;
the difference, however, is the new perspective of the SRG
Measure of SSJs
Fig. 4 presents the percentage of SSJs of the risk-aggressive
users for different values of private bids (wi) and for the
aforementioned algorithms. In the figure, the green line
presents the percentage of SSJs for the HB algorithm, while
the orange, blue and red lines present the percentage of SSJs
for NPHB, WSPT and WSPT-EST algorithms respectively.
From the figure it can be seen that the HB algorithm has
consistently fewer SSJs for almost the whole spectrum of
bids, which means that its performance is better. As in the
previous case, the results correlate with .
The next section analyzes the utility values produced
by the HB algorithm combined with various payment
4. Analysis of Different Payment Schemes
The previous section presented the performance of the
HB algorithm without considering market aspects of pay-
Figure 4: SSJ comparison: HB vs.
ments and utilities. It was concluded that risk-aggressive
users are penalized in terms of BSD for reporting untruthful
bids. This section analyzes users’ utilities under different
payment schemes and under different truthfulness models
(FT and SRG). As in the previous section only the results
of the LPC workload are presented since the results for the
other workloads were conceptually the same. For our anal-
ysis we used the following two payment schemes:
1. First price: each running job pays exactly the bid it re-
ported. Whenever the number of jobs drops below the
number of nodes, the payment of each job becomes the
reservation price of the node. Obviously, it is econom-
ically unreasonable to request a full price if there is no
demand. Note that in order to simplify the model, the
reservation price of each node is set to 1.
2. k-th price: each running job pays the bid of the job
with the highest bid in the queue; if the waiting queue
is empty the payment of each job becomes the reserva-
tion price of the node. Note that this scheme is the nat-
ural dynamic adaptation of the classical second-price
scheme. Essentially each job pays the minimal bid it
could have reported and still be running.
The utility function (Eq. 3) becomes linearly lower as
a job’s run-time grows; i.e., comparing the utility value
of a short job with the utility value a long job almost al-
ways shows that the short job has a higher utility. This ef-
fect is negated by dividing the jobs according to their run-
times and comparing them within the same run-time group.
Therefore all of the jobs were divided into three equal sized
groups, SHORT, MEDIUM and LONG according to their
respective run-times and the utility values were analyzed
within these groups. Naturally, the utility values of the
LONG group were generally much smaller than the values
of the SHORT and MEDIUM. Fig. 5 presents the average
utility values for different values of private bids (wi) and for
the First price scheme. In the figure, the red line presents
the average utility values of the risk-aggressive users in the
SRG model, while the blue line presents the average utility