Content uploaded by Nikolay Korgin
Author content
All content in this area was uploaded by Nikolay Korgin on Oct 31, 2017
Content may be subject to copyright.
Автоматика и телемеханика, № 99, 9999
c
9999 г. N.A. Korgin?,??, V.O. Korepanov?
An Efficient Solution of the Resource Allotment Problem with the
Groves–Ledyard Mechanism under Transferable Utility1
This paper designs an allotment mechanism for a limited amount of an
infinitely divisible good (resource) among a finite number of agents under
transferable utility. The mechanism is efficient in the sense of total agents’
utility maximization. As a solution, we introduce an adaptation of the Groves–
Ledyard “quadratic government” that was initially suggested for the problem
of public good.
1. INTRODUCTION
A key problem in control theory for social and economic systems is minimization
of losses due to the absence of complete information required for decision-making.
The specifics of social and economic systems allow separating a class of problems,
where such information is unavailable to a decision-maker (DM) but available
to other system participants whose interests depend on decisions—the problems
of decision-making under incomplete asymmetrical awareness [11]. For this class
of problems, researchers endeavor to design mechanisms, where a DM obtains
necessary information from the rest system participants and the resulting solutions
are efficient in the sense of total utility maximization of all system participants.
The framework of modern mechanism design for social and economic systems
includes two classes of problems, viz., mechanism design problems under nontransferable
utility (system participants cannot transfer their utilities) and mechanism design
problems under transferable utility (otherwise).
The following fact is a classical result for the first class of problems [2, 27].
Information transmission from system participants to a DM does not guarantee
same efficiency of decision-making as in the case of DM’s complete awareness
about necessary system parameters. However, this can be achieved for the second
class of problems: the efficiency of decision mechanisms under incomplete asymmetrical
awareness can be not smaller than under complete awareness.
Investigations in the field of efficient mechanisms design for the second class
of problems date back to the 1970’s, see [19, 21, 32]. Unfortunately, all those
mechanisms suffered from certain shortcomings which complicated their application.
In particular, the Groves–Ledyard mechanism [19] does not ensure the individual
rationality of a resulting solution, as the latter is not a Lindahl allocation. The
Hurwicz mechanism [21] employs a structure of messages from system participants
to a DM, which hampers its practical implementation. The Walker mechanism
[32] yields an instable solution. Moreover, efficient solutions gained by the last two
1This work was supported by the Russian Foundation for Basic Research (projects no. 12-
07-3124412, 14-07-00875) and by the Presidential Grant (project no. MD-6075.2015.9).
1
mechanisms are unachievable in learning dynamics, i.e., their practical implementation
appears difficult, too.
The recent years have demonstrated a rising tide of interest in this problem.
A series of new publications presented the results of theoretical research and
simulation experiments; for instance, we refer to [12, 20, 28, 31]. Investigations are
par excellence focused on designing mechanisms implementing Lindahl equilibria
in the problem of public good. Most mechanisms in the cited papers have common
drawbacks as follows:
1) complex messages from system participants to a DM;
2) unbalanced side payments under nonequilibrium requests.
The allotment problem of limited resources using non-market mechanisms
under nontransferable utility was considered in Russia [2, 10] and abroad [13, 30].
Design of such mechanisms still represents a topical problem [17, 26] from a
practical point of view.
Research works dedicated to design of efficient resource allotment mechanisms
under transferable utility mostly concern multi-agent systems with a complex
networked structure. In the first place, analysis covers the so-called “auction
approaches” to clients prioritization (e.g., see [14]), where resources are allotted
depending on unit resource prices announced by candidates for compensating
resource amounts not obtained by their opponents. Nevertheless, these models
make the main emphasis on complex resource allotment procedures over networked
structures [22, 24] and the mechanisms proper are constructed on the basis of the
Vickrey–Clarke–Groves strategy-proof mechanisms [23].
Another direction of investigations aims at designing “quasi”-optimal mechanisms
implementing almost optimal resource distributions [27].
As standing by itself, we also mention research in the field of distributed
optimization procedures, where some latest results [15] have a close connection to
mechanism design. However, nowadays these iterative procedures are not represented
in the form of mechanisms, which impedes analysis of their game-theoretic properties.
The closest work to the approach introduced below is the paper [20], where
the authors proposed a mechanism implementing a Walras equilibrium in public
good distribution. But:
1) the proposed solution does not take into account the limited amount of
resources;
2) the mechanism also possesses a complex structure of requests, see the
discussion above.
This paper suggests a refinement of the Groves–Ledyard mechanism used in
[8] to solve the active expertise problem under transferable utility. In particular,
it was demonstrated that the Groves–Ledyard mechanism is individually rational
in the active expertise problem (perhaps, this forms its key disadvantage in the
problem of public good). We endeavor adapting the Groves–Ledyard mechanism
to the resource allotment problem, as a natural assumption is that the resource
allotment mechanism also implements individually rational solutions and it is
possible to balance utility transfers, in an efficient solution or beyond it.
2
The adaptation proceeds from the idea of representing the resource allotment
problem as the multi-criteria active expertise problem. The reasonability and
productivity of such approach was demonstrated in [7] for the resource allotment
problem under nontransferable utility. The whole essence of the idea is simple:
the resource allotment problem is treated not as the individual good distribution
problem but as a multi-criteria choice problem, where each agent can report the
desired value of a multi-criteria public good (i.e., resource allotment among all
agents).
Further exposition has the following structure. Section 2 gives the formal
statement of the efficient resource allotment problem (in the sense of total agents’
utility maximization) and the model used to solve this problem. In Section 3
we present the suggested resource allotment mechanism based on multi-criteria
voting and prove that it implements the efficient resource allotment among the
agents. Section 4 describes an iterative negotiation process based on the suggested
mechanism, which is applied in the conditions that each agent may have incomplete
information about the utility functions of all resource claiming agents. In Section 5
we construct and study a “reduced” version of the suggested mechanism, where
each agent reports only the desired amount of resource. The main results of the
paper are illustrated by several examples using a certain setting of the resource
allotment problem; the latter also underlines a business game intended for experimental
testing of the developed mechanism, see Section 6 for its description and the results
of some testing games.
2. PROBLEM STATEMENT AND BASIC DEFINITIONS
Formally, the resource allotment problem has the following statement. An
organizational system consists of a Principal and a set N={1, . . ., n}of agents.
The Principal disposes of some limited resources in an amount R∈R1
+to-be-
allotted among agents in arbitrary proportions.
The utility of each agent i∈Nin terms of the amount of resources xi∈[0, R]
it obtains is defined by a function ui(•) : R1
+→R1belonging to a given set Uiof
admissible utility functions.
Denote by
A={x= (x1, . . . , xn) : X
i∈N
xi≤R, x ∈Rn
+}
the set of admissible resource allotments and by
U={u= (u1(•), . . . , un(•)) : ui(•)∈Ui, i ∈N}
the set of admissible utility profiles of agents.
The “basic” problem lies in constructing an utility -efficient mapping g(•) : U→
A, i.e., a mapping which maximizes the total utility of all agents from allotted
resources for any admissible utility profile u∈U:
g(u)∈Arg max
x∈AX
i∈N
ui(xi).
(1)
3
However, even if the solution (1) exists, it can be manipulatable [11] (or incentive
incompatible, e.g., [27]). That is, ∃u∈Uand ∃k∈Nsuch that a utility profile
˜u= (˜uk, u−k)∈Usatisfies the condition
uk(gk(˜u)) > uk(gk(u)),
where u−kis the utility profile of all agents except k;u= (uk, u−k),and gk(u)gk(u)
indicates the amount of resources allotted to agent kin the utility profile u.
Within the scope of this paper, we consider the set of utility profiles _
Umeeting
the following assumptions:
1) the utility function of any agent is strictly concave, nondecreasing and twice
continuously differentiable;
2) ∀u∈_
Uthe problem (1) has an internal solution.
Obviously, the above assumptions guarantee that the solution of the problem 1
exists and is unique. And so, our analysis can be focused on the problem of
incentive compatibility. Consider an illustrative example.
П р и м е р 1. Imagine that the utility of each agent is determined by the
function1
ui(x) = √ri+xi, i ∈N,
(2)
where rispecify the internal resource “reserves” of agent iknown to its only. The
maximum total utility of all agents is achieved through allotting the following
amount of resources to each agent:
xi= (R+X
i∈N
ri)/n −ri, i ∈N.
In other words, to solve the problem (1), one has to receive information on ri
from agent i. Obviously, being answered about ri, agent istrives for underrating
the value of this parameter instead or truth-telling. This means that the derived
rule of efficient resource allotment is incentive-incompatible and agents would not
reveal actual information on their utility functions. 2
Amechanism is a set ρ=< S, π, t >, where S=×i∈NSidesignates some set
of agents admissible actions; π(•) : S→Ameans a certain procedure mapping
agents actions into the set of admissible resource allotments; t(•) : S→Rn
indicates a certain utility transfer procedure among agents. Denote by Γ(ρ) =<
N, S, ϕu,ρ >the game induced by the mechanism, where ϕu,ρ ={ϕ1, . . . , ϕn}
is agents’ preference profile defined through their utility profile u∈Uand the
procedures π(•)and t(•):
ϕi(s) = ui(π(s)) −ti(s), i ∈N.
1The motives of choosing the current setting of the resource allotment problem used to
illustrate the basic results of the paper (the number of agents, the form and parameters of utility
functions and mechanism parameters) will be explained in Section 6. It describes a business game
designed for experimental testing of the proposed algorithm.
2Throughout the paper, this symbol marks the end of an example.
4
In this paper, we study the following setting of the resource allotment problem:
is it possible to find a mechanism allowing Nash implementation of an efficient
resource allotment in the case when the solution of the problem (1) appears
incentive-incompatible, i.e., ∀u∈Uthe game admits a unique Nash equilibrium
s∗(u)∈S:
∀i∈N, ∀˜si∈Siϕi(s∗(u)) > ϕi(˜si, s∗
−i(u)),
such that π(s∗(u)) = g(u).
Moreover, the mechanism proper can be considered efficient only if P
i∈N
ϕi(s∗(u)) =
P
i∈N
ui(g(u)). This implies the balance of payments:P
i∈N
ti(s∗(u)) = 0.
3. APPLICATION OF THE GROVES–LEDYARD MECHANISM TO THE
RESOURCE ALLOTMENT PROBLEM
Consider the mechanism ρ=< S, π, t >. Each agent i∈Nreports a desired
resource allotment within the system, possibly demanding “replenishment” from
one or several agents. The only requirement is that the suggested allotment
satisfies the initial resource constraint:
Si={si∈Rn:X
j∈N
sji ≤R}, S =×i∈NSi,
where sji is the request of agent i, which specifies the amount of resources to-be-
allotted to agent j.
The resource allotment procedure π(s) = {x1(s), . . . , xn(s)}averages the requests
of all agents:
xi(s) = 1
n
n
X
j=1
sij , i ∈N.
(3)
Agents’ transfers are defined in the following way. Each agent pays a “disagreement”
penalty described by
pi(s) = β
n
X
j=1
(xj(s)−sji )2, i ∈N.
(4)
The parameter β≥0can be comprehended as penalty strength. A share α∈
[0,1] of all collected penalties goes back to agents in equal portions, and the total
transfer has the form
ti(s) = pi(s)−α
n
n
X
j=1
pj(s), i ∈N.
(5)
5
The unallotted share of penalties is lost or belongs to Principal.
The parameter αplays the role of a balance coefficient, as far as under α= 1
all transfers are always balanced:
∀s∈S
n
X
i=1
ti(s)=0.
Furthermore, in the case of α= 1 the mechanism represents the Groves–
Ledyard “quadratic government” [12, 19] applied to nproblems of public good
under zero production costs and existing constraints on the total production
volume P
i∈N
xi≤Rof public goods, which actually connects these problems.3.
Due to the presence of joint constraints, the applicability of the results established
for the Groves–Ledyard “quadratic government” to the posed problem is not
obvious and requires further theoretical study.
Now, we describe some trivial (yet, fruitful for further analysis) properties of
this mechanism and the game induced by it with agents’ preference functions
ϕi(s) = ui(π(s)) −ti(s), i ∈N.
Denote by s−i∈S−ithe opponents’ request profile for agent i∈N, S−i=
×j∈N\{i}Sj. Let bri(s−i)∈Sibe the best response function of agent i:
bri(s−i)∈Arg max
si∈Si
ϕi(si, s−i).
The following statement is the case.
Л е м м а 1. Suppose that n>α+ 1. Then ∀u∈_
U , ∀i∈N, ∀s−i∈S−ithe
preference function ϕi(si, s−i)is concave and ∃!bri(s−i) = arg max
si∈Si
ϕi(si, s−i).
The proofs of Lemma 1 and other assertions are postponed to the Appendix.
In the studied game, the set Siform compact subsets of Rn, whereas the
functions ϕi(si, s−i)enjoy concavity in siby virtue of Lemma 1 and continuity in
s−iby construction. Thus and so, this game admits Nash equilibria, e.g., see [4].
An important corollary of Lemma 1 is that, under n= 2, compulsory transfer
balancing becomes impossible: only values α < 1are admissible. Moreover, if we
explore the dependence of bri(s−i)on the parameters αand β(designated by
bri(s−i, α, β)), then ∀i∈N, ∀s−i∈S−i:bri(s−i, α, β) = bri(s−i,0,˜
β),where
˜
β=βn−α−1
n−1.
In other words, the best response functions of agents for the mechanism with
the parameters αand βare equivalent to their best response functions for the
3The Appendix elucidates this statement.
6
mechanism with the parameters ˜α= 0 and ˜
β, where the transfers (5) of agents
have a simpler formula:
ti(s) = ˜
β
n
X
j=1
(xj(s)−sji )2.
Denote by xj−i=1
n−1P
k∈N\{i}
sjk the amount of resources to-be-allotted to
agent j∈Naccording to the requests of all agents except agent i∈N. Clearly,
if agent iagrees with the “opinion” of the rest “society” about the amount of
resources allotted to agent j, i.e., sji =xj−i, then the procedure π(s)yields
xj=xj−i.
Set ∆ji =brji(s−i)−xj−iand Ai= max (0; ∆ii +P
j∈N
xj−i−R)∀i, j ∈N. In
the accepted notation, the best response functions bri(s−i) = {brj i(s−i)}j∈N, i ∈
N, are defined by the following statement.
Л е м м а 2. Suppose that n>α+ 1. Then ∀u∈_
U,∀i∈N, ∀s−i∈S−i∆ii
is calculated by solving the equation4
u0
i1
n∆ii +xi−i= 2 ˜
βn−1
n((n−1)∆ii +Ai),
(6)
and
∀j∈N\{i}∆ji =Ai/(1 −n).
After best response design using Lemma 2, one can find Nash equilibrium
messages of agents as fixed points of the mapping BR(•) = {bri(s−i)}i∈N:S→S.
In addition, Lemma 2 gives formal grounds for the following rational behavior
of agents in the induced game. Whenever it is reasonable for an agent to request a
greater amount of resources than society offers, the optimal strategy to eliminate
the resulting “deficit” in its request is to reduce the requests of all other agents
by an identical quantity, i.e., ∀j∈N\{i}: ∆ji = ∆ii/(1 −n). In this case, if the
whole amount of resources P
j∈N
xj−i=R!has to-be-allotted according to the
requests of all other agents, equation (6) acquires the form
u0
i1
n∆ii +xi−i= 2 ˜
β(n−1)∆ii,
(7)
and ∆ji = ∆ii /(1 −n), j ∈N\{i}.
4u0
i(•)and u00
i(•)indicate the first- and second-order partial derivatives of ui(•)with respect
to xi.
7
Introduce the quantity ∆ = P
i∈N
s∗
ii −R(the “deficit” of resources in the system)
as the difference between the sum of all requests and the available amount of
resources. Let u0−1
i(•)stand for the inverse function of u0i(•).
Утверждение 1. ∀u∈_
U,∀α∈[0,min(1, n −1)],∀β > 0the game Γ(ρ)
admits a unique Nash equilibrium s∗∈Ssuch that
π(s∗) = arg max
x∈AX
i∈N
ui(xi).
Moreover, s∗and π(s∗)are connected via the expressions
∀i∈N:xi(s∗) = s∗
ii −∆
n,
∀j∈N\{i}:xj(s∗) = s∗
ji +∆
n(n−1),
where ∆gives a unique solution of the equation
X
i∈N
u0−1
i(2 ˜
β∆) = R.
(8)
Therefore, Assertion 1 demonstrates that the mechanism ρensures solution of
the problem (1) as a Nash equilibrium in the induced game Γ(ρ)(which is unique!),
as well as explains the existing connection between the equilibrium requests of
agents and the parameter mechanisms and the solution of the problem (1).
Using Assertion 1, it is possible to establish the following properties of the
suggested mechanism. In the first place, define the transfers (5) of agents in the
equilibrium:
ti(s∗) = ˜
β(1 −α)∆2
n(n−1), i ∈N.
(9)
Formula (9) directly implies that all agents have identical transfers under any
parameter values of the mechanism. The balanced mechanism (α= 1) includes no
transfers at all. This result can be interpreted as follows. The mechanism ρuses
transfers as “threats,” without obligatory implementation in the case of efficient
resource allotment. Furthermore, a central shortcoming of the Groves–Ledyard
mechanism applied to the problem of public good was that the solution can be
individually irrational for separate agents (e.g., see [19]). The absence of transfers
in the balanced mechanism applied to our problem means that, if the optimal
resource allotment appears individually rational, then the solution of the game is
also individually rational for all agents.
However, the mechanism guarantees the maximum total utility of agents only
under α= 1.5
5The increase requirement being rejected for the utility functions of agents, payments can
be balanced under α < 1if the problem (1) has an internal solution.
8
Следствие 1. ∀u∈_
U:
n
P
i=1
ti(s∗) = 0 if and only if α= 1.
Interestingly, in the unbalanced mechanism (under α < 1) the final transfer of
each agent goes down for higher penalty strength. Really, the expression (9) can
be rewritten as
ti(s∗) = (1 −α)
βn(n−α−1) u02
i(xi(s∗)).
Hence, for given N, u ∈_
Uand Rtransfers depend on αand βonly, being inversely
proportional to the latter parameter (penalty strength). Transfers can be made
arbitrarily small by an appropriate increase of penalty strength.
Thus, the suggested mechanism implements the efficient resource allotment as
a unique Nash equilibrium; under balanced payments, this mechanism is efficient.
Now, illustrate the mechanism using the resource allotment problem from
Example 1.
П р и м е р 2. Three agents pretend to a limited resource available in the amount
of R= 115. Each agent has the utility function described by (2). The internal
reserves of agents make up the vector r={1; 9; 25}, where rispecifies the reserves
of agent iknown to it only.
In this case, the efficient allotment is x={49; 41; 25}, and the corresponding
utility of each agent approximates 7.07.
Application of the suggested mechanism with the parameters α= 1, β =
0.0005 yields the following equilibrium requests in the induced game (rounded to
two decimal digits):
s1={96.14; 17.43; 1,43},
s2={25.43; 88.14; 1.43},
s3={25.43; 17.43; 72.14}.
By averaging these requests for each agent, we construct the efficient resource
allotment x={49; 41; 25}. That is, each agent requests by 47.14 resource units
more than actually receives, lowering its requests for the rest agents by 23.57
resource units.
According to (4), the disagreement penalty of each agent constitutes pi≈
1.67, i ∈ {1; 2; 3}. Transfers obviously are equal to 0 and balanced.
Doubled penalty strength (β= 0.001) halves the “disagreements”: each agent
requests by 23.57 resource units more than actually receives, lowering its requests
for the rest agents by 11.785 resource units:
s1={72.57; 29.215; 13.215},
s2={37.215; 64.57; 13.215},
s3={37.215; 29.215; 48.57}.
The resource allotment again appears optimal: x={49; 41; 25}. The disagreement
penalties are cut twice: pi≈0.83 ∀i∈ {1; 2; 3}. And transfers are equal to 0 and
balanced.
9
In the case of same penalty strength and no balancing (β= 0.0005 and α= 0),
the requests of agents coincide with the ones described in the previous paragraph.
The disagreement penalties remain invariable, too, but transfers loose balance:
ti=pi≈0.83 ∀i∈ {1; 2; 3}. The total utility of agents decreases by 2.5.
4. CONVERGENCE ANALYSIS OF AN ITERATIVE NEGOTIATION
PROCESS BASED ON THE SUGGESTED MECHANISM
Nash equilibrium implementation requires agents’ complete awareness about
the parameters of the game induced by the mechanism, which is not generally the
case in practical resource allotment problems. Let us examine the applicability of
the mechanism provided that each agent knows its utility function, the available
amount of resources, the total number of agents and the mechanism. To define
resource allotment among agents, use the following iterative negotiation process
Iρ on the basis of the suggested mechanism ρ=< S, π, t >:
x(τ) = π(s(τ)), ϕi(τ) = ui(x(τ)) −t(s(τ)),
where s(τ)=(s1(τ), . . . , sn(τ)) ∈Sare the messages of agents at iteration τ≥1.
The negotiation process continues until iteration Tsuch that agents no more
modify their requests: s(T−1) = s(T).
Does the above process converge to the efficient resource allotment at a finite
number of iterations? To answer this question, we have to make assumptions on
decision-making principles of each agent at each iteration of the process and to
study its properties as a discrete dynamic process. An elementary hypothesis of
agents’ decision-making is the Cournot dynamics: each agent chooses its action
as the best response to the actions of all other agents at the previous iteration
(e.g., see [12, 28]):
si(τ) = bri(s−i(τ−1)).
Denote |si(τ)|=P
j∈N
sji (τ).
Л е м м а 3. If for some τ≥1s(τ)∈Sis such that |π(s(τ))|< R, then agents’
behavior according to the Cournot dynamics yields |π(s(τ+ 1))|>|π(s(τ))|. In
the case of |π(s(τ))|=R, the result is |π(s(τ+ 1))|=R.
In other words, whenever in the Cournot dynamics s(τ)ensures incomplete
resource allotment, s(τ+ 1) = br(s(τ)) increases the amount of allotted resource.
If s(τ)allots resources completely, then s(τ+ 1) = br(s(τ)) does so.
Analyze the iterative process in the domain ¯
S={s∈S:∀i∈N|si|=R}. As
a matter of fact, it suffices to check whether the Nash equilibrium is an attracting
fixed point of the mapping BR(•)or not (e.g., see [1]):
∀{s, s0} ∈ ¯
S2d(s, s0)> d(BR(s), B R(s0)),
where d(s, s0)indicates an arbitrary metric in ¯
S.
Accordingly, if the mechanism induces a game such that the mapping BR(•)
is contracting for some u∈_
U, then is is called contracting for this u∈_
U[20, 31].
10
In the case of the contracting mechanism, for a series of behavioral hypotheses
including the Cournot dynamics the iterative process converges to s∗(u)under a
given utility profile u∈_
U. The situation seems somewhat complicated for our
mechanism. Designate by BR2(•) = BR(BR(•)) : S→Sthe “dual mapping”
constructed using BR(•).
Л е м м а 4. ∀u∈_
Usuch that ∀s∈¯
Sthere exists finite C∈R1
+: max
i∈N(−u00
i(xi(s)) ≤C,
it is possible to find ˜
β≥1
2nmax
i∈N(−u00
i(xi(s)) such that BR2(•)is a contracting
mapping.
So long as the mapping BR2(•)is contracting, it has a unique fixed point.
Obviously, the fixed point of the mapping BR(•)calculated above is also the
fixed point of the mapping BR2(•). And the following result holds true.
Утверждение 2. The iterative process Iρ implements the efficient resource
allotment for any preference profile from _
Usuch that there exists finite C∈R1
+:
max
i∈N(−u00
i(xi(s)) ≤Cand agents act according to the Cournot dynamics.
Unfortunately, an endeavor to obtain results for other behavioral strategies
[9, 12, 28] (differing from the Cournot dynamics) requires an additional examination
for the suggested mechanism which is not contracting in the classical sense. The
results established for the Cournot dynamics are of independent value, as such
dynamics arises in distributed optimization algorithms in multiagent systems [15].
Our mechanism can be applied to solve similar problems.
In addition, a still open issue concerns the convergence rate of the iterative
process. According to the proof of Lemma 4, making ˜
βvery large seems unreasonable,
as in this case agents strive for penalty minimization. As a result, the iterative
process converges fast enough to the arithmetical mean of the initial requests and
then moves slowly towards the efficient allotment.
We illustrate the results of this section using several examples.
П р и м е р 3. Mechanism operation under the Cournot dynamics.
Revert to the allotment problem statement from Example 1 with the parameters
from Example 2: α= 1 and β= 0.0005. It is clear from Examples 1 and 2
that such parameter values make the equilibrium disagreement penalties (≈1.67)
“comparable” with the utility of any agent under the efficient resource allotment (≈
7.07). Figure 1 shows the dynamics of payoffs agents would obtain if negotiations
finish at corresponding iterations and their requests for agent 1 if at iteration 1
each agent requests to give all available resources to agent 1.
The parameters α= 1 and β= 0.0005 satisfy the condition of Lemma 4 in a
segment of the domain ¯
S, where any agent receives at least 31 −riresource units.
Starting from iteration 8, for each agent the amount of received resources
varies from the optimal one at most by 1 resource unit. The optimal allotment is
achieved at iteration 39 only. At any subsequent iteration agents do not modify
their requests.
Fig. 1. The payoffs of agents and their requests for agent 1.
Payoff
11
Iteration number
The amount of resources
The Cournot dynamics is one of basic decision-making models, see the discussion
above. It can be a priori “programmed” for artificial multiagent systems [15],
but decision-making by real people is hardly described by this dynamics [17]. A
rational subject can follow this behavioral model only if it increases its payoff.
We give an example of agents’ rational decision-making, where the efficient
resource allotment cannot be achieved during iterative negotiations.
П р и м е р 4. Agents’ refusal to follow the Cournot dynamics.
Within the model of the previous examples, if resources are equally shared by
all agents, agent 3 (r3= 25) gains the utility ≈7.95 which is higher than under
the efficient resource allotment (≈7.07). Agents 1 (r1= 1) and 2 (r2= 9) have
smaller utilities than under the efficient resource allotment. In other words, agent 3
would prefer equal sharing based on the efficient allotment, whereas agents 1 and
2 would choose the opposite situation.
Consider the dynamics of agents’ requests obtained in Example 3, where at
iteration 1 each agent claims all available resources for itself. It follows from (3)
that in this case resources are equally shared by all agents. By virtue of formula
(5), each agent has zero transfer. And the payoff of each agent coincides with the
utility from gained resources.
Imagine that agent 3 acts according to the Cournot dynamics, at each iteration
choosing its request as the best response to the profile at the preceding iteration.
Then its payoff in the game (utility minus transfers) decreases at each iteration
except iteration 1, see Fig. 1. Moreover, its expected payoff under best response
is always smaller than the payoff realized at a corresponding iteration. On the
contrary, the payoffs of two other agents increase.
Therefore, agent 3 can be motivated not to follow the Cournot dynamics as its
behavioral strategy. In particular, the agent possibly does not reduce the request
for itself, claiming all available resource at each iteration.
Figure 2 presents the trajectories of agents’ payoffs and resource amounts
received by agent 1 in the case when agent 3 submits the same request s33 =R
at all iterations but minimizes its transfer by choosing the requests s13 and s23
according to the Cournot dynamics.
Agents 1 and 2 follow the Cournot dynamics up to iteration 10. At iteration 2,
the payoff of agent 3 goes down appreciably, whereas the payoffs of agents 1 and
2 demonstrate growth. But starting from iteration 3 and right up to iteration 15,
the payoffs of agents 1 and 2 are smaller than at iteration 1. Moreover, since
iteration 13, agents 1 and 2 have “equilibrium” requests: each agent benefits
nothing from a deviation provided that agent 3 does not vary its request.6For
agent 3, the request s33 =Ris not the component of its best response at all
iterations right up to iteration 14: R6=br33(s−3(τ−1)).
At iteration 15, agent 2 rejects the behavioral model based on the Cournot
dynamics, being motivated by that its payoff continuously decreased (in contrast
to agent 1, whose payoff demonstrated small growth). As we have mentioned,
6The accuracy of modeling is 10−3.
12
after iteration 13 agents do not vary their requests. Therefore, at iteration 15
agent 2 varies its request for resource it demanded at the starting iteration:
s22(15) = R. And then the agent does not vary s22, following the Cournot
dynamics only in order to determine the requests s12 and s32 which minimize its
transfer. At iteration 15, agent 2 slightly looses in its payoff, but at iteration 16
and subsequently agent 2 gains more than by choosing best response strategies at
the earlier iterations.
Starting from iteration 15, only agent 1 chooses its request as the best response
to the opponents’ requests. Agents 2 and 3 request for agent 1 negative amounts
of resource, minimizing their own transfers.
Fig. 2. The payoffs of agents and their requests for agent 1 under successive
rejection of the Cournot dynamics by agents.
However, since iteration 20 the amount of resource agent 1 can obtain and its
best response do not vary appreciably, while its payoff approximates 4.85. Starting
from iteration 23, agent 1 cannot increase its payoff by request variations within
the Cournot dynamics (under the above accuracy of analysis). If agent 1 rejects
the Cournot dynamics, it can enlarge its payoff. In our example, at iteration 25
agent 1 stops following the best response and requests all available resource for
itself. And then this agent does not vary s11, following the Cournot dynamics and
choosing only the requests s21 and s31 which minimize its transfer. In this case,
if all agents do not vary their requests for themselves, but choose requests for the
rest agents by minimizing their own transfer, then starting from iteration 34 the
requests and payoffs of all agents and the amounts of resources allotted to them
coincide with those at iteration 1.
At all iterations except iteration 2, the payoff of agent 3 is higher than under
the efficient resource allotment. Hence, the strategy chosen by this agent can
be considered rational (more specifically, rationally bounded). This behavioral
strategy can be characterized as “intractability.”
The total payoff of agents appears smaller than the maximum payoff, which
seems natural.
Acting within the framework of the “robust” approach, we claim that Example 4
illustrates the following fact. The iterative negotiation process does not guarantee
the efficient resource allotment under rather rational behavioral hypotheses of
agents.
5. DIMENSIONALITY REDUCTION FOR THE AGENT MESSAGE SPACE
The suggested mechanism requires that agents report the complete vector of
resource allotment. This can be difficult in practice, especially if the number of
agents is large. Furthermore, agents can benefit from complete vector reporting,
i.e., cooperate with each other by submitting a coordinated request. Particularly,
in some situations society is divided into two groups. Their interaction can be
accordingly considered as a two-agent game and efficient allotment implementation
possibly fails under balanced payments: it may happen that ˜
β= 0 and the
two groups have no stimuli to reach an agreement. Consequently, we study the
feasibility of eliminating cooperation among agents via coordination of their requests.
13
The above results on the best response functions of agents (see Section 3)
motivate us to design a modification of the suggested mechanism, where each
agent reports the amount of resources desired for itself only: ˆρ=<ˆ
S, ˆπ, ˆ
t >,
where ˜
Si⊆R.
As before, denote by sii the request of agent i,i∈N.
According to Assertion 1, in an equilibrium the procedure ˆπ:Rn→Xyields
the following amount of resources for each agent:
ˆxi=sii −P
i∈N
sii −R
n.
(10)
Suppose that resources will be allotted in this way for all admissible requests
such that P
i∈N
sii −R≥0. Without deficit, each agent receives exactly what it
requests: ∀i∈N xi=sii.
Let each agent pay ˆ
ti=ˆ
β(sii −xi)2. Obviously, all agents have identical
payments and their balancing is inadmissible (otherwise, the real payment of
each agent vanishes). Then the following result is true.
Л е м м а 5. The mechanism ˆρ=<ˆ
S, ˆπ, ˆ
t > implements the efficient resource
allotment as a unique Nash equilibrium in the induced game Γ( ˆρ)of agents.
Lemma 5 allows establishing the equivalence of the mechanisms ρ=< S, π, t >
and ˆρ=<ˆ
S, ˆπ, ˆ
t >. In fact, we say that (resource allotment) mechanisms are
equivalent if they implement an identical resource allotment for any agents’ preference
profile.
Утверждение 3. Consider the mechanism ρ=< S, π, t > defined by the
parameters β > 0, α < 1. The mechanism ˆρ=<ˆ
S, ˆπ, ˆ
t >, where
ˆ
β=βn−α−1
n(n−1)2:
1) is equivalent to the mechanism ρ=< S, π, t >;
2) has equilibrium requests of agents coinciding with the equilibrium requests
for agents themselves in the mechanism ρ=< S,π, t >.
Therefore, for some parameters of the mechanism ρ=< S, π, t >, it is possible
to design the equivalent mechanism ˆρ=<ˆ
S, ˆπ, ˆ
t >, where all agents report only
requests for themselves. But this mechanism admits no balancing and, in contrast
to the mechanism ρ=< S, π, t >, it is not efficient. Yet, the equivalent mechanism
satisfies the same relationship between the absolute payments of agents and
penalty strength:
ˆpi(ˆs∗) = n(n−1)
ˆ
β
∂ui
∂xi
2
(ˆxi(ˆs∗)).
In other words, agents’ payments can be made arbitrarily small and the mechanism
appears almost efficient, see [29].
14
However, due to nonzero payments the mechanism ˆρ=<ˆ
S, ˆπ, ˆ
t > does not
guarantee the individual rationality of a resulting solution for all agents.
Using the iterative negotiation process Iρ, it is possible to suggest a “reduced”
iterative negotiation process ˆ
Iρ, where at each iteration agents submit only requests
for themselves sii(τ), i ∈N, and ∀i∈N:sii(1) ≤R. For a given agent, its
requests for the rest agents are defined as follows.
At iteration 1, if the submitted requests of agents cannot be satisfied, an
agent offers allotting all residual resources (remaining after satisfaction of its own
request) equally among all other agents. That is, if
X
i∈N
sii(1) > R,
then ∀i∈N, ∀j∈N\{i}:sj i(1) = R−sii
n−1.
Whenever the system has no resource deficit, by assumption all agents agree with
the submitted requests. That is, under
X
i∈N
sii(1) ≤R,
∀i∈N, ∀j∈N\{i}:sj i(1) = sjj (1).
At any iteration τ > 1, for each agent its requests for other agents are calculated
as its best response to the profile at the previous iteration in the game Γ(ρ), i.e.,
∀j∈N\{i}:sji (τ) = brji (s−i(τ−1)), provided that
brii(s−i(τ−1)) = sii (τ).
In other words, if
sii(τ) + X
j∈N\{i}
xj−i(τ−1) < R,
then sji (τ) = xj−i(τ−1).
Otherwise,
sji (τ) = xj−i(τ−1) −1
n−1Ai,
where Ai=sii +P
j∈N\{i}
xj−i−R.
Within the described iterative negotiation process, the Cournot dynamics implies
that at each iteration an agent chooses its request as the solution of equation (6)
taking the requests of all agents for itself as the profile: si−i(τ−1).
Obviously, if the iterative process Iρ ensures convergence to the Nash equilibrium
in the game Γ(ρ)with the Cournot dynamics, then the iterative process ˆ
Iρ does
so, too.
The suggested iterative process ˆ
Iρ eliminates the feasibility of agents’ cooperation
via requests’ coordination, since each agent can choose the request “for itself” and
15
the mechanism reallots the resulting deficit equally among all other agents. But
it remains vulnerable to the behavior illustrated by Fig. 2 when some (or even
all) agents do not follow the Cournot dynamics. Moreover, in the situation of
Example 4, agent 3 does not face the threat of cooperation between agents 1 and
2.
6. EXPERIMENTAL TESTING
For experimental testing of the obtained theoretical results and computer
simulation results, we have constructed a business game and an information
system for its conduct in zTree [16].
This section describes a series of games played to test the suggested mechanism
ρin maximally free conditions and with different participants:
– the information system simultaneously implements the both iterative negotiation
processes Iρ and ˆ
Iρ, players choose freely between them during the game;
– players can communicate with each other during experiments (games).
Players are suggested the following situation resembling the reality: players-
students allot an available time of a tutorial with their tutor; the payoff of each
player is his/her exam mark and depends monotonically on the tutorial time of
this player according to the function (2), where the type riof player icharacterizes
his/her initial “knowledge” (the parameter known to this agent only).
The game parameters are adjusted so that the Pareto optimal payoff of each
agent makes up approximately 7 marks in the 10-mark grading system (4 marks
in the 5-mark grading system). For higher mark (excellent), a player has to
‘negotiate” more resources than in the optimal allotment.
The experimental testing has been organized in the following way:
1. Theoretical explanation of the game and the resource allotment mechanism
applied.
2. Learning game with the feasibility of discussing an obscure situation with
leader.
3. A series of real games.
4. Announcement of the results.
Game organizers divide all participants into groups (3 or 5 persons) and each
group plays its game. If participants can be divided into several groups, then
prior to each subsequent game a current division is changed randomly. Numbers
(accordingly, types) are assigned to players randomly prior to each game and
remain invariable during this game. Therefore, the type of a person participating
in several games may vary from one game to another.
Game participants are MIPT students from Department of Radio Engineering
and Cybernetics (DREC) and Department of Informational Business Systems
(DIBS), as well as researchers from Trapeznikov Institute of Control Sciences
(ICS RAS). Games are played independently with MIPT students from DREC
(bachelor’s degree program), MIPT students from DIBS (master’s degree program)
and researchers from ICS RAS.
16
Players receive printed instructions describing the game proper and its information
system. The full-text version of the game instruction is available at http://www.mtas.ru/games/gl.
The “basic” game is the one studied in Example 2: n= 3 players allot the
tutorial time R= 115, the individual knowledge (types of agents) belong to the
set r={1; 9; 25}so that the types of all players differ. The mechanism parameters
α= 1 and β= 0.0005 are chosen according to the following considerations. First,
the equilibrium penalties of players (≈1.67, see Example 2) are comparable with
the utility they gain in the optimal resource allotment (≈7.07, see Example 2).
Second, these parameters of the mechanism meet the requirements of Assertion 2
for a wide range of resource allotment vectors, as shown in Example 3.
Each player knows all the above-mentioned parameters except the precise types
of the opponents. Moreover, a player does not know a priori which participants
of the experiment play the same game with him/her. Such information can be
obtained during the game, since communication between participants obeys no
restrictions.
A game in a group terminates if all players no more vary their requests or after
iteration 100. The player’s payoff is defined at the last iteration.
Although the Cournot dynamics does not yield a negative amount of resource
for any player, in real situation this possibility cannot be ruled out. Therefore, we
have implemented a punishment system imposing a high penalty (10 000 points)
to a player with a negative amount of resources and to a player whose request has
predetermined such allotment.
At each iteration, a player chooses between the action by the mechanism Iρ,
choosing all components of his/her request independently (the complete resource
allotment vector) or by the mechanism ˆ
Iρ, choosing merely the component of
the request for him/herself and delegating the choice of other components to the
system.
In addition to the “basic” game, the experiment involves two its modifications
as follows:
1. “5 players,” where r={1; 9; 25; 1; 9}and R= 205.
2. “Unbalanced”: α= 0.
Table 1 demonstrates the total number of games of each type and the total
number of iterations.
Table 1. The types of games
The type The values of the parameters The number The total number
of game (α, β, n)of games of iterations
Basic (1; 0.0005; 3) 6 553
5 players (1; 0.0005; 5) 3 123
Unbalanced (0; 0.0005; 3) 10 93
This series of experiments has gained the following outcomes.
1. The Cournot dynamics has not been observed in all games.
2. The efficient (and almost efficient) resource allotment has been achieved
in 8 games (“basic”—1, “5 players”—1, “unbalanced”—6), in all cases owing to full
cooperation among agents.
17
3. Cooperative behavior has been observed in 13 games (“basic”—3, “5 players”—
3, “unbalanced”—7) including full coalition formations in 12 cases. Any games
without cooperative behavior will be called noncooperative, while the rest games
will be called cooperative: (non)cooperative “basic” games, and so on. In one game,
two coalitions have been formed (two against one, “unbalanced” game).
4. The average number of iterations has been 14.95, while the average numbers
of iterations in different types of games have been 33 (“basic”), 15 (“5 players”)
and 4.1 (“unbalanced”).
Noncooperative behavior has been observed merely in 6 games: 3 “basic” and 3
“unbalanced” ones. But these games differ appreciably even in the average number
of iterations and their comparison makes no sense.
Let us compare the average requests of players for themselves at the last
iterations (in the end of games) with their Nash equilibrium counterparts, i.e.,
analyze the existing deviation between the average requests of players for themselves
and the Nash equilibrium requests in the end of games. Table 2 contains the
average requests of players (types 1, 2 and 3) for themselves in 3 noncooperative
“basic” games. Here the requests of players for themselves are in bold type and
their Nash equilibrium requests for themselves are shown after symbol “/”. Clearly,
the divergence is considerable and even has noncoinciding signs for different-type
players. For instance, the average requests of players 1 and 2 for themselves are
smaller than the Nash equilibrium requests, while the average request of player 3
exceeds its Nash equilibrium analog.
Table 2. The average requests of players for themselves at the last iteration
in “basic” noncooperative games
The request of player 1 The request of player 2 The request of player 3
for player 1 63/72.6 42.8/37.2 29.6/37.2
for player 2 32/29.2 50.3/64.6 12.05/29.2
for player 3 20/13.2 21.9/13.2 61.7/48.6
Now, consider the average resource allotments at the last iterations and compare
them with the efficient resource allotment and the equal sharing allotment. According
to Table 3, the average resource allotment nonstrictly approaches the efficient one
due to player 2, and in sum is even smaller than the total amount of 115 units.
Table 3. The game-average amounts of resources obtained by players
at the last iteration in noncooperative “basic” games
The amount of player 1 The amount of player 2 The amount of player 3
Equal sharing 38 1/3 38 1/3 38 1/3
Average 45.1 31.5 34.5
Efficient 49 41 25
Table 3 shows that player 2 obtains appreciably less resources under equal
sharing and efficient allotment. The next table illustrates how this affects his/her
payoff. Actually, Table 4 compares the game-average payoffs of players at the
last iteration. The requests and obtained amounts of resources considerably differ
from their equilibrium counterparts, but the payoffs of players are not far from
the efficient ones.
18
Table 4. The game-average payoffs of players at the last iteration
in noncooperative “basic” games
The payoff of player 1 The payoff of player 2 The payoff of player 3 Total
Equal sharing 6.3 6.9 7.9 21.1
Average 6.8 6.5 7.4 20.7
Efficient 7 7 7 21.2
Interestingly, players of type 2 gain less than players of types 1 and 3. This
also the case for noncooperative “unbalanced” games.
For statistical verification of the distinctions in the behavior of different-type
players, we have conducted the Kruskall–Wallis test [25] for 3 groups, i.e., requests
for themselves of types 1–3 of agents in “basic” games over all iterations. Under
threshold 0.05, the test has resulted in a quantity of order 10−5. This means that
requests for themselves of different-types agents in “basic” games are statistically
dissimilar.
Next, we present the trajectories of the average requests of players for themselves
sii over all iterations for cooperative and noncooperative “basic” games, see Fig. 3.
Here solid line corresponds to the average requests in all games, the rest dash lines
answer for the average requests in concrete games. The overall average request on
all iterations is written on the trajectory. According to Fig. 3a, two games have
terminated within 30 iterations.
Fig. 3. The trajectories of the average requests of players for themselves: (a)
noncooperative and (b) cooperative “basic games.”
Request
The overall average request: 65.4
Iteration
Note that in noncooperative games the average request of players for themselves
appears higher, whereas the evolution of sperate games differs. In “unbalanced”
games, the average request of players for themselves is also greater in the noncooperative
case. Perhaps, this is caused by the “struggle” of players for their amounts of
resources in the noncooperative case.
The absence of the Cournot dynamics in the behavior of real agents during the
whole game can be anticipated. Using the experiment, we have tried to answer
whether this behavior is local (at some iterations) or not.
For this, let us explore the requests of all players at all iterations and their
conformity with the following behavioral models:
•“Stationary requests”;
•“Indicator behavior” [3];
•“Best response.”
To elucidate these models, introduce some additional notions. The direction
of a vector is the vector of signs of its components. For player iand iteration τ,
the direction of variation of a request vector si(τ)is the direction of the vector
∆i(τ) = si(τ)−si(τ−1). Define
si(τ, γ ) = si(τ−1) + γ[bri(s−i(τ−1)) −si(τ−1)].
The class “Stationary requests” includes the following formal behavioral models:
19
–“Stationary (C)”: {si(τ) = si(τ0)},
–“Almost stationary (C(0.1))”: {si(τ)|si(τ) = si(τγ),−0.1< γ < 0.1}, i.e., the
relative variation of the player’s request vector does not exceed 10% of the
distance to best response in each coordinate.
The class “Indicator behavior” includes the following formal behavioral models:
–“Indicator behavior (IB)”: {si(τ)|si(τ) = si(τ, γ),0< γ ≤1}, this is the
classical definition of indicator behavior [3],
–“Towards best response (IB +)”: {si(τ)|si(τ) = si(τ, γ),0< γ}, i.e., the
player’s request moves towards best response, but can be arbitrarily far from
it.
The class “Best response” includes the following formal behavioral models:
–“Best response with accuracy ε(BR(ε)): {si(τ)|si(τ) = si(τ , γ),1−ε<γ≤
1 + ε}. Actually, BR(0) is the Cournot dynamics: si(τ) = bri(si(τ−1)),
–“Best response for player’s request for him/herself with accuracy ε(BRi(ε))”:
{si(τ)|sii(τ) = sii (τ, γ),1−ε<γ≤1 + ε},
–“Best response for player’s rest requests (BRi(ε)): {si(τ)|s−ii(τ) = s−ii (τ, γ),1−
ε<γ≤1 + ε}.
We will consider ε∈ {0; 0.1; 0.2}: “0” corresponds to accurate models, while
the rest parameter values describe models “with accuracy.” The above choice of ε
is explained empirically by the desire to analyze small neighborhoods. Note that
ε= 0.2is not a small value, as it answers not the absolute but relative distance
to best response.
Table 5 shows the correspondence of the requests of the studied games to the
classes and models of behavior.
Obviously, a good model elucidating the behavior of players has yet not been
found: except the class “(Almost) stationary” describing about 30% of all requests,
none of the models goes beyond 10% of occurrence. As a matter of fact, the
dynamics in Example 4 agrees exactly with this behavioral model.
Table 5. Behavioral analysis
Class and model All types Type 1 Type 9 Type 25
Requests totally 553 183 185 185
C141 (25.5%) 44 (24.0%) 58 (31.4%) 39 (21.1%)
C(0.1) 191 (34.5%) 66 (36.1%) 75 (40.5%) 50 (27.0%)
IB 36 (6.5%) 13 (7.1%) 8 (4.3%) 15 (8.1%)
IB+ 58 (10.5%) 18 (9.8%) 17 (9.2%) 23 (12.4%)
BR(0) 0 (0.0%) 0 (0.0%) 0 (0.0%) 0 (0.0%)
BR(0.1) 0 (0.0%) 0 (0.0%) 0 (0.0%) 0 (0.0%)
BR(0.2) 2 (0.4%) 1 (0.5%) 0 (0.0%) 1 (0.5%)
BRi(0.1) 10 (1.8%) 1 (0.5%) 4 (2.2%) 5 (2.7%)
BRi(0.2) 18 (3.3%) 4 (2.2%) 5 (2.7%) 9 (4.9%)
BR−i(0.1) 2 (0.4%) 0 (0.0%) 0 (0.0%) 2 (1.1%)
BR−i(0.2) 3 (0.5%) 1 (0.5%) 0 (0.0%) 2 (1.1%)
In terms of their occurrence in the presented data, the behavioral models can be
placed in the following descending order: “Towards BR,” BRi(0.2), IB, “For BR.”
The first and last models are too abstract, while the second and third ones testify
that improving player’s requests for him/herself and indicator behavior play an
20
appreciable role in players behavior. Interestingly, players of type r= 25 have the
maximum number of requests coinciding with IB.
And finally, let us show the trajectories of agents’ average requests for themselves
in comparison with best responses. In Fig. 4 light-colored columns correspond to
best responses, whereas dark-colored ones answer to players’ average requests; the
height of these columns reflect the deviation of a request at the previous iteration
from best response and from the request at the current iteration. Deviations from
best responses are displayed in their absolute values. Dark-colored columns down
describe cases when a request at the current iteration is “to the opposite direction”
from best response to the profile at the previous iteration. Clearly, players of each
type occasionally act irrationally.
Fig. 4. The dynamics of players’ average requests for themselves in comparison
with best responses in noncooperative “Basic” games: averaging over (a) all types;
(b) the first type; (c) the second type; (d) the third type.
Deviation
Iteration
The requests not identified using the suggested models contain components
varying away from best response in a corresponding component with respect to the
request at the previous iteration (i.e., irrational ones at all). We have not suggested
an adequate model for such behavior so far. Perhaps, players act “irrationally”
struggling for resource regardless of penalties. Or they try predicting opponents’
behavior for several future iterations, i.e., apply reflexion. In future, it is possible
to study players’ moves within the framework of reflexive models, e.g., [9]. A
separate issue to-be-analyzed concerns games with cooperative behavior of more
than two coalitions.
The next stage of mechanism testing lies in playing a series of games with the
“reduced” mechanism implementing only the iterative process ˆ
Iρ, which cuts some
undesired behavioral models (particularly, cooperation via coordinated requests).
7. CONCLUSION
The major result of this paper is that we have succeeded in suggesting an
efficient mechanism for the resource allotment problem based on a single-criterion
voting mechanism using the approach pioneered in [7] (representing the resource
allotment problem as a multi-criteria voting problem). The constructed mechanism
guarantees the efficient resource allotment as a unique Nash equilibrium in the
induced game of agents and maximizes the total utility of agents.
The original tools developed by us for playing experimental games allows to
test the mechanism subject to different applied problems of resource allotment.
However, several issues remain open and require further theoretical and experimental
study, in the first place,
1. Weakening of the restrictions on the class of agents’ utility functions. At best,
most applied problems satisfy the concavity condition. The paper [8] dedicated
to application of this mechanism to the active expertise problem demonstrated
that the positive results on the existence of the efficient solution as a Nash
equilibrium in the agents’ game can be extended to the class of piecewise linear
21
utility functions. The dynamic properties of the mechanism were yet not explored.
Therefore, it seems interesting to extend the results derived in the current paper
to the class of piecewise linear utility functions.
2. Analysis of the properties of the mechanism and iterative negotiation processes
under different behavioral hypotheses of agents, since this paper has examined
only the “elementary” behavioral model (the Cournot dynamics). It has been
shown that, except the unique and efficient Nash equilibrium, the game may
admit equilibria of other types, e.g., equilibria in safety strategies [6], which fail to
guarantee efficient resource allotment. In defense of the proposed mechanism, note
that this issue is still underinvestigated in design of efficient economic mechanisms.
3. Study of stabilization methods of the mechanism against cooperative behavior
when players form several coalitions. The obtained results allow supposing that
the worst-case situation for the mechanism is decomposition into two coalitions.
The rest coalitional configurations seem to create no obstacles for the suggested
mechanism. But this issue needs a detailed treatment including comparison with
cooperative models of resource allotment [5, 10].
ПРИЛОЖЕНИЕ
Comparison of the mechanism ρ=< S, π, t > with the Groves–Ledyard “quadratic
government.”
Designate by sk={ski}i∈Nthe request vector of all agents with respect to
resource allotted to agent k∈N.
Clearly, then (3) can be rewritten as
xk(s) = xk(sk) = 1
n
n
X
i=1
ski.
And ∀i∈Nthe expression (4) admits the representation
pi(s) =
n
X
k=1
pki(sk),
where
pki(sk) = β(xk(sk)−sk i)2.
By-turn, the expression (5) takes the form
ti(s) =
n
X
k=1
tki(sk),
where
tki(sk) = pk i(sk)−α
n
n
X
j=1
pkj (sk).
22
Introduce the notation tk={tki}i∈Nand pk={pk i}i∈N. Then the mechanism
ρ=< S, π, t > can be treated as a set of nmechanisms {ρk}k∈N,where ρk=<
Rn, xk(sk), tk>, bounded by the following constraints:
X
k∈N
ski ≤R∀k∈N.
Consider a separate mechanism ρk=<Rn, xk(sk), tk>under the condition
α= 1. Apply a change of variables which makes the description of the mechanism
as in [19]:
mi=ski
n, m ={mi}i∈N,¯m=1
nX
i∈N
mi.
Hence, it appears that
xk(m) = X
i∈N
mi,
¯m=1
nxk(m),
tki(m) = β
n2(( ¯m−mi)2−1
nX
j∈N
( ¯m−mj)2).
This formula is equivalent to the Groves–Ledyard “quadratic government”
expression in [19, p. 1491] taking into account Remark 8 and zero production
costs of public good (q= 0 according to the notation accepted in [19]).
Proof of Lemma 1. In fact, we have to demonstrate that ∀i∈N, ∀s−i∈S−i
there exists a unique maximum of ϕi(si, s−i)in si.
Obviously, ∀i∈Nthe functions pi(s)are strictly convex.
Now, show that for n > α + 1 and ∀i∈Nthe functions ti(s)are also strictly
convex. Actually, ∀j∈N∀s∈S:
∂ti(s)
∂sj i
= 2β"(xj−sji )( 1
n−1 + α
n)−α
n2X
k∈N
(xj−sjk )#.
So long as ∀j, k ∈NP
k∈N
sjk =nxj, we have that
∂ti(s)
∂sij
= 2β(xj−sji )( 1
n−1 + α
n)=n−α−1
n−1
∂pi(s)
∂sij
.
Since n > 1,then for n>α+ 1 the signs of ∂ti(s)
∂sj i and ∂pi(s)
∂sj i do coincide, and any
higher-order derivatives of the two functions are proportional with the coefficient
n−α−1
n−1.
And so, if pi(s)is strictly convex, then ti(s)is also strictly convex for n>α+1.
Therefore, the function ϕi(s) = ui(π(s)) −ti(s)is strictly concave, which testifies
to the uniqueness of its maximum in si∀s−i∈S−i.
23
Proof of Lemma 2. For any agent i∈N, best response choice (see Lemma 1)
is a convex optimization problem with the Lagrange function
Li(si) = ϕi(si, s−i) + λi(X
j∈N
sji −R).
Consequently,
∂Li(si)
∂sii
=u0
i(xi)1
n−2˜
β(xi−sii)( 1
n−1) + λi,
∂Li(si)
∂sj i
=−2˜
β(xj−sji )( 1
n−1) + λi, j ∈N\{i},
∂Li(si)
∂λi
=X
j∈N
sji −R.
Hence, it follows that in the optimal solution ∀j∈N\{i}∆ji is same, as far as
∆ji =brji(s−i)−xj−i=n2
2˜
β(n−1)2λi.
1. Suppose that the problem has an internal solution:
X
j∈N
brji (s−i)≤R, λi= 0.
Then ∀j∈N\{i}:xj=brji (s−i),which implies that brji(s−i) = xi−iand
∆ji = 0. Having in mind that ∆ii =brii(s−i)−xi−i, we obtain that
X
j∈N
brji (s−i) =∆ii +X
j∈N
xj−i≤R.
That is, Ai= 0.Therefore, brii(s−i)is defined by solving the equation u0
i(xi) =
2˜
β(xi−sii)(1 −n)or the equivalent equation
u0
i1
n∆ii +xi−i= 2 ˜
β(n−1)2
n∆ii.
And the conclusion follows.
2. Assume that the problem has a boundary solution: P
j∈N
brji (s−i) = R,λi<0.
Taking into account that P
j∈N\{i}
∆ji =R−∆ii −P
j∈N
xj−i,we obtain that
P
j∈N\{i}
∆ji <0,so long as ∆ii >0(this is immediate from the properties of the
class _
U). This means that
∆ii +X
j∈N
xj−i−R=n2
2˜
β(1 −n)λi,
24
and
∆ii +X
j∈N
xj−i≥R.
Therefore,
u0
i(1
n∆ii +xi−i)=2˜
βn−1
n((n−1)∆ii +Ai),
and ∀j∈N\{i}:∆ji =Ai/(1 −n),where
Ai= ∆ii +X
j∈N
xj−i−R.
And the conclusion follows.
Proof of Assertion 1.
1. Let us establish that ∀u∈_
Uthe fixed point satisfies the condition ∀i∈N
P
j∈N
s∗
ji =R.
Clearly, only in this case we have that P
i∈N
xi(s∗) = R.
The fixed point meets BR(s∗) = s∗and π(BR(s∗)) = π(s∗), which is equivalent
to the system of equalities
X
i∈N
brji (s∗
−i) = nxj(s∗), j ∈N.
Hence, it appears that
X
j∈NX
i∈N
brji (s∗
−i) = nX
j∈N
xj(s∗).
(Π.1)
Show that the last equality holds true only under
X
j∈N
xj(s∗) = R.
The definition of ∆ji implies that ∀{i, j } ∈ N2:
brji (s∗
−i) = xj(s∗) + n−1
n∆ji .
According to Lemma 2, ∀j∈N:
X
j∈N
brji (s∗
−i) = X
j∈N
xj(s∗) + n−1
n(∆ii −Ai).
25
And so,
X
j∈NX
i∈N
brji (s∗
−i) = nX
j∈N
xj(s∗) + n−1
nX
i∈N
(∆ii −Ai).
In other words, it follows from (A1) that the fixed point satisfies the condition
X
i∈N
(∆ii −Ai)=0.
If P
i∈N
xi(s∗)< R, then
∃K⊆N={k∈N:X
j∈N
s∗
jk < R}.
This means that the best response problem possesses an internal solution for
agents belonging to K. According to Lemma 2, ∀l∈K Ak= 0,∀i∈N\K:
Ai= ∆ii +X
j∈N
xj−i−R.
Consequently,
X
i∈N
(∆ii −Ai) = X
l∈K
∆ll +X
i∈N\K
(R−X
j∈N
xj−i).
The properties of the class _
Udictate that ∀i∈N: ∆ii >0. Since the solution is
internal, we have that ∀i∈N\K:
R−X
j∈N
xj−i>0.
Thus, if for s∗∈S:P
i∈N
xi(s∗)< R, then
X
i∈N
(∆ii −Ai)>0,
which is inadmissible for a fixed point.
If P
i∈N
xi(s∗) = R, then ∀i∈N:Ai= ∆ii and
X
i∈N
(∆ii −Ai)=0.
In other words, the condition (11) holds true only under P
i∈N
xi(s∗) = R.
26
2. Now, demonstrate that the Nash equilibrium requests of agents define the
efficient resource allotment. Since ∀i∈N:Ai= ∆ii,due to (7) any fixed point
s∗satisfies the condition
X
i∈N
u0
i(xi(s∗)) = 2 ˜
β(n−1) X
i∈N
∆ii = 2 ˜
βn(X
i∈N
s∗
ii −R).
It follows from Lemma 2 that ∀i∈N:
xi=1
n(nxi+n−1
n∆ii −1
nX
j∈N\{i}
∆jj ).
Hence, ∀i∈N:
(n−1)∆ii =X
j∈N\{i}
∆jj .
This system possesses a unique solution as a system of nlinear independent
equations with nvariables. The solution has the following form: ∀{i, j } ∈ N2:
∆ii = ∆jj .
Taking into account that ∆ = P
i∈N
s∗
ii −R, we obtain that ∀i∈N:
∆ii =1
n−1∆.
(Π.2)
Consequently, ∀i∈N:
u0
i(xi(s∗)) = 2 ˜
β(X
i∈N
s∗
ii −R).
Therefore, any set of Nash equilibrium requests s∗of agents meets the equality
X
i∈N
xi(s∗) = R
and ∀i∈N:u0
i(xi(s∗)) = λ, where λ= 2 ˜
β∆.
In other words, π(s)is a solution of the problem (1).
3. Let us prove the uniqueness of the Nash equilibrium. The problem (1) is
convex and admits a unique solution, i.e., λand xi(s∗), i ∈Nare uniquely
defined. Thus, s∗is also unique, as the condition (A2) implies that
∀i∈N:s∗
ii =xi(s∗)−λ
2˜
βn .
4. And finally, we show how ∆can be defined in an equilibrium.
27
That ∀i∈N:u0
i(xi(s∗)) = 2 ˜
β∆yields xi(s∗) = u0−1
i(2 ˜
β∆). The properties
of the class _
Udictate that ∀u∈_
Uand ∀i∈Nthe function u0−1
i(•)is strictly
decreasing.
Therefore, the function P
i∈N
u0−1
i(2 ˜
β∆) is also strictly decreasing in ∆. Then
the equation
X
i∈N
u0−1
i(2 ˜
β∆) = R
always possesses a unique solution defining ∆in the equilibrium (under the
assumption that resources are allotted among all agents in the solution of the
problem (1)).
Proof of Corollary 1. If α < 1,then
n
P
i=1
ti(s∗)=0only under ∆=0. According
to (2), ∀i∈N:u0i(x∗)=0,which is impossible due to the definition of _
U.
Proof of Lemma 3. If |π(s(τ))|< R, it appears from Assertion 1 that
X
j∈NX
i∈N
brji (s−i)> n X
j∈N
xj(s).
In other words, n|π(s(τ+ 1))|> n |π(s(τ))|.Moreover, n|π(s(τ+ 1))|−n|π(s(τ))| →
0only when
X
i∈N
u0
i(xi(s(τ+ 1))) →0.
This means that the problem (1) has an “almost” internal solution (which is not
true).
If |π(s(τ))|=R, then
X
j∈NX
i∈N
brji (s−i) = nX
j∈N
xj(s).
That is, n|π(s(τ+ 1))|=n|π(s(τ))|=nR.
Proof of Lemma 4. Let us analyze the mapping BR(•) : S→S, notably, its
contraction properties by the Jacobian matrix
J(s) = ∂brij
∂slk
(s)i,j,l,k∈N
.
Given a certain set of agents strategies s∈S, a sufficient contraction condition
is the inequality kJ(s)k<1satisfied for an arbitrary matrix norm, see [1]. By
analogy with [31], employ the maximal row norm defined by
kJ(s)k= max
{i,j}∈N2X
{l,k}∈N2
∂brij
∂slk
.
28
It is easy to establish that ∀i∈N,˜
β > 0and concave utility functions:
X
{l,k}∈N2
∂brii
∂slk
=X
j∈N\{i}
∂brii
∂sij
= (n−1)
2˜
βn +u00
i(xi)
−u00
i(xi)+2˜
βn(n−1)
<1.
Introduce the notation
Di=2˜
βn +u00
i(xi)
−u00
i(xi)+2˜
βn(n−1) ,¯
D= max
i∈N|Di|, D = min
i∈NDi.
Then (n−1) ¯
D < 1,(n−1)D < 1.
We assume that each agent follows the Cournot dynamics:
brji =xj−i+1
n−1(xi−i−brii), j ∈N\{i},
where xi−j=1
n−1P
k∈N\{j}
sik. Consequently, ∀j∈N\{i}:
X
{l,k}∈N2
∂brj i
∂slk
=X
k∈N\{i}
1
n−1+1
n−1X
k∈N\{i}
(1
n−1−Di).
Having in mind that (n−1)Di<1, we arrive at
X
{l,k}∈N2
∂brj i
∂slk
= 1 + 1
n−1−Di>1.
That is, the mapping BR(•) : S→Sdoes not satisfy the sufficient contraction
conditions.
To proceed, analyze the mapping BR2(•) = BR(BR(•)) : S→S. Due to the
aforesaid, ∀i∈N:
X
{l,k}∈N2
∂br2
ii
∂slk
=|Di|X
j∈N\{i}
(1 + 1
n−1−Dj).
Since each agent follows the Cournot dynamics, we also have that ∀j∈N\{i}:
X
{l,k}∈N2
∂br2
ji
∂slk
= (1 + 1
n−1−Di)X
j∈N\{i}|Dj|.
Hence, it appears that ∀i∈N:
X
{l,k}∈N2
∂br2
ii
∂slk ≤¯
D(n−(n−1)D),
29
and ∀j∈N\{i}:
X
{l,k}∈N2
∂br2
ji
∂slk ≤¯
D(n−(n−1)D).
Obviously, under the condition ¯
D(n−(n−1)D)<1the mapping BR2(•)is
contracting.
Find out under which ˜
βthis can be achieved. If ˜
β≥1
2nmax
i∈N(−u00
i(xi(s)), then
D≥0. In this case, one can always ensure the condition ¯
D(n−(n−1)D)≤1
by choosing D≤¯
D < 1
n−1, as far as D(n−(n−1)D)<1for D < 1
n−1. As we
increase ˜
β, the result is D→¯
D→1
n−1, but ¯
D(n−(n−1)D)→1. And so, very
large values of ˜
βseem unreasonable.
We have shown that, under max
i∈N(−u00
i(xi(s)) ≤C, the mapping BR2(•)can be
made contractive by an appropriate choice of the parameter ˜
β.
Proof of Assertion 2. By virtue of Lemma 3, the iterative process starting in
Smoves to ¯
S. According to Lemma 4, under the condition max
i∈N(−u00
i(xi(s)) ≤C
one can guarantee the convergence of the iterative processes s(τ+ 2) = BR2(s(τ))
and s(τ+ 3) = BR2(s(τ+ 1)), where s(τ+ 1) = BR(s(τ)), τ ≥1to a same Nash
equilibrium by an appropriate choice of ˜
β. Really, it represents a unique fixed
point for each of these processes.
Proof of Lemma 5. In the game Γ(ˆρ)induced by the mechanism ˆρ=<ˆ
S, ˆπ, ˆ
t >,
agents possess the preference functions ˆϕi(s) = ui(ˆxi(s)) −ˆpi(s). Obviously, these
functions enjoy concavity under ˆ
β > 0.
The best response of agent i∈Nis defined as the unique solution of the
equation
u0
i(ˆxi)(1 −1
n)−2ˆ
β(brii −ˆxi)1
n= 0.
Having in mind (5), we obtain that the equilibrium messages of agents meet
the following system of equations:
u0
i(ˆx∗
i)(n−1) = 2 ˆ
βP
i∈N
brii −R
n, i ∈N.
By analogy with Lemmas 1 and 2, the above system has a unique solution.
Furthermore, this solution is such that ∀i∈N:u0i(ˆx∗
i) = const, which answers
solution of the problem (1).
Proof of Assertion 3. The equivalence of the mechanisms follows directly from
that the problem (1) possesses a unique solution.
In the game Γ(ˆρ)the equilibrium requests of agents obey the system of equations
u0
i(ˆx∗
i)(n−1) = 2 ˆ
βP
i∈N
brii −R
n, i ∈N,
30
see Lemma 5.
According to Assertion 1, in the game Γ(ρ)we have that
u0i(x∗
i) = 2 ˜
β(X
i∈N
brii −R).
Consequently, ∀i∈N:brii =brii under ˆ
β=˜
β1
n(n−1) .That is,
ˆ
β=βn−α−1
n(n−1)2.
СПИСОК ЛИТЕРАТУРЫ
1. Boss, V., Lektsii po matematike: differentsial’nye uravneniya (Lectures on
Mathematics: Differential Equations), Moscow: Editorial URSS, 2004.
2. Burkov, V.N., Danev, B., Enaleev, A.K., et al., Bol’shie sistemy: modelirovanie
organizatsionnykh mekhanizmov (Large Systems: Modeling of Organizational
Mechanisms), Moscow: Nauka, 1989.
3. Burkov, V.N., Dzhavakhadze, G.S., Dinova, N.I., and Shchepkin, D.A., Primenenie
igrovogo imitatzionnogo modelirovaniya dlya otsenki effektivnosti ekonomicheskikh
mekhanizmov (Application of Game Simulation Modeling for Estimating the
Efficiency of Economic Mechanisms), Moscow: Inst. Probl. Upravlen., 2003.
4. Goubko, M.V. and Novikov, D.A., Teoriya igr v upravlenii organizatsionnymi
sistemami (Game Theory and Organizational Systems Control), Moscow: Sinteg,
2002.
5. Goubko, M.V. and Spryskov, D.S., Consideration of Cooperative Interactions in
Planning Mechanisms, Upravlen. Bol’sh. Sist., 2000, no. 2, pp. 28–38.
6. Iskakov, M.B., Equilibrium in Safety Strategies and Equilibriums in Objections
and Counter-objections in Noncooperative Games, Autom. Remote Control, 2008,
vol. 69, no. 2, pp. 278–298.
7. Korgin, N.A., Representing a Sequential Resource Allotment Rule in the Form
of a Strategy-proof Mechanism of Multicriteria Active Expertise, Autom. Remote
Control, 2014, vol. 75, no. 5, pp. 983–995.
8. Korgin, N.A. and Khristyuk, A.A., The Effective Mechanism of Active Examination
with the Payment for Participation as the Tool of Acceptance of the Coordinated
Decisions, Vestn. Voronezh. Gos. Tekh. Univ., 2011, vol. 7, no. 6, pp. 117–121.
9. Korepanov, V.O. and Novikov, D.A., The Reflexive Partitions Method in Models
of Collective Behavior and Control, Autom. Remote Control, 2012, vol. 73, no. 8,
pp. 1424–1441.
10. Mazalov, V.V., Mencher, A.E., and Tokareva, Yu.S., Peregovory. Matematicheskaya
teoriya (Negotiations. Mathermatical Theory), St. Petersburg: Lan’, 2012.
11. Novikov, D.A., Theory of Control in Organizations, New York: Nova Science
Publishers, 2013.
12. Arifovic, J. and Ledyard, J.O., A Behavioral Model for Mechanism Design:
Individual Evolutionary Learning, J. Econom. Behav. Organiz., 2011, no. 78,
pp. 375–395.
31
13. Barber´a, S., Jackson, M., and Neme, A., Strategy-Proof Allotment Rules, Games
Econom. Behav., 1997, vol. 18, no. 1, pp. 1–21.
14. Basar, T. and Maheswaren, R., Social Welfare of Selfish Agents: Motivating
Efficiency for Divisible Resources, Proc. Control Decision Conf. (CDC), 2004,
pp. 361–395.
15. Boyd, S., Parikh, N., and Chu, E., Distributed Optimization and Statistical
Learning via the Alternating Direction Method of Multipliers, Foundat. Trends
Machine Learn., 2011, vol. 3, no. 1, pp. 1–122.
16. Fischbacher, U., z-Tree—Zurich Toolbox for Ready-made Economic Experiments,
Experim. Econom., 2007, vol. 10, no. 2, pp. 171–178.
17. Fudenberg, D. and Levine, D., Theory of Learning in Games, Cambridge: MIT
Press, 1999.
18. Goetz, R., Martinez, Y., and Jofre, R., Water Allocation by Social Choice Rules:
The Case of Sequential Rules, Ecol. Econom., 2008, vol. 65, no. 2, pp. 304–314.
19. Groves, T. and Ledyard, J.O., The Existence of Efficient and Incentive Compatible
Equilibria with Public Goods, Econometrica, 1980, no. 6, pp. 1487–1506.
20. Healy, P. and Mathevet, L., Designing Stable Mechanisms for Economic
Environments, Theoret. Econom., 2012, vol. 7, no. 3, pp. 609–661.
21. Hurwicz, L., Outcome Functions Yielding Walrasian and Lindahl Allocations at
Nash Equilibrium Points, Rev. Econom. Stud., 1979, no. 46, pp. 217–225.
22. Jain, R. and Walrand, J., An Efficient Nash-Implementation Mechanism for
Divisible Resource Allocation, Automatica, 2010, vol. 46, no. 8, pp. 1276–1283.
23. Johari, R. and Tsitsiklis, J.N., Efficiency of Scalar-Parameterized Mechanisms,
Oper. Res., 2009, no. 57, pp. 823–839.
24. Kakhbod, A. and Teneketzis, D., An Efficient Game Form for Unicast Service
Provisioning, IEEE Trans. Autom. Control, 2012, vol. 57, no. 2, pp. 392–404.
25. Kruskal, W.H. and Wallis, W.A., Use of Ranks in One-Criterion Variance Analysis,
J. Am. Statist. Ass., 1952. vol. 47, pp. 583–621.
26. Lefebvre, M., Sharing Rules for Common-Pool Resources When Self-insurance
Is Available: An Experiment, Working Papers 11–22, LAMETA, Universtiy of
Montpellier, 2012.
27. Maskin, E., The Theory of Nash Equilibrium: A Survey, in Hurwicz, L., Schmeidler,
D., and Sonnenschein, H., Social Goals and Social Organization, Cambridge:
Cambridge Univ. Press, 1985, pp. 173–204.
28. Mathevet, L., Supermodular Mechanism Design, Theoret. Econom., Econometr.
Soc., 2010, vol. 5(3), pp. 403–443.
29. Moulin, H., An Efficient and Almost Budget Balanced Cost Sharing Method, Games
Econom. Behav., 2010, vol. 70, no. 1, pp. 107–131.
30. Sprumont, Y., The Division Problem with Single-Peaked Preferences: A
Characterization of the Uniform Rule, Econometrica, 1991, vol. 59, pp. 509–519.
31. Van Essen, M., A Note on the Stability of Chen’s Lindahl Mechanism, Soc. Choice
Welfare, 2012, vol. 38(2), pp. 365–370.
32. Walker, M., A Simple Incentive Compatible Scheme for Attaining Lindahl
Allocations, Econometrica, 1981, no. 49, pp. 65–71.
32