Content uploaded by Shrisha Rao
Author content
All content in this area was uploaded by Shrisha Rao on Sep 19, 2018
Content may be subject to copyright.
Adaptive Human-Agent Multi-Issue Bilateral
Negotiation Using the
Thomas-Kilmann Conflict Mode Instrument
Gaurav Koley
gaurav.koley@iiitb.org
Shrisha Rao
shrao@ieee.org
Abstract—Automated negotiation is an important class of prob-
lems that has wide reaching application in the real world. While
a lot of work has been done in Agent-Agent negotiations, Human-
Agent negotiations have been relatively unexplored. Human-
Agent multi issue bilateral negotiations deals with autonomous
agents negotiating with humans over more than one item.
Designing agents which can engage in such negotiations requires
estimating the preferences of the human opponent in real time
and proposing offers which are likely to be accepted before the
session timeout. We design an agent that estimates the human
opponent’s preferences using two new heuristics, Most Changed
Least Preferred and Most Offered Most Preferred. Also, the agent
utilises the Thomas-Kilmann Conflict Mode Instrument to judge
the negotiation strategy of the opponent and then adapts its own
strategy to reach agreement faster. The agent does so without the
use of historical data, therefore, remaining free from the problems
that arise from lack of or biased historical data. Our results show
that the agent reaches good agreements against a wide variety of
human negotiators. The agreements fall on or near the Pareto-
Optimal frontier with the probability of an agreement resulting
in an optimal distribution being 97.7%.
Index Terms—Multi-Issue Negotiation, Bilateral Negotiation,
Human-Agent Negotiations
I. INTRODUCTION
Negotiation, the process of joint decision making, is an
inescapable phenomenon in our society. It is an important pro-
cess in forming social and political alliances and reaching trade
agreements. People avoid negotiations out of fear or lack of
skill and that contributes to income inequality, social injustice
and political gridlock [1]. This calls for focus on designing
autonomous negotiators which are capable of independently
negotiating with others.
Automated negotiation research originates in various disci-
plines including economics, social science, game theory and
artificial intelligence and is fueled by a number of benefits that
computerized negotiation offers, including better deals, and
reduction in time, costs, and stress. Agents which typically
negotiate with a human opponent are studied as a part of the
Human-Agent negotiation domain.
Such automated agents can be used alongside a human
negotiator engaging in important negotiations. Agents can
make negotiations faster as well as assist people who are less
qualified in the negotiation process. Another application of
automated agents is in E-commerce where agents can negotiate
with humans to sell items faster by making a trade off between
price and services like after-sales.
One particular problem of interest in Human-Agent negoti-
ations is multi-issue bilateral negotiations, which involves two
participants, who negotiate over more than one issue or item.
Applications of the solutions of multi-issue bilateral negoti-
ations are far reaching in trade and commerce. The problem
of modeling an automated agent for bilateral negotiation is
not new and has been well studied in the fields of Multi-
Agent Systems, i.e., Agent-Agent Negotiations [2], [3], [4],
[5], [6]. However, multi-issue Human-Agent negotiations is
still in need of study.
Most studies in the domains of Human-Agent and Agent-
Agent negotiations use Bayesian and other sophisticated math-
ematical models and assume complete information to identify
the opponent’s preferences [7], [8]. However, these methods
tend to over-fit and are not suited to Human-Agent negotiation
scenarios where the agent must deal with incomplete informa-
tion and the human opponent may change strategy frequently.
Also, in a Human-Agent Negotiation setting, the agent must
have the ability to adapt its negotiation strategy to be able to
deal with the different types of human negotiators that it may
encounter.
Studies in the Agent-Agent negotiation domain hold much
relevance for Human-Agent negotiations. Fujita [6] proposed
the use of past negotiation session data to learn about the
opponent’s negotiation style by characterizing the opponent in
terms of a known conflict-handling style, the Thomas-Kilmann
Conflict Mode Instrument (TKI) [9].
This paper adapts Fujita’s idea to Human-Agent nego-
tiations working with just real-time data from the current
session and without past session data. We design an adaptive
strategy that adjusts the speed of compromise in real-time to
be compatible with the opponent’s strategy, determined using
TKI.
To retain the flexibility of our agent in being able to respond
to various situations, in place of mathematical models, we
propose the new heuristics Most Changed Least Preferred
(MCLP) and Most Offered Most Preferred (MOMP), to es-
timate the opponent’s preferences.
Previous works [10], [11], [12] have described domain-
978-1-5386-5048-6/18/$31.00 c
2018 IEEE
specific agents capable of Human-Agent Negotiations.
Baarslag et al [10] presented an agent which autonomously
negotiates the permission to exchange private data between
users and services, on behalf of the user. Lin et al [11] experi-
mented in a limited setting with three possible agent types, i.e.,
a set of six different utility functions, while the agents by Mell
and Gratch [12] highlighted the importance of emotional and
language expression in influencing negotiations. Such agents
work well within their defined limited domains but cannot be
used in general resource-allocation problems which our agent
does. Thus, the existing agents come nowhere close to ours in
terms of flexibility, optimality and fairness of the negotiation
outcomes, and have no comparable performance.
To demonstrate the same, we performed 24 trials with the
agent and reached 23 negotiated agreements. Our proposed
strategies enabled our agent to reach fast agreements with
an average time of 6 minutes and 30 seconds. The outcomes
of the negotiation mostly resulted in the best distribution of
items between the participants with a Pareto optimal efficacy
of 97.7%.
In summary:
•Our agent uses the TKI to adapt its strategy in accordance
with different human negotiation styles. Some opponents
may drive a hard bargain, while some might be easygoing.
•Our agent negotiates without knowledge of historical
data. This ability is very useful, particularly when his-
torical data are unavailable or biased.
•Our agent uses two new heuristics, MCLP and MOMP,
to assess the value that the human opponent attaches to
each item of negotiation, which might vary significantly
from human to human.
•Our experiments show that the agent reaches agreement
quickly and fairly. Nearly all experiments performed with
different human opponents resulted in equal value sharing
between the agent and the human with the utilisation of
the value reaching close to the Pareto optimal-Optimal
frontier.
The remainder of the paper is organised as follows. Sec-
tion II describes the environment in which our agent works.
Section III describes our agent’s strategy with respect to
assessing the opponent’s preferences and strategy. Then we
propose a way of adapting the agent’s strategy in accordance
to the opponent. Section IV demonstrate the results of our
experimental trials and some benchmarks. Finally we present
our conclusions in Section V.
II. HUMAN-AGE NT NE GOTIATION ENVIRONMENT
The interaction between the negotiating parties is regulated
by a negotiation protocol that lays down the rules for the
exchange of proposals/offers.
Our agent conforms to the alternating-offers protocol for
bilateral negotiation, in which the negotiating parties exchange
offers in turns [2].
The agent and the human take turns in the negotiation. The
human is expected to start the negotiation and the agent is
informed about the action taken. The possible actions are:
•Accept: This indicates that the participant accepts the
opponent’s last bid.
•Offer: This indicates that the participant proposes a new
bid.
•End: This indicates that the participant terminates the
entire negotiation session, and the participants leave with
the least possible score.
We divide the total Negotiation Time into nrounds. In round
t, if the negotiation has not terminated earlier, each participant
can propose a possible agreement, and the opponent can either
accept or reject the offer. A round can have exactly one offer
exchange.
If the action was an Offer, the participant is subsequently
expected to determine their next action and the turn goes to the
other participant. If the action is not an Offer, the negotiation
session ends either with an End or an Accept and the final
score (utility of the last bid) is determined for each of the
participants, as follows:
•The action of the participant is an Accept. The last bid
of the opponent is taken, and its utility is determined in
the utility spaces of the two participants.
•The action is the participant is an End. Both the partici-
pants are assigned the lowest score.
The participants negotiate over a number of issues, and
every issue has an associated range of values. A negotiation
outcome, i.e, a bid, is a mapping of every issue to a value, and
the set Ωof all possible bids is called the negotiation domain.
The domain is common knowledge to both the participants
and remains static in a negotiation session. Both participants
have certain preferences for issues and values defined by a
preference profile over Ω. These preferences are expressed
through a utility function U:bid ∈Ω→[0,1]. While
the domain information is common knowledge, the preference
profile, i.e, the utility function Uof the participants is private.
The utility of a bid (bid)to the agent is represented as
UA(bid)and its utility to the human is represented as UH(bid).
Human
Opponent Model Bidding Strategy
Acceptance Strategy
Receive Offer
Compute TKI &
Opponent Preferences
Compute Utility
Acceptance/Rejection
& Counter Offer
Send Offer
Fig. 1: Flow of the Agent
III. NEG OTI ATIN G AGE NT DESIGN
During negotiation, the agent needs to determine, which
of the received offers to accept or which of the offers to
propose to the human to increase the likelihood of the offer
of being accepted. This is difficult as the agent must deal with
incomplete information about the human’s preference of the
issues of negotiation. The agent must also take into consid-
eration that humans do not necessarily behave rationally or
maximize expected utility (either self and join). In particular,
results from social sciences suggest that in general, equilibrium
strategies are not followed by humans and that the theoretical
equilibrium strategy is not necessarily optimal.[13], [14], [15]
So the agent must incorporate heuristics to allow for deviations
from expected behavior of the human.
Human-Agent negotiations is further complicated by the
time bound nature of the negotiation protocol: whether or not
to make a fresh offer, depends on the expectation of future
offers, the amount of time left and the offers received so far.
To address this, we formulate a optimal strategy (with respect
to the expected utility of the human as estimated by the agent)
and specify which offers to take and what offers to make.
Our agent is designed along the Bidding, Acceptance,
Opponent (BOA) agent architecture as described in detail by
Baarslag [3]. As indicated in Figure 1, the agent invokes the
following:
1) Acceptance Strategy: It aggregates offers received in a
round and accepts if the current offer is better than those
offered in the previous rounds.
2) Opponent Model: Simple heuristics (MCLP and
MOMP) and the TKI are used to estimate and model
the human’s preferences and strategy.
3) Bidding Strategy: It is an adaptive strategy that adjusts
the speed of compromise depending on the opponent’s
strategy, as estimated by the Opponent Model.
A. Acceptance Strategy
The acceptance strategy of the agent determines whether to
accept the current offer or to wait for better offers in the future.
To do so, we use a conditional model, which aggregates offers
received from the human in rounds and for every received
offer, accepts if:
•The current received offer is better than the offer offered
by the agent in the previous round, or
•The current offer is better than average utility of all offers
in the previous rounds, or
•The current offer is better than any offer seen before.
B. Opponent Modeling using Heuristics
Henceforth, “agent” refers to the autonomous agent and
“opponent” refers to the human opponent.
The agent estimates the values the opponent will offer in
the future bids based on the opponent’s previous offers. To do
so, we use the commonly used heuristic in the negotiation
domain: The first bid made by the opponent is the most
preferred bid [3]. The best bid is the selection of the most
preferred value for each issue, and thereby immediately reveals
which values are the best for each issue. The agent-agent
models use this heuristic and it is modeled on human behavior,
specifically in the work by Oesch and Galinsky [16] which
shows that first offers by humans have anchoring effects and
model their preferences most accurately. Even in most human-
human negotiations, the bidding starts with the best bid stated
outright.
Adding to this, we propose the following two heuristics:
1) Most Changed Least Preferred (MCLP): There is a
inverse relation between the preference of an issue and
the number of times its value is significantly changed.
2) Most Offered Most Preferred (MOMP): There is a direct
relation between the preference of a value and the
frequency with which it is offered.
To elaborate the reasoning behind the above heuristics, we
note that if the value of a particular issue/item is often sig-
nificantly changed, then it can be assumed that it is preferred
less by the opponent, since they evidently do not care enough
for it to request for it equally, always. Had the issue been
important to the opponent, they would have tried to maximise
the value of that issue and its value would not have changed
frequently. Further, if a particular value of an issue maximises
gains for the opponent, they would offer it more frequently.
This also correlates directly to our second proposed heuristic,
which posits a direct relation between the preference of a value
and its frequency of being offered.
There have been previous heuristics [17] in Agent-Agent
literature which use the frequency of an issue to fix its prefer-
ence. In contrast, our heuristic MOMP is novel, since it deals
with the preference for the value of an issue. A negotiation
participant has two kinds of preference: the preference for an
issue, and that for a particular quantity/number of an issue
(which is termed the value of an issue [3]). MOMP deals
specifically with the case when only a bundle or a quantity of
an issue has utility for the bidder as opposed to a unit which
may have none. For example, a builder has higher preference
for bricks as opposed to large boulders, but an offer containing
a single brick does not create sufficient utility and the builder
would reject such an offer.
C. Thomas-Kilmann Conflict Mode Instrument
In our agent’s design, we use the Thomas-Kilmann Conflict
Mode Instrument (TKI) to determine the opponent’s strategy
and adapt the agent’s bidding strategy.
An opponent’s strategy can be characterized in terms of
some global style, such as the negotiation styles, or a known
conflict-handling style. One important style is the TKI [9], [6].
The TKI is designed to assess a person’s behavioral re-
sponse to conflict situations. “Conflict situations” are those
where the interests or concerns of two people are seemingly
incompatible. In such a situation, an individual’s behavior
has two dimensions: (1) assertiveness, the extent to which
the person attempts to fulfill their own interests, and (2)
cooperativeness, the extent to which the person attempts to
fulfill the opponent’s interests. These two basic dimensions of
behavior define five different modes for responding to conflict
situations: Competing,Accommodating,Avoiding,Collaborat-
ing, and Compromising as Figure 2 shows.
Fig. 2: Thomas-Kilmann Conflict Mode Instrument
1) Estimating Opponent Strategy using Real-Time Data:
The agent uses the real-time negotiation information to judge
the opponent’s conflict-handling style, henceforth considered
as the opponent’s negotiation style, using TKI. To judge
assertiveness and cooperativeness, it uses the same criteria as
proposed by Fujita [6]. Using the mean of previous bids as a
measure of the opponent’s cooperativeness and the variance
of bids as a measure of their assertiveness, it judges the
negotiation style of the opponent. The major difference in
our approach from Fujita is that we use only current data to
estimate the cooperativeness and assertiveness of the opponent.
While TKI has been introduced in Agent-Agent negotia-
tions, it has not been applied in the Human-Agent negotiation
domain previously. While previous work compares the current
opponent to past opponents and tries to estimate their nego-
tiation style, such a mechanism is susceptible to biased data
from previous negotiation sessions. [18]
Thus, unlike Fujita’s agent [6] which uses historical ne-
gotiation data in the Agent-Agent domain to estimate the
opponent’s negotiation style, our agent uses only current data.
The agent retains no knowledge of the previous negotiation
sessions. The current bid of the opponent is compared to
the opponent’s past bids in the same negotiation session.
This gives the agent the flexibility to adapt to all kinds of
opponents (strong headed, cooperative, collaborative, passive,
etc.) since the opponent is being compared to that opponent
themselves, and not with others. This method also makes the
agent impervious to bias that might come from historical data.
TABLE I: Estimation of cooperativeness and assertiveness
Condition Cooperativeness Condition Assertiveness
UH(bidt)>µ Uncooperative σ2(t)>σ2
hPassive
UH(bidt) = µNeutral σ2(t) = σ2
hNeutral
UH(bidt)<µ Cooperative σ2(t)<σ2
hAssertive
Source: Fujita [6]
Table I relates the condition to the assertiveness and cooper-
ativeness. In summary, when UH(bidt)(utility of opponent’s
bid in round t) is lower than µ(mean utility of the previous
offers from the opponent), the agent regards the opponent
as cooperative. On the other hand, when UH(bidt)is higher
than µ, the agent regards the opponent as uncooperative. In
addition, our agent evaluates the opponent’s assertiveness by
comparing between σ2(t)(variance of the current bid) and
σ2
h(variance of the previous proposals so far in the session).
If σ2(t)is greater than σ2
h, then the bids are spread out and
the opponent is seen as passive and can be forced. The other
case, where σ2(t)is lesser than σ2
h, represents that the bids are
confined to a smaller region and that the opponent is assertive
of their decision.
2) Bidding Strategy: To determine the right offer to make to
reach agreement, the agent utilizes an adaptive bidding strategy
by making use of the information about the negotiation style
of the opponent, using TKI, as described earlier. Empirically
we say that, for the agent, the best bid to make is dependent on
the negotiation time left as well as the previous offer received
from the opponent. Such a strategy takes care of the estimated
preferences of the opponent, thereby increasing the chances of
the bid being accepted by the opponent.
In concrete terms, at the negotiation round t(out of a total
of nrounds), when the utility of the last bid by the human has
been UA(bidt−1), the strategy is to offer a bid with a target
utility given by:
target(t, α)=Υmin + ∆ ·Γ(t, α)(1a)
where
∆=Υmax −Υmin (1b)
and
Γ(t, α) = 1 −UA(bidt−1)·t
n
1
α
(1c)
Where:
α∈[0,1], is the concession rate, determined from the
opponent’s negotiation style, using TKI, calculated
by the opponent model. (See Algorithm 1.)
Υmax is the maximum target utility for the agent (set to 1).
Υmin is the minimum target utility for the agent (set to
0.3).
The values for Υmax and Υmin are empirically determined
and vary depending on the items of negotiation. Any target
utility Υmin higher than 0.3will almost always result in failure
to reach negotiated agreements in the allotted time.
In equation (1a), Υmin sets the lower bound for the target
utility. The rest of the utility value is determined by ∆
which represents the width of the utility space that the agent
might need to explore, and Γ(t, α)which represents the utility
component dependent on the negotiation time left and the last
bid offered by the opponent. If UA(bidt−1)(the utility of the
last bid made by the opponent) is very low, then the agent
must explore a larger slice of the utility space, as evident from
equation (1c). If UA(bidt−1)is high, then the agent need only
bid within a small slice of the utility space. Also if there is
a lot of time left in the negotiation, the agent can afford to
explore a large slice of the utility space since t
nwill be
small. However, as the end of allotted time for negotiation
approaches, the agent will narrow down on to a small part of
the utility space.
The effect of the opponent’s utility offering itself can be
made more valuable, or less so, to the calculation of the
target utility target(t, α). If the opponent is driving a hard
bargain, then their offers must be taken seriously which is
represented with a low αvalue. If the opponent is compliant
and compromising, then their offers can be discounted and this
is done by increasing the value of αwhich in turn, reduces
the contribution of the UA(bidt−1)·t
n
1
αto the target utility
value.
3) Adapting Bidding Strategy using TKI: As the equations
(1a), (1b), (1c) show, the speed of compromise is decided
by αin target(t, α).αis set very close to 0 initially. We
compare the mean and variance of the utility of the past bids
and compare it with the current bid made by the opponent
to the agent to determine cooperativeness and assertiveness of
the opponent. αis increased when the opponent is Accom-
modating or Compromising, according to the cooperativeness
and assertiveness scale for TKI. By introducing this adjustment
parameter, the agent can adjust its strategy from strong headed
to cooperative, becoming compatible with the opponent to
reach agreement faster.
The exact method for adapting the agent’s strategy by
adjusting αis given in Algorithm 1.
In lines 1–2, we define the initial conditions for the algo-
rithm, set at the beginning of the negotiation session. Line
4 refers to the bid received from the opponent. In lines 5–
9, the mean and variance are calculated to be subsequently
used in lines 11–17 for cooperativeness and lines 18–24
for assertiveness. In lines 25–28, if the assertiveness and
cooperativeness indicate that the opponent’s negotiation style
is Compromising (line 25) or Accommodating (line 26) then
αin increased by a value of 0.1 to increase the speed of
compromise. Line 29 adds the current bid to the array of bids
which is used in subsequent iterations in lines 5–9.
Figure 3 highlights the concept of adapting the speed of
compromise. It is presented as a plot of the agent’s utility
with respect to the αvalue selected. Plugging values of α
ranging from 0.01 to 0.91 and randomly generated opponent
bids (bidt−1)in the equation (1a), we get the resultant graph,
which shows that as the value of αis increased, the agent
concedes faster.
For a low value of α, the agent never concedes. This
would be the case when the opponent is never cooperative or
accommodating. However, if the opponent compromises and
keeps accommodating the agent’s bids by reducing their utility,
the agent will compromise as well and reach agreements faster.
IV. EXP ER IM EN TAL TRIALS AND RESULTS
We conducted 24 trials, against 24 different human partic-
ipants, of which 23 reached negotiated agreement. We used
Interactive Arbitration Guide Online (IAGO) [19] to conduct
these trials.
The human and agent were to negotiate over a set of 4 items,
each 5 in quantity. The human and the agent were assigned
Algorithm 1: Updating αusing TKI
Result: α
1α←0.01;
2bids = [];
3while Offer received is a bid and t <Time do
4bidt←OfferedBid(t);
5µ←mean utility for opponent(bids);
6σ2
h←variance of utility for opponent(bids);
7UH(bidt)←estimated utility of bidtfor opponent;
8// determined from preferences
9σ2(t)←(UH(bidt)−µ)2;
10 // Judging cooperativeness
11 if UH(bidt)>µ then
12 c←“Uncooperative”;
13 else if UH(bidt) = µthen
14 c←“Neutral”;
15 else
16 c←“Cooperative”;
17 end
18 if σ2(t)>σ2
hthen
19 a←“Passive”;
20 else if σ2(t) = σ2
hthen
21 a←“Neutral”;
22 else
23 a←“Assertive”;
24 end
25 if (a= “Neutral”and c= “Neutral”)or
26 (a= “Passive”and c= “Cooperative”)then
27 if α<1then
28 α←α+ 0.1;
29 end
30 end
31 Insert bidtin bids;
32 end
predefined dissimilar interests over the set of 4 items. The
maximum score a participant could get was 50 points. These
sessions had a time limit of 10 minutes.
To test the effectiveness of the agent’s strategies, each trial
had a fresh instance of the agent with no knowledge of past
negotiation sessions. At the end of each session, the time of
negotiated agreement and the points for both human and agent
were captured.
Figure 4 shows the time taken in the trials to reach
negotiated agreement. Trial 13 did not reach agreement in
the given time duration of 600 seconds. 19 of the other 23
agreements completed before round 11, i.e., around 7 minutes
(420 seconds) after the negotiations began. On an average, it
took 6.5 minutes or 390 seconds to reach agreement with our
agent. The maximum time to agreement was 493 seconds or
8 minutes and 13 seconds. The minimum time to agreement
was 290 seconds, i.e., 4 minutes and 50 seconds.
Figure 5 shows the resultant points of human and agent on
reaching agreement in the negotiation session. The points in
Fig. 3: With increasing α, the agent concedes faster (at earlier
rounds). For α= 0.01 the agent never concedes, i.e., drives a hard
bargain.
grey represent 1 agreement, while the points in black repre-
sent 2 agreements. As evident from Figure 5, the negotiated
agreements occur very near to the line which marks the Pareto
optimal frontier for this negotiation setup. The Pareto optimal
frontier represents the situation where the agreed upon items
are allotted in the most value efficient manner between the
participants, and it is impossible for any participant to do any
better without causing loss to the other participant [20]. This
means that the best outcomes of the negotiated agreements
lie on the Pareto optimal frontier. Thus, an agreement on the
Pareto optimal frontier shows that the combined value that the
participants are getting from the agreement is the best possible.
However, agreements on the Pareto optimal frontier do not
mean that the distribution of value between the participants is
equal or fair, just that the combined value derived from the
negotiation is optimal.
A. Observations
The results from the trials were plotted in Figures 4 and 5.
To find the most fair and optimal agreement points, we use
the following equations:
β∈arg max [UH(bid) + UA(bid)] (2a)
δ∈arg min |UH(bid)−UA(bid)|(2b)
such that bid ∈Ω
If β=δ, then UH(β)and UA(β)are the optimal and most
fair values.
Equation (2a) shows the optimality property, and equa-
tion (2b) shows the fairness property. When both are satisfied,
the agreement is both fair and optimal. This point is at the
0.7-0.7 mark, i.e., with a utility of 0.7 for human and agent
each, since it is the point where both participants have derive
equal value from the negotiated agreement and the agreement
lies on the Pareto optimal frontier.
Most of the agreements in our results are clustered around
the 0.7-0.7 mark. Also, almost all the agreements happen very
close to the frontier, signifying that the agent almost always
helps reach a optimal agreement.
This suggests that our proposed heuristics and strategy
which uses the TKI helps reach faster and fairer agreements.
B. Benchmarks
The best case scenario for negotiated agreement is the Pareto
optimal frontier, which is the theoretical limit.
We calculate the Root Mean Square distance of the agree-
ment utilities in the trial results from the Pareto optimal
frontier using (3) as given by Hyndman and Koehler [21].
RMS =sPN
i=1 euclideanDist(resulti,ParetoFront)2
N(3)
We then measure the efficacy of the agent using (4) (see
Bosse [5]) and get an efficacy score of 97.7% for our agent.
EfficacyScore =1−RMS
√0.72+ 0.72×100 (4)
This metric can be understood as follows: on an average,
the probability of the negotiated agreement resulting in an
optimal distribution of points between the human and the
agent is 97.7%. The result is derived from the collective
results of the 23 successful negotiations and doesn’t depend on
specific strategies used by individual participants. Therefore,
if an agreement is reached between our agent and the human
opponent, there is 97.7% probability of it being optimal
regardless of the human’s strategy.
V. CONCLUSION
This paper focused on Human-Agent bilateral multi-issue
negotiation, which is an important class of real-life negotia-
tions. We designed an adaptive strategy for a novel agent that
estimates the preferences of the opponent using simple heuris-
tics (MCLP and MOMP). Our agent guesses the opponent’s
strategy through the means of the Thomas-Kilmann Conflict
Mode Instrument and adapts its own strategy accordingly.
Through 24 trials we demonstrated that the proposed method
results in close to optimal negotiated agreements.
Such an agent can be used for permissions management for
mobile apps where the human and the app can negotiate data
sharing consent for features in the app [10]. Also, such an
agent can be highly useful to E-commerce, where the system
may negotiate with prospective customers helping close more
sales by giving out discounts and freebies without the need
for manual interventions like online flash sales.
Fig. 4: Time taken for negotiations to reach agreement. Here the ”*” indicates that no agreement was reached for this session.
Fig. 5: The best case negotiation scenario is the Pareto optimal
frontier specified by the bounding line. Results from negotiation trials
are marked as dots. Grey dots represent 1 agreement, Black dots
represent 2 agreements.
REFERENCES
[1] T. Eisenberg and C. Lanvers, “What is the settlement rate and why
should we care?” Journal of Empirical Legal Studies, vol. 6, no. 1, pp.
111–146, 2009.
[2] R. Aydo˘
gan, D. Festen, K. V. Hindriks, and C. M. Jonker, “Alternating
offers protocols for multilateral negotiation,” Studies in Computational
Intelligence, vol. 674, pp. 153–167, 2017.
[3] T. Baarslag, “What to bid and when to stop,” Ph.D. dissertation, Dept. of
Electrical Engineering, Mathematics and Computer Science, TU Delft,
Sept 2014.
[4] T. Baarslag, M. Kaisers, C. M. Jonker, E. H. Gerding, and J. Gratch,
“When will negotiation agents be able to represent us? The challenges
and opportunities for autonomous negotiators,” in IJCAI International
Joint Conference on Artificial Intelligence, 2017.
[5] T. Bosse, C. M. Jonker, L. van der Meij, V. Robu, and J. Treur, “A
System for Analysis of Multi-Issue Negotiation,” in Software Agent-
Based Applications, Platforms and Development Kits, 2005.
[6] K. Fujita, “Efficient Strategy Adaptation for Complex Multi-times
Bilateral Negotiations,” in 2014 IEEE 7th International Conference on
Service-Oriented Computing and Applications. IEEE, Nov 2014, pp.
207–214.
[7] P. Faratin, C. Sierra, and N. R. Jennings, “Using similarity criteria to
make issue trade-offs in automated negotiations,” Artificial Intelligence,
vol. 142, no. 2, pp. 205–237, 2002.
[8] M. Oprea, The Use of Adaptive Negotiation by a Shopping Agent in
Agent-Mediated Electronic Commerce. Berlin, Heidelberg: Springer
Berlin Heidelberg, 2003, pp. 594–605.
[9] R. H. Kilmann and K. W. Thomas, “Developing a Forced-Choice
Measure of Conflict-Handling Behavior: The ”Mode” Instrument,” Ed-
ucational and Psychological Measurement, vol. 37, no. 2, pp. 309–325,
jul 1977.
[10] T. Baarslag, A. T. Alan, R. Gomer, M. Alam, C. Perera, E. H. Gerding,
and m. schraefel, “An automated negotiation agent for permission man-
agement,” in Proceedings of the 16th Conference on Autonomous Agents
and MultiAgent Systems, ser. AAMAS ’17’. Richland, SC: International
Foundation for Autonomous Agents and Multiagent Systems, 2017, pp.
380–390.
[11] R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry, “An automated agent
for bilateral negotiation with bounded rational agents with incomplete
information,” in Proceedings of the 2006 conference on ECAI 2006: 17th
European Conference on Artificial Intelligence August 29–September 1,
2006, Riva del Garda, Italy. IOS Press, 2006, pp. 270–274.
[12] J. Mell and J. Gratch, “Grumpy & Pinocchio: Answering Human-Agent
Negotiation Questions Through Realistic Agent Design,” in Proceedings
of the 16th Conference on Autonomous Agents and MultiAgent Systems,
ser. AAMAS ’17’. Richland, SC: International Foundation for Au-
tonomous Agents and Multiagent Systems, 2017, pp. 401–409.
[13] R. D. McKelvey and T. R. Palfrey, “An experimental study of the
centipede game,” Econometrica, vol. 60, no. 4, pp. 803–836, 1992.
[Online]. Available: http://www.jstor.org/stable/2951567
[14] I. Erev and A. E. Roth, “Predicting How People Play Games:
Reinforcement Learning in Experimental Games with Unique,
Mixed Strategy Equilibria,” American Economic Review, vol. 88,
no. 4, pp. 848–881, September 1998. [Online]. Available:
https://ideas.repec.org/a/aea/aecrev/v88y1998i4p848-81.html
[15] A. Tversky and D. Kahneman, “The framing of decisions
and the psychology of choice,” Science (80-. )., vol.
211, no. 4481, pp. 453–458, 1981. [Online]. Available:
http://science.sciencemag.org/content/211/4481/453
[16] J. M. Oesch and A. D. Galinsky, “First Offers in Negotiations: Deter-
minants and Effects,” in 16th Annu. IACM Conf., Melbourne, Australia,
2003.
[17] T. Baarslag, K. Fujita, E. H. Gerding, K. Hindriks, T. Ito, N. R.
Jennings, C. Jonker, S. Kraus, R. Lin, V. Robu, and C. R. Williams,
“Evaluating practical negotiating agents: Results and analysis of the
2011 international competition,” Artificial Intelligence, 2013.
[18] A. J. L. David Danks, “Algorithmic bias in autonomous systems,”
in Proceedings of the Twenty-Sixth International Joint Conference on
Artificial Intelligence, IJCAI-17, 2017, pp. 4691–4697.
[19] J. Mell and J. Gratch, “IAGO : Interactive Arbitration Guide Online
(Demonstration),” Aamas 2016, pp. 1510–1512, 2016.
[20] E. B. Hyder, M. J. Prietula, and L. R. Weingart, “Getting to best:
Efficiency versus optimality in negotiation,” Cognitive Science, vol. 24,
no. 2, pp. 169–204, 2000.
[21] R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast
accuracy,” Int. J. Forecast., vol. 22, no. 4, pp. 679–688, 2006.