Approximating Optimal Policies for Agents with Limited Execution Resources

Source: CiteSeer

ABSTRACT An agent with limited consumable execution resources needs policies that attempt to achieve good performance while respecting these limitations.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Despite significant recent advances in decision theoretic frameworks for reasoning about multiagent teams, little attention has been paid to applying such frameworks in adversarial domains, where the agent team may face security threats from other agents. This paper focuses on domains where such threats are caused by unseen adversaries whose actions or payoffs are unknown. In such domains, action randomization is recognized as a key technique to deteriorate an adversarys capability to predict and exploit an agent/agent teams actions. Unfortunately, there are two key challenges in such randomization. First, randomization can reduce the expected reward (quality) of the agent team’s plans, and thus we must provide some guarantees on such rewards. Second, communication within an agent team can help in alleviating the miscoordination that arises due to randomization, but communication is a scarce resource in most real domains. To address these challenges, this paper provides the following contributions. First, we recall the Multiagent Constrained MDP (MCMDP) framework that enables policy generation for a team of agents where each agent may have a limited (communication) resource. Second, since randomized policies generated directly for MCMDPs lead to miscoordination, we introduce a transformation algorithm that converts the MCMDP into a transformed MCMDP incorporating explicit communication actions. Third, we develop a non-linear program with non-convex constraints for the transformed MCMDP that randomizes team policy while attaining a threshold reward without violating the communication constraints. Finally, we experimentally illustrate the benefits of our work.
  • Source
  • International Journal of Computational Intelligence Research. 01/2007; 3(1).

Full-text (2 Sources)

Available from
Jun 4, 2014