Leveraging Domain Knowledge to Learn Normative Behavior: A Bayesian Approach.
ABSTRACT This paper addresses the problem of norm adaptation using Bayesian reinforcement learning. We are concerned with the effectiveness of adding prior domain knowledge when facing environments with different settings as well as with the speed of adapting to a new environment. Individuals develop their normative framework via interaction with their surrounding environment (including other individuals). An agent acquires the domain-dependent knowledge in a certain environment and later reuses them in different settings. This work is novel in that it represents normative behaviors as probabilities over belief sets. We propose a two-level learning framework to learn the values of normative actions and set them as prior knowledge, when agents are confident about them, to feed them back to their belief sets. Developing a prior belief set about a certain domain can improve an agent's learning process to adjust its norms to the new environment's dynamics. Our evaluation shows that a normative agent, having been trained in an initial environment, is able to adjust its beliefs about the dynamics and behavioral norms in a new environment. Therefore, it converges to the optimal policy more quickly, especially in the early stages of learning.
SourceAvailable from: Georgios Chalkiadakis
Conference Paper: Coalitional Bargaining with Agent Type Uncertainty.[Show abstract] [Hide abstract]
ABSTRACT: Coalition formation is a problem of great interest in AI, allowing groups of autonomous, individually ratio- nal agents to form stable teams. Automating the nego- tiations underlying coalition formation is, naturally, of special concern. However, research to date in both AI and economics has largely ignored the potential presence of uncertainty in coalitional bargaining. We present a model of discounted coalitional bargaining where agents are uncertain about the types (or capabilities) of potential partners, and hence the value of a coalition. We cast the problem as a Bayesian game in extensive form, and de- scribe its Perfect Bayesian Equilibria as the solutions to a polynomial program. We then present a heuristic algo- rithm using iterative coalition formation to approximate the optimal solution, and evaluate its performance.IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007; 01/2007
Conference Paper: Emergence of Norms through Social Learning.[Show abstract] [Hide abstract]
ABSTRACT: Behavioral norms are key ingredients that allow agent coordination where societal laws do not suf- ficiently constrain agent behaviors. Whereas social laws need to be enforced in a top-down manner, norms evolve in a bottom-up manner and are typ- ically more self-enforcing. While effective norms can significantly enhance performance of individ- ual agents and agent societies, there has been lit- tle work in multiagent systems on the formation of social norms. We propose a model that supports the emergence of social norms via learning from interaction experiences. In our model, individual agents repeatedly interact with other agents in the society over instances of a given scenario. Each interaction is framed as a stage game. An agent learns its policy to play the game over repeated interactions with multiple agents. We term this mode of learning social learning, which is distinct from an agent learning from repeated interactions against the same player. We are particularly inter- ested in situations where multiple action combina- tions yield the same optimal payoff. The key re- search question is to find out if the entire population learns to converge to a consistent norm. In addition to studying such emergence of social norms among homogeneous learners via social learning, we study the effects of heterogeneous learners, population size, multiple social groups, etc.IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007; 01/2007
[Show abstract] [Hide abstract]
ABSTRACT: The reinforcement learning problem can be decomposed into two parallel types of inference: (i) estimating the parameters of a model for the underlying process; (ii) determining behavior which maximizes return under the estimated model. Following Dearden, Friedman and Andre (1999), it is proposed that the learning process estimates online the full posterior distribution over models. To determine behavior, a hypothesis is sampled from this distribution and the greedy policy with respect to the hypothesis is obtained by dynamic programming. By using a different hypothesis for each trial appropriate exploratory and exploitative behavior is obtained. This Bayesian method always converges to the optimal policy for a stationary process with discrete states.