Figure 1 - available via license: Creative Commons Attribution-ShareAlike 4.0 International
Content may be subject to copyright.
Source publication
Efficient exploration in complex environments remains a major challenge for reinforcement learning (RL). Compared to previous Thompson sampling-inspired mechanisms that enable temporally extended exploration, i.e., deep exploration, we focus on deep exploration in distributional RL. We develop here a general purpose approach, Bag of Policies (BoP),...
Context in source publication
Similar publications
柳澤秀吉,感情力学と探究サイクル(興味と好奇心の数学原理),設計工学,Vol. 58, No. 11, 2023.
As artificial intelligence (AI) plays a more prominent role in our everyday lives, it becomes increasingly important to introduce basic AI concepts to K-12 students. To help do this, we combined physical robots and an augmented reality (AR) software to help students learn some of the fundamental concepts of reinforcement learning (RL). We chose RL...
Both entropy-minimizing and entropy-maximizing (curiosity) objectives for unsupervised reinforcement learning (RL) have been shown to be effective in different environments, depending on the environment's level of natural entropy. However, neither method alone results in an agent that will consistently learn intelligent behavior across environments...
This paper shows how a tool that explores future possibilities,ReadySet- Future_, helped a major automotive maker understand how shifts in consumer values may impact the features and use cases that surprise and delight future vehicle consumers in the year 2033. It contrasts two common mindsets when thinking about the future: a there is no alternati...