Yiqin Yang

Yiqin Yang
  • Tsinghua University

About

21
Publications
3,657
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
162
Citations
Current institution

Publications

Publications (21)
Preprint
Full-text available
Offline reinforcement learning (RL) represents a significant shift in RL research, allowing agents to learn from pre-collected datasets without further interaction with the environment. A key, yet underexplored, challenge in offline RL is selecting an optimal subset of the offline dataset that enhances both algorithm performance and training effici...
Preprint
Full-text available
Exploration in sparse reward environments remains a significant challenge in reinforcement learning, particularly in Contextual Markov Decision Processes (CMDPs), where environments differ across episodes. Existing episodic intrinsic motivation methods for CMDPs primarily rely on count-based approaches, which are ineffective in large state spaces,...
Preprint
Offline reinforcement learning (RL) is crucial for real-world applications where exploration can be costly or unsafe. However, offline learned policies are often suboptimal, and further online fine-tuning is required. In this paper, we tackle the fundamental dilemma of offline-to-online fine-tuning: if the agent remains pessimistic, it may fail to...
Article
The large action space is one fundamental obstacle to deploying Reinforcement Learning methods in the real world. The numerous redundant actions will cause the agents to make repeated or invalid attempts, even leading to task failure. Although current algorithms conduct some initial explorations for this issue, they either suffer from rule-based sy...
Article
Among the remarkable successes of Reinforcement Learning (RL), self-play algorithms have played a crucial role in solving competitive games. However, current self-play RL methods commonly optimize the agent to maximize the expected win-rates against its current or historical copies, resulting in a limited strategy style and a tendency to get stuck...
Article
Offline reinforcement learning (RL) enables the agent to effectively learn from logged data, which significantly extends the applicability of RL algorithms in real-world scenarios where exploration can be expensive or unsafe. Previous works have shown that extracting primitive skills from the recurring and temporally extended structures in the logg...
Preprint
Among the great successes of Reinforcement Learning (RL), self-play algorithms play an essential role in solving competitive games. Current self-play algorithms optimize the agent to maximize expected win-rates against its current or historical copies, making it often stuck in the local optimum and its strategy style simple and homogeneous. A possi...
Conference Paper
Goal-conditioned Reinforcement Learning (GcRL) has achieved remarkable success in navigating towards goals in recent years. However, learning efficiency and generalization ability remain challenging issues when dealing with uncertain motion patterns of dynamic objects in the environment. To address these issues, existing model-based GcRL algorithms...
Preprint
Offline reinforcement learning (RL) enables the agent to effectively learn from logged data, which significantly extends the applicability of RL algorithms in real-world scenarios where exploration can be expensive or unsafe. Previous works have shown that extracting primitive skills from the recurring and temporally extended structures in the logg...
Preprint
Offline reinforcement learning (RL) enables effective learning from previously collected data without exploration, which shows great promise in real-world applications when exploration is expensive or even infeasible. The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of...
Preprint
Full-text available
Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data. Most existing offline RL algorithms use regularization or constraints to suppress extrapolation error for actions outside the dataset. In this paper, we adopt a different framework, which learns the V-function...
Article
Full-text available
Reinforcement Learning (RL) agents are often fed with large-dimensional observations to achieve the ideal performance in complex environments. Unfortunately, the massive observation space usually contains useless or even adverse features, which leads to low sample efficiency. Existing methods rely on domain knowledge and cross-validation to discove...
Preprint
Full-text available
Learning from datasets without interaction with environments (Offline Learning) is an essential step to apply Reinforcement Learning (RL) algorithms in real-world scenarios. However, compared with the single-agent counterpart, offline multi-agent RL introduces more agents with the larger state and action space, which is more challenging but attract...
Preprint
Full-text available
Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks. However, current methods pay little attention to the interaction between agents, which is essential to teamwork in games or real life. This limits the efficiency of value-b...
Article
Full-text available
Writing is a pivotal part of the language exam, which is considered as a useful tool to accurately reflect students’ language competence. As Chinese language tests become popular, manual grading becomes a heavy and expensive task for language test organizers. In the past years, there is a large volume of research about the automated English evaluat...
Article
Full-text available
Deep neural network (DNN) has many advantages. Autonomous driving has become a popular topic now. In this paper, an improved stack autoencoder based on the deep learning techniques is proposed to learn the driving characteristics of an autonomous car. These techniques realize the input data adjustment and solving diffusion gradient problem. A Raspb...
Article
Full-text available
The automation level of autonomous marine vehicle is limited which is always semi-autonomy and reliant on operator interactions. In order to improve it, an autonomous collision avoidance method is proposed based on the visual technique as human’s visual system. A deep convolutional neural network (Alexnet), with strong visual processing capability,...

Network

Cited By