Hao Zhang’s research while affiliated with East China Normal University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Fig. 1. We learn causal variables z of pixel values x. The bottom illustrates the consequence of intervention on the causal variables.
Enhancing Reinforcement Learning via Causally Correct Input Identification and Targeted Intervention
  • Article
  • Full-text available

March 2024

·

79 Reads

·

3 Citations

Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

·

Hu Lu

·

Hao Zhang

·

[...]

·

Yue Lu

Causal confusion, characterized by the learning of spurious correlations, detrimentally affects the generalization and effectiveness of reinforcement learning (RL) algorithms, especially in environments without latent confounders often encountered in robot autonomous navigation tasks. This study addresses this gap by developing a causal structure within a Partially Observable Markov Decision Process (POMDP). Subsequently, we introduce a targeted intervention that mitigates the influence of spurious correlations by isolating causally significant state variables and discarding irrelevant inputs. Testing in three real-world scenarios confirms the approach's feasibility and superiority in enhancing the RL al-gorithms' performance and generalization ability, signifying a promising step towards more robust online RL frameworks.

Download

Fig. 1. From left to right, the parcels are densely packaged in the infeed area. After singulation processing, the parcels are separated and spaced with predefined interval ∆d.
Fig. 6. Normalized reward of learning-based methods
Pass rate
Parcel singulation Efficiency
Enhanced Deep Reinforcement Learning for Parcel Singulation in Non-Stationary Environments

March 2024

·

108 Reads

·

2 Citations

Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

In the rapidly expanding logistics sector, parcel singulation has emerged as a significant bottleneck. To address this, we propose an automated parcel singulator utilizing a sparse actuator array, which presents an optimal balance between cost and efficiency, albeit requiring a sophisticated control policy. In this study, we frame the parcel singulation issue as a Markov Decision Process with a variable state space dimension, addressed through a deep reinforcement learning (RL) algorithm complemented by a State Space Standardization Module (S3). Distinct from previous RL approaches, our methodology initially considers the non-stationary environment during the problem modeling phase. To counter this challenge, the S3 module standardizes the dynamic input state, thereby stabilizing the RL training process. We validate our method through simulation experiments in complex environments, comparing it with several baseline algorithms. Results indicate that our algorithm excels in parcel singu-lation tasks, achieving a higher success rate and enhanced efficiency.