March 2024
·
79 Reads
·
3 Citations
Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Causal confusion, characterized by the learning of spurious correlations, detrimentally affects the generalization and effectiveness of reinforcement learning (RL) algorithms, especially in environments without latent confounders often encountered in robot autonomous navigation tasks. This study addresses this gap by developing a causal structure within a Partially Observable Markov Decision Process (POMDP). Subsequently, we introduce a targeted intervention that mitigates the influence of spurious correlations by isolating causally significant state variables and discarding irrelevant inputs. Testing in three real-world scenarios confirms the approach's feasibility and superiority in enhancing the RL al-gorithms' performance and generalization ability, signifying a promising step towards more robust online RL frameworks.