The effects of rewards on the ability of an autonomous UAV controlled by a Reinforcement Learning agent to accomplish a target localization task were investigated. It was shown that with an increase in the reward obtained by a learning agent upon correct detection, systems would become more risk-tolerant, efficient and have a tendency to locate targets faster with an increase in the sensor
... [Show full abstract] sensitivity after systems achieve steady-state performance.