Figure5

Reinforcement learning with parameterized action space and sparse reward for UAV navigation

Figure 5. Comparisons of the results using HER-MPDQN and other baselines. The left figure shows the periodic calculation of the task completion rate of the last 10 episodes, and the right figure shows the total success rate during the entire training process. The shaded area is the variance in multiple experiments. Smaller shading indicates that the algorithm is less sensitive to random seeds.

Intelligence & Robotics
ISSN 2770-3541 (Online)
Follow Us

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/