Vision-Guided Outdoor Obstacle Evasion for UAVs via Reinforcement Learning

Shiladitya Dutta

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2025-132

May 23, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-132.pdf

Although quadcopters boast impressive traversal capabilities enabled by their omnidirectional maneuverability, the need for continuous navigation in complex environments impedes their application in GNSS and telemetry-denied scenarios. To this end, we propose a novel sensorimotor policy that uses stereo-vision depth and visual-inertial odometry (VIO) to autonomously navigate through obstacles in an unknown environment to reach a goal point. The policy is comprised of a pre-trained autoencoder as the perception head followed by a planning and control LSTM network which outputs velocity commands that can be followed by an off-the-shelf commercial drone. We employ reinforcement learning and privileged learning paradigms to train the policy in simulation via Flightmare using a simplified environment model. Training follows a two-stage process: (1) supervised initialization using optimal trajectories generated by a global motion planner, and (2) curriculum-based fine-tuning to improve policy robustness and generalization. To bridge the sim-to-real gap, we employ domain randomization and reward shaping to create a policy that is both robust to noise and domain shift. In actual outdoor experiments, our approach achieves successful zero-shot transfer to both a drone platform (DJI M300) and environments with obstacles that were never encountered during training.

Advisors: Avideh Zakhor

BibTeX citation:

@mastersthesis{Dutta:EECS-2025-132,
    Author= {Dutta, Shiladitya},
    Title= {Vision-Guided Outdoor Obstacle Evasion for UAVs via Reinforcement Learning},
    School= {EECS Department, University of California, Berkeley},
    Year= {2025},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-132.html},
    Number= {UCB/EECS-2025-132},
    Abstract= {Although quadcopters boast impressive traversal capabilities enabled by their omnidirectional maneuverability, the need for continuous navigation in complex environments impedes their application in GNSS and telemetry-denied scenarios. To this end, we propose a novel sensorimotor policy that uses stereo-vision depth and visual-inertial odometry (VIO) to autonomously navigate through obstacles in an unknown environment to reach a goal point. The policy is comprised of a pre-trained autoencoder as the perception head followed by a planning and control LSTM network which outputs velocity commands that can be followed by an off-the-shelf commercial drone. We employ reinforcement learning and privileged learning paradigms to train the policy in simulation via Flightmare using a simplified environment model. Training follows a two-stage process: (1) supervised initialization using optimal trajectories generated by a global motion planner, and (2) curriculum-based fine-tuning to improve policy robustness and generalization. To bridge the sim-to-real gap, we employ domain randomization and reward shaping to create a policy that is both robust to noise and domain shift. In actual outdoor experiments, our approach achieves successful zero-shot transfer to both a drone platform (DJI M300) and environments with obstacles that were never encountered during training.},
}

EndNote citation:

%0 Thesis
%A Dutta, Shiladitya 
%T Vision-Guided Outdoor Obstacle Evasion for UAVs via Reinforcement Learning
%I EECS Department, University of California, Berkeley
%D 2025
%8 May 23
%@ UCB/EECS-2025-132
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-132.html
%F Dutta:EECS-2025-132