Deep Reinforcement Learning for Robot Navigation


At a glance

Enabling robots to autonomously navigate complex environments is essential for real-world deployment. Prior methods approach this problem by having the robot maintain an internal map of the world, and then use a localization and planning method to navigate through the internal map. However, these approaches often include a variety of assumptions, are computationally intensive, and do not learn from failures. In contrast, learning-based methods improve as the robot acts in the environment, but are difficult to deploy in the real-world due to their high sample complexity.

The goal of this project is to investigate how navigation and collision avoidance mechanisms can be learned from scratch, via a continuous, self-supervised learning process using onboard sensors, such as a monocular camera. To address the need to learn complex policies with few samples, we have formulated a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based. We can then instantiate this graph to form a navigation model that learns from raw images and is sample efficient. Our simulated car experiments explore the design decisions of our navigation model, and show our approach outperforms single-step and N-step double Q-learning. We also evaluate our approach on a real-world RC car and show it can learn to navigate through a complex indoor environment with a few hours of fully autonomous, self-supervised training. Videos of the experiments and code can be found at

In order to continue evaluating and expanding the scope of our learning-based approaches in the real-world, we have redesigned the RC car platform to consider the needs of our reinforcement learning algorithms: robustness, longevity, multiple sensor modalities, and high computational demand. The RC car has a motor to control the wheels, a servo to control the steering angle, and is built to handle off-road terrain. Laser cut materials and 3D printed parts secure the hardware to the chassis, allow for easy access, and protects the hardware from damage. Two large-capacity batteries are used to enable long-term autonomy experiments. The increased sensor suite consists of an RGB camera, magnetic wheel encoders, inertial measurement unit, magnetometer, and GPS. An Arduino Uno parses the sensors, while the upgraded NVIDIA Jetson TX2 offers increased computational power to run our algorithms. With an easy-to-use ROS interface, the RC car is ready to run reinforcement learning algorithms.




principal investigatorsresearchersthemes
Pieter AbbeelGregory Kahn