Uncertainty-Aware End-to-End Prediction for Robust Decision Making
ABOUT THE PROJECT
At a glance
The standard approach to perception in autonomous driving is to detect objects of interest, either in 3D or 2D, form a map or internal model of the world, and then use this map or model for planning. Virtually all standard approaches use some variant of this sense-plan-act pipeline. The effectiveness of this pipeline is however limited by two key factors: (1) the particular representation that serves as the output of the perception system must be adequate for making the required decisions, and (2) the perception system must be able to infer this representation from observations (e.g., camera images or LIDAR point clouds) with sufficient confidence and accuracy to make correct and safe decisions. Unfortunately, most current systems come up short in both regards: (1) abstract hand-designed vision representations -- such as bounding box detections of other vehicles -- can be inadequate for making decisions that require inferring their intent (e.g., inferring whether another vehicle will change lanes by observing the rotation of the front tire, something many novice drivers are taught to do in driving school); and (2) the error functions that are used to train these systems do not take the importance of confidence of the inferences into account: detecting a car that is on a collision course is far more crucial than detecting a car on the other side of a center divider, and outputting meaningful confidence intervals along with the predictions can be critical for making uncertainty-aware decisions in unfamiliar situations, with the latter being a crucial component of dealing with the “long tail” of unexpected events that currently stands as one of the biggest obstacles in front of widespread deployment of learning-based perception for autonomous systems such as cars. In this project, I will propose to investigate algorithms that use end-to-end prediction to directly predict task-relevant events from raw sensory observations, and study how Bayesian neural network models can be used to endow these predictions with uncertainty estimates for intelligent exploration and safe decision-making in unfamiliar situations.
Current State of the Art. Prior work in this area has explored control of robots and autonomous vehicles either by using a pipelined approach, where the output of a dedicated perception system (e.g., bounding boxes) is fed into a control system, or a fully end-to-end approach, where a policy directly learns to map observations into actions . Both methods have severe shortcomings: the pipelined approach divorces the goals of perception from the needs of control, while the fully end-to-end approach provides little flexibility at test time. We will explore how we can get the best of both: a perception system that detects task relevant events, and a control system that can flexibly accommodate new goals at test time while being aware of its own uncertainty in new and unfamiliar situations.
Proposed Innovation. The proposed research will focus on developing decision-making systems that aim to directly predict task relevant events directly from raw sensory observations using deep convolutional neural networks, as opposed to intermediate hand-labeled representations, such as bounding boxes or 3D locations. Relevant events in a driving scenario might include the probability of a collision, a violation of the rules of the road, or ability to meet a navigational objective (e.g., successfully make a desired turn at an intersection). Relevant events in robotic manipulation might include dropping or picking up an object or placing an option in a particular pose. This will be aimed at addressing problem (1) -- the shortcomings of hand-designed representations such as bounding boxes -- while still maintaining the flexibility to use the predicted events within a flexible decision making system that integrates multiple sources of information, as opposed to an opaque and fully end-to-end system that outputs actions directly. This will also address problem (2) since, by choosing events that are directly relevant to the decisions in question, training a model to predict those events will focus its attention on the most salient observations. We will further study how Bayesian neural networks can provide meaningful uncertainty estimates for these events, so as to enable intelligent responses even in novel situations, which is critical for safe operation of autonomous robots and vehicles.
Project Details. The specific plan will leverage the following insights: (i) it is often much easier to build a perception system that can detect an event once it has occurred (e.g., a collision, running a stop sign, or changing a lane) than it is to build one that can predict whether an event will occur in the future (e.g., if I don’t slow down now, I will run a stop sign); (ii) methods derived from reinforcement learning can “back up” these events to construct predictors that can predict whether the event will occur; (iii) predictors for a rich repertoire of events can be dynamically recombined at test time to plan for actions that will bring about a chosen combination of events (e.g., change lanes, while avoiding collision, while also maintaining a target speed); (iv) if all of the event predictors are trained with calibrated uncertainty estimates, they can detect unfamiliar situations and act conservatively (e.g., avoid collision with high probability, even if it means failing to match a target speed or achieve a navigational goal). To that end, the goal of this research will be to develop fundamental algorithmic tools for using reinforcement learning to learn repertoires of event predictors, where the events themselves are detected (in the training data, after the fact, in hindsight) by conventional computer vision systems, and the predictors themselves have calibrated estimates of uncertainty by using Bayesian neural networks. In essence, the events act as “reward functions” for an ensemble of value functions. At test time, this combination of predictors can be flexibly recombined to achieve new goals, such as driving at a particular speed in a particular lane. Our preliminary research in this direction has already been effective at producing collision avoidance policies that are uncertainty-aware , as well as generalized computation graph frameworks than enable reinforcement learning to predict future collision events . This project will aim to extend these ideas to work with a broad repertoire of events that can be detected in hindsight via computer vision methods, and for which a repertoire of predictors can enable flexible execution of user-specified objectives at test time. Research will be conducted using our small-scale car platform (shown on the right), which is built and developed from scratch in our group. Code for general computation graph predictors and uncertainty-aware prediction will be submitted to the BDD repository, along with simple demonstrations in simulation for rapid prototyping.
- Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine. Uncertainty-Aware Reinforcement Learning for Collision Avoidance. arXiv. 2017.
- Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine. Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation. arXiv. 2017.
- Mariusz Bojarski et al. End to End Learning for Self-Driving Cars. arXiv. 2016
|deep reinforcement learning, uncertainty, robustness