Automated Search for Neural Net Architectures


At a glance

Designing neural networks (NN) has been the critical element of a lot of deep learning-based algorithms. In particular, automotive applications on embedded processors demand particularly efficient NN architectures to perform tasks such as classification, object detection, semantic segmentation, and so on. A better NN design can lead to better performance and efficiency in many computer vision tasks. Previous NNs have been mainly designed through manual explorations, and they have largely ignored target hardware constraints. The process is inefficient, unscalable, and the performance of the network is very limited. 

Recently, neural architecture search (NAS) is attracting more and more research attention. NAS aims to design neural networks for optimal performance and efficiency automatically. Many works show that networks discovered automatically achieve higher accuracy and efficiency compared with hand-designed counterparts. 

However, despite recent progress of NAS, several significant limitations of NAS that need to be addressed:
1) High computational cost of NAS: many NAS algorithms need to sample a large number of neural network architectures to be trained, which, in turn, requires an enormous amount of computational cost. As a typical example, [Zoph2016] requires 450 GPUs for 4-5 days to finish the search. 2) Neural Nets limited to classification tasks: most of the previous NAS algorithms are focusing on image classification networks, with simple, single-stage training pipeline. For autonomous driving, however, classification alone is not enough, as it also heavily relies on object detection, semantic segmentation, and so on. For other problems such as object detection, the training pipeline contains several stages, most of the NAS algorithms cannot be directly applied. 3) Hardware constraints are not integrated: many NAS algorithms solely focus on improving accuracy. Some recent works of NAS also focus on the efficiency of the network, but the efficiency is usually measured by hardware-independent proxies such as FLOPs or parameter size. 

In this project, we aim to address the above problems, by proposing a fast, general, and hardware-aware NAS algorithm.

principal investigatorsresearchersthemes
Kurt KeutzerBichen Wu

Neural architecture search, Efficient deep learning, Computer vision