FPGA PRET Accelerators of Deep Learning Classifiers for Autonomous Vehicles
About This Project
At a glance
This project focuses on predictable, repeatable and transparent (PRET) computation infrastructure for efficient implementation of safety critical cyber-physical systems (CPS), such as control systems in autonomous vehicles. Deep Learning (DL) is a promising approach to sensing in this space. However, it is unclear how DL-based sensor implementations would influence the control structures, due to standard implementations that largely ignore timing and energy constraints present in embedded settings. This project will investigate how to leverage previously developed PRET technology and develop timing predictable, energy optimized classifier implementations.
In recent work, the team focused on PRET challenges in numerical computation for high-performance control system design. A parameterized processor template based on Very Long Instruction Word (VLIW) architecture was developed (shown below), along with a set of highly transparent programming tools. The architecture is characterized by complete predictability and repeatability of computation timing and memory access, to facilitate design and verification of control systems.
Over the course of this project, it has become clear that memory access dominates the energy and time cost of computation in DL inference. Hence, the team is exploring methods for reducing memory traffic between the compute core and coefficient & tensor storage. One of the most popular techniques involves training models with sparse matrices in fully connected layers and has been reported to reduce the number of model coefficients by 10x and more without significant loss in inference performance. The team is looking into sparsification strategies and inference implementation techniques that can leverage such sparse coefficient matrices efficiently.
|Ranko Sredojevic, Lazar Supic and Rawan Naous||Deep Learning|