Hessian Aware Neural Cleansing: Searching for Optimal Trade-offs between Adversarial Robustness, Accuracy, and Speed


At a glance

One of the on-going challenges with Neural Networks has been their vulnerability to adversarial attacks. Despite significant efforts in developing defense strategies, this challenge has remained unresolved, and all of the proposed defenses have been broken by very simple attacks. However, in the past year, there have been several important works that have provided intriguing new insights about the different sources of this vulnerability. These can be categorized into non-robust features in the training data, as well as non-robust components of the Neural Network model itself. In this proposal, we aim to build upon these works and develop a Neural Cleansing framework for designing robust and accurate models. To achieve this, we will first investigate the decision ``boundary thickness'' and its correlation to NN architecture, as well as different types of non-robust input training data. Then, based on this, we will design a novel Hessian Aware Neural Cleansing approach, to systematically prune and replace non-robust components of the NN model. This will be combined with a distillation method, along with adversarial training and data augmentation, to remove and reduce the impact of non-robust features in the input training dataset as well. The result will be a complete framework that will enable the detection of non-robust features in the data/model with interpretable information regarding the different vulnerability sources in the model and the data. Finally, we will incorporate the Neural Cleansing with Neural Architecture Search, to enable the adaptation of the model architecture to find the optimal trade-offs between robustness and accuracy.

principal investigatorsresearchersthemes

Kurt Keutzer

Michael Mahoney

Amir Gholami

Yaoqing Yang

Adversarial Robustness, Neural Architecture Search,  Efficient Deep Learning

This project is built upon the work of the project:  Systematic Study of Neural Network Robustness Against Adversarial Attacks