Anomaly Detection in Autonomous Driving Systems through Multiple Transformer Interpretations

ABOUT THIS PROJECT

At a glance

Autonomous driving has made great progress but at the same time has not been able to replace human drivers as fast as expected because it faces very challenging anomaly detection problems on the road. Transformers have exhibited impressive capabilities in various autonomous driving applications in recent years by learning intricate nonlinear relationships between features. However, the challenge of comprehending these relationships has resulted in transformers largely considered as black boxes. As discussed in previous work, this opaqueness could lead to cybersecurity vulnerabilities and deceptive model behaviors, such as susceptibility to adversarial attacks and hallucinations, critical shortcomings for self-driving cars. To mitigate risks, trustworthy model interpretability is of paramount importance. The proposal aims to introduce a holistic interpretation framework to aid anomaly (e.g. errors, outliers) detection in transformers for autonomous driving. These anomalies include wrong predictions made for object detection in the perception layer or extreme values of spatial coordinates of road agents (such as cars, buses, and pedestrians) calculated in the decision-making layer, either of which may lead to dangerous driving decisions.

Due to the complexity of transformers, multiple approaches are necessary to uncover different kinds of anomalies. Our proposed framework is built on three novel interpretation methods, to provide interpretations from three critical aspects: (1) empirical feature space analyses (2) mechanistic circuit discovery of internal components (e.g. attention heads, feed-forward networks) and (3) an out-of-distribution (OOD) score for generative models. We then stress-test the anomalies identified in each method using the newly developed Predictability, Computability, Stability (PCS) Framework by the Yu Group to ensure reliable anomaly detection. With some promising preliminary evidence of each of the three interpretation method shown in high-stakes domains, such as medicine and biology, we are hopeful of transferring their success to autonomous driving systems by integrating them effectively for more reliable anomaly detection.

Principal investigators	researchers	themes
Bin Yu	Aliyah Hsu Omer Ronen	Anomaly detection in autonomous driving systems, Transformers interpretability, Explainable AI

Principal investigators

researchers

themes

Bin Yu

Aliyah Hsu

Omer Ronen

Anomaly detection in autonomous driving systems, Transformers interpretability, Explainable AI