Tracking Anyone and Anycar for 3D Auto-Labeling from Video Inputs

ABOUT THE PROJECT
At a glance
Autonomous driving technology has been rapidly evolving, but the reliance on extensive labeled data, particularly 3D bounding boxes derived from LiDAR [1, 4], remains a bottleneck. This proposal outlines an innovative approach for 3D object detection using purely visual data [5,6], which is more abundant and less costly to acquire. Our goal is to introduce a system that can track any person or vehicle in 3D using only RGB video data, significantly advancing the capabilities of autonomous vehicles in diverse environments. However, there are several challenges: video data does not involve depth information and the description of 3D geometry, thus it is hard to annotate 3D bounding boxes directly from videos; Besides, there are lots of dynamic objects like cars and pedestrians, which are hard to capture consistency for existing visual models; Finally, pedestrians as detected targets are deformable and we cannot easily leverage rigid body assumption for the 3D annotations.
principal investigators | researchers | themes |
---|---|---|
3D tracking, auto-labeling |