Robust Multimodal Perception Stack for High-Speed Autonomous Racecars

Kaushik Singh

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2025-110

May 16, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-110.pdf

Autonomous racing presents a uniquely constrained yet demanding testbed for perception systems: cars compete at high speeds on fixed circuits with known boundaries, but must reliably detect and track only their opponents under stringent real-time constraints. This thesis addresses the challenge of robust multi-modal perception for autonomous racecars by developing, analyzing, and experimentally validating modular fusion architectures that leverages LiDAR, radar, and camera sensors. We begin by formulating the problem of opponent detection - estimating the two-dimensional position, orientation, and velocity of other vehicles - under assumptions of a separate localization system and a predefined track. After surveying classical and end-to-end learning approaches, we motivate a classical “early-stage” fusion pipeline based on perspective projection and extrinsic calibration, alongside a “late-stage” fusion design that independently processes each modality before combining outputs via an Extended Kalman Filter. Preliminary experiments - benchmarked against transponder-derived ground truth - evaluate positional accuracy and computational load for both fusion methods. Results demonstrate promising indications that our late-stage fusion method achieves superior robustness to misclassification and miscalibration, while maintaining real-time performance on the racecar’s onboard compute.

Advisors: S. Shankar Sastry

BibTeX citation:

@mastersthesis{Singh:EECS-2025-110,
    Author= {Singh, Kaushik},
    Title= {Robust Multimodal Perception Stack for High-Speed Autonomous Racecars},
    School= {EECS Department, University of California, Berkeley},
    Year= {2025},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-110.html},
    Number= {UCB/EECS-2025-110},
    Abstract= {Autonomous racing presents a uniquely constrained yet demanding testbed for perception systems: cars compete at high speeds on fixed circuits with known boundaries, but must reliably detect and track only their opponents under stringent real-time constraints. This thesis addresses the challenge of robust multi-modal perception for autonomous racecars by developing, analyzing, and experimentally validating modular fusion architectures that leverages LiDAR, radar, and camera sensors. We begin by formulating the problem of opponent detection - estimating the two-dimensional position, orientation, and velocity of other vehicles - under assumptions of a separate localization system and a predefined track. After surveying classical and end-to-end learning approaches, we motivate a classical “early-stage” fusion pipeline based on perspective projection and extrinsic calibration, alongside a “late-stage” fusion design that independently processes each modality before combining outputs via an Extended Kalman Filter. Preliminary experiments - benchmarked against transponder-derived ground truth - evaluate positional accuracy and computational load for both fusion methods. Results demonstrate promising indications that our late-stage fusion method achieves superior robustness to misclassification and miscalibration, while maintaining real-time performance on the racecar’s onboard compute.},
}

EndNote citation:

%0 Thesis
%A Singh, Kaushik 
%T Robust Multimodal Perception Stack for High-Speed Autonomous Racecars
%I EECS Department, University of California, Berkeley
%D 2025
%8 May 16
%@ UCB/EECS-2025-110
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-110.html
%F Singh:EECS-2025-110