Ashwin Dara
EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2025-51
May 13, 2025
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-51.pdf
Reinforcement learning (RL) controllers have been shown in both simulation and real-world deployments to significantly improve traffic flow and fuel efficiency, even when only a small fraction of vehicles are autonomous. Despite these benefits, real-world adoption remains limited due to a lack of transparency, which leads human operators to distrust and often override RL policies. In response, we introduce CLEAR (Contextual Language Explanations for Actions from RL), a framework that generates step-by-step natural language explanations of RL decisions using large language models (LLMs). To address the risk of hallucinations in high-stakes settings, CLEAR integrates a multi-stage validation pipeline that verifies explanations against policy outputs, tests robustness under input perturbations, and checks for logical consistency. Unlike static fine-tuning methods, CLEAR adapts online to new scenarios and maintains alignment with the underlying policy. When evaluated on real-world highway data from the VanderTest, CLEAR significantly outperformed few-shot prompting and retrieval-based workflows in both predictive accuracy and explanation quality. This work extends a prior conference submission and demonstrates the potential of validated language-based interpretability for safe and trustworthy RL deployment.
Advisor: Alexandre Bayen
";
?>
BibTeX citation:
@mastersthesis{Dara:EECS-2025-51, Author = {Dara, Ashwin}, Title = {Demystifying Decision-Making of Deep RL through Validated Language Explanations}, School = {EECS Department, University of California, Berkeley}, Year = {2025}, Month = {May}, URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-51.html}, Number = {UCB/EECS-2025-51}, Abstract = {Reinforcement learning (RL) controllers have been shown in both simulation and real-world deployments to significantly improve traffic flow and fuel efficiency, even when only a small fraction of vehicles are autonomous. Despite these benefits, real-world adoption remains limited due to a lack of transparency, which leads human operators to distrust and often override RL policies. In response, we introduce CLEAR (Contextual Language Explanations for Actions from RL), a framework that generates step-by-step natural language explanations of RL decisions using large language models (LLMs). To address the risk of hallucinations in high-stakes settings, CLEAR integrates a multi-stage validation pipeline that verifies explanations against policy outputs, tests robustness under input perturbations, and checks for logical consistency. Unlike static fine-tuning methods, CLEAR adapts online to new scenarios and maintains alignment with the underlying policy. When evaluated on real-world highway data from the VanderTest, CLEAR significantly outperformed few-shot prompting and retrieval-based workflows in both predictive accuracy and explanation quality. This work extends a prior conference submission and demonstrates the potential of validated language-based interpretability for safe and trustworthy RL deployment.} }
EndNote citation:
%0 Thesis %A Dara, Ashwin %T Demystifying Decision-Making of Deep RL through Validated Language Explanations %I EECS Department, University of California, Berkeley %D 2025 %8 May 13 %@ UCB/EECS-2025-51 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-51.html %F Dara:EECS-2025-51