Reinforcement Learning for Adaptive Traffic Control in Mixed-Autonomy Environments
Han Wang
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2025-182
December 3, 2025
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-182.pdf
The integration of autonomous vehicles into transportation infrastructure presents a fundamental challenge: these systems must operate within environments designed for human drivers while sharing road spaces with human-operated vehicles. This thesis addresses how autonomous vehicles can function as traffic flow actuators in mixed-autonomy systems, where strategic control of automated vehicles improves overall system performance.
The research develops a mathematical framework using a PDE-ODE coupled system that models the interaction between aggregate human traffic dynamics and individual autonomous vehicle trajectories. The partial differential equation captures bulk traffic flow through conservation laws, while ordinary differential equations represent controlled vehicle trajectories. This formulation enables control strategies where autonomous vehicles influence macroscopic traffic properties through local speed modulation.
Building on this foundation, the thesis presents a hierarchical Speed Planner that extends single-vehicle control to fleet-level coordination. The system integrates real-time traffic state estimation from commercial data sources (INRIX) with microscopic vehicle observations, employing prediction and fusion modules to compensate for data latency. A reinforcement learning-based buffer design module generates target speed profiles that optimize bottleneck density through an Actor-Critic algorithm trained on the PDE-ODE mathematical model.
Numerical experiments validate the proposed approach. Simulations demonstrate 15% improvement in minimum traffic flux and 35\% reduction in speed deviation. Extended experiments using realistic I-24 highway environments show 12.62% reduction in bottleneck density and 5.01% increase in throughput with only 4% autonomous vehicle penetration. Field deployment through the MegaVanderTest—a large-scale experiment deploying 100 automated vehicles on Interstate I-24—achieved 52% reduction in congestion formation density at the bottleneck location, validating the hierarchical control architecture under real-world traffic conditions.
Advisors: Alexandre Bayen
BibTeX citation:
@mastersthesis{Wang:EECS-2025-182,
Author= {Wang, Han},
Title= {Reinforcement Learning for Adaptive Traffic Control in Mixed-Autonomy Environments},
School= {EECS Department, University of California, Berkeley},
Year= {2025},
Month= {Dec},
Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-182.html},
Number= {UCB/EECS-2025-182},
Abstract= {The integration of autonomous vehicles into transportation infrastructure presents a fundamental challenge: these systems must operate within environments designed for human drivers while sharing road spaces with human-operated vehicles. This thesis addresses how autonomous vehicles can function as traffic flow actuators in mixed-autonomy systems, where strategic control of automated vehicles improves overall system performance.
The research develops a mathematical framework using a PDE-ODE coupled system that models the interaction between aggregate human traffic dynamics and individual autonomous vehicle trajectories. The partial differential equation captures bulk traffic flow through conservation laws, while ordinary differential equations represent controlled vehicle trajectories. This formulation enables control strategies where autonomous vehicles influence macroscopic traffic properties through local speed modulation.
Building on this foundation, the thesis presents a hierarchical Speed Planner that extends single-vehicle control to fleet-level coordination. The system integrates real-time traffic state estimation from commercial data sources (INRIX) with microscopic vehicle observations, employing prediction and fusion modules to compensate for data latency. A reinforcement learning-based buffer design module generates target speed profiles that optimize bottleneck density through an Actor-Critic algorithm trained on the PDE-ODE mathematical model.
Numerical experiments validate the proposed approach. Simulations demonstrate 15% improvement in minimum traffic flux and 35\% reduction in speed deviation. Extended experiments using realistic I-24 highway environments show 12.62% reduction in bottleneck density and 5.01% increase in throughput with only 4% autonomous vehicle penetration. Field deployment through the MegaVanderTest—a large-scale experiment deploying 100 automated vehicles on Interstate I-24—achieved 52% reduction in congestion formation density at the bottleneck location, validating the hierarchical control architecture under real-world traffic conditions.},
}
EndNote citation:
%0 Thesis %A Wang, Han %T Reinforcement Learning for Adaptive Traffic Control in Mixed-Autonomy Environments %I EECS Department, University of California, Berkeley %D 2025 %8 December 3 %@ UCB/EECS-2025-182 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-182.html %F Wang:EECS-2025-182