Comparative Studies on Sample Complexity Bounds in Multi-Agent Reinforcement Learning
Jiaqi Yang
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2022-47
May 10, 2022
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-47.pdf
In this report, we survey on the existing sample complexity bounds from multi-agent reinforcement learning (MARL) literature and those from game theory literature. Along the way, we give unified notations for game theory and MARL, and summarize different definitions of equilibria in game theory and MARL.
By comparative studies on the existing bounds, we identify several interesting open gaps in MARL, and we take preliminary steps towards answering these open questions. This report can serve as a starting point for future studies in MARL theory.
Advisors: Jiantao Jiao
BibTeX citation:
@mastersthesis{Yang:EECS-2022-47, Author= {Yang, Jiaqi}, Title= {Comparative Studies on Sample Complexity Bounds in Multi-Agent Reinforcement Learning}, School= {EECS Department, University of California, Berkeley}, Year= {2022}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-47.html}, Number= {UCB/EECS-2022-47}, Abstract= {In this report, we survey on the existing sample complexity bounds from multi-agent reinforcement learning (MARL) literature and those from game theory literature. Along the way, we give unified notations for game theory and MARL, and summarize different definitions of equilibria in game theory and MARL. By comparative studies on the existing bounds, we identify several interesting open gaps in MARL, and we take preliminary steps towards answering these open questions. This report can serve as a starting point for future studies in MARL theory.}, }
EndNote citation:
%0 Thesis %A Yang, Jiaqi %T Comparative Studies on Sample Complexity Bounds in Multi-Agent Reinforcement Learning %I EECS Department, University of California, Berkeley %D 2022 %8 May 10 %@ UCB/EECS-2022-47 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-47.html %F Yang:EECS-2022-47