Learning to play collaborative-competitive games

Kshama Dwarakanath and S. Shankar Sastry

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2020-201
December 16, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-201.pdf

In this project, we formalize a collaborative-competitive game with competition between two teams and collaboration between members of each team. The players from the first team seek to reach their individual goals while avoiding capture by the second team. And, the second team seeks to capture all players in the first team. The competition between the two teams arises from the fact that the second team seeks to capture the first team's players, while the latter seek to reach their individual goals while avoiding capture. The players within each team can collaborate with each other in order to achieve their individual and team goals. The ground rules for game play are cast in the form of a Markov Decision Process with the goal of learning optimal game play strategies for members of the first team. We collect expert trajectories from human experts that played the game, and use this data to learn similar game play strategies designed to ensure that the first team wins the game. A recent approach for imitation learning called Generative Adversarial Imitation Learning (GAIL) is examined in the context of these collaborative-competitive games. The results of running GAIL on expert data are contrasted against those got from state of the art algorithms from the domain of imitation learning as well as (forward) reinforcement learning. We see that the learnt policies resemble in logic to those used by human experts in playing the game, while being successful in about 70% of new games played. This success rate is very close to that of the human experts playing the game.

Advisor: S. Shankar Sastry and Yi Ma


BibTeX citation:

@mastersthesis{Dwarakanath:EECS-2020-201,
    Author = {Dwarakanath, Kshama and Sastry, S. Shankar},
    Title = {Learning to play collaborative-competitive games},
    School = {EECS Department, University of California, Berkeley},
    Year = {2020},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-201.html},
    Number = {UCB/EECS-2020-201},
    Abstract = {In this project, we formalize a collaborative-competitive game with competition between two teams and collaboration between members of each team. The players from the first team seek to reach their individual goals while avoiding capture by the second team. And, the second team seeks to capture all players in the first team. The competition between the two teams arises from the fact that the second team seeks to capture the first team's players, while the latter seek to reach their individual goals while avoiding capture. The players within each team can collaborate with each other in order to achieve their individual and team goals. The ground rules for game play are cast in the form of a Markov Decision Process with the goal of learning optimal game play strategies for members of the first team. We collect expert trajectories from human experts that played the game, and use this data to learn similar game play strategies designed to ensure that the first team wins the game. A recent approach for imitation learning called Generative Adversarial Imitation Learning (GAIL) is examined in the context of these collaborative-competitive games. The results of running GAIL on expert data are contrasted against those got from state of the art algorithms from the domain of imitation learning as well as (forward) reinforcement learning. We see that the learnt policies resemble in logic to those used by human experts in playing the game, while being successful in about 70% of new games played. This success rate is very close to that of the human experts playing the game.}
}

EndNote citation:

%0 Thesis
%A Dwarakanath, Kshama
%A Sastry, S. Shankar
%T Learning to play collaborative-competitive games
%I EECS Department, University of California, Berkeley
%D 2020
%8 December 16
%@ UCB/EECS-2020-201
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-201.html
%F Dwarakanath:EECS-2020-201