Enabling Generalization of Human Models for Human-AI Collaboration to New Tasks
Xiaocheng Yang and Anca Dragan
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2021-199
August 13, 2021
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-199.pdf
Human modeling is a crucial step for achieving good human-AI collaboration, and human data provides us with information on human behavior and thus plays an important role in the process. Even though existing methods work well on a single task with the help of plenty of on-task human data, real-world human-AI collaborations usually involve a distribution of disjoint tasks, and collecting human data on every single task is unrealistic. Consequently, naive human modeling could fail in tasks without human data. However, as long as we know the distribution of tasks, we can still use self-play to obtain a multi-task self-play policy. Since this policy will need to learn robust representations of all tasks, it can serve as an effective initialization for human models. We provide theoretical justification for this technique, and show its benefits on a challenging multi-task setting: multi-layout Overcooked-AI.
Advisors: Anca Dragan
BibTeX citation:
@mastersthesis{Yang:EECS-2021-199, Author= {Yang, Xiaocheng and Dragan, Anca}, Title= {Enabling Generalization of Human Models for Human-AI Collaboration to New Tasks}, School= {EECS Department, University of California, Berkeley}, Year= {2021}, Month= {Aug}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-199.html}, Number= {UCB/EECS-2021-199}, Abstract= {Human modeling is a crucial step for achieving good human-AI collaboration, and human data provides us with information on human behavior and thus plays an important role in the process. Even though existing methods work well on a single task with the help of plenty of on-task human data, real-world human-AI collaborations usually involve a distribution of disjoint tasks, and collecting human data on every single task is unrealistic. Consequently, naive human modeling could fail in tasks without human data. However, as long as we know the distribution of tasks, we can still use self-play to obtain a multi-task self-play policy. Since this policy will need to learn robust representations of all tasks, it can serve as an effective initialization for human models. We provide theoretical justification for this technique, and show its benefits on a challenging multi-task setting: multi-layout Overcooked-AI.}, }
EndNote citation:
%0 Thesis %A Yang, Xiaocheng %A Dragan, Anca %T Enabling Generalization of Human Models for Human-AI Collaboration to New Tasks %I EECS Department, University of California, Berkeley %D 2021 %8 August 13 %@ UCB/EECS-2021-199 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-199.html %F Yang:EECS-2021-199