Learning Representations that Enable Generalization in Assistive Tasks
Zhiyang He and Anca Dragan
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2022-233
October 25, 2022
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-233.pdf
Recent work in sim2real has successfully enabled robots to act in physical environments by training in simulation with a diverse “population” of environments (i.e. domain randomization). In this work, we focus on enabling generalization in assistive tasks: tasks in which the robot is acting to assist a user (e.g. helping someone with motor impairments with bathing or with scratching an itch). Such tasks are particularly interesting relative to prior sim2real successes because the environment now contains a human who is also acting. This complicates the problem because the diversity of human users (instead of merely physical environment parameters) is more difficult to capture in a population, thus increasing the likelihood of encountering out-of-distribution (OOD) human policies at test time. We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only. We study how to best learn such a representation by evaluating on purposefully constructed OOD test policies. We find that sim2real methods that encode environment (or population) parameters and work well in tasks that robots do in isolation, do not work well in assistance. In assistance, it seems crucial to train the representation based on the history of interaction directly, because that is what the robot will have access to at test time. Further, training these representations to then predict human actions not only gives them better structure, but also enables them to be fine-tuned at test-time, when the robot observes the partner act.
Advisors: Anca Dragan
BibTeX citation:
@mastersthesis{He:EECS-2022-233, Author= {He, Zhiyang and Dragan, Anca}, Title= {Learning Representations that Enable Generalization in Assistive Tasks}, School= {EECS Department, University of California, Berkeley}, Year= {2022}, Month= {Oct}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-233.html}, Number= {UCB/EECS-2022-233}, Abstract= {Recent work in sim2real has successfully enabled robots to act in physical environments by training in simulation with a diverse “population” of environments (i.e. domain randomization). In this work, we focus on enabling generalization in assistive tasks: tasks in which the robot is acting to assist a user (e.g. helping someone with motor impairments with bathing or with scratching an itch). Such tasks are particularly interesting relative to prior sim2real successes because the environment now contains a human who is also acting. This complicates the problem because the diversity of human users (instead of merely physical environment parameters) is more difficult to capture in a population, thus increasing the likelihood of encountering out-of-distribution (OOD) human policies at test time. We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only. We study how to best learn such a representation by evaluating on purposefully constructed OOD test policies. We find that sim2real methods that encode environment (or population) parameters and work well in tasks that robots do in isolation, do not work well in assistance. In assistance, it seems crucial to train the representation based on the history of interaction directly, because that is what the robot will have access to at test time. Further, training these representations to then predict human actions not only gives them better structure, but also enables them to be fine-tuned at test-time, when the robot observes the partner act.}, }
EndNote citation:
%0 Thesis %A He, Zhiyang %A Dragan, Anca %T Learning Representations that Enable Generalization in Assistive Tasks %I EECS Department, University of California, Berkeley %D 2022 %8 October 25 %@ UCB/EECS-2022-233 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-233.html %F He:EECS-2022-233