Offline Learning for Scalable Decision Making

Justin Fu

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2021-168

July 29, 2021

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-168.pdf

The remarkable success of modern machine learning has arguably been due to the ability of algorithms to combine powerful models, such as neural networks, with large-scale datasets. This data-driven paradigm has been applied to a variety of applications from computer vision and speech processing, to machine translation and question answering. However, the majority of these successes have been in prediction problems, such as supervised learning. In contrast, many real-world applications of machine learning involve decision making problems, where one must leverage learned models to select optimal actions that maximize some objective of interest. Unfortunately, learned models can often fail in these situations, due to issues such as distribution shift and model exploitation. This thesis proposes methods and algorithms which are designed to handle these shortcomings in modern machine learning methods in order to produce reliable decision making agents. We begin in the area of reinforcement learning, where we study robust algorithms for offline reinforcement learning and model-based reinforcement learning. We discuss considerations for benchmarking offline reinforcement learning and off-policy evaluation, and propose a variety of domains and datasets designed to stress test state-of-the-art algorithms in the area. Finally, we study the more general problem of model-based optimization, and show how information-theoretic principles can guide us to construct uncertainty-aware models that mitigate exploitation.

Advisors: Sergey Levine

BibTeX citation:

@phdthesis{Fu:EECS-2021-168,
    Author= {Fu, Justin},
    Title= {Offline Learning for Scalable Decision Making},
    School= {EECS Department, University of California, Berkeley},
    Year= {2021},
    Month= {Jul},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-168.html},
    Number= {UCB/EECS-2021-168},
    Abstract= {The remarkable success of modern machine learning has arguably been due to the ability of algorithms to combine powerful models, such as neural networks, with large-scale datasets. This data-driven paradigm has been applied to a variety of applications from computer vision and speech processing, to machine translation and question answering. However, the majority of these successes have been in prediction problems, such as supervised learning. In contrast, many real-world applications of machine learning involve decision making problems, where one must leverage learned models to select optimal actions that maximize some objective of interest. Unfortunately, learned models can often fail in these situations, due to issues such as distribution shift and model exploitation. This thesis proposes methods and algorithms which are designed to handle these shortcomings in modern machine learning methods in order to produce reliable decision making agents. We begin in the area of reinforcement learning, where we study robust algorithms for offline reinforcement learning and model-based reinforcement learning. We discuss considerations for benchmarking offline reinforcement learning and off-policy evaluation, and propose a variety of domains and datasets designed to stress test state-of-the-art algorithms in the area. Finally, we study the more general problem of model-based optimization, and show how information-theoretic principles can guide us to construct uncertainty-aware models that mitigate exploitation.},
}

EndNote citation:

%0 Thesis
%A Fu, Justin 
%T Offline Learning for Scalable Decision Making
%I EECS Department, University of California, Berkeley
%D 2021
%8 July 29
%@ UCB/EECS-2021-168
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-168.html
%F Fu:EECS-2021-168