Hierarchical Deep Reinforcement Learning For Robotics and Data Science

Sanjay Krishnan

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2018-101

August 7, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-101.pdf

This dissertation explores learning important structural features of a Markov Decision Process from offline data to significantly improve the sample-efficiency, stability, and robustness of solutions even with high dimensional action spaces and long time horizons. It presents applications to surgical robot control, data cleaning, and generating efficient execution plans for relational queries. The dissertation contributes: (1) Sequential Windowed Reinforcement Learning: a framework that approximates a long-horizon MDP with a sequence of shorter term MDPs with smooth quadratic cost functions from a small number of expert demonstrations, (2) Deep Discovery of Options: an algorithm that discovers hierarchical structure in the action space from observed demonstrations, (3) AlphaClean: a system that decomposes a data cleaning task into a set of independent search problems and uses deep q-learning to share structure across the problems, and (4) Learning Query Optimizer: a system that observes executions of a dynamic program for SQL query optimization and learns a model to predict cost-to-go values to greatly speed up future search problems.

Advisors: Ken Goldberg

BibTeX citation:

@phdthesis{Krishnan:EECS-2018-101,
    Author= {Krishnan, Sanjay},
    Title= {Hierarchical Deep Reinforcement Learning For Robotics and Data Science},
    School= {EECS Department, University of California, Berkeley},
    Year= {2018},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-101.html},
    Number= {UCB/EECS-2018-101},
    Abstract= {This dissertation explores learning important structural features of a Markov Decision Process from offline data to significantly improve the sample-efficiency, stability, and robustness
of solutions even with high dimensional action spaces and long time horizons. It presents applications to surgical robot control, data cleaning, and generating efficient execution
plans for relational queries. The dissertation contributes: (1) Sequential Windowed Reinforcement Learning: a framework that approximates a long-horizon MDP with a sequence of shorter term MDPs with smooth quadratic cost functions from a small number
of expert demonstrations, (2) Deep Discovery of Options: an algorithm that discovers hierarchical structure in the action space from observed demonstrations, (3) AlphaClean: a system that decomposes a data cleaning task into a set of independent search problems and uses deep q-learning to share structure across the problems, and (4) Learning Query Optimizer: a system that observes executions of a dynamic program for SQL query optimization
and learns a model to predict cost-to-go values to greatly speed up future search problems.},
}

EndNote citation:

%0 Thesis
%A Krishnan, Sanjay 
%T Hierarchical Deep Reinforcement Learning For Robotics and Data Science
%I EECS Department, University of California, Berkeley
%D 2018
%8 August 7
%@ UCB/EECS-2018-101
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-101.html
%F Krishnan:EECS-2018-101