Hierarchical Deep Reinforcement Learning For Robotics and Data Science
Sanjay Krishnan
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2018-101
August 7, 2018
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-101.pdf
This dissertation explores learning important structural features of a Markov Decision Process from offline data to significantly improve the sample-efficiency, stability, and robustness of solutions even with high dimensional action spaces and long time horizons. It presents applications to surgical robot control, data cleaning, and generating efficient execution plans for relational queries. The dissertation contributes: (1) Sequential Windowed Reinforcement Learning: a framework that approximates a long-horizon MDP with a sequence of shorter term MDPs with smooth quadratic cost functions from a small number of expert demonstrations, (2) Deep Discovery of Options: an algorithm that discovers hierarchical structure in the action space from observed demonstrations, (3) AlphaClean: a system that decomposes a data cleaning task into a set of independent search problems and uses deep q-learning to share structure across the problems, and (4) Learning Query Optimizer: a system that observes executions of a dynamic program for SQL query optimization and learns a model to predict cost-to-go values to greatly speed up future search problems.
Advisors: Ken Goldberg
BibTeX citation:
@phdthesis{Krishnan:EECS-2018-101, Author= {Krishnan, Sanjay}, Title= {Hierarchical Deep Reinforcement Learning For Robotics and Data Science}, School= {EECS Department, University of California, Berkeley}, Year= {2018}, Month= {Aug}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-101.html}, Number= {UCB/EECS-2018-101}, Abstract= {This dissertation explores learning important structural features of a Markov Decision Process from offline data to significantly improve the sample-efficiency, stability, and robustness of solutions even with high dimensional action spaces and long time horizons. It presents applications to surgical robot control, data cleaning, and generating efficient execution plans for relational queries. The dissertation contributes: (1) Sequential Windowed Reinforcement Learning: a framework that approximates a long-horizon MDP with a sequence of shorter term MDPs with smooth quadratic cost functions from a small number of expert demonstrations, (2) Deep Discovery of Options: an algorithm that discovers hierarchical structure in the action space from observed demonstrations, (3) AlphaClean: a system that decomposes a data cleaning task into a set of independent search problems and uses deep q-learning to share structure across the problems, and (4) Learning Query Optimizer: a system that observes executions of a dynamic program for SQL query optimization and learns a model to predict cost-to-go values to greatly speed up future search problems.}, }
EndNote citation:
%0 Thesis %A Krishnan, Sanjay %T Hierarchical Deep Reinforcement Learning For Robotics and Data Science %I EECS Department, University of California, Berkeley %D 2018 %8 August 7 %@ UCB/EECS-2018-101 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-101.html %F Krishnan:EECS-2018-101