Efficient Policy Learning for Robust Robot Grasping

Jeffrey Mahler

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2018-120
August 10, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-120.pdf

While humans can grasp and manipulate novel objects with ease, rapid and reliable robot grasping of a wide variety of objects is highly challenging due to sensor noise, partial observability, imprecise control, and hardware limitations. Analytic approaches to robot grasping use models from physics to predict grasp success but require precise knowledge of the robot and objects in the environment, making them well-suited for controlled industrial applications but difficult to scale to many objects. On the other hand, deep neural networks trained on large datasets of grasps labeled with empirical successes and failures can rapidly plan grasps across a diverse set of objects, but data collection is tedious, robot-specific, and prone to mislabeling.

To improve the efficiency of learning deep grasping policies, we propose a hybrid method to automate dataset collection by generating millions of synthetic 3D point clouds, robot grasps, and success metrics using analytic models of contact, collision geometry, and image formation. We present the Dexterity-Network (Dex-Net), a framework for generating training datasets by analyzing mechanical models of contact forces and torques under stochastic perturbations across thousands of 3D object CAD models. We describe dataset generation models for training policies to lift and transport novel objects from a tabletop or cluttered bin using a 3D depth sensor and a parallel-jaw (two-finger) or suction cup gripper. To study the effects of learning from massive amounts of training data, we generate datasets containing millions of training examples using distributed Cloud computing, simulations, and parallel GPU processing. We use these datasets to train robust grasping policies based on Grasp Quality Convolutional Neural Networks (GQ-CNNs) that take as input a depth image and a candidate grasp with up to five degrees of freedom and predict the probability of grasp success on an object in the image. To transfer from simulation to reality, we develop novel analytic grasp success metrics based on resisting disturbing forces and torques under stochastic perturbations and bounding an object's mobility under an energy field such as gravity. In addition, we study techniques in algorithmic supervision to guide dataset collection using full knowledge of the object geometry and pose in simulation. We explore extensions to learning policies that sequentially pick novel objects from dense clutter in a bin and that can rapidly decide which gripper hardware is best in a particular scenario.

To substantiate the method, we describe thousands of experimental trials on a physical robot which suggest that deep learning on synthetic Dex-Net datasets can be used to rapidly and reliably plan grasps across a diverse set of novel objects for a variety of depth sensors, robot grippers, and robot arms. Results suggest that policies trained on Dex-Net datasets can achieve up to 95% success in picking novel objects from densely cluttered bins at a rate of over 310 mean picks per hour with no additional training or tuning on the physical system.

Advisor: Ken Goldberg


BibTeX citation:

@phdthesis{Mahler:EECS-2018-120,
    Author = {Mahler, Jeffrey},
    Title = {Efficient Policy Learning for Robust Robot Grasping},
    School = {EECS Department, University of California, Berkeley},
    Year = {2018},
    Month = {Aug},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-120.html},
    Number = {UCB/EECS-2018-120},
    Abstract = {While humans can grasp and manipulate novel objects with ease, rapid and reliable robot grasping of a wide variety of objects is highly challenging due to sensor noise, partial observability, imprecise control, and hardware limitations.
Analytic approaches to robot grasping use models from physics to predict grasp success but require precise knowledge of the robot and objects in the environment, making them well-suited for controlled industrial applications but difficult to scale to many objects.
On the other hand, deep neural networks trained on large datasets of grasps labeled with empirical successes and failures can rapidly plan grasps across a diverse set of objects, but data collection is tedious, robot-specific, and prone to mislabeling.

To improve the efficiency of learning deep grasping policies, we propose a hybrid method to automate dataset collection by generating millions of synthetic 3D point clouds, robot grasps, and success metrics using analytic models of contact, collision geometry, and image formation.
We present the Dexterity-Network (Dex-Net), a framework for generating training datasets by analyzing mechanical models of contact forces and torques under stochastic perturbations across thousands of 3D object CAD models.
We describe dataset generation models for training policies to lift and transport novel objects from a tabletop or cluttered bin using a 3D depth sensor and a parallel-jaw (two-finger) or suction cup gripper.
To study the effects of learning from massive amounts of training data, we generate datasets containing millions of training examples using distributed Cloud computing, simulations, and parallel GPU processing.
We use these datasets to train robust grasping policies based on Grasp Quality Convolutional Neural Networks (GQ-CNNs) that take as input a depth image and a candidate grasp with up to five degrees of freedom and predict the probability of grasp success on an object in the image.
To transfer from simulation to reality, we develop novel analytic grasp success metrics based on resisting disturbing forces and torques under stochastic perturbations and bounding an object's mobility under an energy field such as gravity. 
In addition, we study techniques in algorithmic supervision to guide dataset collection using full knowledge of the object geometry and pose in simulation.
We explore extensions to learning policies that sequentially pick novel objects from dense clutter in a bin and that can rapidly decide which gripper hardware is best in a particular scenario.

To substantiate the method, we describe thousands of experimental trials on a physical robot which suggest that deep learning on synthetic Dex-Net datasets can be used to rapidly and reliably plan grasps across a diverse set of novel objects for a variety of depth sensors, robot grippers, and robot arms.
Results suggest that policies trained on Dex-Net datasets can achieve up to 95% success in picking novel objects from densely cluttered bins at a rate of over 310 mean picks per hour with no additional training or tuning on the physical system.}
}

EndNote citation:

%0 Thesis
%A Mahler, Jeffrey
%T Efficient Policy Learning for Robust Robot Grasping
%I EECS Department, University of California, Berkeley
%D 2018
%8 August 10
%@ UCB/EECS-2018-120
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-120.html
%F Mahler:EECS-2018-120