3D Object Detection with Sparse Sampling Neural Networks

Ryan Goy and Avideh Zakhor

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2018-172

December 14, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-172.pdf

The advent of inexpensive 3D sensors has resulted in an abundance of 3D point- clouds and datasets. For instance, RGB-D sensors such as Kinect can result in 3D point clouds by projecting 2D pixels into 3D world coordinate using depth and pose information. Recent advancements in deep learning techniques appear to result in promising solutions to 2D and 3D recognition problems including 3D object detection. Unlike 3D classification, 3D object detection has received less attention in the research community. In this thesis, we propose a novel approach to 3D object detection, the Sparse Sampling Neural Network (SSNN), which takes large, unordered point clouds as input. We overcome the challenges of processing three dimensional data by convolving a collection of ”probes” across a point cloud input which then feeds into a 3D convolutional neural network. This approach allows us to efficiently and ac- curately infer bounding boxes and their associated classes without discritizing the volumetric space into voxels. We demonstrate that our network performs well on indoor scenes, achiev- ing mean Average Precision (mAP) of 54.48% on the Matterport3D dataset, 62.93% on the Stanford Large-Scale 3D Indoor Spaces Dataset, and 48.4% on the SUN RGB-D dataset.

Advisors: Avideh Zakhor

BibTeX citation:

@mastersthesis{Goy:EECS-2018-172,
    Author= {Goy, Ryan and Zakhor, Avideh},
    Title= {3D Object Detection with Sparse Sampling Neural Networks},
    School= {EECS Department, University of California, Berkeley},
    Year= {2018},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-172.html},
    Number= {UCB/EECS-2018-172},
    Abstract= {The advent of inexpensive 3D sensors has resulted in an abundance of 3D point-
clouds and datasets. For instance, RGB-D sensors such as Kinect can result in 3D point clouds
by projecting 2D pixels into 3D world coordinate using depth and pose information. Recent
advancements in deep learning techniques appear to result in promising solutions to 2D and
3D recognition problems including 3D object detection. Unlike 3D classification, 3D object
detection has received less attention in the research community. In this thesis, we propose a
novel approach to 3D object detection, the Sparse Sampling Neural Network (SSNN), which
takes large, unordered point clouds as input. We overcome the challenges of processing three
dimensional data by convolving a collection of ”probes” across a point cloud input which then
feeds into a 3D convolutional neural network. This approach allows us to efficiently and ac-
curately infer bounding boxes and their associated classes without discritizing the volumetric
space into voxels. We demonstrate that our network performs well on indoor scenes, achiev-
ing mean Average Precision (mAP) of 54.48% on the Matterport3D dataset, 62.93% on the
Stanford Large-Scale 3D Indoor Spaces Dataset, and 48.4% on the SUN RGB-D dataset.},
}

EndNote citation:

%0 Thesis
%A Goy, Ryan 
%A Zakhor, Avideh 
%T 3D Object Detection with Sparse Sampling Neural Networks
%I EECS Department, University of California, Berkeley
%D 2018
%8 December 14
%@ UCB/EECS-2018-172
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-172.html
%F Goy:EECS-2018-172