James Smith and Michael Laielli and Giscard Biamby and Trevor Darrell and Björn Hartmann

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2019-58

May 17, 2019

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-58.pdf

Computer vision is applied in an ever expanding range of applications, many of which require custom training data to perform well. We present a novel interface for rapid collection and labeling of training images to improve computer vision based object detectors. LabelAR leverages the spatial tracking capabilities of an AR-enabled camera, allowing users to place persistent bounding volumes that stay centered on real-world objects. The interface then guides the user to move the camera to cover a wide variety of viewpoints. We eliminate the need for post-hoc manual labeling of images by automatically projecting 2D bounding boxes around objects in the images as they are captured from AR-marked viewpoints. In a user study with 12 participants, LabelAR significantly outperforms existing approaches in terms of the trade-off between model performance and collection time.

Advisors: Björn Hartmann


BibTeX citation:

@mastersthesis{Smith:EECS-2019-58,
    Author= {Smith, James and Laielli, Michael and Biamby, Giscard and Darrell, Trevor and Hartmann, Björn},
    Title= {LabelAR: A spatial guidance interface for fast computer vision image collection},
    School= {EECS Department, University of California, Berkeley},
    Year= {2019},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-58.html},
    Number= {UCB/EECS-2019-58},
    Abstract= {Computer vision is applied in an ever expanding range of applications, many of which require custom training data to perform well.  We present a novel interface for rapid collection and labeling of training images to improve computer vision based object detectors. LabelAR leverages the spatial tracking capabilities of an AR-enabled camera, allowing users to place persistent bounding volumes that stay centered on real-world objects. The interface then guides the user to move the camera to cover a wide variety of viewpoints. We eliminate the need for post-hoc manual labeling of images by automatically projecting 2D bounding boxes around objects in the images as they are captured from AR-marked viewpoints. In a user study with 12 participants, LabelAR significantly outperforms existing approaches in terms of the trade-off between model performance and collection time.},
}

EndNote citation:

%0 Thesis
%A Smith, James 
%A Laielli, Michael 
%A Biamby, Giscard 
%A Darrell, Trevor 
%A Hartmann, Björn 
%T LabelAR: A spatial guidance interface for fast computer vision image collection
%I EECS Department, University of California, Berkeley
%D 2019
%8 May 17
%@ UCB/EECS-2019-58
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-58.html
%F Smith:EECS-2019-58