Tech Reports | EECS at UC Berkeley

Shubham Tulsiani

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2018-93

July 26, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-93.pdf

We address the task of inferring the 3D structure underlying an image, in particular focusing on two questions -- how we can plausibly obtain supervisory signal for this task, and what forms of representation should we pursue. We first show that we can leverage image-based supervision to learn single-view 3D prediction, by using geometry as a bridge between the learning systems and the available indirect supervision. We demonstrate that this approach enables learning 3D structure across diverse setups e.g. learning deformable models, predictive models for volumetric 3D, or inferring textured meshes. We then advocate the case for inferring interpretable and compositional 3D representations. We present a method that discovers the coherent compositional structure across objects in a unsupervised manner by attempting to assemble shapes using volumetric primitives, and then demonstrate the advantages of predicting similar factored 3D representations for complex scenes.

Advisors: Jitendra Malik

BibTeX citation:

@phdthesis{Tulsiani:EECS-2018-93,
    Author= {Tulsiani, Shubham},
    Title= {Learning Single-view 3D Reconstruction of Objects and Scenes},
    School= {EECS Department, University of California, Berkeley},
    Year= {2018},
    Month= {Jul},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-93.html},
    Number= {UCB/EECS-2018-93},
    Abstract= {We address the task of inferring the 3D structure underlying an image, in particular focusing on two questions -- how we can plausibly obtain supervisory signal for this task, and what forms of representation should we pursue. We first show that we can leverage image-based supervision to learn single-view 3D prediction, by using geometry as a bridge between the learning systems and the available indirect supervision. We demonstrate that this approach enables learning 3D structure across diverse setups e.g. learning deformable models, predictive models for volumetric 3D, or inferring textured meshes. We then advocate the case for inferring interpretable and compositional 3D representations. We present a method that discovers the coherent compositional structure across objects in a unsupervised manner by attempting to assemble shapes using volumetric primitives, and then demonstrate the advantages of predicting similar factored 3D representations for complex scenes.},
}

EndNote citation:

%0 Thesis
%A Tulsiani, Shubham 
%T Learning Single-view 3D Reconstruction of Objects and Scenes
%I EECS Department, University of California, Berkeley
%D 2018
%8 July 26
%@ UCB/EECS-2018-93
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-93.html
%F Tulsiani:EECS-2018-93