High-Fidelity 3D Mesh Reconstruction of Humans and Objects
Shubham Goel
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2023-254
December 1, 2023
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-254.pdf
Humans perceive the world through their eyes -- where the images formed on the retina are two-dimensional projections of the underlying three-dimensional world. Akin to human vision, the goal of computer vision, is to extract information about the 3D world from 2D images. A fundamental problem in computer vision is to extract the 3D structure underlying such 2D images. Even though this problem is mathematically ill-posed, the ambiguity can be resolved, either using multiple 2D views, or using priors about how the world is structured.
In this thesis, I present my work on high-fidelity 3D mesh reconstruction of humans and objects from 2D images. I discuss the more classical setting of optimizing a shape/texture using multiple image inputs, as well as how we can learn priors that enable mesh reconstruction even from a single image. Specifically, I first present work on multi-view 3D reconstruction, where we reconstruct meshes of an object given few images with noisy camera poses. Then, I continue with 3D reconstruction from single images, enabled by learning category-specific shape priors from natural image datasets. Finally, I focus on learning single-view 3D human reconstruction using big models and big data. Such robust 3D reconstruction of humans enables downstream applications like 3D tracking and action recognition.
Advisors: Jitendra Malik and Angjoo Kanazawa
BibTeX citation:
@phdthesis{Goel:EECS-2023-254, Author= {Goel, Shubham}, Title= {High-Fidelity 3D Mesh Reconstruction of Humans and Objects}, School= {EECS Department, University of California, Berkeley}, Year= {2023}, Month= {Dec}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-254.html}, Number= {UCB/EECS-2023-254}, Abstract= {Humans perceive the world through their eyes -- where the images formed on the retina are two-dimensional projections of the underlying three-dimensional world. Akin to human vision, the goal of computer vision, is to extract information about the 3D world from 2D images. A fundamental problem in computer vision is to extract the 3D structure underlying such 2D images. Even though this problem is mathematically ill-posed, the ambiguity can be resolved, either using multiple 2D views, or using priors about how the world is structured. In this thesis, I present my work on high-fidelity 3D mesh reconstruction of humans and objects from 2D images. I discuss the more classical setting of optimizing a shape/texture using multiple image inputs, as well as how we can learn priors that enable mesh reconstruction even from a single image. Specifically, I first present work on multi-view 3D reconstruction, where we reconstruct meshes of an object given few images with noisy camera poses. Then, I continue with 3D reconstruction from single images, enabled by learning category-specific shape priors from natural image datasets. Finally, I focus on learning single-view 3D human reconstruction using big models and big data. Such robust 3D reconstruction of humans enables downstream applications like 3D tracking and action recognition.}, }
EndNote citation:
%0 Thesis %A Goel, Shubham %T High-Fidelity 3D Mesh Reconstruction of Humans and Objects %I EECS Department, University of California, Berkeley %D 2023 %8 December 1 %@ UCB/EECS-2023-254 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-254.html %F Goel:EECS-2023-254