Yarden Goraly
EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2025-112
May 16, 2025
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-112.pdf
Unsupervised object-centric representation learning is an active area of research with promising applications to robotics and computer vision. These models go beyond the ability to segment objects in a scene. The goal is for these models to develop a disentangled internal representation of objects in latent space. Some models can even encode specific interpretable properties of these objects, such as position, size and shape, in the latent space. In this work, we review the current literature and history of unsupervised object-centric learning and evaluate the impact of each model and how they compare to human perception. We then look at the current theory related to object-centric latent disentanglement and suggest avenues for future research. Finally, we look into a few novel experiments that improve the segmentation performance of these methods and solve sim-to-real problems. We found that it is possible to improve segmentation performance of unsupervised object-centric models using knowledge distillation while retaining latent encoding of object properties. We also uncover unique ways in which the type of dataset can affect reconstruction quality for real and synthetic inputs.
Advisor: Claire Tomlin
";
?>
BibTeX citation:
@mastersthesis{Goraly:EECS-2025-112, Author = {Goraly, Yarden}, Editor = {Stocking, Kaylene and Tomlin, Claire}, Title = {On Unsupervised Object-Centric Representation Learning: Advantages and Shortcomings}, School = {EECS Department, University of California, Berkeley}, Year = {2025}, Month = {May}, URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-112.html}, Number = {UCB/EECS-2025-112}, Abstract = {Unsupervised object-centric representation learning is an active area of research with promising applications to robotics and computer vision. These models go beyond the ability to segment objects in a scene. The goal is for these models to develop a disentangled internal representation of objects in latent space. Some models can even encode specific interpretable properties of these objects, such as position, size and shape, in the latent space. In this work, we review the current literature and history of unsupervised object-centric learning and evaluate the impact of each model and how they compare to human perception. We then look at the current theory related to object-centric latent disentanglement and suggest avenues for future research. Finally, we look into a few novel experiments that improve the segmentation performance of these methods and solve sim-to-real problems. We found that it is possible to improve segmentation performance of unsupervised object-centric models using knowledge distillation while retaining latent encoding of object properties. We also uncover unique ways in which the type of dataset can affect reconstruction quality for real and synthetic inputs.} }
EndNote citation:
%0 Thesis %A Goraly, Yarden %E Stocking, Kaylene %E Tomlin, Claire %T On Unsupervised Object-Centric Representation Learning: Advantages and Shortcomings %I EECS Department, University of California, Berkeley %D 2025 %8 May 16 %@ UCB/EECS-2025-112 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-112.html %F Goraly:EECS-2025-112