Disentangled Visual Generative Models

Dave Epstein

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2024-65

May 8, 2024

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-65.pdf

Generative modeling promises an elegant solution to learning about high-dimensional data distributions such as images and videos — but how can we expose and utilize the rich structure these models discover? Rather than just drawing new samples, how can an agent actually harness p(x) as a source of knowledge about how our world works? This thesis explores scalable inductive biases that unlock a generative model's understanding of the entities latent in visual data, enabling much richer interaction with the model as a result.

Advisors: Alexei (Alyosha) Efros

BibTeX citation:

@phdthesis{Epstein:EECS-2024-65,
    Author= {Epstein, Dave},
    Title= {Disentangled Visual Generative Models},
    School= {EECS Department, University of California, Berkeley},
    Year= {2024},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-65.html},
    Number= {UCB/EECS-2024-65},
    Abstract= {Generative modeling promises an elegant solution to learning about high-dimensional data distributions such as images and videos — but how can we expose and utilize the rich structure these models discover? Rather than just drawing new samples, how can an agent actually harness p(x) as a source of knowledge about how our world works? This thesis explores scalable inductive biases that unlock a generative model's understanding of the entities latent in visual data, enabling much richer interaction with the model as a result.},
}

EndNote citation:

%0 Thesis
%A Epstein, Dave 
%T Disentangled Visual Generative Models
%I EECS Department, University of California, Berkeley
%D 2024
%8 May 8
%@ UCB/EECS-2024-65
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-65.html
%F Epstein:EECS-2024-65