Machine Learning for Deep Image Synthesis

Taesung Park

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2021-143

May 20, 2021

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-143.pdf

Common types of image editing methods focus on low-level characteristics. In this thesis, I leverage machine learning to enable image editing that operates at a higher conceptual level. Fundamentally, the proposed methods aim to factor out the visual information that must be maintained in the editing process from the information that may be edited by incorporating the generic visual knowledge. As a result, the new methods can transform images in human-interpretable ways, such as turning one object into another, stylizing photographs into a specific artist's paintings, or adding sunset to a photo taken in daylight. We explore designing such methods in different settings with varying amounts of supervision: per-pixel labels, per-image labels, and no labels. First, using per-pixel supervision, I propose a new deep neural network architecture that can synthesize realistic images from scene layouts and optional target styles. Second, using per-image supervision, I explore the task of domain translation, where an input image of one class is transformed into another. Lastly, I design a framework that can still discover disentangled manipulation of structure and texture from a collection of unlabeled images. We present convincing visuals in a wide range of applications including interactive photo drawing tools, object transfiguration, domain gap reduction between virtual and real environment, and realistic manipulation of image textures.

Advisors: Alexei (Alyosha) Efros

BibTeX citation:

@phdthesis{Park:EECS-2021-143,
    Author= {Park, Taesung},
    Title= {Machine Learning for Deep Image Synthesis},
    School= {EECS Department, University of California, Berkeley},
    Year= {2021},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-143.html},
    Number= {UCB/EECS-2021-143},
    Abstract= {Common types of image editing methods focus on low-level characteristics. In this thesis, I leverage machine learning to enable image editing that operates at a higher conceptual level. Fundamentally, the proposed methods aim to factor out the visual information that must be maintained in the editing process from the information that may be edited by incorporating the generic visual knowledge. As a result, the new methods can transform images in human-interpretable ways, such as turning one object into another, stylizing photographs into a specific artist's paintings, or adding sunset to a photo taken in daylight. We explore designing such methods in different settings with varying amounts of supervision: per-pixel labels, per-image labels, and no labels. First, using per-pixel supervision, I propose a new deep neural network architecture that can synthesize realistic images from scene layouts and optional target styles. Second, using per-image supervision, I explore the task of domain translation, where an input image of one class is transformed into another. Lastly, I design a framework that can still discover disentangled manipulation of structure and texture from a collection of unlabeled images. We present convincing visuals in a wide range of applications including interactive photo drawing tools, object transfiguration, domain gap reduction between virtual and real environment, and realistic manipulation of image textures.},
}

EndNote citation:

%0 Thesis
%A Park, Taesung 
%T Machine Learning for Deep Image Synthesis
%I EECS Department, University of California, Berkeley
%D 2021
%8 May 20
%@ UCB/EECS-2021-143
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-143.html
%F Park:EECS-2021-143