The Interplay between Sampling and Optimization

Cheng Xiang

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2020-145

August 12, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-145.pdf

We study the connections between optimization and sampling. In one direction, we study sampling algorithms from an optimization perspective. We will see how the Langevin MCMC algorithm can be viewed as a deterministic gradient descent in probability space, which enables us to do convergence analysis in KL divergence. We will also see how adding a momentum term improves the convergence rate of Langevin MCMC, much like acceleration in gradient descent. Finally, we will study the problem of sampling from non-logconcave distributions, which is roughly analogous to non-convex optimization.

Conversely, we will also study optimization algorithms from a sampling perspective. We will approximate stochastic gradient descent by a Langevin-like stochastic differential equation, and use this to explain some of its remarkable generalization properties.

Advisors: Michael Jordan and Peter Bartlett

BibTeX citation:

@phdthesis{Xiang:EECS-2020-145,
    Author= {Xiang, Cheng},
    Title= {The Interplay between Sampling and Optimization},
    School= {EECS Department, University of California, Berkeley},
    Year= {2020},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-145.html},
    Number= {UCB/EECS-2020-145},
    Abstract= {
We study the connections between optimization and sampling. In one direction, we study sampling algorithms from an optimization perspective. We will see how the Langevin MCMC algorithm can be viewed as a deterministic gradient descent in probability space, which enables us to do convergence analysis in KL divergence. We will also see how adding a momentum term improves the convergence rate of Langevin MCMC, much like acceleration in gradient descent. Finally, we will study the problem of sampling from non-logconcave distributions, which is roughly analogous to non-convex optimization.

Conversely, we will also study optimization algorithms from a sampling perspective. We will approximate stochastic gradient descent by a Langevin-like stochastic differential equation, and use this to explain some of its remarkable generalization properties.},
}

EndNote citation:

%0 Thesis
%A Xiang, Cheng 
%T The Interplay between Sampling and Optimization
%I EECS Department, University of California, Berkeley
%D 2020
%8 August 12
%@ UCB/EECS-2020-145
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-145.html
%F Xiang:EECS-2020-145