Bayesian Posterior Sampling via Stochastic Gradient Descent with Collisions

Quico Spaen

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2019-169

December 10, 2019

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-169.pdf

Markov Chain Monte Carlo algorithms, with step proposals based on Hamiltonian or Langevin dynamics, are commonly used in Bayesian machine learning and inference methods to sample from the posterior distribution of over model parameters. In addition to providing accurate predictions, these methods quantify parameter uncertainty and are robust to overfitting. Until recently, these methods were limited to small datasets since they require a full pass over the data per update step. New developments have enabled mini-batch updates through the use of a new mini-batch acceptance test and by combining stochastic gradient descent with additional noise to correct the noise distribution. We propose a novel method that redistributes the stochastic gradient noise across all degrees of freedom via collisions between particles instead of inserting additional noise into the system. Since no additional noise is added to the system, the proposed method has a higher rate of diffusion. This should result in faster convergence as well as improved exploration of the posterior distribution. We observe this behavior in initial experiments on a multivariate Gaussian model with a highly skewed, and correlated distribution.

Advisors: John F. Canny

BibTeX citation:

@mastersthesis{Spaen:EECS-2019-169,
    Author= {Spaen, Quico},
    Title= {Bayesian Posterior Sampling via Stochastic Gradient Descent with Collisions},
    School= {EECS Department, University of California, Berkeley},
    Year= {2019},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-169.html},
    Number= {UCB/EECS-2019-169},
    Abstract= {Markov Chain Monte Carlo algorithms, with step proposals based on Hamiltonian or Langevin dynamics, are commonly used in Bayesian machine learning and inference methods to sample from the posterior distribution of over model parameters. In addition to providing accurate predictions, these methods quantify parameter uncertainty and are robust to overfitting. Until recently, these methods were limited to small datasets since they require a full pass over the data per update step. New developments have enabled mini-batch updates through the use of a new mini-batch acceptance test and by combining stochastic gradient descent with additional noise to correct the noise distribution. 
	
We propose a novel method that redistributes the stochastic gradient noise across all degrees of freedom via collisions between particles instead of inserting additional noise into the system. Since no additional noise is added to the system, the proposed method has a higher rate of diffusion. This should result in faster convergence as well as improved exploration of the posterior distribution. We observe this behavior in initial experiments on a multivariate Gaussian model with a highly skewed, and correlated distribution.},
}

EndNote citation:

%0 Thesis
%A Spaen, Quico 
%T Bayesian Posterior Sampling via Stochastic Gradient Descent with Collisions
%I EECS Department, University of California, Berkeley
%D 2019
%8 December 10
%@ UCB/EECS-2019-169
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-169.html
%F Spaen:EECS-2019-169