TY - JOUR
T1 - Global convergence of Langevin dynamics based algorithms for nonconvex optimization
AU - Xu, Pan
AU - Zou, Difan
AU - Chen, Jinghui
AU - Gu, Quanquan
N1 - Funding Information:
We would like to thank the anonymous reviewers for their helpful comments. We thank Maxim Raginsky for insightful comments and discussion on the first version of this paper. We also thank Tianhao Wang for discussion on this work. This research was sponsored in part by the National Science Foundation IIS-1652539. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing any funding agencies.
Publisher Copyright:
© 2018 Curran Associates Inc.All rights reserved.
PY - 2018
Y1 - 2018
N2 - We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with n component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates. Specifically, we show that gradient Langevin dynamics (GLD) and stochastic gradient Langevin dynamics (SGLD) converge to the almost minimizer2 within Õe(nd/(λε)) and Õe(d7/(λ5ε5)) stochastic gradient evaluations respectively3, where d is the problem dimension, and λ is the spectral gap of the Markov chain generated by GLD. Both results improve upon the best known gradient complexity4 results [45]. Furthermore, for the first time we prove the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (SVRG-LD) to the almost minimizer within Õe(pnd5/(λ4ε5/2)) stochastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime. Our theoretical analyses shed some light on using Langevin dynamics based algorithms for nonconvex optimization with provable guarantees.
AB - We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with n component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates. Specifically, we show that gradient Langevin dynamics (GLD) and stochastic gradient Langevin dynamics (SGLD) converge to the almost minimizer2 within Õe(nd/(λε)) and Õe(d7/(λ5ε5)) stochastic gradient evaluations respectively3, where d is the problem dimension, and λ is the spectral gap of the Markov chain generated by GLD. Both results improve upon the best known gradient complexity4 results [45]. Furthermore, for the first time we prove the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (SVRG-LD) to the almost minimizer within Õe(pnd5/(λ4ε5/2)) stochastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime. Our theoretical analyses shed some light on using Langevin dynamics based algorithms for nonconvex optimization with provable guarantees.
UR - http://www.scopus.com/inward/record.url?scp=85063009242&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063009242&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85063009242
VL - 2018-December
SP - 3122
EP - 3133
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
SN - 1049-5258
T2 - 32nd Conference on Neural Information Processing Systems, NeurIPS 2018
Y2 - 2 December 2018 through 8 December 2018
ER -