TY - JOUR
T1 - Revisiting Normalized Gradient Descent
T2 - Fast Evasion of Saddle Points
AU - Murray, Ryan
AU - Swenson, Brian
AU - Kar, Soummya
N1 - Funding Information:
Manuscript received July 23, 2018; revised December 3, 2018; accepted March 20, 2019. Date of publication May 6, 2019; date of current version October 30, 2019. This work of B. Swenson and S. Kar was supported in part by National Science Foundation under Grant CCF-1513936. Recommended by Associate Editor C. W. Scherer. R. Murray and B. Swenson contributed equally to this paper. (Corresponding author: Brian Swenson.) R. Murray is with the Department of Mathematics, Pennsylvania State University, State College, PA 16801 USA (e-mail:,rwm22@psu.edu).
Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - The paper considers normalized gradient descent (NGD), a natural modification of classical gradient descent (GD) in optimization problems. It is shown that, contrary to GD, NGD escapes saddle points 'quickly.' A serious shortcoming of GD in nonconvex problems is that it can take arbitrarily long to escape from the neighborhood of a saddle point. In practice, this issue can significantly slow the convergence of GD, particularly in high-dimensional nonconvex problems. The paper focuses on continuous-time dynamics. It is shown that 1) NGD 'almost never' converges to saddle points and 2) the time required for NGD to escape from a ball of radius r about a saddle point x$ is at most 5κr, where κ is the condition number of the Hessian of f at x. As a simple application of these results, a global convergence-time bound is established for NGD under mild assumptions.
AB - The paper considers normalized gradient descent (NGD), a natural modification of classical gradient descent (GD) in optimization problems. It is shown that, contrary to GD, NGD escapes saddle points 'quickly.' A serious shortcoming of GD in nonconvex problems is that it can take arbitrarily long to escape from the neighborhood of a saddle point. In practice, this issue can significantly slow the convergence of GD, particularly in high-dimensional nonconvex problems. The paper focuses on continuous-time dynamics. It is shown that 1) NGD 'almost never' converges to saddle points and 2) the time required for NGD to escape from a ball of radius r about a saddle point x$ is at most 5κr, where κ is the condition number of the Hessian of f at x. As a simple application of these results, a global convergence-time bound is established for NGD under mild assumptions.
UR - http://www.scopus.com/inward/record.url?scp=85074535152&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074535152&partnerID=8YFLogxK
U2 - 10.1109/TAC.2019.2914998
DO - 10.1109/TAC.2019.2914998
M3 - Article
AN - SCOPUS:85074535152
SN - 0018-9286
VL - 64
SP - 4818
EP - 4824
JO - IRE Transactions on Automatic Control
JF - IRE Transactions on Automatic Control
IS - 11
M1 - 8706530
ER -