TY - JOUR
T1 - Riemannian Stochastic Proximal Gradient Methods for Nonsmooth Optimization over the Stiefel Manifold
AU - Wang, Bokun
AU - Ma, Shiqian
AU - Xue, Lingzhou
N1 - Funding Information:
and suggestions that led to significant improvement of the presentation of this paper. The research of S. Ma is supported in part by NSF grants DMS-1953210 and CCF-2007797, and UC Davis CeDAR (Center for Data Science and Artificial Intelligence Research) Innovative Data Science Seed Funding Program. The research of L. Xue is supported in part by NSF Grants DMS-1811552, DMS-1953189, and CCF-2007823.
Publisher Copyright:
© 2022 Bokun Wang, Shiqian Ma, Lingzhou Xue.
PY - 2022
Y1 - 2022
N2 - Riemannian optimization has drawn a lot of attention due to its wide applications in practice. Riemannian stochastic first-order algorithms have been studied in the literature to solve large-scale machine learning problems over Riemannian manifolds. However, most of the existing Riemannian stochastic algorithms require the objective function to be differentiable, and they do not apply to the case where the objective function is nonsmooth. In this paper, we present two Riemannian stochastic proximal gradient methods for minimizing nonsmooth function over the Stiefel manifold. The two methods, named R-ProxSGD and R-ProxSPB, are generalizations of proximal SGD and proximal SpiderBoost in Euclidean setting to the Riemannian setting. Analysis on the incremental first-order oracle (IFO) complexity of the proposed algorithms is provided. Specifically, the R-ProxSPB algorithm finds an ϵ-stationary point with O(ϵ-3) IFOs in the online case, and O(n + √ nϵ-2) IFOs in the finite-sum case with n being the number of summands in the objective. Experimental results on online sparse PCA and robust low-rank matrix completion show that our proposed methods significantly outperform the existing methods that use Riemannian subgradient information.
AB - Riemannian optimization has drawn a lot of attention due to its wide applications in practice. Riemannian stochastic first-order algorithms have been studied in the literature to solve large-scale machine learning problems over Riemannian manifolds. However, most of the existing Riemannian stochastic algorithms require the objective function to be differentiable, and they do not apply to the case where the objective function is nonsmooth. In this paper, we present two Riemannian stochastic proximal gradient methods for minimizing nonsmooth function over the Stiefel manifold. The two methods, named R-ProxSGD and R-ProxSPB, are generalizations of proximal SGD and proximal SpiderBoost in Euclidean setting to the Riemannian setting. Analysis on the incremental first-order oracle (IFO) complexity of the proposed algorithms is provided. Specifically, the R-ProxSPB algorithm finds an ϵ-stationary point with O(ϵ-3) IFOs in the online case, and O(n + √ nϵ-2) IFOs in the finite-sum case with n being the number of summands in the objective. Experimental results on online sparse PCA and robust low-rank matrix completion show that our proposed methods significantly outperform the existing methods that use Riemannian subgradient information.
UR - http://www.scopus.com/inward/record.url?scp=85130310380&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130310380&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85130310380
SN - 1532-4435
VL - 23
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
ER -