On stochastic gradient and subgradient methods with adaptive steplength sequences

Farzad Yousefian, Angelia Nedi, Vinayak V. Shanbhag

Research output: Contribution to journalArticle

46 Citations (Scopus)

Abstract

Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory, that aim to overcome some of the reliance on user-specific parameters. The first scheme, referred to as a recursive steplength stochastic approximation (RSA) scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. The second scheme, termed as a cascading steplength stochastic approximation (CSA) scheme, maintains the steplength sequence as a piecewise-constant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met. Then, we allow for nondifferentiable objectives but with bounded subgradients over a certain domain. In such a regime, we propose a local smoothing technique, based on random local perturbations of the objective function, that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution.

Original languageEnglish (US)
Pages (from-to)56-67
Number of pages12
JournalAutomatica
Volume48
Issue number1
DOIs
StatePublished - Jan 1 2012

Fingerprint

Sampling

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Yousefian, Farzad ; Nedi, Angelia ; Shanbhag, Vinayak V. / On stochastic gradient and subgradient methods with adaptive steplength sequences. In: Automatica. 2012 ; Vol. 48, No. 1. pp. 56-67.
@article{2165cd7bdfb0437a97de12628130df39,
title = "On stochastic gradient and subgradient methods with adaptive steplength sequences",
abstract = "Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory, that aim to overcome some of the reliance on user-specific parameters. The first scheme, referred to as a recursive steplength stochastic approximation (RSA) scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. The second scheme, termed as a cascading steplength stochastic approximation (CSA) scheme, maintains the steplength sequence as a piecewise-constant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met. Then, we allow for nondifferentiable objectives but with bounded subgradients over a certain domain. In such a regime, we propose a local smoothing technique, based on random local perturbations of the objective function, that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution.",
author = "Farzad Yousefian and Angelia Nedi and Shanbhag, {Vinayak V.}",
year = "2012",
month = "1",
day = "1",
doi = "10.1016/j.automatica.2011.09.043",
language = "English (US)",
volume = "48",
pages = "56--67",
journal = "Automatica",
issn = "0005-1098",
publisher = "Elsevier Limited",
number = "1",

}

On stochastic gradient and subgradient methods with adaptive steplength sequences. / Yousefian, Farzad; Nedi, Angelia; Shanbhag, Vinayak V.

In: Automatica, Vol. 48, No. 1, 01.01.2012, p. 56-67.

Research output: Contribution to journalArticle

TY - JOUR

T1 - On stochastic gradient and subgradient methods with adaptive steplength sequences

AU - Yousefian, Farzad

AU - Nedi, Angelia

AU - Shanbhag, Vinayak V.

PY - 2012/1/1

Y1 - 2012/1/1

N2 - Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory, that aim to overcome some of the reliance on user-specific parameters. The first scheme, referred to as a recursive steplength stochastic approximation (RSA) scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. The second scheme, termed as a cascading steplength stochastic approximation (CSA) scheme, maintains the steplength sequence as a piecewise-constant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met. Then, we allow for nondifferentiable objectives but with bounded subgradients over a certain domain. In such a regime, we propose a local smoothing technique, based on random local perturbations of the objective function, that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution.

AB - Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory, that aim to overcome some of the reliance on user-specific parameters. The first scheme, referred to as a recursive steplength stochastic approximation (RSA) scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. The second scheme, termed as a cascading steplength stochastic approximation (CSA) scheme, maintains the steplength sequence as a piecewise-constant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met. Then, we allow for nondifferentiable objectives but with bounded subgradients over a certain domain. In such a regime, we propose a local smoothing technique, based on random local perturbations of the objective function, that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution.

UR - http://www.scopus.com/inward/record.url?scp=84355162114&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84355162114&partnerID=8YFLogxK

U2 - 10.1016/j.automatica.2011.09.043

DO - 10.1016/j.automatica.2011.09.043

M3 - Article

VL - 48

SP - 56

EP - 67

JO - Automatica

JF - Automatica

SN - 0005-1098

IS - 1

ER -