Linear convergence with condition number independent access of full gradients

Lijun Zhang, Mehrdad Mahdavi, Rong Jin

Research output: Contribution to journalConference article

38 Citations (Scopus)

Abstract

For smooth and strongly convex optimizations, the optimal iteration complexity of the gradient-based algorithm is O(√κ log 1/ε), where κ is the condition number. In the case that the optimization problem is ill-conditioned, we need to evaluate a large number of full gradients, which could be computationally expensive. In this paper, we propose to remove the dependence on the condition number by allowing the algorithm to access stochastic gradients of the objective function. To this end, we present a novel algorithm named EpochMixed Gradient Descent (EMGD) that is able to utilize two kinds of gradients. A distinctive step in EMGD is the mixed gradient descent, where we use a combination of the full and stochastic gradients to update the intermediate solution. Theoretical analysis shows that EMGD is able to find an ε-optimal solution by computing O(log 1/ε) full gradients and O(κ2 log 1/ε) stochastic gradients.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
StatePublished - Jan 1 2013
Event27th Annual Conference on Neural Information Processing Systems, NIPS 2013 - Lake Tahoe, NV, United States
Duration: Dec 5 2013Dec 10 2013

Fingerprint

Convex optimization

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

@article{93f18c561c22490aa1bd0e7e16a217dc,
title = "Linear convergence with condition number independent access of full gradients",
abstract = "For smooth and strongly convex optimizations, the optimal iteration complexity of the gradient-based algorithm is O(√κ log 1/ε), where κ is the condition number. In the case that the optimization problem is ill-conditioned, we need to evaluate a large number of full gradients, which could be computationally expensive. In this paper, we propose to remove the dependence on the condition number by allowing the algorithm to access stochastic gradients of the objective function. To this end, we present a novel algorithm named EpochMixed Gradient Descent (EMGD) that is able to utilize two kinds of gradients. A distinctive step in EMGD is the mixed gradient descent, where we use a combination of the full and stochastic gradients to update the intermediate solution. Theoretical analysis shows that EMGD is able to find an ε-optimal solution by computing O(log 1/ε) full gradients and O(κ2 log 1/ε) stochastic gradients.",
author = "Lijun Zhang and Mehrdad Mahdavi and Rong Jin",
year = "2013",
month = "1",
day = "1",
language = "English (US)",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",

}

Linear convergence with condition number independent access of full gradients. / Zhang, Lijun; Mahdavi, Mehrdad; Jin, Rong.

In: Advances in Neural Information Processing Systems, 01.01.2013.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Linear convergence with condition number independent access of full gradients

AU - Zhang, Lijun

AU - Mahdavi, Mehrdad

AU - Jin, Rong

PY - 2013/1/1

Y1 - 2013/1/1

N2 - For smooth and strongly convex optimizations, the optimal iteration complexity of the gradient-based algorithm is O(√κ log 1/ε), where κ is the condition number. In the case that the optimization problem is ill-conditioned, we need to evaluate a large number of full gradients, which could be computationally expensive. In this paper, we propose to remove the dependence on the condition number by allowing the algorithm to access stochastic gradients of the objective function. To this end, we present a novel algorithm named EpochMixed Gradient Descent (EMGD) that is able to utilize two kinds of gradients. A distinctive step in EMGD is the mixed gradient descent, where we use a combination of the full and stochastic gradients to update the intermediate solution. Theoretical analysis shows that EMGD is able to find an ε-optimal solution by computing O(log 1/ε) full gradients and O(κ2 log 1/ε) stochastic gradients.

AB - For smooth and strongly convex optimizations, the optimal iteration complexity of the gradient-based algorithm is O(√κ log 1/ε), where κ is the condition number. In the case that the optimization problem is ill-conditioned, we need to evaluate a large number of full gradients, which could be computationally expensive. In this paper, we propose to remove the dependence on the condition number by allowing the algorithm to access stochastic gradients of the objective function. To this end, we present a novel algorithm named EpochMixed Gradient Descent (EMGD) that is able to utilize two kinds of gradients. A distinctive step in EMGD is the mixed gradient descent, where we use a combination of the full and stochastic gradients to update the intermediate solution. Theoretical analysis shows that EMGD is able to find an ε-optimal solution by computing O(log 1/ε) full gradients and O(κ2 log 1/ε) stochastic gradients.

UR - http://www.scopus.com/inward/record.url?scp=84898971059&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84898971059&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84898971059

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

ER -