TY - GEN

T1 - Data-driven first-order methods for misspecified convex optimization problems

T2 - 2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014

AU - Ahmadi, Hesam

AU - Shanbhag, Uday V.

N1 - Publisher Copyright:
© 2014 IEEE.

PY - 2014

Y1 - 2014

N2 - We consider a misspecified optimization problem that requires minimizing of a convex function f(x; θ) in x over a closed and convex set X where θ∗ is an unknown vector of parameters. Suppose θ∗ may be learnt by a parallel learning process that generates a sequence of estimators θk, each of which is an increasingly accurate approximation of θ. In this context, we examine the development of coupled schemes that generate iterates (xk, θk) such that as the iteration index k → ∞, then xk → x, a minimizer of f(x; θ) over X and θk → θ.We make two sets of contributions along this direction. First, we consider the use of gradient methods and proceed to show that such techniques are globally convergent. In addition, such schemes show a quantifiable degradation in the linear rate of convergence observed for strongly convex optimization problems. When strong convexity assumptions are weakened, we see a modification in the convergence rate in function values of O(1/K) by an additive factor θ0-θO(qKg +1/K) where θ0-θ represents the initial misspecification in θ∗ and qg denotes the contractive factor associated with the learning process. Second, we present an averaging-based subgradient scheme and show that the optimal constant steplength leads to a modification in the rate by θ0-θO(qKg +1/K), implying no effect on the standard rate of O(1/√K).

AB - We consider a misspecified optimization problem that requires minimizing of a convex function f(x; θ) in x over a closed and convex set X where θ∗ is an unknown vector of parameters. Suppose θ∗ may be learnt by a parallel learning process that generates a sequence of estimators θk, each of which is an increasingly accurate approximation of θ. In this context, we examine the development of coupled schemes that generate iterates (xk, θk) such that as the iteration index k → ∞, then xk → x, a minimizer of f(x; θ) over X and θk → θ.We make two sets of contributions along this direction. First, we consider the use of gradient methods and proceed to show that such techniques are globally convergent. In addition, such schemes show a quantifiable degradation in the linear rate of convergence observed for strongly convex optimization problems. When strong convexity assumptions are weakened, we see a modification in the convergence rate in function values of O(1/K) by an additive factor θ0-θO(qKg +1/K) where θ0-θ represents the initial misspecification in θ∗ and qg denotes the contractive factor associated with the learning process. Second, we present an averaging-based subgradient scheme and show that the optimal constant steplength leads to a modification in the rate by θ0-θO(qKg +1/K), implying no effect on the standard rate of O(1/√K).

UR - http://www.scopus.com/inward/record.url?scp=84931846719&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84931846719&partnerID=8YFLogxK

U2 - 10.1109/CDC.2014.7040048

DO - 10.1109/CDC.2014.7040048

M3 - Conference contribution

AN - SCOPUS:84931846719

T3 - Proceedings of the IEEE Conference on Decision and Control

SP - 4228

EP - 4233

BT - 53rd IEEE Conference on Decision and Control,CDC 2014

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 15 December 2014 through 17 December 2014

ER -