Calibrating nonconvex penalized regression in ultra-high dimension

Lan Wang, Yongdai Kim, Runze Li

Research output: Contribution to journalArticle

47 Citations (Scopus)

Abstract

We investigate high-dimensional nonconvex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of nonconvex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis.

Original languageEnglish (US)
Pages (from-to)2505-2536
Number of pages32
JournalAnnals of Statistics
Volume41
Issue number5
DOIs
StatePublished - Oct 1 2013

Fingerprint

Penalized Regression
Higher Dimensions
Estimator
High-dimensional
Path
Parameter Tuning
Local Minima
Oracle Property
Random Error
Asymptotic Theory
High-dimensional Data
Monte Carlo Study
Sparsity
Gaussian distribution
Penalty
Covariates
Open Problems
Data analysis
Calculate
Unknown

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Wang, Lan ; Kim, Yongdai ; Li, Runze. / Calibrating nonconvex penalized regression in ultra-high dimension. In: Annals of Statistics. 2013 ; Vol. 41, No. 5. pp. 2505-2536.
@article{41ab4d2e51ef47f49340523e8658359c,
title = "Calibrating nonconvex penalized regression in ultra-high dimension",
abstract = "We investigate high-dimensional nonconvex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of nonconvex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis.",
author = "Lan Wang and Yongdai Kim and Runze Li",
year = "2013",
month = "10",
day = "1",
doi = "10.1214/13-AOS1159",
language = "English (US)",
volume = "41",
pages = "2505--2536",
journal = "Annals of Statistics",
issn = "0090-5364",
publisher = "Institute of Mathematical Statistics",
number = "5",

}

Calibrating nonconvex penalized regression in ultra-high dimension. / Wang, Lan; Kim, Yongdai; Li, Runze.

In: Annals of Statistics, Vol. 41, No. 5, 01.10.2013, p. 2505-2536.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Calibrating nonconvex penalized regression in ultra-high dimension

AU - Wang, Lan

AU - Kim, Yongdai

AU - Li, Runze

PY - 2013/10/1

Y1 - 2013/10/1

N2 - We investigate high-dimensional nonconvex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of nonconvex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis.

AB - We investigate high-dimensional nonconvex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of nonconvex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis.

UR - http://www.scopus.com/inward/record.url?scp=84888861633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84888861633&partnerID=8YFLogxK

U2 - 10.1214/13-AOS1159

DO - 10.1214/13-AOS1159

M3 - Article

AN - SCOPUS:84888861633

VL - 41

SP - 2505

EP - 2536

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 5

ER -