Regularized rank-based estimation of high-dimensional nonparanormal graphical models

Lingzhou Xue, Hui Zou

Research output: Contribution to journalArticle

89 Citations (Scopus)

Abstract

A sparse precision matrix can be directly translated into a sparse Gaussian graphical model under the assumption that the data follow a joint normal distribution. This neat property makes high-dimensional precision matrix estimation very appealing in many applications. However, in practice we often face nonnormal data, and variable transformation is often used to achieve normality. In this paper we consider the nonparanormal model that assumes that the variables follow a joint normal distribution after a set of unknown monotone transformations. The nonparanormal model is much more flexible than the normal model while retaining the good interpretability of the latter in that each zero entry in the sparse precision matrix of the nonparanormal model corresponds to a pair of conditionally independent variables. In this paper we show that the nonparanormal graphical model can be efficiently estimated by using a rank-based estimation scheme which does not require estimating these unknown transformation functions. In particular, we study the rank-based graphical lasso, the rank-based neighborhood Dantzig selector and the rank-based CLIME.We establish their theoretical properties in the setting where the dimension is nearly exponentially large relative to the sample size. It is shown that the proposed rank-based estimators work as well as their oracle counterparts defined with the oracle data. Furthermore, the theory motivates us to consider the adaptive version of the rank-based neighborhood Dantzig selector and the rank-based CLIME that are shown to enjoy graphical model selection consistency without assuming the irrepresentable condition for the oracle and rank-based graphical lasso. Simulated and real data are used to demonstrate the finite performance of the rank-based estimators.

Original languageEnglish (US)
Pages (from-to)2541-2571
Number of pages31
JournalAnnals of Statistics
Volume40
Issue number5
DOIs
StatePublished - Oct 1 2012

Fingerprint

Graphical Models
High-dimensional
Lasso
Selector
Joint Distribution
Gaussian distribution
Estimator
Graphical models
Variable Transformation
Unknown
Data Transformation
Interpretability
Gaussian Model
Model Selection
Normality
Model
Monotone
Sample Size
Zero

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

@article{46d039e457644ac497af947b738aec77,
title = "Regularized rank-based estimation of high-dimensional nonparanormal graphical models",
abstract = "A sparse precision matrix can be directly translated into a sparse Gaussian graphical model under the assumption that the data follow a joint normal distribution. This neat property makes high-dimensional precision matrix estimation very appealing in many applications. However, in practice we often face nonnormal data, and variable transformation is often used to achieve normality. In this paper we consider the nonparanormal model that assumes that the variables follow a joint normal distribution after a set of unknown monotone transformations. The nonparanormal model is much more flexible than the normal model while retaining the good interpretability of the latter in that each zero entry in the sparse precision matrix of the nonparanormal model corresponds to a pair of conditionally independent variables. In this paper we show that the nonparanormal graphical model can be efficiently estimated by using a rank-based estimation scheme which does not require estimating these unknown transformation functions. In particular, we study the rank-based graphical lasso, the rank-based neighborhood Dantzig selector and the rank-based CLIME.We establish their theoretical properties in the setting where the dimension is nearly exponentially large relative to the sample size. It is shown that the proposed rank-based estimators work as well as their oracle counterparts defined with the oracle data. Furthermore, the theory motivates us to consider the adaptive version of the rank-based neighborhood Dantzig selector and the rank-based CLIME that are shown to enjoy graphical model selection consistency without assuming the irrepresentable condition for the oracle and rank-based graphical lasso. Simulated and real data are used to demonstrate the finite performance of the rank-based estimators.",
author = "Lingzhou Xue and Hui Zou",
year = "2012",
month = "10",
day = "1",
doi = "10.1214/12-AOS1041",
language = "English (US)",
volume = "40",
pages = "2541--2571",
journal = "Annals of Statistics",
issn = "0090-5364",
publisher = "Institute of Mathematical Statistics",
number = "5",

}

Regularized rank-based estimation of high-dimensional nonparanormal graphical models. / Xue, Lingzhou; Zou, Hui.

In: Annals of Statistics, Vol. 40, No. 5, 01.10.2012, p. 2541-2571.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Regularized rank-based estimation of high-dimensional nonparanormal graphical models

AU - Xue, Lingzhou

AU - Zou, Hui

PY - 2012/10/1

Y1 - 2012/10/1

N2 - A sparse precision matrix can be directly translated into a sparse Gaussian graphical model under the assumption that the data follow a joint normal distribution. This neat property makes high-dimensional precision matrix estimation very appealing in many applications. However, in practice we often face nonnormal data, and variable transformation is often used to achieve normality. In this paper we consider the nonparanormal model that assumes that the variables follow a joint normal distribution after a set of unknown monotone transformations. The nonparanormal model is much more flexible than the normal model while retaining the good interpretability of the latter in that each zero entry in the sparse precision matrix of the nonparanormal model corresponds to a pair of conditionally independent variables. In this paper we show that the nonparanormal graphical model can be efficiently estimated by using a rank-based estimation scheme which does not require estimating these unknown transformation functions. In particular, we study the rank-based graphical lasso, the rank-based neighborhood Dantzig selector and the rank-based CLIME.We establish their theoretical properties in the setting where the dimension is nearly exponentially large relative to the sample size. It is shown that the proposed rank-based estimators work as well as their oracle counterparts defined with the oracle data. Furthermore, the theory motivates us to consider the adaptive version of the rank-based neighborhood Dantzig selector and the rank-based CLIME that are shown to enjoy graphical model selection consistency without assuming the irrepresentable condition for the oracle and rank-based graphical lasso. Simulated and real data are used to demonstrate the finite performance of the rank-based estimators.

AB - A sparse precision matrix can be directly translated into a sparse Gaussian graphical model under the assumption that the data follow a joint normal distribution. This neat property makes high-dimensional precision matrix estimation very appealing in many applications. However, in practice we often face nonnormal data, and variable transformation is often used to achieve normality. In this paper we consider the nonparanormal model that assumes that the variables follow a joint normal distribution after a set of unknown monotone transformations. The nonparanormal model is much more flexible than the normal model while retaining the good interpretability of the latter in that each zero entry in the sparse precision matrix of the nonparanormal model corresponds to a pair of conditionally independent variables. In this paper we show that the nonparanormal graphical model can be efficiently estimated by using a rank-based estimation scheme which does not require estimating these unknown transformation functions. In particular, we study the rank-based graphical lasso, the rank-based neighborhood Dantzig selector and the rank-based CLIME.We establish their theoretical properties in the setting where the dimension is nearly exponentially large relative to the sample size. It is shown that the proposed rank-based estimators work as well as their oracle counterparts defined with the oracle data. Furthermore, the theory motivates us to consider the adaptive version of the rank-based neighborhood Dantzig selector and the rank-based CLIME that are shown to enjoy graphical model selection consistency without assuming the irrepresentable condition for the oracle and rank-based graphical lasso. Simulated and real data are used to demonstrate the finite performance of the rank-based estimators.

UR - http://www.scopus.com/inward/record.url?scp=84873376907&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873376907&partnerID=8YFLogxK

U2 - 10.1214/12-AOS1041

DO - 10.1214/12-AOS1041

M3 - Article

AN - SCOPUS:84873376907

VL - 40

SP - 2541

EP - 2571

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 5

ER -