TY - JOUR
T1 - Robust Covariance Matrix Estimation for High-Dimensional Compositional Data with Application to Sales Data Analysis
AU - Li, Danning
AU - Srinivasan, Arun
AU - Chen, Qian
AU - Xue, Lingzhou
N1 - Publisher Copyright:
© 2022 American Statistical Association.
PY - 2022
Y1 - 2022
N2 - Compositional data arises in a wide variety of research areas when some form of standardization and composition is necessary. Estimating covariance matrices is of fundamental importance for high-dimensional compositional data analysis. However, existing methods require the restrictive Gaussian or sub-Gaussian assumption, which may not hold in practice. We propose a robust composition adjusted thresholding covariance procedure based on Huber-type M-estimation to estimate the sparse covariance structure of high-dimensional compositional data. We introduce a cross-validation procedure to choose the tuning parameters of the proposed method. Theoretically, by assuming a bounded fourth moment condition, we obtain the rates of convergence and signal recovery property for the proposed method and provide the theoretical guarantees for the cross-validation procedure under the high-dimensional setting. Numerically, we demonstrate the effectiveness of the proposed method in simulation studies and also a real application to sales data analysis.
AB - Compositional data arises in a wide variety of research areas when some form of standardization and composition is necessary. Estimating covariance matrices is of fundamental importance for high-dimensional compositional data analysis. However, existing methods require the restrictive Gaussian or sub-Gaussian assumption, which may not hold in practice. We propose a robust composition adjusted thresholding covariance procedure based on Huber-type M-estimation to estimate the sparse covariance structure of high-dimensional compositional data. We introduce a cross-validation procedure to choose the tuning parameters of the proposed method. Theoretically, by assuming a bounded fourth moment condition, we obtain the rates of convergence and signal recovery property for the proposed method and provide the theoretical guarantees for the cross-validation procedure under the high-dimensional setting. Numerically, we demonstrate the effectiveness of the proposed method in simulation studies and also a real application to sales data analysis.
UR - http://www.scopus.com/inward/record.url?scp=85139021690&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85139021690&partnerID=8YFLogxK
U2 - 10.1080/07350015.2022.2106990
DO - 10.1080/07350015.2022.2106990
M3 - Article
AN - SCOPUS:85139021690
SN - 0735-0015
JO - Journal of Business and Economic Statistics
JF - Journal of Business and Economic Statistics
ER -