TY - JOUR
T1 - Nonparametric Regularized Regression for Phenotype-Associated Taxa Selection and Network Construction with Metagenomic Count Data
AU - Guo, Wenchuan
AU - Liu, Zhenqiu
AU - Ma, Shujie
N1 - Funding Information:
The research of Liu is partially supported by NSF grant DMS-222381. The research of Ma is supported, in part, by the U.S. NSF grant DMS-13-06972 and Hellman Fellowship.
Publisher Copyright:
© Copyright 2016, Mary Ann Liebert, Inc. 2016.
PY - 2016/11
Y1 - 2016/11
N2 - We use a metagenomic approach and network analysis to investigate the relationships between phenotypes across taxa under different environmental conditions. The network structure of taxa can be affected by the disease-associated environmental conditions. In addition, taxa abundance is differentiated under conditions. Therefore, knowing how the correlation or relative abundance changes with these factors would be of great interest to researchers. We develop a nonparametric regularized regression method to construct taxa association networks under different clinical conditions. We let the coefficients be unknown functions of the environmental variable. The varying coefficients are estimated by using regression splines. The proposed method is regularized with concave penalties, and an efficient group descent algorithm is developed for computation. We also apply the varying coefficient model to estimate taxa abundance to see how it changes across different environmental conditions. Moreover, for conducting inference, we propose a bootstrap method to construct the simultaneous confidence bands for the corresponding coefficients. We use different simulated designs and a real data set to demonstrate that our method can identify the network structures successfully under different environmental conditions. As such, the proposed method has potential applications for researchers to construct differential networks and identify taxa.
AB - We use a metagenomic approach and network analysis to investigate the relationships between phenotypes across taxa under different environmental conditions. The network structure of taxa can be affected by the disease-associated environmental conditions. In addition, taxa abundance is differentiated under conditions. Therefore, knowing how the correlation or relative abundance changes with these factors would be of great interest to researchers. We develop a nonparametric regularized regression method to construct taxa association networks under different clinical conditions. We let the coefficients be unknown functions of the environmental variable. The varying coefficients are estimated by using regression splines. The proposed method is regularized with concave penalties, and an efficient group descent algorithm is developed for computation. We also apply the varying coefficient model to estimate taxa abundance to see how it changes across different environmental conditions. Moreover, for conducting inference, we propose a bootstrap method to construct the simultaneous confidence bands for the corresponding coefficients. We use different simulated designs and a real data set to demonstrate that our method can identify the network structures successfully under different environmental conditions. As such, the proposed method has potential applications for researchers to construct differential networks and identify taxa.
UR - http://www.scopus.com/inward/record.url?scp=84995390432&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84995390432&partnerID=8YFLogxK
U2 - 10.1089/cmb.2016.0023
DO - 10.1089/cmb.2016.0023
M3 - Article
C2 - 27427793
AN - SCOPUS:84995390432
SN - 1066-5277
VL - 23
SP - 877
EP - 890
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 11
ER -