TY - JOUR
T1 - The architecture of the protein domain universe
AU - Dokholyan, Nikolay V.
N1 - Funding Information:
We would like to thank Shantanu Sarma for implementing the Brandes (2001) algorithm for fast calculation of the betweenness centrality and Drs. C. Carter Jr. and B. Kuhlman for insightful discussions. This work is supported by the UNC/IBM Junior Faculty Award.
PY - 2005/3/14
Y1 - 2005/3/14
N2 - Understanding the design of the universe of protein structures may provide insights into protein evolution. We study the architecture of the protein domain universe, which has been found to poses peculiar scale-free properties [Dokholyan, N.V., Shakhnovich, B., Shakhnovich, E.I., 2002b. Expanding protein universe and its origin from the biological Big Bang. Proceedings of the National Academy of Sciences of the United States of America 99, 14132-14136]. We examine the origin of these scale-free properties of the graph of protein domain structures (PDUG) and determine that that the PDUG is not modular, i.e. it does not consist of modules with uniform properties. Instead, we find the PDUG to be self-similar at all scales. We further characterize the PDUG architecture by studying the properties of the hub nodes that are responsible for the scale-free connectivity of the PDUG. We introduce a measure of the betweenness centrality of protein domains in the PDUG and find a power-law distribution of the betweenness centrality values. The scale-free distribution of hubs in the protein universe suggests that a set of specific statistical mechanics models, such as the self-organized criticality model, can potentially identify the principal driving forces of protein evolution. We also find a gatekeeper protein domain, removal of which partitions the largest cluster into two large sub-clusters. We suggest that the loss of such gatekeeper protein domains in the course of evolution is responsible for the creation of new fold families.
AB - Understanding the design of the universe of protein structures may provide insights into protein evolution. We study the architecture of the protein domain universe, which has been found to poses peculiar scale-free properties [Dokholyan, N.V., Shakhnovich, B., Shakhnovich, E.I., 2002b. Expanding protein universe and its origin from the biological Big Bang. Proceedings of the National Academy of Sciences of the United States of America 99, 14132-14136]. We examine the origin of these scale-free properties of the graph of protein domain structures (PDUG) and determine that that the PDUG is not modular, i.e. it does not consist of modules with uniform properties. Instead, we find the PDUG to be self-similar at all scales. We further characterize the PDUG architecture by studying the properties of the hub nodes that are responsible for the scale-free connectivity of the PDUG. We introduce a measure of the betweenness centrality of protein domains in the PDUG and find a power-law distribution of the betweenness centrality values. The scale-free distribution of hubs in the protein universe suggests that a set of specific statistical mechanics models, such as the self-organized criticality model, can potentially identify the principal driving forces of protein evolution. We also find a gatekeeper protein domain, removal of which partitions the largest cluster into two large sub-clusters. We suggest that the loss of such gatekeeper protein domains in the course of evolution is responsible for the creation of new fold families.
UR - http://www.scopus.com/inward/record.url?scp=15544372313&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=15544372313&partnerID=8YFLogxK
U2 - 10.1016/j.gene.2004.12.020
DO - 10.1016/j.gene.2004.12.020
M3 - Article
C2 - 15777630
AN - SCOPUS:15544372313
SN - 0378-1119
VL - 347
SP - 199
EP - 206
JO - Gene
JF - Gene
IS - 2 SPEC. ISS.
ER -