Understanding the design of the universe of protein structures may provide insights into protein evolution. We study the architecture of the protein domain universe, which has been found to poses peculiar scale-free properties [Dokholyan, N.V., Shakhnovich, B., Shakhnovich, E.I., 2002b. Expanding protein universe and its origin from the biological Big Bang. Proceedings of the National Academy of Sciences of the United States of America 99, 14132-14136]. We examine the origin of these scale-free properties of the graph of protein domain structures (PDUG) and determine that that the PDUG is not modular, i.e. it does not consist of modules with uniform properties. Instead, we find the PDUG to be self-similar at all scales. We further characterize the PDUG architecture by studying the properties of the hub nodes that are responsible for the scale-free connectivity of the PDUG. We introduce a measure of the betweenness centrality of protein domains in the PDUG and find a power-law distribution of the betweenness centrality values. The scale-free distribution of hubs in the protein universe suggests that a set of specific statistical mechanics models, such as the self-organized criticality model, can potentially identify the principal driving forces of protein evolution. We also find a gatekeeper protein domain, removal of which partitions the largest cluster into two large sub-clusters. We suggest that the loss of such gatekeeper protein domains in the course of evolution is responsible for the creation of new fold families.
All Science Journal Classification (ASJC) codes