Data mining in genome sequences can identify distant homologues of known protein families, and is most powerful if solved structures are available to reveal the three-dimensional implications of very dissimilar sequences. Here we describe putative serpin sequences identified with very high statistical significance in the Caenorhabditis elegans genome. When mapped onto vertebrate serpins such as α1-antitrypsin, they suggest novel structural features. Some appear complete, some show extensive deletions, and others appear to contain only the C-terminal part of the known serpin fold, probably in partnership with N-terminal regions that have conformations unlike those of known serpins. The observation of such striking sequence similarity, in proteins that must have significantly different overall structures, substantially extends the structural characteristics of the serpin family of proteins.
|Original language||English (US)|
|Number of pages||11|
|Journal||Proteins: Structure, Function and Genetics|
|State||Published - Jul 1 1999|
All Science Journal Classification (ASJC) codes
- Structural Biology
- Molecular Biology