Classification of protein families and detection of the determinant residues with an improved self-organizing map |
| |
Authors: | Miguel A Andrade Georg Casari Chris Sander Alfonso Valencia |
| |
Institution: | (1) Protein Design Group, Centro Nacional de Biotecnología-CSIC, Cantoblanco, E-28049 Madrid, Spain, ES;(2) Protein Design Group, EMBL, Meyerhofstrasse 1, D-69120 Heidelberg, Germany, DE |
| |
Abstract: | Using a SOM (self-organizing map) we can classify sequences within a protein family into subgroups that generally correspond
to biological subcategories. These maps tend to show sequence similarity as proximity in the map. Combining maps generated
at different levels of resolution, the structure of relations in protein families can be captured that could not otherwise
be represented in a single map. The underlying representation of maps enables us to retrieve characteristic sequence patterns
for individual subgroups of sequences. Such patterns tend to correspond to functionally important regions. We present a modified
SOM algorithm that includes a convergence test that dynamically controls the learning parameters to adapt them to the learning
set instead of being fixed and externally optimized by trial and error. Given the variability of protein family size and distribution,
the addition of this feature is necessary. The method is successfully tested with a number of families. The rab family of small GTPases is used to illustrate the performance of the method.
Received: 25 July 1996 / Accepted in revised form: 13 February 1997 |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|