Grouping of amino acids and recognition of protein structurally conserved regions by reduced alphabets of amino acids |
| |
Authors: | Li Jing Wang Wei |
| |
Affiliation: | 1.National Laboratory of Solid State Microstructure and Department of Physics,Nanjing University,Nanjing,China;2.Interdisciplinary Center of Theoretical Studies,Chinese Academy of Sciences,Beijing,China |
| |
Abstract: | Sequence alignment is a common method for finding protein structurally conserved/similar regions. However, sequence alignment
is often not accurate if sequence identities between to-be-aligned sequences are less than 30%. This is because that for these
sequences, different residues may play similar structural roles and they are incorrectly aligned during the sequence alignment
using substitution matrix consisting of 20 types of residues. Based on the similarity of physicochemical features, residues
can be clustered into a few groups. Using such simplified alphabets, the complexity of protein sequences is reduced and at
the same time the key information encoded in the sequences remains. As a result, the accuracy of sequence alignment might
be improved if the residues are properly clustered. Here, by using a database of aligned protein structures (DAPS), a new
clustering method based on the substitution scores is proposed for the grouping of residues, and substitution matrices of
residues at different levels of simplification are constructed. The validity of the reduced alphabets is confirmed by relative
entropy analysis. The reduced alphabets are applied to recognition of protein structurally conserved/similar regions by sequence
alignment. The results indicate that the accuracy or efficiency of sequence alignment can be improved with the optimal reduced
alphabet with N around 9. |
| |
Keywords: | grouping of amino acids structural recognition sequence alignment |
本文献已被 万方数据 SpringerLink 等数据库收录! |
|