Secondary structure-based assignment of the protein structural classes |
| |
Authors: | Lukasz A. Kurgan Tuo Zhang Hua Zhang Shiyi Shen Jishou Ruan |
| |
Affiliation: | (1) Department of Electrical and Computer Engineering, University of Alberta, 2nd floor, ECERF (9107 116 Street), T6G 2V4 Edmonton, AB, Canada;(2) College of Mathematical Science and LPMC, Nankai University, Tianjin, People’s Republic of China;(3) Chern Institute of Mathematics, Tianjin, People’s Republic of China |
| |
Abstract: | Structural class categorizes proteins based on the amount and arrangement of the constituent secondary structures. The knowledge of structural classes is applied in numerous important predictive tasks that address structural and functional features of proteins. We propose novel structural class assignment methods that use one-dimensional (1D) secondary structure as the input. The methods are designed based on a large set of low-identity sequences for which secondary structure is predicted from their sequence (PSSAsc model) or assigned based on their tertiary structure (SSAsc). The secondary structure is encoded using a comprehensive set of features describing count, content, and size of secondary structure segments, which are fed into a small decision tree that uses ten features to perform the assignment. The proposed models were compared against seven secondary structure-based and ten sequence-based structural class predictors. Using the 1D secondary structure, SSAsc and PSSAsc can assign proteins to the four main structural classes, while the existing secondary structure-based assignment methods can predict only three classes. Empirical evaluation shows that the proposed models are quite promising. Using the structure-based assignment performed in SCOP (structural classification of proteins) as the golden standard, the accuracy of SSAsc and PSSAsc equals 76 and 75%, respectively. We show that the use of the secondary structure predicted from the sequence as an input does not have a detrimental effect on the quality of structural class assignment when compared with using secondary structure derived from tertiary structure. Therefore, PSSAsc can be used to perform the automated assignment of structural classes based on the sequences. |
| |
Keywords: | |
本文献已被 PubMed SpringerLink 等数据库收录! |
|