TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains |
| |
Authors: | Guillermín Agüero-Chapin Gisselle Pérez-Machado Reinaldo Molina-Ruiz Yunierkis Pérez-Castillo Aliuska Morales-Helguera Vítor Vasconcelos Agostinho Antunes |
| |
Institution: | 1.CIMAR/CIIMAR, Centro Interdisciplinar de Investiga??o Marinha e Ambiental,Universidade do Porto,Porto,Portugal;2.Molecular Simulation and Drug Design (CBQ),Central University of Las Villas,Santa Clara,Cuba;3.Department of Organic Chemistry,Vigo University,Vigo,Spain;4.Department of Chemistry,Central University of Las Villas,Santa Clara,Cuba;5.REQUIMTE, Department of Chemistry,University of Porto,Porto,Portugal;6.Departamento de Biologia, Faculdade de Ciências,Universidade do Porto,Porto,Portugal |
| |
Abstract: | Bacteriocins are proteinaceous toxins produced and exported by both gram-negative and gram-positive bacteria as a defense
mechanism. The bacteriocin protein family is highly diverse, which complicates the identification of bacteriocin-like sequences
using alignment approaches. The use of topological indices (TIs) irrespective of sequence similarity can be a promising alternative
to predict proteinaceous bacteriocins. Thus, we present Topological Indices to BioPolymers (TI2BioP) as an alignment-free
approach inspired in both the Topological Substructural Molecular Design (TOPS-MODE) and Markov Chain Invariants for Network
Selection and Design (MARCH-INSIDE) methodology. TI2BioP allows the calculation of the spectral moments as simple TIs to seek
quantitative sequence-function relationships (QSFR) models. Since hydrophobicity and basicity are major criteria for the bactericide
activity of bacteriocins, the spectral moments (HPμ
k
) were derived for the first time from protein artificial secondary structures based on amino acid clustering into a Cartesian
system of hydrophobicity and polarity. Several orders of HPμ
k
characterized numerically 196 bacteriocin-like sequences and a control group made up of 200 representative CATH domains.
Subsequently, they were used to develop an alignment-free QSFR model allowing a 76.92% discrimination of bacteriocin proteins
from other domains, a relevant result considering the high sequence diversity among the members of both groups. The model
showed a prediction overall performance of 72.16%, detecting specifically 66.7% of proteinaceous bacteriocins whereas the
InterProScan retrieved just 60.2%. As a practical validation, the model also predicted successfully the cryptic bactericide
function of the Cry 1Ab C-terminal domain from Bacillus thuringiensis’s endotoxin, which has not been detected by classical alignment methods. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|