In silico Structural Study of Random Amino Acid Sequence Proteins Not Present in Nature |
| |
Authors: | Katarzyna Prymula Monika Piwowar Marek Kochanczyk Lukasz Flis Maciej Malawski Tomasz Szepieniec Giovanni Evangelista Giuseppe Minervini Fabio Polticelli Zdzis?aw Wi?niowski Kinga Sa?apa Ewa Matczyńska Irena Roterman |
| |
Institution: | 1. Department of Bioinformatics and Telemedicine, Collegium Medicum – Jagiellonian University, Lazarza 16, PL‐31‐530 Krakow (phone and fax: +48?12?619?96?93);2. Faculty of Chemistry, Jagiellonian University, Ingardena 3, PL‐30‐060 Krakow;3. Faculty of Physics, Astronomy and Applied Informatics, Reymonta 4, PL‐30‐059 Krakow;4. Institute of Computer Science AGH, Mickiewicza 30, PL‐30‐059 Krakow;5. Academic Computer Center CYFRONET, Nawojki 11, PL‐30‐950 Krakow;6. Department of Biology, University Roma Tre, Viale G. Marconi 446, I‐00146 Rome;7. G. Evangelista and F. Polticelli are the authors of the software for random sequences generation and selection against the Non‐Redundant sequence database, G. Minervini is the one who actually set up Rosetta on grid and generated the predictions. M. Malawski and T. Szepieniec were responsible to set up the FOD model on the grid system. M. Kochanczyk, L. Flis prepared the program according to FOD model and introduced some modifications (the procedure avoiding the atoms overlapping) and RMS‐D calculation allowing structural comparison. K. Prymula was responsible for monitoring the progress in calculation. M. Piwowar and E. Matczyńska are the authors of program checking the amino acid sequences. Z. Wi?niowski and K. Sa?apa performed the statistical calculations. I. Roterman is the author of FOD model. |
| |
Abstract: | The three‐dimensional structures of a set of ‘never born proteins’ (NBP, random amino acid sequence proteins with no significant homology with known proteins) were predicted using two methods: Rosetta and the one based on the ‘fuzzy‐oil‐drop’ (FOD) model. More than 3000 different random amino acid sequences have been generated, filtered against the non redundant protein sequence data base, to remove sequences with significant homology with known proteins, and subjected to three‐dimensional structure prediction. Comparison between Rosetta and FOD predictions allowed to select the ten top (highest structural similarity) and the ten bottom (the lowest structural similarity) structures from the ranking list organized according to the RMS‐D value. The selected structures were taken for detailed analysis to define the scale of structural accordance and discrepancy between the two methods. The structural similarity measurements revealed discrepancies between structures generated on the basis of the two methods. Their potential biological function appeared to be quite different as well. The ten bottom structures appeared to be ‘unfoldable’ for the FOD model. Some aspects of the general characteristics of the NBPs are also discussed. The calculations were performed on the EUChinaGRID grid platform to test the performance of this infrastructure for massive protein structure predictions. |
| |
Keywords: | In silicio studies Biological activity Active center recognition Hydrophobicity deficiency Proteins |
|
|