首页 | 本学科首页   官方微博 | 高级检索  
     


Distinct Stages of Protein Evolution as Suggested by Protein Sequence Analysis
Authors:Edward N. Trifonov  Alla Kirzhner  Valery M. Kirzhner  Igor N. Berezovsky
Affiliation:(1) Department of Structural Biology, The Weizman Institute of Science, Rehovot 76100, Israel, IL;(2) Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel, IL
Abstract:Evolution of proteins encoded in nucleotide sequences began with the advent of the triplet code. The chronological order of the appearance of amino acids on the evolution scene and the steps in the evolution of the triplet code have been recently reconstructed (Trifonov, 2000b) on the basis of 40 different ranking criteria and hypotheses. According to the consensus chronology, the pair of complementary GGC and GCC codons for the amino acids alanine and glycine appeared first. Other codons appeared as complementary pairs as well, which divided their respective amino acids into two alphabets, encoded by triplets with either central purines or central pyrimidines: G, D, S, E, N, R, K, Q, C, H, Y, and W (Glycine alphabet G) and A, V, P, S, L, T, I, F, and M (Alanine alphabet A). It is speculated that the earliest polypeptide chains were very short, presumably of uniform length, belonging to two alphabet types encoded in the two complementary strands of the earliest mRNA duplexes. After the fusion of the minigenes, a mosaic of the alphabets would form. Traces of the predicted mosaic structure have been, indeed, detected in the protein sequences of complete prokaryotic genomes in the form of weak oscillations with the period 12 residues in the form of alteration of two types of 6 residue long units. The next stage of protein evolution corresponded to the closure of the chains in the loops of the size 25–30 residues (Berezovsky et al., 2000). Autocorrelation analysis of proteins of 23 complete archaebacterial and eubacterial genomes revealed that the preferred distances between valine, alanine, glycine, leucine, and isoleucine along the sequences are in the same range of 25–30 residues, indicating that the loops are primarily closed by hydrophobic interactions between the ends of the loops. The loop closure stage is followed by the formation of typical folds of 100–200 amino acids, via end-to-end fusion of the genes encoding the loop-size chains. This size was apparently dictated by the optimal ring closure for DNA. In both cases the closure into the ring (loop) rendered evolutionarily advantageous stability to the respective structures. Further gene fusions lead to the formation of modern multidomain proteins. Recombinational gene splicing is likely to have appeared after the DNA circularization stage. Received: 21 December 2000 / Accepted: 28 February 2001
Keywords:: Early evolution —   Amino acid chronology —   Codon chronology —   Triplet code —   Homopeptides —   Heteropeptides —   Sequence mosaic —   Closed loops —   Autocorrelation —   DNA ring closure —   Protein folds —   Multidomain proteins —   Gene splicing
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号