首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In a recent paper Giulio & Caldararo (1987) used a Markov chain model to study the evolution of proteins. Unfortunately, their use of a first-order Markov chain model at the amino acid level is incorrect. The model has to be applied at the codon level [Jorre & Curnow (1975a)] followed by amalgamation of the codon states corresponding to each amino acid and of the three codons specifying termination. The model is correctly applied in this paper. The results obtained do not differ substantially from those obtained by Giulio & Caldararo (1987). The interpretation of the results as supporting the neutralist view of protein evolution is criticized.  相似文献   

2.
To understand more fully how amino acid composition of proteins has changed over the course of evolution, a method has been developed for estimating the composition of proteins in an ancestral genome. Estimates are based upon the composition of conserved residues in descendant sequences and empirical knowledge of the relative probability of conservation of various amino acids. Simulations are used to model and correct for errors in the estimates. The method was used to infer the amino acid composition of a large protein set in the Last Universal Ancestor (LUA) of all extant species. Relative to the modern protein set, LUA proteins were found to be generally richer in those amino acids that are believed to have been most abundant in the prebiotic environment and poorer in those amino acids that are believed to have been unavailable or scarce. It is proposed that the inferred amino acid composition of proteins in the LUA probably reflects historical events in the establishment of the genetic code.  相似文献   

3.
Understanding the patterns and causes of protein sequence evolution is a major challenge in evolutionary biology. One of the critical unresolved issues is the relative contribution of selection and genetic drift to the fixation of amino acid sequence differences between species. Molecular homoplasy, the independent evolution of the same amino acids at orthologous sites in different taxa, is one potential signature of selection; however, relatively little is known about its prevalence in eukaryotic proteomes. To quantify the extent and type of homoplasy among evolving proteins, we used phylogenetic methodology to analyze 8 genome-scale data matrices from clades of different evolutionary depths that span the eukaryotic tree of life. We found that the frequency of homoplastic amino acid substitutions in eukaryotic proteins was more than 2-fold higher than expected under neutral models of protein evolution. The overwhelming majority of homoplastic substitutions were parallelisms that involved the most frequently exchanged amino acids with similar physicochemical properties and that could be reached by a single-mutational step. We conclude that the role of homoplasy in shaping the protein record is much larger than generally assumed, and we suggest that its high frequency can be explained by both weak positive selection for certain substitutions and purifying selection that constrains substitutions to a small number of functionally equivalent amino acids.  相似文献   

4.
M Hasegawa  T A Yano 《Origins of life》1975,6(1-2):219-227
The entropy of the amino acid sequences coded by DNA is considered as a measure of diversity of variety of proteins, and is taken as a measure of evolution. The DNA or m-RNA sequence is considered as a stationary second-order Markov chain composed of four kinds of bases. Because of the biased nature of the genetic code table, increase of entropy of amino acid sequences is possible with biased nucleotide sequence. Thus the biased DNA base composition and the extreme rarity of the base doublet CpG of higher organisms are explained. It is expected that the amino acid composition was highly biased at the days of the origin of the genetic code table, and the more frequent amino acids have tended to get rarer, and the rarer ones more frequent. This tendency is observed in the evolution of hemoglobin, cytochrome C, fibrinopeptide, immunoglobulin and lysozyme, and protein as a whole.  相似文献   

5.
Sato Y  Nishida M 《Gene》2009,441(1-2):3-11
Previous studies of protein evolution have identified important mutations in various proteins that affect a small number of residues, but dramatically alter protein function. However, the evolutionary process underlying the three-dimensional protein properties, which are determined by a much larger number of residues, remains unclear. Based on a comparative evolutionary analysis of teleost phosphoglucose isomerases (PGIs; EC 5.3.1.9), we previously demonstrated that the relatively weak selection on many amino acid sites has played an important role in the evolution of protein electric charge as a model of three-dimensional protein properties. To ascertain the generality of this finding, we sought further evidence of this type of protein evolution. For this purpose, we analyzed the vertebrate isoforms of fructose-1,6-bisphosphate aldolase (ALD; EC 4.1.2.13), for which electric charges are known to have diverged after gene duplication. The results showed that the divergence in electric charge between the ALD isoforms was also driven by weak selection on many amino acid sites, as in PGI, confirming the generality of earlier findings. To obtain further insights, ALD and PGI were compared to the proteins pancreatic ribonuclease (EC 3.1.27.5) and triose-phosphate isomerase (EC 5.3.1.1), for which electric charges likely evolved through a well-defined mode of molecular evolution; namely, strong selection on specific amino acid sites. Comparison of the number and composition of amino acids on the protein surface suggested that the absolute number of evolutionarily changeable amino acids in a protein affects the strength of selection pressure acting on individual amino acid sites.  相似文献   

6.
7.
A variety of strategies to incorporate unnatural amino acids into proteins have been pursued, but all have limitations with respect to technical accessibility, scalability, applicability to in vivo studies, or site specificity of amino acid incorporation. The ability to selectively introduce unnatural functional groups into specific sites within proteins, in vivo, provides a potentially powerful approach to the study of protein function and to large-scale production of novel proteins. Here we describe a combined genetic selection and screen that allows the rapid evolution of aminoacyl-tRNA synthetase substrate specificity. Our strategy involves the use of an "orthogonal" aminoacyl-tRNA synthetase and tRNA pair that cannot interact with any of the endogenous synthetase-tRNA pairs in Escherichia coli. A chloramphenicol-resistance (Cm(r)) reporter is used to select highly active synthetase variants, and an amplifiable fluorescence reporter is used together with fluorescence-activated cell sorting (FACS) to screen for variants with the desired change in amino acid specificity. Both reporters are contained within a single genetic construct, eliminating the need for plasmid shuttling and allowing the evolution to be completed in a matter of days. Following evolution, the amplifiable fluorescence reporter allows visual and fluorimetric evaluation of synthetase activity and selectivity. Using this system to explore the evolvability of an amino acid binding pocket of a tyrosyl-tRNA synthetase, we identified three new variants that allow the selective incorporation of amino-, isopropyl-, and allyl-containing tyrosine analogs into a desired protein. The new enzymes can be used to produce milligram-per-liter quantities of unnatural amino acid-containing protein in E. coli.  相似文献   

8.
The estimation of amino acid replacement frequencies during molecular evolution is crucial for many applications in sequence analysis. Score matrices for database search programs or phylogenetic analysis rely on such models of protein evolution. Pioneering work was done by Dayhoff et al. (1978) who formulated a Markov model of evolution and derived the famous PAM score matrices. Her estimation procedure for amino acid exchange frequencies is restricted to pairs of proteins that have a constant and small degree of divergence. Here we present an improved estimator, called the resolvent method, that is not subject to these limitations. This extension of Dayhoff's approach enables us to estimate an amino acid substitution model from alignments of varying degree of divergence. Extensive simulations show the capability of the new estimator to recover accurately the exchange frequencies among amino acids. Based on the SYSTERS database of aligned protein families (Krause and Vingron, 1998) we recompute a series of score matrices.  相似文献   

9.
The complete primary structure of the coat protein of strain VRU of alfalfa mosaic virus (AMV) is reported. The strain is morphologically different from all other AMV strains as it contains large amounts of unusually long virus particles. This is caused by structural differences in the coat protein chain. The amino acid sequence has mainly been established by the characterization of peptides obtained after cleavage with cyanogen bromide and digestion with trypsin, chymotrypsin, thermolysin or Staphylococcus aureus protease. The major sequencing technique used was the dansyl-Edman procedure. The VRU coat protein consists of 219 amino acid residues corresponding to a molecular weight of 24056. Compared to the coat protein of strain 425 [Van Beynum et al. (1977) Eur. J. Biochem. 72, 63-78], 15 amino acid substitutions were localized. Most of them have a conservative character and may be explained by single-point mutations. A correction is given for the AMV 425 coat protein: Asn-216 was shown to be Asp-216. The prediction of the secondary structure for the two viral coat proteins was not significantly influenced by the various amino acid substitutions except for the region containing residues 65-100. This led us to the hypothesis that the AMV coat protein may occur in two different conformations favouring its incorporation into either a pentagonal or hexagonal quasi-equivalent position in the lattice of the protein shell. The substitutions in the above-mentioned region of the VRU coat protein may have caused a strong preference for the hexagonal lattice conformation. The model is supported by preliminary sequence data of the same coat protein region in AMV 15/64, a strain morphologically intermediate between 425 and VRU.  相似文献   

10.
The entropy of the amino acid sequences coded by DNA is considered as a measure of diversity or variety of proteins, and is taken as a measure of evolution. The DNA or m-RNA sequence is corsidered as a stationary second-order Markov chain composed of four kinds of bases. Because of the biased nature of the genetic code table, increase of entropy of amino acid sequences is possible with biased nucleotide sequence. Thus the biased DNA base composition and the extreme rarity of the base doubletC p G of higher organisms are explained. It is expected that the amino acid composition was highly biased at the days of the origin of the genetic code table, and the more frequent amino acids have tended to get rarer, and the rarer ones more frequent. This tendency is observed in the evolution of hemoglobin, cytochrome C, fibrinopeptide, immunoglobulin and lysozyme, and protein as a whole.  相似文献   

11.
Jia M  Luo L  Liu C 《Biopolymers》2004,73(1):16-26
A new integrated sequence-structure database, called IADE (Integrated ASTRAL-DSSP-EMBL), incorporating matching mRNA sequence, amino acid sequence, and protein secondary structural data, is constructed. It includes 648 protein domains. Based on the IADE database, we studied the relation between RNA stem-loop frequencies and protein secondary structure. It was found that the alpha-helices and beta-strands on proteins tend to be preferably "coded" by mRNA stem region, while the coils on proteins tend to be preferably "coded" by mRNA loop region. These tendencies are more obvious if we observe the structural words (SWs). An SW is defined by a four-amino-acid-fragment that shows the pronounced secondary structural (alpha-helix or beta-strand) propensity. It is demonstrated that the deduced correlation between protein and mRNA structure can hardly be explained as the stochastic fluctuation effect.  相似文献   

12.
Understanding the cause of the changes in the amino acid composition of proteins is essential for understanding the evolution of protein functions. Since the early 1970s, it has been known that the frequency of some amino acids in protein sequences is increasing and that of others is decreasing. Recently, it was found that the trends of amino acid changes were similar in 15 taxa representing Bacteria, Archaea, and Eukaryota. However, the cause of this similarity in the trend of the gains and losses of amino acids continued to be debated. Here, we show that this trend of the gain and loss of amino acids can be simply explained by CpG hypermutability. We found that the frequency of amino acids coded by codons with TpG dinucleotides and those with CpA dinucleotides is increasing, while that of amino acids coded by codons with CpG dinucleotides is decreasing. We also found that organisms that lack DNA methyltransferase show different trends of the gain and loss of amino acids. DNA methyltransferase methylates CpG dinucleotides and induces CpG hypermutability. The incorporation of CpG hypermutability into models of protein evolution will improve studies on protein evolution in different organisms. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

13.
The presence in proteins of amino acid residues that change in concert during evolution is associated with keeping constant the protein spatial structure and functions. As in the case with morphological features, correlated substitutions may become the cause of homoplasies--the independent evolution of identical non-homological adaptations. Our data obtained on model phylogenetic trees and corresponding sets of sequences have shown that the presence of correlated substitutions distorts the results of phylogenetic reconstructions. A method for accounting for co-evolving amino acid residues in phylogenetic analysis is proposed. According to this method, only a single site from the group of correlated amino acid positions should remain, whereas other positions should not be used in further phylogenetic analysis. Simulations performed have shown that replacement on the average of 8% of variable positions in a pair of model sequences by coordinately evolving amino acid residues is able to change the tree topology. The removal of such amino acid residues from sequences before phylogenetic analysis restores the correct topology.  相似文献   

14.
Data on the amino acid composition of proteins having various functions from organisms representing different evolutionary levels (83 superfamilies) are used in order to elucidate the trends in protein molecular evolution. The interconnections evolutionary rate (rate of mutation acceptance) — amino acid composition, and evolutionary level of the organism — amino acid composition (in case of proteins of the same or very similar function) are studied. The amino acid compositions of proteins performing jointly an evolutionarily old functions are also juxtaposed. The mean contemporary protein composition is used as a basis for comparison. The obtained results are evidence in favour of the existence of a trend for an increase of the special amino acids (Met, Ile, Gln, His, Lys, Asn, Phe, Tyr, Trp, Cys) at the expense of the usual ones (Thr, Pro, Ala, Ser, Arg, Gly, Leu, Val, Glu, Asp). The tests of statistical significance of the obtained results (comparison of the mean compositions of proteins from low evolutionary level organisms with that of all sequenced proteins; comparison of the mean contemporary protein composition with that obtained after simulation of the evolutionary process) confirm and universalize the observed trend. The above results direct the attention to the concept of a smaller number of amino acids in the ancient proteins and respectively simpler genetic code. A fluctuation around the initial primitive level is suggested to explain the conservatism of proteins of the same function in evolutionarily low level organisms. The observed trend could be applied for designing new proteins.  相似文献   

15.
MOTIVATION: Knowledge of how proteomic amino acid composition has changed over time is important for constructing realistic models of protein evolution and increasing our understanding of molecular evolutionary history. The proteomic amino acid composition of the Last Universal Ancestor (LUA) of life is of particular interest, since that might provide insight into the early evolution of proteins and the nature of the LUA itself. RESULTS: We introduce a method to estimate ancestral amino acid composition that is based on expectation-maximization. On simulated data, the approach was found to be very effective in estimating ancestral amino acid composition, with accuracy improving as the number of residues in the dataset was increased. The method was then used to infer the amino acid composition of a set of proteins in the LUA. In general, as compared with the modern protein set, LUA proteins were found to be richer in amino acids that are believed to have been most abundant in the prebiotic environment and poorer in those believed to have been unavailable or scarce. Additionally, we found the inferred amino acid composition of this protein set in the LUA to be more similar to the observed composition of the same set in extant thermophilic species than in extant mesophilic species, supporting the idea that the LUA lived in a thermophilic environment. AVAILABILITY: The program is available at http://compbio.cs.princeton.edu/ancestralaa  相似文献   

16.
MOTIVATION: The observed correlations between pairs of homologous protein sequences are typically explained in terms of a Markovian dynamic of amino acid substitution. This model assumes that every location on the protein sequence has the same background distribution of amino acids, an assumption that is incompatible with the observed heterogeneity of protein amino acid profiles and with the success of profile multiple sequence alignment. RESULTS: We propose an alternative model of amino acid replacement during protein evolution based upon the assumption that the variation of the amino acid background distribution from one residue to the next is sufficient to explain the observed sequence correlations of homologs. The resulting dynamical model of independent replacements drawn from heterogeneous backgrounds is simple and consistent, and provides a unified homology match score for sequence-sequence, sequence-profile and profile-profile alignment.  相似文献   

17.
The effect of cadmium (Cd) exposure on Cd-binding ligands was investigated for the first time in a beetle (Coleoptera), using the mealworm Tenebrio molitor (L) as a model species. Exposure to Cd resulted in an approximate doubling of the Cd-binding capacity of the protein extracts from whole animals. Analysis showed that the increase was mainly explained by the induction of a Cd-binding protein of 7134.5 Da, with non-metallothionein characteristics. Amino acid analysis and de novo sequencing revealed that the protein has an unusually high content of the acidic amino acids aspartic and glutamic acid that may explain how this protein can bind Cd even without cysteine residues. Similarities in the amino acid composition suggest it to belong to a group of little studied proteins often referred to as "Cd-binding proteins without high cysteine content". This is the first report on isolation and peptide sequence determination of such a protein from a coleopteran.  相似文献   

18.
Parisi G  Echave J 《Gene》2005,345(1):45-53
The Structurally Constrained Protein Evolution (SCPE) model simulates protein evolution by introducing random mutations into the evolving sequences and selecting them against too much structural perturbation. Given a single protein structure, the SCPE model can be used to obtain a whole set of site-dependent amino acid substitution matrices. The set of SCPE substitution matrices for a given protein family can be seen as an independent-sites model of evolution for that family. Thus, these matrices can be compared with other substitution-matrix-based models of evolution. So far, SCPE has been tested only on left-handed parallel beta helix (LbetaH) proteins. Here, we address the question of generality by assessing the SCPE model on representatives of the four main classes of folds: alpha, beta, alpha+beta, and alpha/beta. We compare with other models using the likelihood ratio test with parametric bootstrapping. We show that SCPE performs better than the popular JTT model for all cases considered. Furthermore, by considering the relative contributions of mutation and selection, we found that the key to the success of the SCPE model is the selection step.  相似文献   

19.
That the physicochemical properties of amino acids constrain the structure, function and evolution of proteins is not in doubt. However, principles derived from information theory may also set bounds on the structure (and thus also the evolution) of proteins. Here we analyze the global properties of the full set of proteins in release 13-11 of the SwissProt database, showing by experimental test of predictions from information theory that their collective structure exhibits properties that are consistent with their being guided by a conservation principle. This principle (Conservation of Information) defines the global properties of systems composed of discrete components each of which is in turn assembled from discrete smaller pieces. In the system of proteins, each protein is a component, and each protein is assembled from amino acids. Central to this principle is the inter-relationship of the unique amino acid count and total length of a protein and its implications for both average protein length and occurrence of proteins with specific unique amino acid counts. The unique amino acid count is simply the number of distinct amino acids (including those that are post-translationally modified) that occur in a protein, and is independent of the number of times that the particular amino acid occurs in the sequence. Conservation of Information does not operate at the local level (it is independent of the physicochemical properties of the amino acids) where the influences of natural selection are manifest in the variety of protein structure and function that is well understood. Rather, this analysis implies that Conservation of Information would define the global bounds within which the whole system of proteins is constrained; thus it appears to be acting to constrain evolution at a level different from natural selection, a conclusion that appears counter-intuitive but is supported by the studies described herein.  相似文献   

20.
Protein evolution can be seen as the successive replacement of amino acids by other amino acids. In general, it is a very slow process which is triggered by point mutations in the nucleotide sequence. These mutations can transform into single nucleotide polymorphisms (SNPs) within populations and diverging proteins between species. It is well known that in many cases amino acids can be replaced by others without impeding the functioning of the protein, even if these are of quite different physico-chemical character. In some cases, however, almost any replacement would result in a functionally deficient protein. Based upon comprehensive published SNP data and applying correlation analysis we quantified the two antagonist factors controlling the process of amino acid replacement and thus protein evolution: First, the degenerate structure of the genetic code which facilitates the exchange of certain amino acids and, second, the physico-chemical forces which limit the range of possible exchanges to maintain a functional protein. We found that the observed frequencies of amino acid exchanges within species are best explained by the genetic code and that the conservation of physico-chemical properties plays a subordinate role, but has nevertheless to be considered as a key factor. Between moderately diverged species genetic code and physico-chemical properties exert comparable influence on amino acid exchanges. We furthermore studied amino acid exchanges in more detail for six species (four mammals, one bird, and one insect) and found that the profiles are highly correlated across all examined species despite their large evolutionary divergence of up to 800 million years. The species specific exchange profiles are also correlated to the exchange profile observed between different species. The currently available huge body of SNP data allows to characterize the role of two major shaping forces of protein evolution more quantitatively than before.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号