首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Stabilization of secondary structure elements by specific combinations of hydrophobic and hydrophilic amino acids has been studied by the way of analysis of pentapeptide fragments from twelve partial bacterial proteomes. PDB files describing structures of proteins from species with extremely high and low genomic GC-content, as well as with average G + C were included in the study. Amino acid residues in 78,009 pentapeptides from alpha helices, beta strands and coil regions were classified into hydrophobic and hydrophilic ones. The common propensity scale for 32 possible combinations of hydrophobic and hydrophilic amino acid residues in pentapeptide has been created: specific pentapeptides for helix, sheet and coil were described. The usage of pentapeptides preferably forming alpha helices is decreasing in alpha helices of partial bacterial proteomes with the increase of the average genomic GC-content in first and second codon positions. The usage of pentapeptides preferably forming beta strands is increasing in coil regions and in helices of partial bacterial proteomes with the growth of the average genomic GC-content in first and second codon positions. Due to these circumstances the probability of coil-sheet and helix-sheet transitions should be increased in proteins encoded by GC-rich genes making them prone to form amyloid in certain conditions. Possible causes of the described fact that importance of alpha helix and coil stabilization by specific combinations of hydrophobic and hydrophilic amino acids is growing with the decrease of genomic GC-content have been discussed.  相似文献   

2.
Analysis of extant proteomes has the potential of revealing how amino acid frequencies within proteins have evolved over biological time. Evidence is presented here that cysteine, tyrosine, and phenylalanine residues have substantially increased in frequency since the three primary lineages diverged more than three billion years ago. This inference was derived from a comparison of amino acid frequencies within conserved and non-conserved residues of a set of proteins dating to the last universal ancestor in the face of empirical knowledge of the relative mutability of these amino acids. The under-representation of these amino acids within last universal ancestor proteins relative to their modern descendants suggests their late introduction into the genetic code. Thus, it appears that extant ancient proteins contain evidence pertaining to early events in the formation of biological systems.  相似文献   

3.
Evolutionary traces of thermophilic adaptation are manifest, on the whole-genome level, in compositional biases toward certain types of amino acids. However, it is sometimes difficult to discern their causes without a clear understanding of underlying physical mechanisms of thermal stabilization of proteins. For example, it is well-known that hyperthermophiles feature a greater proportion of charged residues, but, surprisingly, the excess of positively charged residues is almost entirely due to lysines but not arginines in the majority of hyperthermophilic genomes. All-atom simulations show that lysines have a much greater number of accessible rotamers than arginines of similar degree of burial in folded states of proteins. This finding suggests that lysines would preferentially entropically stabilize the native state. Indeed, we show in computational experiments that arginine-to-lysine amino acid substitutions result in noticeable stabilization of proteins. We then hypothesize that if evolution uses this physical mechanism as a complement to electrostatic stabilization in its strategies of thermophilic adaptation, then hyperthermostable organisms would have much greater content of lysines in their proteomes than comparably sized and similarly charged arginines. Consistent with that, high-throughput comparative analysis of complete proteomes shows extremely strong bias toward arginine-to-lysine replacement in hyperthermophilic organisms and overall much greater content of lysines than arginines in hyperthermophiles. This finding cannot be explained by genomic GC compositional biases or by the universal trend of amino acid gain and loss in protein evolution. We discovered here a novel entropic mechanism of protein thermostability due to residual dynamics of rotamer isomerization in native state and demonstrated its immediate proteomic implications. Our study provides an example of how analysis of a fundamental physical mechanism of thermostability helps to resolve a puzzle in comparative genomics as to why amino acid compositions of hyperthermophilic proteomes are significantly biased toward lysines but not similarly charged arginines.  相似文献   

4.
The usage of synonymous codons and the frequencies of amino acids were investigated in the complete genome of the bacterium Thermotoga maritima using a multivariate statistical approach. The GC3 content of each gene was the most prominent source of variation of codon usage. Surprisingly the usage of UGU and UGC (synonymous triplets coding for Cys, the least frequent amino acid in this species) was detected as the second most prominent source of variation. However, this result is probably an artifact due to the very low frequency of Cys together with the nonbiased composition of this genome. The third trend was related to the preferential usage of a subset of codons among highly expressed genes, and these triplets are presumed to be translationally optimal. Concerning the amino acid usage, the hydropathy level of each protein (and therefore the frequency of charged residues) was the main trend, while the second factor was related to the frequency of usage of the smaller residues, suggesting that the cell economy strongly influences the architecture of the proteins. The third axis of the analysis discriminated the usage of Phe, Tyr, Trp (aromatic residues) plus Cys, Met, and His. These six residues have in common the property of being the preferential targets of reactive oxygen species, and therefore the anaerobic condition of T. maritima is an important factor for the amino acid frequencies. Finally, the Cys content of each protein was the fourth trend. Received: 22 June 2001 / Accepted: 1 October 2001  相似文献   

5.
6.
We analyzed the dependence of the percent of highly immunogenic amino acid residues included in B-cell epitopes of homologous proteins on the GC-content (G+C) of genes coding for them in twenty-seven lineages of proteins (and subsequent genes), which belong to seven Varicello and five Simplex viruses. We found out that proteins encoded by genes of a high GC-content usually contain more targets for humoral immune response than their homologs encoded by GC-poor genes. This tendency is characteristic not only to the lineages of glycoproteins, which are the main targets for humoral immune response against Simplex and Varicello viruses, but also to the lineages of capsid proteins and even "housekeeping" enzymes. The percent of amino acids included in linear B-cell epitopes has been predicted for 324 proteins by BepiPred algorithm (www.cbs.dtu.dk/services/BepiPred), the percent of highly immunogenic amino acids included in discontinuous B-cell epitopes and the percent of exposed amino acid residues have been predicted by Epitopia algorithm (http://epitopia.tau.ac.il/). Immunological consequences of the directional mutational GC-pressure are mostly due to the decrease in the total usage of highly hydrophobic amino acids and due to the increase in proline and glycine levels of usage in proteins. The weaker the negative selection on amino acid substitutions caused by symmetric mutational pressure, the higher the slope of direct dependence of the percent of highly immunogenic amino acids included in B-cell epitopes on G+C.  相似文献   

7.
Recent studies across animal phyla have suggested a possible link between amino acid compositional shifts and adaptive evolution across mitochondrial proteomes enabling longer lifespans. These studies examined associations of a gradual loss of cysteine (Cys) residues, increased usage of methionine (Met), and increased usage of threonine (Thr), with the evolution of longevity. Here, we examine all three hypotheses in a framework that considers nucleotide composition. We find that nucleotide composition is strongly correlated across codon positions, and with the above amino acid frequency patterns. We also find that the ND6 gene, which in vertebrates is the only mitochondrial gene situated on the “light-strand” shows no significant pattern for any of the amino acid associations. We also reasoned that if the mitochondrially-encoded proteins of oxidative phosphorylation (OXPHOS) were under selection for such shifts, then nuclear-encoded components should also reflect such pressure. However, we found non-correspondence of these patterns in the nuclear genes when compared to the mitochondrial genes previously associated with positive selection. These results are strongly suggestive of mutational bias, or less efficient purifying selection, as the primary driver of whole proteome shifts in amino acid composition.  相似文献   

8.
Archaea, bacteria and eukaryotes represent the main kingdoms of life. Is there any trend for amino acid compositions of proteins found in full genomes of species of different kingdoms? What is the percentage of totally unstructured proteins in various proteomes? We obtained amino acid frequencies for different taxa using 195 known proteomes and all annotated sequences from the Swiss-Prot data base. Investigation of the two data bases (proteomes and Swiss-Prot) shows that the amino acid compositions of proteins differ substantially for different kingdoms of life, and this difference is larger between different proteomes than between different kingdoms of life. Our data demonstrate that there is a surprisingly small selection for the amino acid composition of proteins for higher organisms (eukaryotes) and their viruses in comparison with the "random" frequency following from a uniform usage of codons of the universal genetic code. On the contrary, lower organisms (bacteria and especially archaea) demonstrate an enhanced selection of amino acids. Moreover, according to our estimates, 12%, 3% and 2% of the proteins in eukaryotic, bacterial and archaean proteomes are totally disordered, and long (> 41 residues) disordered segments are found to occur in 16% of arhaean, 20% of eubacterial and 43% of eukaryotic proteins for 19 archaean, 159 bacterial and 17 eukaryotic proteomes, respectively. A correlation between amino acid compositions of proteins of various taxa, show that the highest correlation is observed between eukaryotes and their viruses (the correlation coefficient is 0.98), and bacteria and their viruses (the correlation coefficient is 0.96), while correlation between eukaryotes and archaea is 0.85 only.  相似文献   

9.
Correspondence analysis of 28 proteomes selected to span the entire realm of prokaryotes revealed universal biases in the proteins' amino acid distribution. Integral Inner Membrane Proteins always form an individual cluster, which can then be used to predict protein localisation in unknown proteomes, independently of the organism's biotope or kingdom. Orphan proteins are consistently rich in aromatic residues. Another bias is also ubiquitous: the amino acid composition is driven by the G + C content of the first codon position. An unexpected bias is driven, in many proteomes, by the AAN box of the genetic code, suggesting some functional biochemical relationship between asparagine and lysine. Less-significant biases are driven by the rare amino acids, cysteine and tryptophan. Some allow identification of species-specific functions or localisation such as surface or exported proteins. Errors in genome annotations are also revealed by correspondence analysis, making it useful for quality control and correction.  相似文献   

10.
11.
A full repertoire of octapeptides which are present in at least 30 bacterial proteomes of total 131 currently available is computationally derived and filtered. An original search technique is used that, in terms of computational time and memory, is similar to the Suffix tree method. The presence of a given sequence in a large number of proteomes qualifies it as a conserved sequence. The larger the number of proteomes where it is found, the higher is the conservation. The concept of compositional age of the amino acid sequences (“compositional clock”) is introduced for the first time. The compositional age is calculated on the basis of the consensus temporal order of appearance of amino acids in early evolution. The correlation between the compositional age and the sequence conservation is established. [Reviewing Editor: Martin Kreitman]  相似文献   

12.
Zhou XX  Wang YB  Pan YJ  Li WF 《Amino acids》2008,34(1):25-33
Summary. Thermophilic proteins show substantially higher intrinsic thermal stability than their mesophilic counterparts. Amino acid composition is believed to alter the intrinsic stability of proteins. Several investigations and mutagenesis experiment have been carried out to understand the amino acid composition for the thermostability of proteins. This review presents some generalized features of amino acid composition found in thermophilic proteins, including an increase in residue hydrophobicity, a decrease in uncharged polar residues, an increase in charged residues, an increase in aromatic residues, certain amino acid coupling patterns and amino acid preferences for thermophilic proteins. The differences of amino acids composition between thermophilic and mesophilic proteins are related to some properties of amino acids. These features provide guidelines for engineering mesophilic protein to thermophilic protein. Authors’ addresses: Yuan-Jiang Pan, Institute of Chemical Biology and Pharmaceutical Chemistry, Zhejiang University, Zhejiang University Road 38, Hangzhou 310027, China; Wei-Fen Li, Microbiology Division, College of Animal Science, Zhejiang University, Hangzhou 310029, China  相似文献   

13.
Snake venom contains a diverse array of proteins and polypeptides. Cytotoxins and short neurotoxins are non-enzymatic polypeptide components of snake venom. The three-dimensional structure of cytotoxin and short neurotoxin resembles a three finger appearance of three-finger protein super family. Different family members of three-finger protein super family are employed in diverse biological functions. In this work we analyzed the cytotoxin, short neurotoxin and related non-toxin proteins of other chordates in terms of functional analysis, amino acid compositional (%) profile, number of amino acids, molecular weight, theoretical isoelectric point (pI), number of positively charged and negatively charged amino acid residues, instability index and grand average of hydropathy with the help of different bioinformatical tools. Among all interesting results, profile of amino acid composition (%) depicts that all sequences contain a conserved cysteine amount but differential amount of different amino acid residues which have a family specific pattern. Involvement in different biological functions is one of the driving forces which contribute the vivid amino acid composition profile of these proteins. Different biological system dependent adaptation gives the birth of enriched bio-molecules. Understanding of physicochemical properties of these proteins will help to generate medicinally important therapeutic molecules for betterment of human lives.  相似文献   

14.
Sequence analysis of short fragments resulting from trypsin digestion of the thermolabile shrimp alkaline phosphatase (SAP) from Northern shrimp Pandalus borealis formed the basis for amplification of its encoding cDNA. The predicted protein sequence was recognized as containing the consensus alkaline phosphatase motif comprising the active site of this protein family. Protein sequence homology searches identified several eukaryote alkaline phosphatases with which the 475-amino acid SAP polypeptide revealed shares 45% amino acid sequence identity. Residues for potential metal binding seem to be conserved in these proteins. The predicted 54-kDa molecular mass of SAP is smaller than previously reported, but is consistent with our recent SDS-PAGE analysis of the native protein. Compared to its homologs, the shrimp enzyme has a surplus of negatively charged amino acids, while the relative number of prolines is lower and the frequency of aromatic residues is higher than in mesophilic counterparts.  相似文献   

15.
Evolution of Chitin-Binding Proteins in Invertebrates   总被引:11,自引:0,他引:11  
Analysis of a group of invertebrate proteins, including chitinases and peritrophic matrix proteins, reveals the presence of chitin-binding domains that share significant amino acid sequence similarity. The data suggest that these domains evolved from a common ancestor which may be a protein containing a single chitin-binding domain. The duplication and transposition of this chitin-binding domain may have contributed to the functional diversification of chitin-binding proteins. Sequence comparisons indicated that invertebrate and plant chitin binding domains do not share significant amino acid sequence similarity, suggesting that they are not coancestral. However, both the invertebrate and the plant chitin-binding domains are cysteine-rich and have several highly conserved aromatic residues. In plants, cysteines have been elucidated in maintaining protein folding and aromatic amino acids in interacting with saccharides [Wright HT, Sanddrasegaram G, Wright CS (1991) J Mol Evol 33:283–294]. It is likely that these residues perform similar functions in invertebrates. We propose that the invertebrate and the plant chitin-binding domains share similar mechanisms for folding and saccharide binding and that they evolved by convergent evolution. Furthermore, we propose that the disulfide bonds and aromatic residues are hallmarks for saccharide-binding proteins. Received: 2 March 1998 / Accepted: 17 July 1998  相似文献   

16.
In this study we classified regions of random coil into four types: coil between alpha helix and beta strand, coil between beta strand and alpha helix, coil between two alpha helices and coil between two beta strands. This classification may be considered as natural. We used 610 3D structures of proteins collected from the Protein Data Bank from bacteria with low, average and high genomic GC-content. Relatively short regions of coil are not random: certain amino acid residues are more or less frequent in each of the types of coil. Namely, hydrophobic amino acids with branched side chains (Ile, Val and Leu) are rare in coil between two beta strands, unlike some acrophilic amino acids (Asp, Asn and Gly). In contrast, coil between two alpha helices is enriched by Leu. Regions of coil between alpha helix and beta strand are enriched by positively charged amino acids (Arg and Lys), while the usage of residues with side chains possessing hydroxyl group (Ser and Thr) is low in them, in contrast to the regions of coil between beta strand and alpha helix. Regions of coil between beta strand and alpha helix are significantly enriched by Cys residues. The response to the symmetric mutational pressure (AT-pressure or GC-pressure) is also quite different for four types of coil. The most conserved regions of coil are “connecting bridges” between beta strand and alpha helix, since their amino acid content shows less strong dependence on GC-content of genes than amino acid contents of other three types of coil. Possible causes and consequences of the described differences in amino acid content distribution between different types of random coil have been discussed.  相似文献   

17.
The levels of cellular organization in living organisms are the results of a variety of selection pressures. We have investigated here the final outcome of this integrated selective process in proteins of the best known microbial models Escherichia coli, Bacillus subtilis, and Methanococcus jannaschii, supposed to have undergone separate evolution for more than 1 billion years. Using multivariate analysis methods, including correspondence analysis, we studied the overall amino acid composition of all proteins making a proteome. Starting from and further developing previous results that had pointed out some general forces driving the amino acid composition of the proteomes of these model bacteria, we explored the correlations existing between the structure and functions of the proteins forming a proteome and their amino acid composition. The electric charge of amino acids measured against hydrophobicity creates a highly homogeneous cluster, made exclusively of proteins that are core components of the cytoplasmic membrane of the cell (integral inner membrane proteins). A second bias is imposed by the G+C content of the genome, indicating that protein functions are so robust with respect to amino acid changes that they can accommodate a large shift in the nucleotide content of the genome. A remarkable role of aromatic amino acids was uncovered. Expressed orphan proteins are enriched in these residues, suggesting that they might participate in a process of gain of function during evolution.  相似文献   

18.
Assembly of the ribosome from its protein and RNA constituents has been studied extensively over the past 50?years, and here we utilize a comparative analysis approach to relate the composition of ribosomal proteins (r-proteins) to their role in the assembly process. We computed the amino acid distributions for the 30S subunit r-protein sequences from 560 bacterial species and compared this composition to those of other house-keeping proteins from the same species. We found that r-proteins have a significantly higher content of positively charged residues (Lysine, K, and Arginine, R) than do nonribosomal proteins (10% for R and 11% for K in r-proteins, vs. 4.7% R and 5.9% K in non-ribosomal proteins), which is consistent with prior knowledge of net positive charges carried by r-proteins (Baker et al., 2001; Klein et al., 2004; Burton et al., 2012). Furthermore, these two residues are also highly represented at contact sites along the protein/RNA interface (contact enrichment factor (CEF)?>?1). These results provide further evidence of the importance of electrostatic interactions between the positively charged proteins and negatively charged ribosomal RNA (rRNA) during ribosome assembly. Other highly represented contact residues include polar and aromatic residues, which are likely to interact with rRNA via hydrogen bonds and base stacking interactions, respectively. Interestingly, the proportion of K residues generally decreases with r-protein size, reflecting a negative correlation between protein lengths and the proportion of K (Spearman’s rank correlation, ρ?=??0.802, p?=?2.60e???5). We suggest that this trend helps the smaller r-proteins, which experience higher translational entropy than large proteins, overcome the increased free energy barrier during assembly. When the r-protein sequences were categorized according to the species’ optimal growth temperature, we found that thermophiles show increased R, Isoleucine (I), and Tyrosine (Y) content, whereas mesophiles have increased proportions of Serine (S) and Threonine (T). These results reflect one typical distinction between thermophiles and mesophiles (Kumar and Nussinov 2001), yet these differences in amino acid distributions do not extend to their respective contact sites. That is, the makeup of thermophilic and mesophilic r-protein contact residues are not significantly different (p?>?0.01). This indicates that, while the percent compositions of amino acids relating to qualities such as thermostability and protein folding are expected to vary with environmental temperature, the distributions of residues in contact with rRNA are comparable for all bacterial species. From this, we conclude that the electrostatic interactions that guide ribosome assembly are independent of temperature.  相似文献   

19.
The far-ultraviolet circular dichroic spectrum of the 39-residue peptide hormone porcine corticotropin and the biologically active fragment corticotropin 1–24 is negative from 250 nm to 195 nm in water, but in 6M guanidinium chloride a positive band appears at about 225 nm. The temperature and guanidinium chloride dependence of this spectral transition indicates the absence of any stable ordered secondary structure in corticotropin and the spectrum is seen to be in only partial agreement with results using the model peptide chromophore, Ala-Ala-Ala. Using oligopeptides containing aromatic amino acid residues sandwiched between glycyl residues, it is shown that the shape and intensity of the corticotropin 225 nm positive band which appears in 6M guanidinium chloride is in agreement with the far-ultraviolet transitions of the aromatic chromophores in the hormone. Curve resolution of the near-ultraviolet circular dichroic spectrum of corticotropin and comparison of the rotational strengths of the phenylalanyl and tyrosyl bands reveals no evidence for increased rotational freedom in 6M guanidinium hydrochloride. Spectral changes are observed, however, in the transitions arising from the single tryptophan. This study suggests that corticotropin in aqueous solution may serve as a better model for the circular dichroic spectrum of the aperiodic regions in globular proteins than either synthetic homopolypeptides or reference proteins for which spectral and X-ray diffraction data are available.  相似文献   

20.
A comparative study of the compositional properties of various protein sets from both cellular and viral organisms is presented. Invariants and contrasts of amino acid usages have been discerned for different protein function classes and for different species using robust statistical methods based on quantile distributions and stochastic ordering relationships. In addition, a quantitative criterion to assess amino acid compositional extremes relative to a reference protein set is proposed and applied. Invariants of amino acid usage relate mainly to the central range of quantile distributions, whereas contrasts occur mainly in the tails of the distributions, especially contrasts between eukaryote and prokaryote species. Influences from genomic constraint are evident, for example, in the arginine:lysine ratios and the usage frequencies of residues encoded by G + C-rich versus A + T-rich codon types. The structurally similar amino acids, glutamate versus aspartate and phenylalanine versus tyrosine, show stochastic dominance relationships for most species protein sets favoring glutamate and phenylalanine respectively. The quantile distribution of hydrophobic amino acid usages in prokaryote data dominates the corresponding quantile distribution in human data. In contrast, glutamate, cysteine, proline and serine usages in human proteins dominate the corresponding quantile distributions in Escherichia coli. E. coli dominates human in the use of basic residues, but no dominance ordering applies to acidic residues. The discussion centers on commonalities and anomalies of the amino acid compositional spectrum in relation to species, function, cellular localization, biochemical and steric attributes, complexity of the amino acid biosynthetic pathway, amino acid relative abundances and founder effects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号