首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Evolutionary traces of thermophilic adaptation are manifest, on the whole-genome level, in compositional biases toward certain types of amino acids. However, it is sometimes difficult to discern their causes without a clear understanding of underlying physical mechanisms of thermal stabilization of proteins. For example, it is well-known that hyperthermophiles feature a greater proportion of charged residues, but, surprisingly, the excess of positively charged residues is almost entirely due to lysines but not arginines in the majority of hyperthermophilic genomes. All-atom simulations show that lysines have a much greater number of accessible rotamers than arginines of similar degree of burial in folded states of proteins. This finding suggests that lysines would preferentially entropically stabilize the native state. Indeed, we show in computational experiments that arginine-to-lysine amino acid substitutions result in noticeable stabilization of proteins. We then hypothesize that if evolution uses this physical mechanism as a complement to electrostatic stabilization in its strategies of thermophilic adaptation, then hyperthermostable organisms would have much greater content of lysines in their proteomes than comparably sized and similarly charged arginines. Consistent with that, high-throughput comparative analysis of complete proteomes shows extremely strong bias toward arginine-to-lysine replacement in hyperthermophilic organisms and overall much greater content of lysines than arginines in hyperthermophiles. This finding cannot be explained by genomic GC compositional biases or by the universal trend of amino acid gain and loss in protein evolution. We discovered here a novel entropic mechanism of protein thermostability due to residual dynamics of rotamer isomerization in native state and demonstrated its immediate proteomic implications. Our study provides an example of how analysis of a fundamental physical mechanism of thermostability helps to resolve a puzzle in comparative genomics as to why amino acid compositions of hyperthermophilic proteomes are significantly biased toward lysines but not similarly charged arginines.  相似文献   

2.
Nine human neurodegenerative diseases, including Huntington's disease and several spinocerebellar ataxia, are associated to the aggregation of proteins comprising an extended tract of consecutive glutamine residues (polyQs) once it exceeds a certain length threshold. This event is believed to be the consequence of the expansion of polyCAG codons during the replication process. This is in apparent contradiction with the fact that many polyQs-containing proteins remain soluble and are encoded by invariant genes in a number of eukaryotes. The latter suggests that polyQs expansion and/or aggregation might be counter-selected through a genetic and/or protein context. To identify this context, we designed a software that scrutinize entire proteomes in search for imperfect polyQs. The nature of residues flanking the polyQs and that of residues other than Gln within polyQs (insertions) were assessed. We discovered strong amino acid residue biases robustly associated to polyQs in the 15 eukaryotic proteomes we examined, with an over-representation of Pro, Leu and His and an under-representation of Asp, Cys and Gly amino acid residues. These biases are conserved amongst unrelated proteins and are independent of specific functional classes. Our findings suggest that specific residues have been co-selected with polyQs during evolution. We discuss the possible selective pressures responsible of the observed biases.  相似文献   

3.
Analysis of the Arabidopsis thaliana, Saccharomyces cerevisiae, Mus musculus, Escherichia coli, Bacillus subtilis, Thermoplasma acidophilum, and Sulfolobus tokodaii genomes demonstrate that many amino acid biases occur at the N- and C-termini of proteins, a statistically significant number of these biases are evolutionarily conserved, and these biases occur in amino acids beyond the first and last five amino acids. Analyses designed to shed light on the mechanism causing amino acid biases suggest that in at least some cases the bias is caused by forces acting at the nucleic acid level. It is also demonstrated that in E. coli functionally related proteins show similar biases at the N- and C-termini suggesting that the mechanisms causing the biases are complex and in some cases are related to function.  相似文献   

4.
Few quantitative measures of genome architecture or organization exist to support assumptions of differences between microorganisms that are broadly defined as being free-living or pathogenic. General principles about complete proteomes exist for codon usage, amino acid biases and essential or core genes. Genome-wide shifts in amino acid usage between free-living and pathogenic microorganisms result in fundamental differences in the complexity of their respective proteomes that are size and gene content independent. These differences are evident across broad phylogenetic groups–a result of environmental factors and population genetic forces rather than phylogenetic distance. A novel comparative analysis of amino acid usage–utilizing linguistic analyses of word frequency in language and text–identified a global pattern of higher peptide word repetition in 376 free-living versus 421 pathogen genomes across broad ranges of genome size, G+C content and phylogenetic ancestry. This imprint of repetitive word usage indicates free-living microorganisms have a bias for repetitive sequence usage compared to pathogens. These findings quantify fundamental differences in microbial genomes relative to life-history function.  相似文献   

5.
During the course of evolution, amino acid shifts might have resulted in mitochondrial proteomes better endowed to resist oxidative stress. However, owing to the problem of distinguishing between functional constraints/adaptations in protein sequences and mutation-driven biases in the composition of these sequences, the adaptive value of such amino acid shifts remains under discussion. We have analyzed the coding sequences of mtDNA from 173 mammalian species, dissecting the effect of nucleotide composition on amino acid usages. We found remarkable cysteine avoidance in mtDNA-encoded proteins. However, no effect of longevity on cysteine content could be detected. On the other hand, nucleotide compositional shifts fully accounted for threonine usages. In spite of a strong effect of mutational bias on methionine abundances, our results suggest a role of selection in determining the composition of methionine. Whether this selective effect is linked or not to protection against oxidative stress is still a subject of debate.  相似文献   

6.
In this study, the relative synonymous codon and amino acid usage biases of the broad-host range phage, KVP40, were investigated in an attempt to understand the structure and function of its proteins/protein-coding genes, as well as the role of its tRNAs. Synonymous codons in KVP40 were determined to be ATrich at the third codon positions, and their variations are dictated principally by both mutational bias and translational selection. Further analysis revealed that the RSCU of KVP40 is distinct from that of its Vibrio hosts, V. cholerae and V. parahaemolyticus. Interestingly, the expression of the putative highly expressed genes of KVP40 appear to be preferentially influenced by the abundant host tRNA species, whereas the tRNAs expressed by KVP40 may be required for the efficient synthesis of all its proteins in a diverse array of hosts. The data generated in this study also revealed that KVP40 proteins are rich in low molecular weight amino acid residues, and that these variations are influenced primarily by hydropathy, mean molecular weight, aromaticity, and cysteine content.  相似文献   

7.
Metabolic efficiency, as a selective force shaping proteomes, has been shown to exist in Escherichia coli and Bacillus subtilis and in a small number of organisms with photoautotrophic and thermophilic lifestyles. Earlier attempts at larger-scale analyses have utilized proxies (such as molecular weight) for biosynthetic cost, and did not consider lifestyle or auxotrophy. This study extends the analysis to all currently sequenced microbial organisms that are amenable to these analyses while utilizing lifestyle specific amino acid biosynthesis pathways (where possible) to determine protein production costs and compensating for auxotrophy. The tendency for highly expressed proteins (with adherence to codon usage bias as a proxy for expressivity) to utilize less biosynthetically expensive amino acids is taken as evidence of cost selection. A comprehensive analysis of sequenced genomes to identify those that exhibit strong translational efficiency bias (389 out of 1,700 sequenced organisms) is also presented.  相似文献   

8.
9.
Wu S  Wan P  Li J  Li D  Zhu Y  He F 《Proteomics》2006,6(2):449-455
Multi-modality of pI distribution is a common feature in different whole proteomes. Some researchers considered it relate to the proteins with different subcellular locations, indicating the result of natural selection. We explored the pI distribution of predicted proteomes (including animals, plants, bacterium, archaeans) and random proteome [random protein sequences constructed according to the special amino acid composition and molecular weight (MW) distribution of human predicted proteome]. Our results suggest that the multi-modality is the result of discrete pK(R) values for different amino acids. Amino acid composition and MW distribution of a proteome also contributes to the specific pI distribution. Although protein subcellular location was related to pI value, our analyses revealed that comparing with the random proteome, neither the multi-modality phenomenon nor the distribution bias of pI values is caused by subcellular location. It seems that the multi-modality distribution is just a mathematical fun. The blank region near the neutral pI was caused by the absence of amino acids with neutral pK(R), and suggests that the selection of amino acids with ionizable side chain might be restricted by the requirement for a special pH environment during the origin of life. From this point of view, the special distribution was the result of natural selection.  相似文献   

10.
Analysis of 15 complete bacterial chromosomes revealed important biases in gene organization. Strong compositional asymmetries between the genes lying on the leading versus lagging strands were observed at the level of nucleotides, codons and, surprisingly, amino acids. For some species, the bias is so high that the sole knowledge of a protein sequence allows one to predict with almost no errors whether the gene is transcribed from one strand or the other. Furthermore, we show that these biases are not species specific but appear to be universal. These findings may have important consequences in our understanding of fundamental biological processes in bacteria, such as replication fidelity, codon usage in genes and even amino acid usage in proteins.  相似文献   

11.
Adenine nucleotides have been found to appear preferentially in the regions after the initiation codons or before the termination codons of bacterial genes. Our previous experiments showed that AAA and AAT, the two most frequent second codons in Escherichia coli, significantly enhance translation efficiency. To determine whether such a characteristic feature of base frequencies exists in eukaryote genes, we performed a comparative analysis of the base biases at the gene terminal portions using the proteomes of seven eukaryotes. Here we show that the base appearance at the codon third positions of gene terminal regions is highly biased in eukaryote genomes, although the codon third positions are almost free from amino acid preference. The bias changes depending on its position in a gene, and is characteristic of each species. We also found that bias is most outstanding at the second codon, the codon after the initiation codon. NCN is preferred in every genome; in particular, GCG is strongly favored in human and plant genes. The presence of the bias implies that the base sequences at the second codon affect translation efficiency in eukaryotes as well as bacteria.  相似文献   

12.
All amino acid sequences derived from 248 prokaryotic genomes, 10 invertebrate genomes (plants and fungi) and 10 vertebrate genomes were analysed by the autocorrelation function of charge sequences. The analysis of the total amino acid sequences derived from the 268 biological genomes showed that a significant periodicity of 28 residues is observable for the vertebrate genomes, but not for the other genomes. When proteins with a charge periodicity of 28 residues (PCP28) were selected from the total proteomes, we found that PCP28 in fact exists in all proteomes, but the number of PCP28 is much larger for the vertebrate proteomes than for the other proteomes. Although excess PCP28 in the vertebrate proteomes are only poorly characterized, a detailed inspection of the databases suggests that most excess PCP28 are nuclear proteins.  相似文献   

13.
The levels of cellular organization in living organisms are the results of a variety of selection pressures. We have investigated here the final outcome of this integrated selective process in proteins of the best known microbial models Escherichia coli, Bacillus subtilis, and Methanococcus jannaschii, supposed to have undergone separate evolution for more than 1 billion years. Using multivariate analysis methods, including correspondence analysis, we studied the overall amino acid composition of all proteins making a proteome. Starting from and further developing previous results that had pointed out some general forces driving the amino acid composition of the proteomes of these model bacteria, we explored the correlations existing between the structure and functions of the proteins forming a proteome and their amino acid composition. The electric charge of amino acids measured against hydrophobicity creates a highly homogeneous cluster, made exclusively of proteins that are core components of the cytoplasmic membrane of the cell (integral inner membrane proteins). A second bias is imposed by the G+C content of the genome, indicating that protein functions are so robust with respect to amino acid changes that they can accommodate a large shift in the nucleotide content of the genome. A remarkable role of aromatic amino acids was uncovered. Expressed orphan proteins are enriched in these residues, suggesting that they might participate in a process of gain of function during evolution.  相似文献   

14.
Archaea, bacteria and eukaryotes represent the main kingdoms of life. Is there any trend for amino acid compositions of proteins found in full genomes of species of different kingdoms? What is the percentage of totally unstructured proteins in various proteomes? We obtained amino acid frequencies for different taxa using 195 known proteomes and all annotated sequences from the Swiss-Prot data base. Investigation of the two data bases (proteomes and Swiss-Prot) shows that the amino acid compositions of proteins differ substantially for different kingdoms of life, and this difference is larger between different proteomes than between different kingdoms of life. Our data demonstrate that there is a surprisingly small selection for the amino acid composition of proteins for higher organisms (eukaryotes) and their viruses in comparison with the "random" frequency following from a uniform usage of codons of the universal genetic code. On the contrary, lower organisms (bacteria and especially archaea) demonstrate an enhanced selection of amino acids. Moreover, according to our estimates, 12%, 3% and 2% of the proteins in eukaryotic, bacterial and archaean proteomes are totally disordered, and long (> 41 residues) disordered segments are found to occur in 16% of arhaean, 20% of eubacterial and 43% of eukaryotic proteins for 19 archaean, 159 bacterial and 17 eukaryotic proteomes, respectively. A correlation between amino acid compositions of proteins of various taxa, show that the highest correlation is observed between eukaryotes and their viruses (the correlation coefficient is 0.98), and bacteria and their viruses (the correlation coefficient is 0.96), while correlation between eukaryotes and archaea is 0.85 only.  相似文献   

15.
The genetic code has an inherent bias towards some amino acids because of the variable number of synonymous codons per amino acid. The extent to which these biases are expressed in protein secondary structure is described through the analysis of the overall amino acid compositions of the alpha-helix, beta-sheet, beta-turn and random coil segments elucidated by X-ray crystallography. Given the concept of neutral mutation in proteins, the allocation of synonyms in the genetic code appears to protect secondary structures from amino acid changes and discourages the appearance of chemically complex residues. The level of protection is similar for each structural form, despite their clear preferences for certain amino acids. The organization of the code is therefore relevant to the preservation of conformation seen in the evolution of many protein families.  相似文献   

16.
Malaria parasites (species of the genus Plasmodium) harbor a relict chloroplast (the apicoplast) that is the target of novel antimalarials. Numerous nuclear-encoded proteins are translocated into the apicoplast courtesy of a bipartite N-terminal extension. The first component of the bipartite leader resembles a standard signal peptide present at the N-terminus of secreted proteins that enter the endomembrane system. Analysis of the second portion of the bipartite leaders of P. falciparum, the so-called transit peptide, indicates similarities to plant transit peptides, although the amino acid composition of P. falciparum transit peptides shows a strong bias, which we rationalize by the extraordinarily high AT content of P. falciparum DNA. 786 plastid transit peptides were also examined from several other apicomplexan parasites, as well as from angiosperm plants. In each case, amino acid biases were correlated with nucleotide AT content. A comparison of a spectrum of organisms containing primary and secondary plastids also revealed features unique to secondary plastid transit peptides. These unusual features are explained in the context of secondary plastid trafficking via the endomembrane system.  相似文献   

17.
Recent studies across animal phyla have suggested a possible link between amino acid compositional shifts and adaptive evolution across mitochondrial proteomes enabling longer lifespans. These studies examined associations of a gradual loss of cysteine (Cys) residues, increased usage of methionine (Met), and increased usage of threonine (Thr), with the evolution of longevity. Here, we examine all three hypotheses in a framework that considers nucleotide composition. We find that nucleotide composition is strongly correlated across codon positions, and with the above amino acid frequency patterns. We also find that the ND6 gene, which in vertebrates is the only mitochondrial gene situated on the “light-strand” shows no significant pattern for any of the amino acid associations. We also reasoned that if the mitochondrially-encoded proteins of oxidative phosphorylation (OXPHOS) were under selection for such shifts, then nuclear-encoded components should also reflect such pressure. However, we found non-correspondence of these patterns in the nuclear genes when compared to the mitochondrial genes previously associated with positive selection. These results are strongly suggestive of mutational bias, or less efficient purifying selection, as the primary driver of whole proteome shifts in amino acid composition.  相似文献   

18.
An exhaustive statistical analysis of the amino acid sequences at the carboxyl (C) and amino (N) termini of proteins and of coding nucleic acid sequences at the 5' side of the stop codons was undertaken. At the N ends, Met and Ala residues are over-represented at the first (+1) position whereas at positions 2 and 5 Thr is preferred. These peculiarities at N-termini are most probably related to the mechanism of initiation of translation (for Met) and to the mechanisms governing the life-span of proteins via regulation of their degradation (for Ala and Thr). We assume that the C-terminal bias facilitates fixation of the C ends on the protein globule by a preference for charged and Cys residues. The terminal biases, a novel feature of protein structure, have to be taken into account when molecular evolution, three-dimensional structure, initiation and termination of translation, protein folding and life-span are concerned. In addition, the bias of protein termini composition is an important feature which should be considered in protein engineering experiments.  相似文献   

19.
陈浩  朱晟  陈良标 《遗传学报》2005,32(3):315-321
20世纪70年代,Ohno提出了功能蛋白的起源理论,认为寡肽片段的周期性重复是蛋白质起源的一种方式。蛋白质内部重复片段在蛋白质序列进化的过程中具有重要意义。选取原核生物、古细菌、真核生物的8个代表物种,设计了新的蛋白质内部重复片段的提取方法,并用矩阵的方式对重复片段的类型及其出现的频率进行形象地展现,既保留了重复片段的序列特征又可进行全局性的统计描述。分析表明:真核生物高频率的使用简单重复序列;真细菌也具有低频率使用简单重复序列的现象;而古细菌则几乎没有。进一步研究显示,3大种群生物偏向性使用氨基酸构成蛋白质内部重复片段的形为与蛋白质组的氨基酸使用频率紧密相关。其相关系数在真细菌和古细菌中高于0.95,而真核生物略低。真核生物蛋白质组大量使用简单重复片段,以及两者在氨基酸使用上的较低相关性暗示简单重复序列的快速进化是导致真核生物蛋白质组高复杂性的一个关键因素。  相似文献   

20.
Amino acid repeats, or homorepeats, are low complexity protein motifs consisting of tandem repetitions of a single amino acid. Their presence and relative number vary in different proteomes, and some studies have tried to address this variation, proteome by proteome. In this work, we present a full characterization of amino acid homorepeats across evolution. We studied the presence and differential usage of each possible homorepeat in proteomes from various taxonomic groups, using clusters of very similar proteins to eliminate redundancy. The position of each amino acid repeat within proteins, and the order of co‐occurring amino acid repeats were also addressed. As a result, we present evidence about the unevenly evolution of homorepeats, as well as the functional implications of their relative position in proteins. We discuss some of these cases in their taxonomic context. Collectively, our results show evolutionary and positional signals that suggest that homorepeats have biological function, likely creating unspecific protein interactions or modulating specific interactions in a context dependent manner. In conclusion, our work supports the functional importance of homorepeats and establishes a basis for the study of other low complexity repeats. Proteins 2017; 85:709–719. © 2016 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号