首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
Li W  Zou H  Tao M 《Antonie van Leeuwenhoek》2007,92(4):417-427
The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5′ termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5′ termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

2.
The genomic as well as structural relationship of phycobiliproteins (PBPs) in different cyanobacterial species are determined by nucleotides as well as amino acid composition. The genomic GC constituents influence the amino acid variability and codon usage of particular subunit of PBPs. We have analyzed 11 cyanobacterial species to explore the variation of amino acids and causal relationship between GC constituents and codon usage. The study at the first, second and third levels of GC content showed relatively more amino acid variability on the levels of G3 + C3 position in comparison to the first and second positions. The amino acid encoded GC rich level including G rich and C rich or both correlate the codon variability and amino acid availability. The fluctuation in amino acids such as Arg, Ala, His, Asp, Gly, Leu and Glu in α and β subunits was observed at G1C1 position; however, fluctuation in other amino acids such as Ser, Thr, Cys and Trp was observed at G2C2 position. The coding selection pressure of amino acids such as Ala, Thr, Tyr, Asp, Gly, Ile, Leu, Asn, and Ser in α and β subunits of PBPs was more elaborated at G3C3 position. In this study, we observed that each subunit of PBPs is codon specific for particular amino acid. These results suggest that genomic constraint linked with GC constituents selects the codon for particular amino acids and furthermore, the codon level study may be a novel approach to explore many problems associated with genomics and proteomics of cyanobacteria.  相似文献   

3.
Sorimachi K  Okayasu T 《Amino acids》2008,34(4):661-668
When nucleotide (G, C, T and A) contents were plotted against each nucleotide, their relationships were clearly expressed by a linear formula, y = αx + β in the coding and non-coding regions. This linear relationship was obtained from the complete single-stranded DNA. Similarly, nucleotide contents at all three codon positions were expressed by linear regression lines based on the content of each nucleotide. In addition, 64 codon usages were also expressed by linear formulas against nucleotide content. Thus, the nucleotide content not only in coding sequence but also in non-coding sequence can be expressed by a linear formula, y = αx + β, in 145 organisms (112 bacteria, 15 archaea and 18 eukaryotes). Based on these results, the ratio of C/T, G/T, C/A or G/A one can essentially estimate all four nucleotide contents in the complete single-stranded DNA, and the determination of any ratio of two kinds of nucleotides can essentially estimate four nucleotide contents, nucleotide contents at the three different codon positions and codon distributions at 64 codons in the coding region. The maximum and minimum values of G content were ∼0.35 and ∼0.15, respectively, among various organisms examined. Codon evolution occurs according to linear formulas between these two values. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

4.
To explore how chemical structures of both nucleobases and amino acids may have played a role in shaping the genetic code, numbers of sp2 hybrid nitrogen atoms in nucleobases were taken as a determinative measure for empirical stereo-electronic property to analyze the genetic code. Results revealed that amino acid hydropathy correlates strongly with the sp2 nitrogen atom numbers in nucleobases rather than with the overall electronic property such as redox potentials of the bases, reflecting that stereo-electronic property of bases may play a role. In the rearranged code, five simple but stereo-structurally distinctive amino acids (Gly, Pro, Val, Thr and Ala) and their codon quartets form a crossed intersection “core”. Secondly, a re-categorization of the amino acids according to their β-carbon stereochemistry, verified by charge density (at β-carbon) calculation, results in five groups of stereo-structurally distinctive amino acids, the group leaders of which are Gly, Pro, Val, Thr and Ala, remarkably overlapping the above “core”. These two lines of independent observations provide empirical arguments for a contention that a seemingly “frozen” “core” could have formed at a certain evolutionary stage. The possible existence of this codon “core” is in conformity with a previous evolutionary model whereby stereochemical interactions may have shaped the code. Moreover, the genetic code listed in UCGA succession together with this codon “core” has recently facilitated an identification of the unprecedented icosikaioctagon symmetry and bi-pyramidal nature of the genetic code.  相似文献   

5.
Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.  相似文献   

6.
The number of completely sequenced archaeal genomes has been sufficient for a large-scale bioinformatic study.We have conducted analyses for each coding region from 36 archaeal genomes using the original CGS algorithm by calculating the total GC content(G+C),GC content in first,second and third codon positions as well as in fourfold and twofold degenerated sites from third codon positions,levels of arginine codon usage(Arg2:AGA/G;Arg4:CGX),levels of amino acid usage and the entropy of amino acid content distribution.In archaeal genomes with strong GC pressure,arginine is coded preferably by GC-rich Arg4 codons,whereas in most of archaeal genomes with G+C0.6,arginine is coded preferably by AT-rich Arg2 codons.In the genome of Haloquadratum walsbyi,which is closely related to GC-rich archaea,GC content has decreased mostly in third codon positions,while Arg4Arg2 bias still persists.Proteomes of archaeal species carry characteristic amino acid biases:levels of isoleucine and lysine are elevated,while levels of alanine,histidine,glutamine and cytosine are relatively decreased.Numerous genomic and proteomic biases observed can be explained by the hypothesis of previously existed strong mutational AT pressure in the common predecessor of all archaea.  相似文献   

7.
Veitia RA 《Genomics》2004,83(3):502-507
A compositional analysis of a sample of 50 zebrafish proteins containing at least one alanine run and of their open reading frames (ORFs) has been performed. The sample of poly(Ala) proteins showed a tendency to have runs of other amino acids (His/H, Gln/Q, Ser/S, Pro/P). Their ORFs and the first and second codon positions had higher GC contents than a reference gene set. The "universal" correlation between the GC content of the first+second and third codon positions (GC1+2 vs GC3) does not hold, but I provide an explanation in terms of genomic heterogeneity. Significant correlation between AHQS content and GC3 was obtained, reflecting codon bias favoring G/C at the third codon position of these amino acids. A correspondence analysis (COA) of relative synonymous codon usage showed that the poly(Ala) proteins have a biased distribution according to the second axis of the COA, which correlates with gene expression in zebrafish. A comparison with human is undertaken.  相似文献   

8.
Genome-wide analysis of sequence divergence patterns in 12,024 human-mouse orthologous pairs reveals, for the first time, that the trends in nucleotide and amino acid substitutions in orthologs of high and low GC composition are highly asymmetric and polarized to opposite directions. The entire dataset has been divided into three groups on the basis of the GC content at third codon sites of human genes: high, medium, and low. High-GC orthologs exhibit significant bias in favor of the replacements, Thr --> Ala, Ser --> Ala, Val --> Ala, Lys --> Arg, Asn --> Ser, Ile --> Val etc., from mouse to human, whereas in low-GC orthologs, the reverse trends prevail. In general, in the high-GC group, residues encoded by A/U-rich codons of mouse proteins tend to be replaced by the residues encoded by relatively G/C-rich codons in their human orthologs, whereas the opposite trend is observed among the low-GC orthologous pairs. The medium-GC group shares some trends with high-GC group and some with low-GC group. The only significant trend common in all groups of orthologs, irrespective of their GC bias, is (Asp)(Mouse) --> (Glu)(Human) replacement. At the nucleotide level, high-GC orthologs have undergone a large excess of (A/T)(Mouse) --> (G/C)(Human) substitutions over (G/C)(Mouse) --> (A/T)(Human) at each codon position, whereas for low-GC orthologs, the reverse is true.  相似文献   

9.
Codon usage patterns in 16 chromosomes coincided with each other in Saccharomyces cerevisiae, and the same result was obtained from Encephalitozoon cuniculi consisting of 11 chromosomes, although each chromosome function differs. In addition, preferential codon usage in the regenerated coding systems for Leu and Lys differed between Saccharomyces cerevisiae and Encephalitozoon cuniculi. These results cannot be explained by Darwins natural selection theory or by the neutral theory proposed against Darwins. Furthermore, the codon usage patterns were examined in both prokaryotes and eukaryotes. The use of G or C at the third codon position was much lower than T or A in Ureaplasma urealyticum, whereas inversely the use of G or C at the third codon position was much higher than T or A in Mycobacterium tuberculosis. Additionally, Candida albicans and Plasmodium falciparum also showed a very low usage of G or C at the third codon position. It is a difficult leap to speculate that the inverse codon usage change occurred over the genome during biological evolution. Thus, the present results strongly suggest that organisms were derived from different origins, indicating that the origin of life was plural, based on genomic structures.  相似文献   

10.
The relative contribution of mutation and selection to the G+C content of DNA was analyzed in bacterial species having widely different G+C contents. The analysis used two methods that were developed previously. The first method was to plot the average G+C content of a set of nucleotides against the G+C content of the third codon position for each gene. This method was used to present the G+C distribution of the third codon position and to assess the relative neutrality of a set of nucleotides to that of the G+C content of the third codon position. The second method was to plot the intrastrand bias of the third codon position from Parity Rule 2 (PR2), where A=T and G=C. It was found that whereas intragenomic distributions of the DNA G+C content of these bacteria are narrow in the majority of species, in some species the G+C content of the minor class of genes distributes over wider ranges than the major class of genes. On the other hand, ubiquitous PR2 biases are amino acid specific and independent of the G+C content of DNA, so that when averaged over the amino acids, the biases are small and not correlated with the DNA G+C content. Therefore, translation coupled PR2-biases are unlikely to explain the wide range of G+C contents among different species. Considering all data available, it was concluded that the amino acid-specific PR2 bias has only a minor effect, if any, on the average G+C content. In addition, PR2 bias patterns of different species show phylogenetic relationships, and the pattern can be as a taxal fingerprint. Received: 5 November 1998 / Accepted: 1 March 1999  相似文献   

11.
Tao N  Hu Z  Liu Q  Xu J  Cheng Y  Guo L  Guo W  Deng X 《Plant cell reports》2007,26(6):837-843
Citrus is an important fruit crop as regards accumulation of carotenoids. In plant carotenoid biosynthesis, phytoene synthase gene (Psy) plays a key role in catalyzing the head-to-head condensation of geranylgeranyl diphosphate molecules to produce colorless phytoene. In the present paper, we reported the phytoene contents determination and characterization of Psy during fruit ripening of “Washington” navel orange and its red-fleshed mutant “Cara Cara”. Results showed that phytoene was exclusively accumulated in peel and pulp of “Cara Cara”. Although phytoene was observed accumulating with fruit ripening of “Cara Cara”, the contents in pulp were 10 times higher than those in peel. The isolated two Psy cDNAs were both 1520 bp in full length, containing 436 deduced amino acid residues, with a different amino acid at 412th. Genomic hybridization results showed that one or two copies might be present in “Cara Cara” and “Washington” genomes. During “Cara Cara” and “Washington” fruit coloration, expression of Psy was observed to be up-regulated, as revealed by tissue specific profiles in the flavedo, albedo, segment membrane and juice sacs. However, Psy expression in albedo of “Cara Cara” was higher than that in “Washington”, as evidenced by phytoene accumulation in the peel.  相似文献   

12.
Aquifex pyrophilus is one of the hyperthermophilic bacteria that can grow at temperatures up to 95°C. To obtain information about its genomic structure, random sequencing was performed on plasmid libraries containing 0.5–2 kb genomic DNA fragments of A. pyrophilus. Comparison of the obtained sequence tags with known proteins revealed that 123 tags showed strong similarity to previously identifed proteins in the PIR or Genebank databases. These included three proteases, two amino acid racemases, and three enzymes utilizing oxygen as substrate. Although the GC ratio of the genome is about 40%, the codon usage of A. pyrophilus showed biased occurrence of G and C at the third position of codons, especially those for amino acids such as asparagine, aspartic acid, cysteine, glutamine, glutamic acid, histidine, lysine, and tyrosine. A higher ratio of positively charged amino acids in A. pyrophilus proteins as compared with proteins from mesophiles suggested that Aquifex proteins might contain increased ion-pair interaction that could help to maintain heat stability. Received: March 1, 1997 / Accepted: May 9, 1997  相似文献   

13.
The extent of codon usage in the protein coding genes of the mycobacteriophage, Bxz1, and its plating bacteria, M. smegmatis, were determined, and it was observed that the codons ending with either G and / or C were predominant in both the organisms. Multivariate statistical analysis showed that in both organisms, the genes were separated along the first major explanatory axis according to their expression levels and their genomic GC content at the synonymous third positions of the codons. The second major explanatory axis differentiates the genes according to their genome type. A comparison of the relative synonymous codon usage between 20 highly- and 20 lowly expressed genes from Bxz1 identified 21 codons, which are statistically over represented in the former group of genes. Further analysis found that the Bxz1- specific tRNA species could recognize 13 out of the 21 over represented synonymous codons, which incorporated 13 amino acid residues preferentially into the highly expressed proteins of Bxz1. In contrast, seven amino acid residues were preferentially incorporated into the lowly expressed proteins by 10 other tRNA species of Bxz1. This analysis predicts for the first time that the Bxz1-specific tRNA species modulates the optimal expression of its proteins during development.  相似文献   

14.
Codon usage in Clonorchis sinensis was analyzed using 12,515 codons from 38 coding sequences. Total GC content was 49.83%, and GC1, GC2 and GC3 contents were 56.32%, 43.15% and 50.00%, respectively. The effective number of codons converged at 51-53 codons. When plotted against total GC content or GC3, codon usage was distributed in relation to GC3 biases. Relative synonymous codon usage for each codon revealed a single major trend, which was highly correlated with GC content at the third position when codons began with A or U at the first two positions. In codons beginning with G or C base at the first two positions, the G or C base rarely occurred at the third position. These results suggest that codon usage is shaped by a bias towards G or C at the third base, and that this is affected by the first and second bases.  相似文献   

15.
Sueoka N  Kawanishi Y 《Gene》2000,261(1):53-62
The human genome, as in other eukaryotes, has a wide heterogeneity in the DNA base composition. The evolutionary basis for this heterogeneity has been unknown. A previous study of the human genome (846 genes analyzed) has shown that, in the major range of the G+C content in the third codon position (0.25-0.75), biases from the Parity Rule 2 (PR2) among the synonymous codons of the four-codon amino acids are similar except in the highest G+C range (Sueoka, N., 1999. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53-58.). PR2 is an intra-strand rule where A=T and G=C are expected when there are no biases between the two complementary strands of DNA in mutation and selection rates (substitution rates). In this study, 14,026 human genes were analyzed. In addition, the third codon positions of two-codon amino acids were analyzed. New results show the following: (a) The G+C contents of the third codon position of human genes are scattered in the G+C range of 0.22-0.96 in the third codon position. (b) The PR2 biases are similar in the range of 0.25-0.75, whereas, in the high G+C range (0.75-0.96; 13% of the genes), the PR2-bias fingerprints are different from those of the major range. (c) Unlike the PR2 biases, the G+C contents of the third codon position for both four-codon and two-codon amino acids are all correlated almost perfectly with the G+C content of the third codon position over the total G+C ranges. These results support the notion that the directional mutation pressure, rather than the directional selection pressure, is mainly responsible for the heterogeneity of the G+C content of the third codon position.  相似文献   

16.
A very powerful method for detecting functional constraints operative in biological macromolecules is presented. This method entails performing a base permanence analysis of protein coding genes at each codon position simultaneously in different species. It calculates the degree of permanence of subregions of the gene by dividing it into segments,c codons long, counting how many sites remain unchanged in each segment among all species compared. By comparing the base permanence among several sequences with the expectations based on a stochastic evolutionary process, gene regions showing different degrees of conservation can be selected. This means that wherever the permanence deviates significantly from the expected value generated by the simulation, the corresponding regions are considered “constrained” or “hypervariable”. The constrained regions are of two types: α and β. The α regions result from constraints at the amino acid level, whereas the β regions are those probably involved in “control” processing. The method has been applied to mitochondrial genes coding for subunit 6 of the ATPase and subunit 1 of the cytochrome oxidase in four mammalian species: human, rat, mouse, and cow. In the two mitochondrial genes a few regions that are highly conserved in all codon positions have been identified. Among these regions a sequence, common to both genes, that is complementary to a strongly conserved region of 12S rRNA has been found. This method can also be of great help in studying molecular evolution mechanisms.  相似文献   

17.
Codon usage bias (CUB) is an omnipresent phenomenon, which occurs in nearly all organisms. Previous studies of codon bias in Plasmodium species were based on a limited dataset. This study uses whole genome datasets for comparative genome analysis of six Plasmodium species using CUB and other related methods for the first time. Codon usage bias, compositional variation in translated amino acid frequency, effective number of codons and optimal codons are analyzed for P.falciparum, P.vivax, P.knowlesi, P.berghei, P.chabaudii and P.yoelli. A plot of effective number of codons versus GC3 shows their differential codon usage pattern arises due to a combination of mutational and translational selection pressure. The increased relative usage of adenine and thymine ending optimal codons in highly expressed genes of P.falciparum is the result of higher composition biased pressure, and usage of guanine and cytosine bases at third codon position can be explained by translational selection pressure acting on them. While higher usage of adenine and thymine bases at third codon position in optimal codons of P.vivax highlights the role of translational selection pressure apart from composition biased mutation pressure in shaping their codon usage pattern. The frequency of those amino acids that are encoded by AT ending codons are significantly high in P.falciparum due to action of high composition biased mutational pressure compared with other Plasmodium species. The CUB variation in the three rodent parasites, P.berghei, P.chabaudii and P.yoelli is strikingly similar to that of P.falciparum. The simian and human malarial parasite, P.knowlesi shows a variation in codon usage bias similar to P.vivax but on closer study there are differences confirmed by the method of Principal Component Analysis (PCA).

Abbreviations

CDS - Coding sequences, GC1 - GC composition at first site of codon, GC2 - GC composition at second site of codon, GC3 - GC composition at third site of codon, Ala - Alanine, Arg - Arginine, Asn - Asparagine, Asp - Aspartic acid, Cys - Cysteine, Gln - Glutamine Glu - Glutamic acid Gly - Glycine His - Histidine Ile - Isoleucine Leu - Leucine Lys - Lysine Met - Methionine Phe - Phenylalanine Pro - Proline Ser - Serine Thr - Threonine Trp - Tryptophan Tyr - Tyrosine Val - Valine.  相似文献   

18.
The thermoacidophilic microbial community inhabiting the groundwater with pH 4.0 and temperature 50°C at the East Thermal Field of Uzon Caldera, Kamchatka, was examined using pyrosequencing of the V3 region of the 16S rRNA gene. Bacteria comprise about 30% of microorganisms and are represented primarily by aerobic lithoautotrophs using the energy sources of volcanic origin—thermoacidophilic methanotrophs of the phylum Verrucomicrobia and Acidithiobacillus spp. oxidising metals and reduced sulfur compounds. More than 70% of microbial population in this habitat were represented by archaea, in majority affiliated with “uncultured” lineages. The most numerous group (39% of all archaea) represented a novel division in the phylum Euryarchaeota related to the order Thermoplasmatales. Another abundant group (33% of all archaea) was related to MCG1 lineage of the phylum Crenarchaeota, originally detected in the Yellowstone hot spring as the environmental clone pJP89. The organisms belonging to these two groups are widely spread in hydrothermal environments worldwide. These data indicate an important environmental role of these two archaeal groups and should stimulate the investigation of their metabolism by cultivation or metagenomic approaches.  相似文献   

19.
TheBacillus subtilis phage ?29 DNA polymerase, involved in protein-primed viral DNA replication, contains several amino acid consensus sequences common to other eukaryotic-type DNA polymerases. Using site-directed mutagenesis, we have studied the functional significance of a C-terminal conserved region, represented by the Lys-X-Tyr (“K-Y”) motif. Single point mutants have been constructed and the corresponding proteins have been overproduced and characterized. Measurements of the activity of the mutant proteins indicated that the invariant Lys and Tyr residues play a critical role in DNA polymerization. Interestingly, substitution of the invariant Lys either by Arg or Thr, produced enzymes with an increased or a largely reduced, respectively, capability to use a protein as primer, an intrinsic property of TP-priming DNA polymerases. On the other hand, the viral protein p6, which stimulates initiation of ?29 DNA replication by formation of a nucleoprotein complex at both DNA replication origins, increased (about 5-fold) the insertion fidelity of ?29 DNA polymerase during the formation of the TP-dAMP initiation complex. We propose a model in which the special strategy to maintain the integrity of the ?29 DNA ends, by means of a “sliding-back” mechanism, could also contribute to increase the fidelity of ?29 DNA replication.  相似文献   

20.
Kothe U  Rodnina MV 《Molecular cell》2007,25(1):167-174
tRNAs reading four-codon families often have a modified uridine, cmo(5)U(34), at the wobble position of the anticodon. Here, we examine the effects on the decoding mechanism of a cmo(5)U modification in tRNA(1B)(Ala), anticodon C(36)G(35)cmo(5)U(34). tRNA(1B)(Ala) reads its cognate codons in a manner that is very similar to that of tRNA(Phe). As Ala codons are GC rich and Phe codons AU rich, this similarity suggests a uniform decoding mechanism that is independent of the GC content of the codon-anticodon duplex or the identity of the tRNA. The presence of cmo(5)U at the wobble position of tRNA(1B)(Ala) permits fairly efficient reading of non-Watson-Crick and nonwobble bases in the third codon position, e.g., the GCC codon. The ribosome accepts the C-cmo(5)U pair as an almost-correct base pair, unlike third-position mismatches, which lead to the incorporation of incorrect amino acids and are efficiently rejected.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号