首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
An intact gene for the ribosomal protein S19 (rps19) is absent from Oenothera mitochondria. The conserved rps19 reading frame found in the mitochondrial genome is interrupted by a termination codon. This rps19 pseudogene is cotranscribed with the downstream rps3 gene and is edited on both sides of the translational stop. Editing, however, changes the amino acid sequence at positions that were well conserved before editing. Other strange editings create translational stops in open reading frames coding for functional proteins. In coxI and rps3 mRNAs CGA codons are edited to UGA stop codons only five and three codons, respectively, downstream to the initiation codon. These aberrant editings in essential open reading frames and in the rps19 pseudogene appear to have been shifted to these positions from other editing sites. These observations suggest a requirement for a continuous evolutionary constraint on the editing specificities in plant mitochondria.  相似文献   

4.
A correspondence between open reading frames in sense and antisense strands is expected from the hypothesis that the prototypic triplet code was of general form RNY, where R is a purine base, N is any base, and Y is a pyrimidine. A deficit of stop codons in the antisense strand (and thus long open reading frames) is predicted for organisms with high G + C percentages; however, two bacteria (Azotobacter vinelandii, Rhodobacter capsulatum) have larger average antisense strand open reading frames than predicted from (G + C)%. The similar Codon frequencies found in sense and antisense strands can be attributed to the wide distribution of inverted repeats (stem-loop potential) in natural DNA sequences.  相似文献   

5.
BACKGROUND: The composition and sequence of amino acids in a protein may serve the underlying needs of the nucleic acids that encode the protein (the genome phenotype). In extreme form, amino acids become mere placeholders inserted between functional segments or domains, and--apart from increasing protein length--playing no role in the specific function or structure of a protein (the conventional phenotype). METHODS: We studied the genomes of two malarial parasites and 521 prokaryotes (144 complete) that differ widely in GC% and optimum growth temperature, comparing the base compositions of the protein coding regions and corresponding lengths (kilobases). RESULTS: Malarial parasites show distinctive responses to base-compositional pressures that increase as protein lengths increase. A low-GC% species (Plasmodium falciparum) is likely to have more placeholder amino acids than an intermediate-GC% species (P. vivax), so that homologous proteins are longer. In prokaryotes, GC% is generally greater and AG% is generally less in open reading frames (ORFs) encoding long proteins. The increased GC% in long ORFs increases as species' GC% increases, and decreases as species' AG% increases. In low- and intermediate-GC% prokaryotic species, increases in ORF GC% as encoded proteins increase in length are largely accounted for by the base compositions of first and second (amino acid-determining) codon positions. In high-GC% prokaryotic species, first and third (non-amino acid-determining) codon positions play this role. CONCLUSION: In low- and intermediate-GC% prokaryotes, placeholder amino acids are likely to be well defined, corresponding to codons enriched in G and/or C at first and second positions. In high-GC% prokaryotes, placeholder amino acids are likely to be less well defined. Increases in ORF GC% as encoded proteins increase in length are greater in mesophiles than in thermophiles, which are constrained from increasing protein lengths in response to base-composition pressures.  相似文献   

6.
An algebraic and geometrical approach is used to describe the primaeval RNA code and a proposed Extended RNA code. The former consists of all codons of the type RNY, where R means purines, Y pyrimidines, and N any of them. The latter comprises the 16 codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type), and the third, (YRN type) reading frames. In each of these reading frames, there are 16 triplets that altogether complete a set of 48 triplets, which specify 17 out of the 20 amino acids, including AUG, the start codon, and the three known stop codons. The other 16 codons, do not pertain to the Extended RNA code and, constitute the union of the triplets YYY and RRR that we define as the RNA-less code. The codons in each of the three subsets of the Extended RNA code are represented by a four-dimensional hypercube and the set of codons of the RNA-less code is portrayed as a four-dimensional hyperprism. Remarkably, the union of these four symmetrical pairwise disjoint sets comprises precisely the already known six-dimensional hypercube of the Standard Genetic Code (SGC) of 64 triplets. These results suggest a plausible evolutionary path from which the primaeval RNA code could have originated the SGC, via the Extended RNA code plus the RNA-less code. We argue that the life forms that probably obeyed the Extended RNA code were intermediate between the ribo-organisms of the RNA World and the last common ancestor (LCA) of the Prokaryotes, Archaea, and Eucarya, that is, the cenancestor. A general encoding function, E, which maps each codon to its corresponding amino acid or the stop signal is also derived. In 45 out of the 64 cases, this function takes the form of a linear transformation F, which projects the whole six-dimensional hypercube onto a four-dimensional hyperface conformed by all triplets that end in cytosine. In the remaining 19 cases the function E adopts the form of an affine transformation, i.e., the composition of F with a particular translation. Graphical representations of the four local encoding functions and E, are illustrated and discussed. For every amino acid and for the stop signal, a single triplet, among those that specify it, is selected as a canonical representative. From this mapping a graphical representation of the 20 amino acids and the stop signal is also derived. We conclude that the general encoding function E represents the SGC itself.  相似文献   

7.
Starting from two datasets of codon usage in coding sequences from mesophilic and thermophilic bacteria, we used internal correspondence analysis to study the variability of codon usage within and between species, and within and between amino acids. The first dataset included 18,958,458 codons from 58,482 coding sequences from completely sequenced genomes of 25 species, along with 6,793,581 dinucleotides from 21,876 intergenic spaces. The second dataset, with partially sequenced genomes, included 97,095,873 codons from 293 bacterial species. Results were consistent between the two datasets. The trend for the amino-acid composition of thermophilic proteins was found to be under the control of a pressure at the nucleic acid level, not a selection at the protein level. This effect was not present in intergenic spaces, ruling out a pressure at the DNA level. The pattern at the mRNA level was more complex than a simple purine enrichment of the sense strand of coding sequences. Outliers in the partial genome dataset introduced a note of caution about the interpretation of temperature as the direct determinant of the trend observed in thermophiles. The surprising lack of selection on the amino-acid content of thermophilic proteins suggests that the amino-acid repertoire was set up in a hot environment.  相似文献   

8.
We describe here a repetitive chromosomal element, which appears to be an insertion sequence, isolated from Clavibacter xyli subsp. cynodontis, a gram-positive plant-associated bacterium. The element, IS1237, is 905 bp in size, is bounded by 19-bp perfect inverted repeats and 3-bp direct repeats, and appears at least 16 times in the genome. It contains three open reading frames which show similarity to open reading frames from various other insertion sequences. We have found that there are two groups of related mobile elements: one in which two open reading frames are read separately and the other in which these two open reading frames are fuse together to give one predicted protein product. Using one of these open reading frames to search amino acid sequence databases, we found two instances in which similar reading frames flank genes carried on plasmids. We believe therefore that these plasmid-borne genes may be parts of previously unidentified mobile elements. For IS1237, a frameshift in two of the open reading frames and a stop codon in the third may indicate that this particular copy of the element is no longer active in transposition. The similarity of IS1237 to other elements from both gram-negative and gram-positive bacteria provides further evidence that mobile elements have been transferred between these two bacterial groups.  相似文献   

9.
Summary It has been shown that codons coding for strongly hydrophilic amino acids are complemented by codons that code for strongly hydrophobic ones, leading to a hypothesis stating that peptides thus encoded should interact. Though the principle has been validated in a number of experimental models, its general applicability has been questioned. I have discussed this principle, showing that the correlation between coding and noncoding strand amino acids was maintained, indeed slightly improved, when weighted averages based on codon usage tables were used to determine noncoding strand amino acid hydropathies. The coding capacity of the noncoding strand and its content of open reading frames were also discussed. Another point of contention that was afforded further clarification is the chemical plausibility of interactions between hydrophobic and hydrophilic amino acids implicit in this concept. The extension of complementary domains was also dealt with. Finally, I have discussed what I called the evolutionary drift of primary structure, and I showed as an example that though nucleotide sequences coding for the substance K receptor bear little resemblance to the inverse complement of that which codes for the SK peptide, a peptide spanning residues 130–139 is hydropathically very similar to that predicted from such an inverse complement.  相似文献   

10.
11.
Human immunodeficiency virus type 1 (HIV-1) and other lentiviridae demonstrate a strong preference for the A-nucleotide, which can account for up to 40% of the viral RNA genome. The biological mechanism responsible for this nucleotide bias is currently unknown. The increased A-content of these viral genomes corresponds to the typical use of synonymous codons by all members of the lentiviral family (HIV, SIV, BIV, FIV, CAEV, EIAV, visna) and the human spuma retrovirus, but not by other retroviruses like the human T-cell leukemia viruses HTLV-I and HTLV-II. In this article, we analyzed A-bias for all codon groups in all open reading frames of several lentiviruses. The extent of lentiviral codon bias could be related to host cellular translation. By calculating codon bias indices (CBIs), we were able to demonstrate an inverse correlation between the extent of codon bias and the rate of translation of individual reading frames in these viruses. Specifically, the shift toward A-rich codons is more pronounced in pol than in gag lentiviral genes. Since it is known that Gag synthesis exceeds Pol synthesis by a factor of 20 due to infrequent ribosomal frame-shifting during translation of the gap-pol mRNA molecule, we propose that the aminoacyl-tRNA availability in the host cell restricts the lentiviral preference for A-rich codons. In addition, less A-nucleotides were found in regions of the viral genome encoding multiple functions; e.g., overlapping reading frames (tat-rev-env) or in genes that overlap regulatory sequences (nef-LTR region). Finally, the characteristics of lentiviral codon usage are presented as a phylogenetic tree without the need for prior sequence alignment.Correspondence to: B. Berkhout  相似文献   

12.
毕赤酵母的密码子用法分析   总被引:135,自引:5,他引:130  
通过分析Pichia pastoris的28个蛋白编码基因的同义密码子使用情况并计算该酵母的密码子用法,首次确定出P.pastoris的19个高表达优越密码子。这些结果经与已知的Saccharomyces cerevisiaeKluyveromyces lactis的密码子用法基本相似,但在氨基酸谷氨酸的密码子选择上截然相反,提示这可能属于P.pastoris所偏爱的密码子用法。  相似文献   

13.
《BBA》2022,1863(8):148597
The origin of the genetic code is an abiding mystery in biology. Hints of a ‘code within the codons’ suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO2 and H2. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO2 fixation, with amino acids encoded by the purines (G followed by A) being closest to CO2 fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO2 and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells.  相似文献   

14.
The codon table for the canonical genetic code can be rearranged in such a way that the code is divided into four quarters and two halves according to the variability of their GC and purine contents, respectively. For prokaryotic genomes, when the genomic GC content increases, their amino acid contents tend to be restricted to the GC-rich quarter and the purine-content insensitive half, where all codons are fourfold degenerate and relatively mutation-tolerant. Conversely, when the genomic GC content decreases, most of the codons retract to the AUrich quarter and the purine-content sensitive half; most of the codons not only remain encoding physicochemically diversified amino acids but also vary when transversion (between purine and pyrimidine) happens. Amino acids with sixfolddegenerate codons are distributed into all four quarters and across the two halves; their fourfold-degenerate codons are all partitioned into the purine-insensitive half in favorite of robustness against mutations. The features manifested in the rearranged codon table explain most of the intrinsic relationship between protein coding sequences (the informational content) and amino acid compositions (the functional content). The renovated codon table is useful in predicting abundant amino acids and positioning the amino acids with related or distinct physicochemical properties.  相似文献   

15.
Methionine synthase is a key enzyme poised at the intersection of folate and sulfur metabolism and functions to reclaim homocysteine to the methionine cycle. The 5' leader sequence in human MS is 394 nucleotides long and harbors two open reading frames (uORFs). In this study, regulation of the main open reading frame by the uORFs has been elucidated. Both uORFs downregulate translation as demonstrated by mutation of the upstream AUG codons (uAUG) either singly or simultaneously. The uAUGs are capable of recruiting the 40S ribosomal complex as revealed by their ability to drive reporter expression in constructs in which the luciferase is fused to the uORFs. uORF2, which is predicted to encode a 30 amino acid long polypeptide, has a clustering of rare codons encoding arginine and proline. Mutation of a tandemly repeated rare codon for arginine at positions 3 and 4 in uORF2 to either common codons for the same amino acid or common codons for alanine results in complete alleviation of translation inhibition. This suggests a mechanism for ribosome stalling and demonstrates that the cis-effects on translation by uORF2 is dependent on the nucleotide sequence but is apparently independent of the sequence of the encoded peptide. This study reveals complex regulation of the essential housekeeping gene, methionine synthase, by the uORFs in its leader sequence.  相似文献   

16.
Thalassiosira weissflogii (Grun.) Fryxell et Hasle is one of the more commonly studied centric diatoms, and yet molecular studies of this organism are still in their infancy. The ability to identify open reading frames and thus distinguish between introns and exons, coding and noncoding sequence is essential to move from nuclear DNA sequences to predicted amino acid sequences. To facilitate the identification of open reading frames in T. weissflogii , two newly identified nuclear genes encoding β-tubulin and t  -complex polypeptide (TCP)-γ, along with six previously published nuclear DNA sequences, were examined for general structural features. The coding region of the nuclear open reading frames had a G + C content of about 49% and could readily be distinguished from noncoding sequence due to a significant difference in G + C content. The introns were uniformly small, about 100 base pairs in size. Furthermore, the 5' and 3' splice sites of introns displayed the canonical GT/AG sequence, further facilitating recognition of noncoding regions. Six of the nuclear open reading frames displayed relatively little bias in the use of synonymous codons, as exemplified by the cDNAs encoding β-tubulin and TCP-γ. Two open reading frames displayed strong bias in the use of particular codons (although the codons used were different), as exemplified by the cDNA encoding fucoxanthin chlorophyll a/c binding protein. Knowledge of codon bias should facilitate, for example, design of degenerate PCR primers and potential heterologous reporter gene constructs.  相似文献   

17.
Summary The first AUG in the Chlamydomonas reinhardtii ADP/ATP translocator (CRANT) mRNA initiates an open reading frame (ORF) which is very similar (51–79% amino acid identity) to other ANT proteins. In contrast to higher plants, no evidence for a long amino-terminal extension was obtained. The 5 non-transcribed region of the single-copy CRANT gene contains sequence motifs present in other C. reinhardtii nuclear genes. Four introns, whose positions are not conserved in other ANT genes, interrupt the protein coding region. A short heat shock specifically reduces CRANT mRNA levels. CRANT mRNA levels were unaffected by a mutation in photosynthesis. In a dark/light regime CRANT mRNA levels are high in the dark phase and low in the early light phase. Data on translation initiation sites, splice junctions and the codon preferences of C. reinhardtii nuclear genes were compiled. With the exception of two rare codons, ACA and GGA, the CRANT gene exhibits the biased codon usage of C. reinhardtii nuclear genes that are highly expressed during normal vegetative growth.  相似文献   

18.
Singer GA  Hickey DA 《Gene》2003,317(1-2):39-47
A number of recent studies have shown that thermophilic prokaryotes have distinguishable patterns of both synonymous codon usage and amino acid composition, indicating the action of natural selection related to thermophily. On the other hand, several other studies of whole genomes have illustrated that nucleotide bias can have dramatic effects on synonymous codon usage and also on the amino acid composition of the encoded proteins. This raises the possibility that the thermophile-specific patterns observed at both the codon and protein levels are merely reflections of a single underlying effect at the level of nucleotide composition. Moreover, such an effect at the nucleotide level might be due entirely to mutational bias. In this study, we have compared the genomes of thermophiles and mesophiles at three levels: nucleotide content, codon usage and amino acid composition. Our results indicate that the genomes of thermophiles are distinguishable from mesophiles at all three levels and that the codon and amino acid frequency differences cannot be explained simply by the patterns of nucleotide composition. At the nucleotide level, we see a consistent tendency for the frequency of adenine to increase at all codon positions within the thermophiles. Thermophiles are also distinguished by their pattern of synonymous codon usage for several amino acids, particularly arginine and isoleucine. At the protein level, the most dramatic effect is a two-fold decrease in the frequency of glutamine residues among thermophiles. These results indicate that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting (i) mRNA thermostability, (ii) stability of codon-anticodon interactions and (iii) increased thermostability of the protein products. We conclude that elevated growth temperature imposes selective constraints at all three molecular levels: nucleotide content, codon usage and amino acid composition. In addition to these multiple selective effects, however, the genomes of both thermophiles and mesophiles are often subject to superimposed large changes in composition due to mutational bias.  相似文献   

19.
The 50 non-coding bases immediately internal to the telomeric repeats in the two 5′ ends of macronuclear DNA molecules of a group of hypotrichous ciliates are anomalous in composition, consisting of 61% purines and 39% pyrimidines, A>T (ratio of 44:32), and G>C (ratio of 17:7). These ratio imbalances violate parity rule 2, according to which A should equal T and G should equal C within a DNA strand and therefore pyrimidines should equal purines. The purine-rich and base ratio imbalances are in marked contrast to the rest of the non-coding parts of the molecules, which have the theoretically expected purine content of 50%, with A = T and G = C. The ORFs contain an average of 52% purines as a result of bias in codon usage. The 50 bases that flank the 5′ ends of macronuclear sequences in micronuclear DNA (12 cases) consist of ~50% purines. Thus, the 50 bases in the 5′ ends of macronuclear sequences in micronuclear DNA are islands of purine richness in which A>T and G>C. These islands may serve as signals for the excision of macronuclear molecules during macronuclear development. We have found no published reports of coding or non-coding native DNA with such anomalous base composition.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号