首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Does the 'non-coding' strand code?   总被引:3,自引:2,他引:1       下载免费PDF全文
The hypothesis that DNA strands complementary to the coding strand contain in phase coding sequences has been investigated. Statistical analysis of the 50 genes of bacteriophage T7 shows no significant correlation between patterns of codon usage on the coding and non-coding strands. In Bacillus and yeast genes the correlation observed is not different from that expected with random synonymous codon usage, while a high correlation seen in 52 E. coli genes can be explained in terms of an excess of RNY codons. A deficiency of UUA, CUA and UCA codons (complementary to termination) seems to be restricted to the E. coli genes, and may be due to low abundance of the relevant cognate tRNA species. Thus the analysis shows that the non-coding strand has the properties expected of a sequence complementary to a coding strand, with no indications that it encodes, or may have encoded, proteins.  相似文献   

2.
Human rhinoviruses are single stranded positive sense RNA viruses that are presented in more than 50% of acute upper respiratory tract infections. Despite extensive studies on the genetic diversity of the virus, little is known about the forces driving it. In order to explain this diversity, many research groups have focused on protein sequence requirements for viable, functional and transmissible virus but have missed out an important aspect of viral evolution such as the genomic ontology of the virus. This study presents for the first time the genomic signature of 111 fully sequenced HRV strains from all three groups HRV-A, HRV-B and HRV-C. We observed an HRV genome tendency to eliminate CpG and UpA dinucleotides, coupling with over-representation of UpG and CpA. We propose a specific mechanism which describes how rapid changes in the HRV genomic sequence can take place under the strict control of conservation of the polypeptide backbone. Moreover, the distribution of the observed under- and over-represented dinucleotides along the HRV genome is presented. Distance matrice tables based on CpG and UpA odds ratios were constructed and viewed as heatmaps and distance trees. None of the suppressions can be attributed to codon usage or in RNA secondary structure requirements. Since viral recognition is dependent on RNA motifs rich in CpG and UpA, it is possible that the overall described genome evolution mechanism acts in order to protect the virus from host recognition.  相似文献   

3.
To understand the variation in genomic composition and its effect on codon usage, we performed the comparative analysis of codon usage and nucleotide usage in the genes of three dicots, Glycine max, Arabidopsis thaliana and Medicago truncatula. The dicot genes were found to be A/T rich and have predominantly A-ending and/or T-ending codons. GC3s directly mimic the usage pattern of global GC content. Relative synonymous codon usage analysis suggests that the high usage frequency of A/T over G/C mononucleotide containing codons in AT-rich dicot genome is due to compositional constraint as a factor of codon usage bias. Odds ratio analysis identified the dinucleotides TpG, TpC, GpA, CpA and CpT as over-represented, where, CpG and TpA as under-represented dinucleotides. The results of (NcExp?NcObs)/NcExp plot suggests that selection pressure other than mutation played a significant role in influencing the pattern of codon usage in these dicots. PR2 analysis revealed the significant role of selection pressure on codon usage. Analysis of varience on codon usage at start and stop site showed variation in codon selection in these sites. This study provides evidence that the dicot genes were subjected to compositional selection pressure.  相似文献   

4.
5.
Analysis of codon usage frequency for the combined coding sequences of 52 E. coli genes, taken from the European Molecular Biology Laboratory Nucleotide Sequence Data Library, Release 2, shows that there is a significant positive correlation between the frequency with which a given codon appears on the coding strand and the frequency with which it appears, in phase, on the non-coding strand.  相似文献   

6.
Starting from two datasets of codon usage in coding sequences from mesophilic and thermophilic bacteria, we used internal correspondence analysis to study the variability of codon usage within and between species, and within and between amino acids. The first dataset included 18,958,458 codons from 58,482 coding sequences from completely sequenced genomes of 25 species, along with 6,793,581 dinucleotides from 21,876 intergenic spaces. The second dataset, with partially sequenced genomes, included 97,095,873 codons from 293 bacterial species. Results were consistent between the two datasets. The trend for the amino-acid composition of thermophilic proteins was found to be under the control of a pressure at the nucleic acid level, not a selection at the protein level. This effect was not present in intergenic spaces, ruling out a pressure at the DNA level. The pattern at the mRNA level was more complex than a simple purine enrichment of the sense strand of coding sequences. Outliers in the partial genome dataset introduced a note of caution about the interpretation of temperature as the direct determinant of the trend observed in thermophiles. The surprising lack of selection on the amino-acid content of thermophilic proteins suggests that the amino-acid repertoire was set up in a hot environment.  相似文献   

7.
Codon bias is the non-random use of synonymous codons, a phenomenon that has been observed in species as diverse as bacteria, plants and mammals. The preferential use of particular synonymous codons may reflect neutral mechanisms (e.g. mutational bias, G|C-biased gene conversion, genetic drift) and/or selection for mRNA stability, translational efficiency and accuracy. The extent to which these different factors influence codon usage is unknown, so we dissected the contribution of mutational bias and selection towards codon bias in genes from 15 eudicots, 4 monocots and 2 mosses. We analysed the frequency of mononucleotides, dinucleotides and trinucleotides and investigated whether the compositional genomic background could account for the observed codon usage profiles. Neutral forces such as mutational pressure and G|C-biased gene conversion appeared to underlie most of the observed codon bias, although there was also evidence for the selection of optimal translational efficiency and mRNA folding. Our data confirmed the compositional differences between monocots and dicots, with the former featuring in general a lower background compositional bias but a higher overall codon bias.  相似文献   

8.
The general property of asymmetry in word use in meaningful texts written in a variety of languages, motivates a quantification of the differences in the use of mutually symmetric triplets in genomic sequences. When this is done in the three reading frames, high values found for one of them are used as indication that the sequence is coding for a protein. Moreover, a similar quantification of the differences in the use of complementary triplets is introduced, again with predictive power of the coding character of a sequence. This method reflects the non-equivalence between sense and anti-sense strand of a coding segment. In both approaches, "linguistic asymmetry" in coding sequences is related to the form of the genetic code and to the bias in codon usage and amino acid use skews.  相似文献   

9.
Replicative fitness of poliovirus can be modulated systematically by replacement of preferred capsid region codons with synonymous unpreferred codons. To determine the key genetic contributors to fitness reduction, we introduced different sets of synonymous codons into the capsid coding region of an infectious clone derived from the type 2 prototype strain MEF-1. Replicative fitness in HeLa cells, measured by plaque areas and virus yields in single-step growth experiments, decreased sharply with increased frequencies of the dinucleotides CpG (suppressed in higher eukaryotes and most RNA viruses) and UpA (suppressed nearly universally). Replacement of MEF-1 capsid codons with the corresponding codons from another type 2 prototype strain (Lansing), a randomization of MEF-1 synonymous codons, increased the %G+C without increasing CpG, and reductions in the effective number of codons used had much smaller individual effects on fitness. Poliovirus fitness was reduced to the threshold of viability when CpG and UpA dinucleotides were saturated within and across synonymous codons of a capsid region interval representing only ∼9% of the total genome. Codon replacements were associated with moderate decreases in total virion production but large decreases in the specific infectivities of intact poliovirions and viral RNAs. Replication of codon replacement viruses, but not MEF-1, was temperature sensitive at 39.5°C. Synthesis and processing of viral intracellular proteins were largely unaltered in most codon replacement constructs. Replacement of natural codons with synonymous codons with increased frequencies of CpG and UpA dinucleotides may offer a general approach to the development of attenuated vaccines with well-defined antigenicities and very high genetic stabilities.Diversification of genomic sequences is constrained in all biological systems. At the level of primary sequences, the range of variability in coding regions is restricted by the codon usage bias (CUB), whereby a subset of synonymous codons are preferentially used in translation (24, 53, 69). The intensity of the CUB and the specific set of preferred codons vary widely across biological systems (39). Intertwined with the CUB is the suppression of the dinucleotides CpG and TpA (or UpA in RNA viruses) in the genomes of higher eukaryotes (4, 7, 26, 61) and many of their RNA viruses and small DNA viruses (28, 49). Variation in the primary sequences of RNA virus genomes is further constrained by requirements to maintain essential secondary and higher-order structures (42, 54, 68).We previously described the modulation of the replicative fitness of the Sabin type 2 oral poliovirus vaccine (OPV) strain (Sabin 2) by systematically changing the CUB in the capsid region, replacing the naturally occurring preferred codons with an unpreferred synonymous codon (isocodon) for each of nine amino acids (8). We called our approach “codon deoptimization” to contrast with the process of codon optimization, which is frequently used to maximize expression of foreign proteins in heterologous host systems (1, 27, 70). Apart from its potential application to development of improved poliovirus vaccines (8, 13, 38), experimental investigations of codon deoptimization directly test the relationships between replicative fitness, the extent of CUB, and the intensity of CpG and UpA suppression. As a model system for such studies, polioviruses offer several favorable properties, including (i) intrinsically high error rates for the poliovirus RNA-dependent RNA polymerase (2, 14, 16, 65), (ii) very high evolution rates (25), (iii) short generation times (8 to 10 h) and large progeny yields of prototype polioviruses, and (iv) well-developed reverse genetics (9).In this report, we extend our codon deoptimization strategy to the type 2 wild poliovirus prototype strain MEF-1. As before, we restricted our replacement of synonymous codons to the capsid coding region, which encodes two of the defining properties of polioviruses, namely, (i) the capacity to bind the CD155 poliovirus receptor (PVR) (23) and (ii) the poliovirus type-specific neutralizing antigenic sites (35). No changes were made to the flanking 5′-untranslated region and noncapsid region sequences, as they contain essential secondary structural elements (42, 54, 68) and are frequently exchanged out by recombination during circulation of poliovirus in human populations (20, 30, 32). MEF-1 was selected because of its high fitness level (hence, its use as the type 2 component of the inactivated poliovirus vaccine [IPV]) and because of its neurovirulence for humans (15), for nontransgenic mice (52), and for transgenic mice expressing the PVR (71). Type 2 polioviruses were selected first for study because the Sabin 2 OPV strain is most frequently associated with vaccine-associated paralytic poliomyelitis in contacts of OPV recipients (57, 59), with prolonged excretion of immunodeficiency-associated vaccine-derived polioviruses (VDPVs) (10, 31, 60), and with the emergence of circulating VDPVs in areas of low OPV coverage (10, 31).Consistent with our previous findings, the fitness of MEF-1 decreased in proportion to the total number of synonymous replacement codons. Fitness was reduced most efficiently by increasing the frequencies of CpG and UpA dinucleotides within and across synonymous codons. Saturation of CpG and UpA in a small capsid interval (representing only ∼9% of the genome) reduced fitness to the threshold of viability, even though the MEF-1 amino acid sequence was unaltered. The most prominent biological effect of deoptimization of codon usage and the large-scale incorporation of CpG and UpA was a sharp reduction in virus specific infectivities. In contrast, translation and processing of viral proteins and yields of intact virus particles with native antigenicities were reduced only moderately by increased CpG and UpA frequencies. Codon deoptimization with concurrent increases in the frequencies of CpG and/or UpA dinucleotides in RNA virus genomes may provide a novel general approach to the rational design of improved attenuated vaccines with predictable and stable genetic properties.  相似文献   

10.
The genetic code is degenerate; thus, protein evolution does not uniquely determine the coding sequence. One of the puzzles in evolutionary genetics is therefore to uncover evolutionary driving forces that result in specific codon choice. In many bacteria, the first 5–10 codons of protein‐coding genes are often codons that are less frequently used in the rest of the genome, an effect that has been argued to arise from selection for slowed early elongation to reduce ribosome traffic jams. However, genome analysis across many species has demonstrated that the region shows reduced mRNA folding consistent with pressure for efficient translation initiation. This raises the possibility that unusual codon usage is a side effect of selection for reduced mRNA structure. Here we discriminate between these two competing hypotheses, and show that in bacteria selection favours codons that reduce mRNA folding around the translation start, regardless of whether these codons are frequent or rare. Experiments confirm that primarily mRNA structure, and not codon usage, at the beginning of genes determines the translation rate.  相似文献   

11.
In recent years, the amount of molecular sequencing data from Tetrahymena thermophila has dramatically increased. We analyzed G + C content, codon usage, initiator codon context and stop codon sites in the extremely A + T rich genome of this ciliate. Average G + C content was 38% for protein coding regions, 21% for 5' non-coding sequences, 19% for 3' non-coding sequences, 15% for introns, 19% for micronuclear limited sequences and 17% for macronuclear retained sequences flanking micronuclear specific regions. The 75 available T. thermophila protein coding sequences favored codons ending in T and, where possible, avoided those with G in the third position. Highly expressed genes were relatively G + C-rich and exhibited an extremely biased pattern of codon usage while developmentally regulated genes were more A + T-rich and showed less codon usage bias. Regions immediately preceding Tetrahymena translation initiator codons were generally A-rich. For the 60 stop codons examined, the frequency of G in the end + 1 site was much higher than expected whereas C never occupied this position.  相似文献   

12.
Summary In species where actin genes exist as single copies, analysis of their synonymous codon usage and of the substitutions occurring between the genes of closely related species shows that there is a positive selection for codons that do not have highly mutable CpG dinucleotides in codon positions 2 and 3 when the GC content of these genes is less than 57%.  相似文献   

13.
14.
Synonymous codons are neutral at the protein level, therefore natural selection at the protein level should have no effect on their frequencies. Synonymous codons, however, differ in their capacity to reduce the effects of errors: after mutation, certain codons keep on coding for the same amino acid or for amino acids with similar properties, while other synonymous codons produce very different amino acids. Therefore, the impact of errors on a coding sequence (genetic robustness) can be measured by analysing its codon usage. I analyse the codon usage of sequenced nuclear and cytoplasmic genomes and I show that there is an extensive variation in genetic robustness at the DNA sequence level, both among genomes and among genes of the same genome. I also show theoretically that robustness can be adaptive, that is natural selection may lead to a preference for codons that reduce the impact of errors. If selection occurs only among the mutants of a codon (e.g. among the progeny before the adult phase), however, the codons that are more sensitive to the effects of mutations may increase in frequency because they manage to get rid more easily of deleterious mutations. I also suggest other possible explanations for the evolution of genetic robustness at the codon level.  相似文献   

15.
Summary It has been shown that codons coding for strongly hydrophilic amino acids are complemented by codons that code for strongly hydrophobic ones, leading to a hypothesis stating that peptides thus encoded should interact. Though the principle has been validated in a number of experimental models, its general applicability has been questioned. I have discussed this principle, showing that the correlation between coding and noncoding strand amino acids was maintained, indeed slightly improved, when weighted averages based on codon usage tables were used to determine noncoding strand amino acid hydropathies. The coding capacity of the noncoding strand and its content of open reading frames were also discussed. Another point of contention that was afforded further clarification is the chemical plausibility of interactions between hydrophobic and hydrophilic amino acids implicit in this concept. The extension of complementary domains was also dealt with. Finally, I have discussed what I called the evolutionary drift of primary structure, and I showed as an example that though nucleotide sequences coding for the substance K receptor bear little resemblance to the inverse complement of that which codes for the SK peptide, a peptide spanning residues 130–139 is hydropathically very similar to that predicted from such an inverse complement.  相似文献   

16.
17.
The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.  相似文献   

18.
Understanding the cause of the changes in the amino acid composition of proteins is essential for understanding the evolution of protein functions. Since the early 1970s, it has been known that the frequency of some amino acids in protein sequences is increasing and that of others is decreasing. Recently, it was found that the trends of amino acid changes were similar in 15 taxa representing Bacteria, Archaea, and Eukaryota. However, the cause of this similarity in the trend of the gains and losses of amino acids continued to be debated. Here, we show that this trend of the gain and loss of amino acids can be simply explained by CpG hypermutability. We found that the frequency of amino acids coded by codons with TpG dinucleotides and those with CpA dinucleotides is increasing, while that of amino acids coded by codons with CpG dinucleotides is decreasing. We also found that organisms that lack DNA methyltransferase show different trends of the gain and loss of amino acids. DNA methyltransferase methylates CpG dinucleotides and induces CpG hypermutability. The incorporation of CpG hypermutability into models of protein evolution will improve studies on protein evolution in different organisms. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

19.
The genetic code is degenerate—most amino acids can be encoded by from two to as many as six different codons. The synonymous codons are not used with equal frequency: not only are some codons favored over others, but also their usage can vary significantly from species to species and between different genes in the same organism. Known causes of codon bias include differences in mutation rates as well as selection pressure related to the expression level of a gene, but the standard analysis methods can account for only a fraction of the observed codon usage variation. We here introduce an explicit model of codon usage bias, inspired by statistical physics. Combining this model with a maximum likelihood approach, we are able to clearly identify different sources of bias in various genomes. We have applied the algorithm to Saccharomyces cerevisiae as well as 325 prokaryote genomes, and in most cases our model explains essentially all observed variance.  相似文献   

20.
Understanding the extent and causes of biases in codon usage and nucleotide composition is essential to the study of viral evolution, particularly the interplay between viruses and host cells or immune responses. To understand the common features and differences among viruses we analyzed the genomic characteristics of a representative collection of all sequenced vertebrate-infecting DNA viruses. This revealed that patterns of codon usage bias are strongly correlated with overall genomic GC content, suggesting that genome-wide mutational pressure, rather than natural selection for specific coding triplets, is the main determinant of codon usage. Further, we observed a striking difference in CpG content between DNA viruses with large and small genomes. While the majority of large genome viruses show the expected frequency of CpG, most small genome viruses had CpG contents far below expected values. The exceptions to this generalization, the large gammaherpesviruses and iridoviruses and the small dependoviruses, have sufficiently different life-cycle characteristics that they may help reveal some of the factors shaping the evolution of CpG usage in viruses. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Nicolas Galtier]  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号