首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The multiple codes of nucleotide sequences   总被引:4,自引:0,他引:4  
Nucleotide sequences carry genetic information of many different kinds, not just instructions for protein synthesis (triplet code). Several codes of nucleotide sequences are discussed including: (1) the translation framing code, responsible for correct triplet counting by the ribosome during protein synthesis; (2) the chromatin code, which provides instructions on appropriate placement of nucleosomes along the DNA molecules and their spatial arrangement; (3) a putative loop code for single-stranded RNA-protein interactions. The codes are degenerate and corresponding messages are not only interspersed but actually overlap, so that some nucleotides belong to several messages simultaneously. Tandemly repeated sequences frequently considered as functionless “junk” are found to be grouped into certain classes of repeat unit lengths. This indicates some functional involvement of these sequences. A hypothesis is formulated according to which the tandem repeats are given the role of weak enhancer-silencers that modulate, in a copy number-dependent way, the expression of proximal genes. Fast amplification and elimination of the repeats provides an attractive mechanism of species adaptation to a rapidly changing environment.  相似文献   

2.
Code domains in tandem repetitive DNA sequence structures   总被引:6,自引:0,他引:6  
Peter Vogt 《Chromosoma》1992,101(10):585-589
Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.  相似文献   

3.
4.
If we define a genetic code as a widespread DNA sequence pattern that carries a message with an impact on biology, then there are multiple genetic codes. Sequences involved in these codes overlap and, thus, both interact with and constrain each other, such as for the triplet code, the intron-splicing code, the code for amphipathic alpha helices, and the chromatin code. Nucleosomes preferentially are located at the ends of exons, thus protecting splice junctions, with the N9 positions of guanines of the GT and AG junctions oriented toward the histones. Analysis of protein-coding sequences reveals numerous traces of tandem repeats, apparently formed by triplet expansion, which in effect is a genome inflation ``code'. Our data are consistent with the hypothesis that expansion of simple tandem repetition of certain aggressive triplets has been a characteristic of life from its emergence. Such expanding triplets appear to be the major factor underlying observed codon usage biases.  相似文献   

5.
Herein two genetic codes from which the primeval RNA code could have originated the standard genetic code (SGC) are derived. One of them, called extended RNA code type I, consists of all codons of the type RNY (purine-any base-pyrimidine) plus codons obtained by considering the RNA code but in the second (NYR type) and third (YRN type) reading frames. The extended RNA code type II, comprises all codons of the type RNY plus codons that arise from transversions of the RNA code in the first (YNY type) and third (RNR) nucleotide bases. In order to test if putative nucleotide sequences in the RNA World and in both extended RNA codes, share the same scaling and statistical properties to those encountered in current prokaryotes, we used the genomes of four Eubacteria and three Archaeas. For each prokaryote, we obtained their respective genomes obeying the RNA code or the extended RNA codes types I and II. In each case, we estimated the scaling properties of triplet sequences via a renormalization group approach, and we calculated the frequency distributions of distances for each codon. Remarkably, the scaling properties of the distance series of some codons from the RNA code and most codons from both extended RNA codes turned out to be identical or very close to the scaling properties of codons of the SGC. To test for the robustness of these results, we show, via computer simulation experiments, that random mutations of current genomes, at the rates of 10−10 per site per year during three billions of years, were not enough for destroying the observed patterns. Therefore, we conclude that most current prokaryotes may still contain relics of the primeval RNA World and that both extended RNA codes may well represent two plausible evolutionary paths between the RNA code and the current SGC.  相似文献   

6.
7.
Selection for affinity for free histidine yields a single RNA aptamer, which was isolated 54 times independently. This RNA is highly specific for the side chain and binds protonated L-histidine with 102−103-fold stereoselectivity and a dissociation constant (KD) of 8–54 μM in different isolates. These histidine-binding RNAs have a common internal loop–hairpin loop structure, based on a conserved RAAGUGGGKKN0–36 AUGUN0–2AGKAACAG sequence. Notably, the repetitively isolated sequence contains two histidine anticodons, both implicated by conservation and chemical data in amino acid affinity. This site is probably the simplest structure that can meet our histidine affinity selection, which strengthens experimental support for a “stereochemical” origin of the genetic code.[Reviewing Editor: Niles Lehman]  相似文献   

8.
9.
A parsimony analysis of 133 sequences of the nuclear ribosomal DNA ITS1 + 5.8S + ITS2 region from 71 taxa in Armeria was carried out. The presence of additive polymorphic sites (APS; occurring in 14 accessions) fits the reticulate scenario proposed in previous work for explaining the ITS pattern of variation on a much smaller scale and is based mainly on the geographical structure of the data, irrespective of taxonomic boundaries. Despite the relatively low bootstrap values and large polytomies, part of which are likely due to disruptive effects of reticulation and concerted evolution in these multicopy sequences, the ITS analysis has phylogenetic and biogeographic implications. APS detected in this study are consistent with hypothesized hybridization events, although biased concerted evolution, previously documented in the genus, needs to be invoked for specific cases and may be responsible for a possible “sink” effect in terminals from a large clade. The causes for sequences of the same species appearing in different clades (here termed transclade) are discussed.  相似文献   

10.
Coding plays a universal and pervasive role in biological organization, in forms such as genetic coding (DNA to protein translation), RNA processing, gene regulation, protein modification, cell signalling, immune responses, epigenetic development and natural language. Nevertheless, the ways and means by which organic codes are formed and used are still poorly understood. A formal model is presented in this paper to investigate the emergence of conventional codes among code users. The relationship between the formation and the usage of codes is discussed, and a biological mechanism involving coding is identified in the context of the immune system.  相似文献   

11.
12.
We performed 3′ RNA sequence analyses of [32P]pCp-end-labeled La Crosse (LAC) virus, alternate LAC virus isolate L74, and snowshoe hare bunyavirus large (L), medium (M), and small (S) negative-stranded viral RNA species to determine the coding capabilities of these species. These analyses were confirmed by dideoxy primer extension studies in which we used a synthetic oligodeoxynucleotide primer complementary to the conserved 3′-terminal decanucleotide of the three viral RNA species (Clerx-van Haaster and Bishop, Virology 105:564-574, 1980). The deduced sequences predicted translation of two S-RNA gene products that were read in overlapping reading frames. So far, only single contiguous open reading frames have been identified for the viral M- and L-RNA species. For the negative-stranded M-RNA species of all three viruses, the single reading frame developed from the first 3′-proximal UAC triplet. Likewise, for the L-RNA of the alternate LAC isolate, a single open reading frame developed from the first 3′-proximal UAC triplet. The corresponding L-RNA sequences of prototype LAC and snowshoe hare viruses initiated open reading frames; however, for both viral L-RNA species there was a preceding 3′-proximal UAC triplet in another reading frame that was followed shortly afterward by a termination codon. A comparison of the sequence data obtained for snowshoe hare virus, LAC virus, and the alternate LAC virus isolate showed that the identified nucleotide substitutions were sufficient to account for some of the fingerprint differences in the L-, M-, and S-RNA species of the three viruses. Unlike the distribution of the L- and M-RNA substitutions, significantly fewer nucleotide substitutions occurred after the initial UAC triplet of the S-RNA species than before this triplet, implying that the overlapping genes of the S RNA provided a constraint against evolution by point mutation. The comparative sequence analyses predicted amino acid differences among the corresponding L-, M-, and S-RNA gene products of snowshoe hare virus and the two LAC virus isolates.  相似文献   

13.
Ribosomal frameshifting is used by various organisms to maximize protein coding potential of genomic sequences. It is commonly exploited by RNA viruses to overcome the constraint of their limited genome size. Frameshifting requires specific RNA structural features, such as a suitable heptanucleotide “slippery” sequence and an RNA pseudoknot. Previous genomic analysis of HIV-1 indicated the potential for several hidden genes encoded through frameshifting; one of these, overlapping the envelope gene, has an RNA pseudoknot just downstream from a slippery sequence, AAAAAGA that features an adenine quadruplet prior to a potential hungry arginine codon (AGA). This env-frameshift (env-fs) gene has been shown to encode a truncated glutathione peroxidase homologue, with both antioxidant and anti-apoptotic activities in transfected cells. Using a dual reporter cell-based frameshift assay, we demonstrate that the env-fs frameshift sequence is active in vitro. Furthermore, in arginine deficient media, env-fs frameshifting increased over 100% (p < 0.005), consistent with the hypothesized hungry codon mechanism. As a response to arginine deficiency, increased expression of the antioxidant viral GPx gene (env-fs) by upregulation of frameshifting could be protective to HIV-infected cells, as a countermeasure to the increased oxidative stress induced by arginine deficiency (because NO is a known scavenger of hydroxyl radical).  相似文献   

14.

Background

The degeneracy of the genetic code makes it possible for the same amino acid string to be coded by different messenger RNA (mRNA) sequences. These “synonymous mRNAs” may differ largely in a number of aspects related to their overall translational efficiency, such as secondary structure content and availability of the encoded transfer RNAs (tRNAs). Consequently, they may render different yields of the translated polypeptides. These mRNA features related to translation efficiency are also playing a role locally, resulting in a non-uniform translation speed along the mRNA, which has been previously related to some protein structural features and also used to explain some dramatic effects of “silent” single-nucleotide-polymorphisms (SNPs). In this work we perform the first large scale analysis of the relationship between three experimental proxies of mRNA local translation efficiency and the local features of the corresponding encoded proteins.

Results

We found that a number of protein functional and structural features are reflected in the patterns of ribosome occupancy, secondary structure and tRNA availability along the mRNA. One or more of these proxies of translation speed have distinctive patterns around the mRNA regions coding for certain protein local features. In some cases the three patterns follow a similar trend. We also show specific examples where these patterns of translation speed point to the protein’s important structural and functional features.

Conclusions

This support the idea that the genome not only codes the protein functional features as sequences of amino acids, but also as subtle patterns of mRNA properties which, probably through local effects on the translation speed, have some consequence on the final polypeptide. These results open the possibility of predicting a protein’s functional regions based on a single genomic sequence, and have implications for heterologous protein expression and fine-tuning protein function.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1734-7) contains supplementary material, which is available to authorized users.  相似文献   

15.
In a construct containing a GUS reporter gene driven by the 5′ regulatory elements from rubi3, expression was enhanced 4-fold when a 20-nucleotide (nt) GUS 5′ untranslated sequence was replaced with 9 nt sequences derived from rubi3′s second exon. The roles of the sequences immediately upstream from the GUS translation initiation codon, and their significance in gene expression, were investigated. Sequence analysis suggests that complementarity between sequences immediately 5′ of a translation initiation codon and the rice 17S rRNA may be responsible for the reduction in protein levels from constructs containing the GUS leader sequence. The results demonstrate an affect sequences immediately upstream from transgenic coding sequences have on expression, and when using the rubi3 5′ regulatory sequence in particular.  相似文献   

16.
Protein coding sequences carry an additional message in the form of a universal three-base periodical pattern (G-non-G-N)n, which is expressed as a strong preference for guanines in the first positions of the codons in mRNA and lack of guanines in the second positions. This periodicity appears immediately after the initiation codon and is maintained along the mRNA as far as the termination triplet, where it disappears abruptly. Known cases of ribosome slippage during translation (leaky frameshifts, out-of-frame gene fusion) are analyzed. At the sites of the slippage the G-periodical pattern is found to be interrupted. It reappears downstream from the slippage sites, in a new frame that corresponds to the new translation frame. This suggests that the (G-non-G-N)n pattern in the mRNA may be responsible for monitoring the correct reading frame during translation. Several sites with complementary C-periodical structure are found in the Escherichia coli 16 S rRNA sequence. Only three of them are exposed to various interactions at the surface of the small ribosomal subunit: (517)gcCagCagCegC, (1395)caCacCgcC and (1531)auCacCucC. A model of a frame-monitoring mechanism is suggested based on the weak complementarity of G-periodical mRNA to the C-periodical sites in the ribosomal RNA. The model is strongly supported by the fact that the hypothetical frame-monitoring sites in the 16 S rRNA that are derived from the nucleotide sequence analysis are also the only sites known to be actually involved or implicated in rRNA-mRNA interactions.  相似文献   

17.
Mono-ADP-ribosylation is one of the posttranslational protein modifications regulating cellular metabolism, e.g., nitrogen fixation, in prokaryotes. Several bacterial toxins mono-ADP-ribosylate and inactivate specific proteins in their animal hosts. Recently, two mammalian GPI-anchored cell surface enzymes with similar activities were cloned (designated ART1 and ART2). We have now identified six related expressed sequence tags (ESTs) in the public database and cloned the two novel human genes from which these are derived (designatedART3andART4). The deduced amino acid sequences of the predicted gene products show 28% sequence identity to one another and 32–41% identity vs the muscle and T cell enzymes. They contain signal peptide sequences characteristic of GPI anchorage. Southern Zoo blot analyses suggest the presence of related genes in other mammalian species. By PCR screening of somatic cell hybrids and byin situhybridization, we have mapped the two genes to human chromosomes 4p14–p15.1 and 12q13.2–q13.3. Northern blot analyses show that these genes are specifically expressed in testis and spleen, respectively. Comparison of genomic and cDNA sequences reveals a conserved exon/intron structure, with an unusually large exon encoding the predicted mature membrane proteins. Secondary structure prediction analyses indicate conserved motifs and amino acid residues consistent with a common ancestry of this emerging mammalian enzyme family and bacterial mono(ADP-ribosyl)transferases. It is possible that the four human gene family members identified so far represent the “tip of an iceberg,” i.e., a larger family of enzymes that influences the function of target proteins via mono-ADP-ribosylation.  相似文献   

18.
The canonical genetic code has been reported both to be error minimizing and to show stereochemical associations between coding triplets and binding sites. In order to test whether these two properties are unexpectedly overlapping, we generated 200,000 randomized genetic codes using each of five randomization schemes, with and without randomization of stop codons. Comparison of the code error (difference in polar requirement for single-nucleotide codon interchanges) with the coding triplet concentrations in RNA binding sites for eight amino acids shows that these properties are independent and uncorrelated. Thus, one is not the result of the other, and error minimization and triplet associations probably arose independently during the history of the genetic code. We explicitly show that prior fixation of a stereochemical core is consistent with an effective later minimization of error. [Reviewing Editor : Dr. Stephen Freeland]  相似文献   

19.
Several models have been advanced, both in this journal and others, for the development of the genetic code and translation apparatus. Eigen in particular has put forward a detailed model based on the hypercycle. This paper uses some of these previous ideas to develop a new model of the code and translation in which the pairs AU and GC play complementary roles, and in whichtRNAs develop from a molecule withtwo loops which stacks in repetitive patterns without the need for a messenger RNA. Thus a bridge is provided between random, (or autocatalytic) polymerization, and coded translation. In addition, alternative postulates to several of Eigen's ideas are tested by computer simulation.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号