首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
An overview is presented on the status of studies on multiple codes in genetic sequences. Indirectly, the existence of multiple codes is recognized in the form of several rediscoveries of Second Genetic Code that is different each time. A due credit is given to earlier seminal work related to the codes often neglected in literature. The latest developments in the field of chromatin code are discussed, as well as perspectives of single-base resolution studies of nucleosome positioning, including rotational setting of DNA on the surface of the histone octamers.  相似文献   

3.
It is well known that sequences of bases in DNA are translated into sequences of amino acids in cells via the genetic code. More recently, it has been discovered that the sequence of DNA bases also influences the geometry and deformability of the DNA. These two correspondences represent a naturally arising example of duplexed codes, providing two different ways of interpreting the same DNA sequence. This paper will set up the notation and basic results necessary to mathematically investigate the relationship between these two natural DNA codes. It then undertakes two very different such investigations: one graphical approach based only on expected values and another analytic approach incorporating the deformability of the DNA molecule and approximating the mutual information of the two codes. Special emphasis is paid to whether there is evidence that pressure to maximize the duplexing efficiency influenced the evolution of the genetic code. Disappointingly, the results fail to support the hypothesis that the genetic code was influenced in this way. In fact, applying both methods to samples of realistic alternative genetic codes shows that the duplexing of the genetic code found in nature is just slightly less efficient than average. The implications of this negative result are considered in the final section of the paper.  相似文献   

4.
The multiple codes of nucleotide sequences   总被引:4,自引:0,他引:4  
Nucleotide sequences carry genetic information of many different kinds, not just instructions for protein synthesis (triplet code). Several codes of nucleotide sequences are discussed including: (1) the translation framing code, responsible for correct triplet counting by the ribosome during protein synthesis; (2) the chromatin code, which provides instructions on appropriate placement of nucleosomes along the DNA molecules and their spatial arrangement; (3) a putative loop code for single-stranded RNA-protein interactions. The codes are degenerate and corresponding messages are not only interspersed but actually overlap, so that some nucleotides belong to several messages simultaneously. Tandemly repeated sequences frequently considered as functionless “junk” are found to be grouped into certain classes of repeat unit lengths. This indicates some functional involvement of these sequences. A hypothesis is formulated according to which the tandem repeats are given the role of weak enhancer-silencers that modulate, in a copy number-dependent way, the expression of proximal genes. Fast amplification and elimination of the repeats provides an attractive mechanism of species adaptation to a rapidly changing environment.  相似文献   

5.
An overview is presented on the status of studies on multiple codes in genetic sequences. Indirectly, the existence of multiple codes is recognized in the form of several rediscoveries of Second Genetic Code that is different each time. A due credit is given to earlier seminal work related to the codes often neglected in literature. The latest developments in the field of chromatin code are discussed, as well as perspectives of single-base resolution studies of nucleosome positioning, including rotational setting of DNA on the surface of the histone octamers.  相似文献   

6.
The genetic code provides the translation table necessary to transform the information contained in DNA into the language of proteins. In this table, a correspondence between each codon and each amino acid is established: tRNA is the main adaptor that links the two. Although the genetic code is nearly universal, several variants of this code have been described in a wide range of nuclear and organellar systems, especially in metazoan mitochondria. These variants are generally found by searching for conserved positions that consistently code for a specific alternative amino acid in a new species. We have devised an accurate computational method to automate these comparisons, and have tested it with 626 metazoan mitochondrial genomes. Our results indicate that several arthropods have a new genetic code and translate the codon AGG as lysine instead of serine (as in the invertebrate mitochondrial genetic code) or arginine (as in the standard genetic code). We have investigated the evolution of the genetic code in the arthropods and found several events of parallel evolution in which the AGG codon was reassigned between serine and lysine. Our analyses also revealed correlated evolution between the arthropod genetic codes and the tRNA-Lys/-Ser, which show specific point mutations at the anticodons. These rather simple mutations, together with a low usage of the AGG codon, might explain the recurrence of the AGG reassignments.  相似文献   

7.
Code domains in tandem repetitive DNA sequence structures   总被引:6,自引:0,他引:6  
Peter Vogt 《Chromosoma》1992,101(10):585-589
Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.  相似文献   

8.
If we define a genetic code as a widespread DNA sequence pattern that carries a message with an impact on biology, then there are multiple genetic codes. Sequences involved in these codes overlap and, thus, both interact with and constrain each other, such as for the triplet code, the intron-splicing code, the code for amphipathic alpha helices, and the chromatin code. Nucleosomes preferentially are located at the ends of exons, thus protecting splice junctions, with the N9 positions of guanines of the GT and AG junctions oriented toward the histones. Analysis of protein-coding sequences reveals numerous traces of tandem repeats, apparently formed by triplet expansion, which in effect is a genome inflation ``code'. Our data are consistent with the hypothesis that expansion of simple tandem repetition of certain aggressive triplets has been a characteristic of life from its emergence. Such expanding triplets appear to be the major factor underlying observed codon usage biases.  相似文献   

9.
It has been suggested that tRNA acceptor stems specify an operational RNA code for amino acids. In the last 20 years several attributes of the putative code have been elucidated for a small number of model organisms. To gain insight about the ensemble attributes of the code, we analyzed 4925 tRNA sequences from 102 bacterial and 21 archaeal species. Here, we used a classification and regression tree (CART) methodology, and we found that the degrees of degeneracy or specificity of the RNA codes in both Archaea and Bacteria differ from those of the genetic code. We found instances of taxon-specific alternative codes, i.e., identical acceptor stem determinants encrypting different amino acids in different species, as well as instances of ambiguity, i.e., identical acceptor stem determinants encrypting two or more amino acids in the same species. When partitioning the data by class of synthetase, the degree of code ambiguity was significantly reduced. In cryptographic terms, a plausible interpretation of this result is that the class distinction in synthetases is an essential part of the decryption rules for resolving the subset of RNA code ambiguities enciphered by identical acceptor stem determinants of tRNAs acylated by enzymes belonging to the two classes. In evolutionary terms, our findings lend support to the notion that in the pre-DNA world, interactions between tRNA acceptor stems and synthetases formed the basis for the distinction between the two classes; hence, ambiguities in the ancient RNA code were pivotal for the fixation of these enzymes in the genomes of ancestral prokaryotes.  相似文献   

10.
Reprogramming of the standard genetic code to include non-canonical amino acids (ncAAs) opens new prospects for medicine, industry, and biotechnology. There are several methods of code engineering, which allow us for storing new genetic information in DNA sequences and producing proteins with new properties. Here, we provided a theoretical background for the optimal genetic code expansion, which may find application in the experimental design of the genetic code. We assumed that the expanded genetic code includes both canonical and non-canonical information stored in 64 classical codons. What is more, the new coding system is robust to point mutations and minimizes the possibility of reversion from the new to old information. In order to find such codes, we applied graph theory to analyze the properties of optimal codon sets. We presented the formal procedure in finding the optimal codes with various number of vacant codons that could be assigned to new amino acids. Finally, we discussed the optimal number of the newly incorporated ncAAs and also the optimal size of codon groups that can be assigned to ncAAs.  相似文献   

11.
The universally valid genetic code is the final result of a multi-stage course of development. Degeneracy, as an important property of the genetic code, was possibly not yet present in the earliest code, first appearing at a later stage of development (Code III). Possibly this step in development is coupled with the presence of a total of four amino acid groups (L, I, E, F). Each group contains a specific number of amino acid (AL, AI, AE, AF). Amino acid groups: - (L) hydrophobic - (I) weakly hydrophobic or polar but uncharged - (E) hydrophilic, acidic - (F) hydrophilic, basic - (D) hydrophobic, aromatic (only in Code IV and Code M. This group is not considered in the calculations below.) In a subsequent stage of development the number of amino acids increases further. At the same time the code becomes more degenerate. The universal genetic code is characterized by three constants of being degenerate. Its immediate predecessor has linear degeneration with two constants. The mitochondrial code represents a transitional form between these two codes.  相似文献   

12.
We propose the existence of a relationship of stereochemical complementarity between gene sequences that code for interacting components: nucleic acid-nucleic acid, protein-protein and protein-nucleic acid. Such a relationship would impose evolutionary constraints on the DNA sequences themselves, thus retaining these sequences and governing the direction of the evolutionary process. Therefore, we propose that prebiotic, template-directed autocatalytic synthesis of mutally cognate peptides and polynucleotides resulted in their amplification and evolutionary conservation in contemporary prokaryotic and eukaryotic organisms as a genetic regulatory apparatus. If this proposal is correct, then the relationships between the sequences in DNA coding for these interactions constitute a life code of which the genetic code is only one aspect of the many related interactions encoded in DNA.  相似文献   

13.
Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction.  相似文献   

14.
M A Santos  V M Perreau    M F Tuite 《The EMBO journal》1996,15(18):5060-5068
The human pathogenic yeast Candida albicans and a number of other Candida species translate the standard leucine CUG codon as serine. This is the latest addition to an increasing number of alterations to the standard genetic code which invalidate the theory that the code is frozen and universal. The unexpected finding that some organisms evolved alternative genetic codes raises two important questions: how have these alternative codes evolved and what evolutionary advantages could they create to allow for their selection? To address these questions in the context of serine CUG translation in C.albicans, we have searched for unique structural features in seryl-tRNA(CAG), which translates the leucine CUG codon as serine, and attempted to reconstruct the early stages of this genetic code switch in the closely related yeast species Saccharomyces cerevisiae. We show that a purine at position 33 (G33) in the C.albicans Ser-tRNA(CAG) anticodon loop, which replaces a conserved pyrimidine found in all other tRNAs, is a key structural element in the reassignment of the CUG codon from leucine to serine in that it decreases the decoding efficiency of the tRNA, thereby allowing cells to survive low level serine CUG translation. Expression of this tRNA in S.cerevisiae induces the stress response which allows cells to acquire thermotolerance. We argue that acquisition of thermotolerance may represent a positive selection for this genetic code change by allowing yeasts to adapt to sudden changes in environmental conditions and therefore colonize new ecological niches.  相似文献   

15.
Summary We lay new foundations to the hypothesis that the genetic code is adapted to evolutionary retention of information in the antisense strands of natural DNA/RNA sequences. In particular, we show that the genetic code exhibits, beyond the neutral replacement patterns of amino acid substitutions, optimal properties by favoring simultaneous evolution of proteins encoded in DNA/RNA sense-antisense strands. This is borne out in the sense-antisense transformations of the codons of every amino acid which target amino acids physicochemically similar to each other. Moreover, silent mutations in the sense strand generate conservative ones in its antisense counterpart and vice versa. Coevolution of proteins coded by complementary strands is shown to be a definite possibility, a result which does not depend on any physical interaction between the coevolving proteins. Likewise, the degree to which the present genetic code is dedicated to evolutionary sense-antisense tolerance is demonstrated by comparison with many randomized codes. Double-strand coding is quantified from an information-theoretical point of view.  相似文献   

16.
In many eukaryotic genomes only a small fraction of the DNA codes for proteins, but the non-protein coding DNA harbors important genetic elements directing the development and the physiology of the organisms, like promoters, enhancers, insulators, and micro-RNA genes. The molecular evolution of these genetic elements is difficult to study because their functional significance is hard to deduce from sequence information alone. Here we propose an approach to the study of the rate of evolution of functional non-coding sequences at a macro-evolutionary scale. We identify functionally important non-coding sequences as Conserved Non-Coding Nucleotide (CNCN) sequences from the comparison of two outgroup species. The CNCN sequences so identified are then compared to their homologous sequences in a pair of ingroup species, and we monitor the degree of modification these sequences suffered in the two ingroup lineages. We propose a method to test for rate differences in the modification of CNCN sequences among the two ingroup lineages, as well as a method to estimate their rate of modification. We apply this method to the full sequences of the HoxA clusters from six gnathostome species: a shark, Heterodontus francisci; a basal ray finned fish, Polypterus senegalus; the amphibian, Xenopus tropicalis; as well as three mammalian species, human, rat and mouse. The results show that the evolutionary rate of CNCN sequences is not distinguishable among the three mammalian lineages, while the Xenopus lineage has a significantly increased rate of evolution. Furthermore the estimates of the rate parameters suggest that in the stem lineage of mammals the rate of CNCN sequence evolution was more than twice the rate observed within the placental amniotes clade, suggesting a high rate of evolution of cis-regulatory elements during the origin of amniotes and mammals. We conclude that the proposed methods can be used for testing hypotheses about the rate and pattern of evolution of putative cis-regulatory elements.  相似文献   

17.
Liu C  Shi L  Xu X  Li H  Xing H  Liang D  Jiang K  Pang X  Song J  Chen S 《PloS one》2012,7(5):e35146
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.  相似文献   

18.
自1936年细菌学家Buchanan负责起草专门的细菌命名法规以来,国际原核生物命名法规(International Code of Nomenclature of Prokaryotes, ICNP)在不断发展和完善过程中,积极促进了原核生物分类学及相关学科的发展。随着组学技术在原核生物多样性研究中的应用,越来越多未培养的细菌和古菌新类群被发现,却因为ICNP要求活的生物材料作为命名模式(nomenclatural type),而无法获得生效名称(validly published name)。2022年,原核生物命名法规从序列数据描述原核生物命名法(Code of Nomenclature of Prokaryotes Described from Sequence Data, SeqCode)正式发布,以补充ICNP在未培养微生物类群命名方面的不足。SeqCode不希望和ICNP产生较大分歧,并尽可能保留在将来和ICNP合并的可能性。然而,作为两种独立运行的命名法规,尚不明确SeqCode和ICNP并存会对学术界产生怎样的影响。本文系统介绍了ICNP和SeqCode各自的发展历程和主要内容,分析了二者的优势和局限性,并呼吁微生物学相关领域的学者共同关注原核生物命名法规并应用于实践,以期构建更加合理、有效的原核生物名称系统。  相似文献   

19.
The codon-degeneracy model (CDM) predicts that patterns of nucleotide substitution in protein-coding genes are largely determined by the relative frequencies of four-fold (4f), two-fold, and non-degenerate sites, the attributes of which are determined by the structure of the governing genetic code. The CDM thus further predicts that genetic codes with alternative structures will "filter" molecular evolution differentially. A method, therefore, is presented by which the CDM may be applied to the unique structure of any genetic code. The mathematical relationship between the proportion of transitions at 4f degenerate nucleotide sites and the transition-to-transversion ratio is described. Predictions for five individual genetic codes, relative to the relationship between code structure and expected patterns of nucleotide substitution, are clearly defined. To test this "filter" hypothesis of genetic codes, simulated DNA sequence data sets were generated with a variety of input parameter values to estimate the relationship between patterns of nucleotide substitution and best-fit estimates of transition bias at 4f degenerate sites for both the universal genetic code and the vertebrate mitochondrial genetic code. These analyses confirm the prediction of the CDM that, all else being equal, even small differences in the structure of alternative genetic codes may result in significant shifts in the overall pattern of nucleotide substitution.  相似文献   

20.
The DNA sequence of approximately 80% of the transcribed region of the kinetoplast maxicircle DNA of Leishmania tarentolae was obtained, and structural genes were localized by comparison of the translated amino acid sequences with those of known mitochondrial genes from other organisms. By this method, the genes for cytochrome oxidase subunits I, II, and III, cytochrome b, and human mitochondrial unidentified reading frames 4 and 5 were identified. By comparing the amino acid sequences of the putative L. tarentolae genes with those of known genes, we conclude that TGA codes for tryptophan, as in most other mitochondrial systems. This is the only apparent change from the universal genetic code. The six identified structural genes show various degrees of divergence from the homologous genes in other species, with cytochrome oxidase subunit I being the most conserved and cytochrome oxidase subunit III being the least conserved. A comparison of the cytochrome b genes from L. tarentolae and Trypanosoma brucei showed that the ratio of transversions to transitions is 1:1, suggesting that these species diverged from each other more than 80 X 10(6) years ago. Several as yet unidentified open reading frames were also present in the maxicircle sequence. These data confirm that maxicircle DNA has a coding potential which typifies other mitochondrial systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号