首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Codon catalog usage and the genome hypothesis.   总被引:34,自引:31,他引:34       下载免费PDF全文
Frequencies for each of the 61 amino acid codons have been determined in every published mRNA sequence of 50 or more codons. The frequencies are shown for each kind of genome and for each individual gene. A surprising consistency of choices exists among genes of the same or similar genomes. Thus each genome, or kind of genome, appears to possess a "system" for choosing between codons. Frameshift genes, however, have widely different choice strategies from normal genes. Our work indicates that the main factors distinguishing between mRNA sequences relate to choices among degenerate bases. These systematic third base choices can therefore be used to establish a new kind of genetic distance, which reflects differences in coding strategy. The choice patterns we find seem compatible with the idea that the genome and not the individual gene is the unit of selection. Each gene in a genome tends to conform to its species' usage of the codon catalog; this is our genome hypothesis.  相似文献   

2.
The utility of mining DNA sequence data to understand the structure and expression of cereal prolamin genes is demonstrated by the identification of a new class of wheat prolamins. This previously unrecognized wheat prolamin class, given the name δ-gliadins, is the most direct ortholog of barley γ3-hordeins. Phylogenetic analysis shows that the orthologous δ-gliadins and γ3-hordeins form a distinct prolamin branch that existed separate from the γ-gliadins and γ-hordeins in an ancestral Triticeae prior to the branching of wheat and barley. The expressed δ-gliadins are encoded by a single gene in each of the hexaploid wheat genomes. This single δ-gliadin/γ3-hordein ortholog may be a general feature of the Triticeae tribe since examination of ESTs from three barley cultivars also confirms a single γ3-hordein gene. Analysis of ESTs and cDNAs shows that the genes are expressed in at least five hexaploid wheat cultivars in addition to diploids Triticum monococcum and Aegilops tauschii. The latter two sequences also allow assignment of the δ-gliadin genes to the A and D genomes, respectively, with the third sequence type assumed to be from the B genome. Two wheat cultivars for which there are sufficient ESTs show different patterns of expression, i.e., with cv Chinese Spring expressing the genes from the A and B genomes, while cv Recital has ESTs from the A and D genomes. Genomic sequences of Chinese Spring show that the D genome gene is inactivated by tandem premature stop codons. A fourth δ-gliadin sequence occurs in the D genome of both Chinese Spring and Ae. tauschii, but no ESTs match this sequence and limited genomic sequences indicates a pseudogene containing frame shifts and premature stop codons. Sequencing of BACs covering a 3 Mb region from Ae. tauschii locates the δ-gliadin gene to the complex Gli-1 plus Glu-3 region on chromosome 1.  相似文献   

3.
4.
It is generally believed that bryophytes are the earliest land plants. However, the phylogenetic relationships among bryophytes, including mosses, liverworts and hornworts, are not clearly resolved. To obtain more information on the earliest land plants, we determined the complete nucleotide sequence of the chloroplast genome from the hornwort Anthoceros formosae. The circular double-stranded DNA of 161 162 bp is the largest genome ever reported among land plant chloroplasts. It contains 76 protein, 32 tRNA and 4 rRNA genes and 10 open reading frames (ORFs), which are identical with the chloroplast genome of the other green plants analyzed. The major difference is a larger inverted repeat than that of the liverwort Marchantia, Anthoceros contains an excess of ndhB and rps7 genes and the 3′ exon of rps12. The genes matK and rps15, commonly found in the chloroplast genomes of land plants, are pseudogenes. The intron of rrn23 is the first finding in the known chloroplast genomes of land plants. A striking feature of the hornwort chloroplast is that more than half of the protein-coding genes have nonsense codons, which are converted into sense codons by RNA editing. Maximum-likelihood (ML) analysis, based on 11 518 amino acid sites of 52 proteins encoded in the chloroplast genomes of the green plants, placed liverworts as the sister to all other land plants.  相似文献   

5.
The Selective Advantage of Synonymous Codon Usage Bias in Salmonella   总被引:1,自引:0,他引:1  
The genetic code in mRNA is redundant, with 61 sense codons translated into 20 different amino acids. Individual amino acids are encoded by up to six different codons but within codon families some are used more frequently than others. This phenomenon is referred to as synonymous codon usage bias. The genomes of free-living unicellular organisms such as bacteria have an extreme codon usage bias and the degree of bias differs between genes within the same genome. The strong positive correlation between codon usage bias and gene expression levels in many microorganisms is attributed to selection for translational efficiency. However, this putative selective advantage has never been measured in bacteria and theoretical estimates vary widely. By systematically exchanging optimal codons for synonymous codons in the tuf genes we quantified the selective advantage of biased codon usage in highly expressed genes to be in the range 0.2–4.2 x 10−4 per codon per generation. These data quantify for the first time the potential for selection on synonymous codon choice to drive genome-wide sequence evolution in bacteria, and in particular to optimize the sequences of highly expressed genes. This quantification may have predictive applications in the design of synthetic genes and for heterologous gene expression in biotechnology.  相似文献   

6.
Datura stramonium is a widely used poisonous plant with great medicinal and economic value. Its chloroplast (cp) genome is 155,871 bp in length with a typical quadripartite structure of the large (LSC, 86,302 bp) and small (SSC, 18,367 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,601 bp). The genome contains 113 unique genes, including 80 protein-coding genes, 29 tRNAs and four rRNAs. A total of 11 forward, 9 palindromic and 13 tandem repeats were detected in the D. stramonium cp genome. Most simple sequence repeats (SSR) are AT-rich and are less abundant in coding regions than in non-coding regions. Both SSRs and GC content were unevenly distributed in the entire cp genome. All preferred synonymous codons were found to use A/T ending codons. The difference in GC contents of entire genomes and of the three-codon positions suggests that the D. stramonium cp genome might possess different genomic organization, in part due to different mutational pressures. The five most divergent coding regions and four non-coding regions (trnH-psbA, rps4-trnS, ndhD-ccsA, and ndhI-ndhG) were identified using whole plastome alignment, which can be used to develop molecular markers for phylogenetics and barcoding studies within the Solanaceae. Phylogenetic analysis based on 68 protein-coding genes supported Datura as a sister to Solanum. This study provides valuable information for phylogenetic and cp genetic engineering studies of this poisonous and medicinal plant.  相似文献   

7.
Regularities of context-dependent codon bias in eukaryotic genes   总被引:10,自引:1,他引:9       下载免费PDF全文
Nucleotides surrounding a codon influence the choice of this particular codon from among the group of possible synonymous codons. The strongest influence on codon usage arises from the nucleotide immediately following the codon and is known as the N1 context. We studied the relative abundance of codons with N1 contexts in genes from four eukaryotes for which the entire genomes have been sequenced: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. For all the studied organisms it was found that 90% of the codons have a statistically significant N1 context-dependent codon bias. The relative abundance of each codon with an N1 context was compared with the relative abundance of the same 4mer oligonucleotide in the whole genome. This comparison showed that in about half of all cases the context-dependent codon bias could not be explained by the sequence composition of the genome. Ranking statistics were applied to compare context-dependent codon biases for codons from different synonymous groups. We found regularities in N1 context-dependent codon bias with respect to the codon nucleotide composition. Codons with the same nucleotides in the second and third positions and the same N1 context have a statistically significant correlation of their relative abundances.  相似文献   

8.
Codon usage in higher plants, green algae, and cyanobacteria   总被引:3,自引:1,他引:2  
Codon usage is the selective and nonrandom use of synonymous codons by an organism to encode the amino acids in the genes for its proteins. During the last few years, a large number of plant genes have been cloned and sequenced, which now permits a meaningful comparison of codon usage in higher plants, algae, and cyanobacteria. For the nuclear and organellar genes of these organisms, a small set of preferred codons are used for encoding proteins. Codon usage is different for each genome type with the variation mainly occurring in choices between codons ending in cytidine (C) or guanosine (G) versus those ending in adenosine (A) or uridine (U). For organellar genomes, chloroplastic and mitochrondrial proteins are encoded mainly with codons ending in A or U. In most cyanobacteria and the nuclei of green algae, proteins are encoded preferentially with codons ending in C or G. Although only a few nuclear genes of higher plants have been sequenced, a clear distinction between Magnoliopsida (dicot) and Liliopsida (monocot) codon usage is evident. Dicot genes use a set of 44 preferred codons with a slight preference for codons ending in A or U. Monocot codon usage is more restricted with an average of 38 codons preferred, which are predominantly those ending in C or G. But two classes of genes can be recognized in monocots. One set of monocot genes uses codons similar to those in dicots, while the other genes are highly biased toward codons ending in C or G with a pattern similar to nuclear genes of green algae. Codon usage is discussed in relation to evolution of plants and prospects for intergenic transfer of particular genes.  相似文献   

9.
I have analysed the coding regions of 96 eukaryotic genes for their use of iso-coding codons. Specific codons occur more frequently in specific positions in all members of some gene families than would be expected if codon choice was determined solely by the frequency of codon usage. In the absence of evidence a priori for selection for particular codons at particular positions, I term such co-occurring codons “coincident codons”. Coincident codons are not confined to particular regions of genes, and their occurrence is not detectably linked with the location of introns in the genomic sequence. Their presence is partly but not completely explained by the exchange of sequence between similar functional genes within a species: homologous genes from different organisms also possess the same codons at some sites with greater than expected frequencies. The relative excess of coincident codons correlates well with the overall length of the genes analysed, but not with the length of mRNA or coding regions, or with qualitative features of gene structure or expression. This, and the unusual sequence environment of coincident codons, suggests that they are a feature of the overall secondary structure of the heterogeneous nuclear RNA. Such considerations suggest approaches for optimizing the expression of exogenous genes in eukaryotic systems, and for predicting the structure of genes for which only partial sequence data is available.  相似文献   

10.
11.
12.
Ophidascaris species are parasitic roundworms that inhabit the python gut, resulting in severe granulomatous lesions or even death. However, the classification and nomenclature of these roundworms are still controversial. Our study aims to identify a snake roundworm from the Burmese python (Python molurus bivittatus) and analyze the mitochondrial genome. We identified this roundworm as Ophidascaris baylisi based on the morphology and cytochrome c oxidase subunit I (cox1) sequence. Ophidascaris baylisi complete mitochondrial genome was 14,784 bp in length, consisting of two non-coding regions and 36 mitochondrial genes (12 protein-coding genes, 22 tRNA genes, and two rRNA genes). The protein-coding genes used TTG, ATG, ATT, or TTA as start codons and TAG, TAA, or T as stop codons. All tRNA genes showed a TV-loop structure, except trnS1AGN and trnS2UCN revealed a D-loop structure. The mitochondrial large ribosomal subunit 16S (rrnL) and small ribosomal subunit 12S (rrnS) were 956 bp and 700 bp long, respectively. Phylogenetic analysis based on O. baylisi mitochondrial protein-coding genes demonstrated that O. baylisi clustered with the family Ascarididae members and was most closely related to Ophidascaris wangi. These results may enhance the nematode mitochondrial genome database and provide valuable molecular markers for further research on the taxonomy, phylogeny, and genetic relationships of Ophidascaris nematodes.  相似文献   

13.
14.
Twenty-nine genes for 27 species of tRNAs were deduced from the complete nucleotide sequence of the mitochondrial genome from a liverwort, Marchantia polymorpha. One to three species of tRNA genes corresponded to each of 20 amino acids including three species for leucine and arginine, two species for serine and glycine, and one for the rest of the amino acids. Interestingly, all tRNA genes were located in the semicircle of the liverwort mitochondrial genome except for the trnY and trnR genes. The region containing these tRNA genes was originally duplicated, and two trnR genes have diverged from each other. On the other hand, trnY and trnfM are present as two identical copies. The G:U and U:N wobbling between the first nucleotide of the anticodon and the third nucleotide of the codon permit the 27 tRNA identified species to translate almost all codons. However, at least two additional tRNA genes, trnl-GAU for AUY codon and trnT-UGU for ACR codon, are required to read all codons used in the liverwort mitochondrial genome. All of the identified tRNA genes are 'native' in liverwort mitochondria, not 'chloroplast-like' tRNAs as are found in the mitochondria of higher plants. This result implies that the tRNA gene transfer from chloroplast to mitochondrial genome in higher plants has occurred after the divergence from bryophytes.  相似文献   

15.
The complete mitochondrial genome (mitogenome) of Bombyx mori strain H9 (Lepidoptera: Bombycidae) is 15,670 base pairs (bp) in length, encoding 13 protein-coding genes (PCGs), two rRNA genes, 22 tRNA genes and a control region. The nucleotide composition of the genome is highly A + T biased, accounting for 81.31%, with a slightly positive AT skewness (0.059). The arrangement of 13 PCGs is similar to that of other sequenced lepidopterans. All the PCGs are initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which is proposed by the TTAG sequence as observed in other lepidopterans. Unlike the other PCGs, the cox1 and cytochrome c oxidase subunit 2 (cox2) genes have incomplete stop codons consisting of just a T. All tRNAs have typical structures of insect mitochondrial tRNAs, which is different from other sequenced lepidopterans. The structure of A + T-rich region is similar to that of other sequenced lepidopterans, including non-repetitive sequences, the ATAGA binding domain, a 18 bp poly-T stretch and a poly-A element upstream of transfer RNA M (trnM) gene. Phylogenetic analysis shows that the domesticated silkmoth B. mori originated from the Chinese Bombyx mandarina.  相似文献   

16.
The nucleotide sequence of a 1082 bp fragment from the pea (Pisum sativum) chloroplast genome is presented. This fragment contains genes for tRNAGlu, tRNATyr and tRNAAsp as well as an open reading frame (ORF) of 91 codons on one strand and two ORFs of 52 and 59 codons on the complementary strand. The tRNAAsp gene is located entirely within the ORF of 91 codons. The first 366 bp of the fragment correspond to 376 bp at one end of a recently published (1) sequence from the broad bean (Vicia faba) chloroplast genome. These regions contain the tRNAGlu and tRNATyr genes, which are identical and separated by 60 bp in both species. These two genes are probably cotranscribed. The intergenic regions in the corresponding segments from the two species are, except for a 10 bp deletion in the pea sequence, 94% homologous.  相似文献   

17.
We present here the complete 16,338 nucleotide DNA sequence of the bovine mitochondrial genome. This sequence is homologous to that of the human mitochondrial genome (Anderson et al., 1981) and the genes are organized in virtually identical fashion. The bovine mitochondrial protein genes are 63 to 79% homologous to their human counterparts, and most of the nucleotide differences occur in the third positions of codons. The minimum rate of base substitution that accounts for the nucleotide differences in the codon third positions is very high: at least 6 × 10?9 changes per position per year. The bovine and human mitochondrial transfer RNA genes exhibit more interspecies variation than do their cytoplasmic counterparts, with the “TΨC” loop being the most variable part of the molecule. The bovine 12 S and 16 S ribosomal RNA genes, when compared with those from human mitochondrial DNA, show conserved features that are consistent with proposed secondary structure models for the ribosomal RNAs. Unlike the pattern of moderate-to-high homology between the bovine and human mitochondrial DNAs found over most of the genome, the DNA sequence in the bovine D-loop region is only slightly homologous to the corresponding region in the human mitochondrial genome. This region is also quite variable in length, and accounts for the bulk of the size difference between the human and bovine mitochondrial DNAs.  相似文献   

18.
Codon usage in bacteria: correlation with gene expressivity   总被引:153,自引:53,他引:100       下载免费PDF全文
The nucleic acid sequence bank now contains over 600 protein coding genes of which 107 are from prokaryotic organisms. Codon frequencies in each new prokaryotic gene are given. Analysis of genetic code usage in the 83 sequenced genes of the Escherichia coli genome (chromosome, transposons and plasmids) is presented, taking into account new data on gene expressivity and regulation as well as iso-tRNA specificity and cellular concentration. The codon composition of each gene is summarized using two indexes: one is based on the differential usage of iso-tRNA species during gene translation, the other on choice between Cytosine and Uracil for third base. A strong relationship between codon composition and mRNA expressivity is confirmed, even for genes transcribed in the same operon. The influence of codon use of peptide elongation rate and protein yield is discussed. Finally, the evolutionary aspect of codon selection in mRNA sequences is studied.  相似文献   

19.
The cytochrome oxidase subunit II gene has been localized in the mitochondrial genome of Oenothera berteriana and the nucleotide sequence has been determined. The coding sequence contains 777 bp and, unlike the corresponding gene in Zea mays, is not interrupted by an intron. No TGA codon is found within the open reading frame. The codon CGG, as in the maize gene, is used in place of tryptophan codons of corresponding genes in other organisms. At position 742 in the Oenothera sequence the TGG of maize is changed into a CGG codon, where Trp is conserved as the amino acid in other organisms. Homologous sequences occur more than once in the mitochondrial genome as several mitochondrial DNA species hybridize with DNA probes of the cytochrome oxidase subunit II gene.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号