首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In recent years, the amount of molecular sequencing data from Tetrahymena thermophila has dramatically increased. We analyzed G + C content, codon usage, initiator codon context and stop codon sites in the extremely A + T rich genome of this ciliate. Average G + C content was 38% for protein coding regions, 21% for 5' non-coding sequences, 19% for 3' non-coding sequences, 15% for introns, 19% for micronuclear limited sequences and 17% for macronuclear retained sequences flanking micronuclear specific regions. The 75 available T. thermophila protein coding sequences favored codons ending in T and, where possible, avoided those with G in the third position. Highly expressed genes were relatively G + C-rich and exhibited an extremely biased pattern of codon usage while developmentally regulated genes were more A + T-rich and showed less codon usage bias. Regions immediately preceding Tetrahymena translation initiator codons were generally A-rich. For the 60 stop codons examined, the frequency of G in the end + 1 site was much higher than expected whereas C never occupied this position.  相似文献   

2.
Eucaryotic DNA is punctuated by many A+T-rich segments that we named A+T-rich linkers. Two types of these A+T-rich linkers can be distinguished: (i) isolated A+T-rich linkers, and (ii) A+T-rich linkers crowded in clusters. We have analysed the distribution of A+T-rich linker across the alpha- and beta-globin gene domain in Xenopus laevis and human genomes using isodenaturation and electron microscopy. Comparison of our data with those previously obtained for the avian globin genes leads us to conclude that genes can be harboured indifferently in either domain. A correlation is established between the presence of A+T-rich linker inside introns and flanking regions and the A+T content of the coding sequence. For the coding sequence, a high A+T content is strongly correlated with high A+T content in the codon's third position and weakly in the first position.  相似文献   

3.
Ovarian poly (A) + RNA from Xenopus laevis and Xenopus borealis was used to construct two cDNA libraries which were screened for histone sequences. cDNA clones to H4 mRNA were obtained from both species and an H3 cDNA clone from Xenopus laevis. The complete DNA sequences of these clones have been determined and are presented. These new sequences are compared with other H3 and H4 DNA sequences both in the coding and 3' noncoding regions. We find that there is considerable non-random codon usage in ten H4 genes. In addition there are some sequence similarities in the 3' noncoding regions of H3 and H4 genes.  相似文献   

4.
Summary Ubiquitin is ubiquitous in all eukaryotes and its amino acid sequence shows extreme conservation. Ubiquitin genes comprise direct repeats of the ubiquitin coding unit with no spacers. The nucleotide sequences coding for 13 ubiquitin genes from 11 species reported so far have been compiled and analyzed. The G+C content of codon third base reveals a positive linear correlation with the genome G+C content of the corresponding species. The slope strongly suggests that the overall G+C content of codons of polyubiquitin genes clearly reflects the genome G+C content by AT/GC substitutions at the codon third position. The G+C content of ubiquitin codon third base also shows a positive linear correlation with the overall G+C content of coding regions of compiled genes, indicating the codon choices among synonymous codons reflect the average codon usage pattern of corresponding species. On the other hand, the monoubiquitin gene, which is different from the polyubiquitin gene in gene organization, gene expression, and function of the encoding protein, shows a different codon usage pattern compared with that of the polyubiquitin gene. From comparisons of the levels of synonymous substitutions among ubiquitin repeats and the homology of the amino acid sequence of the tail of monomeric ubiquitin genes, we propose that the molecular evolution of ubiquitin genes occurred as follows: Plural primitive ubiquitin sequences were dispersed on genome in ancestral eukaryotes. Some of them situated in a particular environment fused with the tail sequence to produce monomeric ubiquitin genes that were maintained across species. After divergence of species, polyubiquitin genes were formed by duplication of the other primitive ubiquitin sequences on different chromosomes. Differences in the environments in which ubiquitin genes are embedded reflect the differences in codon choice and in gene expression pattern between poly- and monomeric ubiquitin genes.  相似文献   

5.
锯凤蝶类与凤蝶科其他类群的系统发生关系及其分类学地位一直存在争议。本研究采用PCR和long PCR技术测定了属于锯凤蝶类的丝带凤蝶Sericinus montelus线粒体基因组全序列; 结合已有的其他凤蝶科物种的相应序列数据, 基于13个蛋白质编码基因重建了凤蝶科主要类群的系统发生树, 探讨了它们之间的系统发生关系。基因组分析结果表明: 丝带凤蝶线粒体基因组全长15 242 bp, 包括13个编码蛋白基因(ATP6, ATP8, COⅠ-Ⅲ, ND1-6, ND4L和Cytb)、 22个tRNA基因、 16S和12S rRNA基因以及非编码的控制区; 基因组A, T, G和C含量分别为40.1%, 40.8%, 7.4%和11.7%, 表现出明显的AT偏倚。所有的蛋白质编码基因都使用标准的起始密码子(ATN); 除ND4 和 ND4L基因使用单个的T作为终止密码子外, 其余蛋白编码基因都使用了标准的终止密码子(TAA)。除丝氨酸 tRNA的二氢尿苷突环缺失外, 所有tRNA基因都形成典型的三叶草型结构。基因组中共存在12个大小介于2~65 bp之间的基因间隔区以及15个大小介于1~8 bp之间的基因重叠区, 其中, 存在于COⅡ和tRNALys之间的24 bp的间隔区在其他鳞翅目昆虫中未曾见到。以邻接法和最大简约法并基于13个蛋白质编码基因序列对凤蝶科进行了系统发生分析。结果显示, 丝带凤蝶和中华虎凤蝶Luehdorfia chinensis先构成一个支系, 再和冰清绢蝶Parnassius bremeri构成姊妹群; 表明锯凤蝶类应作为族级分类单元归于凤蝶科下的绢蝶亚科。  相似文献   

6.
7.
8.
The sequencing of the cloned Locusta migratoria mitochondrial genome has been completed. The sequence is 15,722 by in length and contains 75.3% A+T, the lowest value in any of the five insect mitochondrial sequences so far determined. The protein coding genes have a similar A+T content (74.1%) but are distinguished by a high cytosine content at the third codon position. The gene content and organization are the same as in Drosophila yakuba except for a rearrangement of the two tRNA genes tRNAlys and tRNAasp. The A+T-rich region has a lower A+T nucleotide content than in other insects, and this is largely due to the presence of two G+C-rich 155-bp repetitive sequences at the 5 end of this section and the beginning of the adjacent small rRNA gene. The sizes of the large and small rRNA genes are 1,314 and 827 bp, respectively, and both sequences can be folded to form secondary structures similar to those previously predicted for Drosophila. The tRNA genes have also been modeled and these show a strong resemblance to the dipteran tRNAs, all anticodons apparently being conserved between the two species. A comparison of the protein coding nucleotide sequences of the locust DNA with the homologous sequences of five other arthropods (Drosophila yakuba, Anopheles quadrimaculatus, Anopheles gambiae, Apis mellifera, and Artemia franciscana) was performed. The amino acid composition of the encoded proteins in Locusta is similar to that of Drosophila, with a Dayhoff distance twice that of the distance between the fruit fly and the mosquitoes. A phylogenetic analysis revealed the locust genes to be more similar to those of the Dipterans than to those of the honeybee at both the nucleotide and amino acid levels. A comparative analysis of tRNA orders, using crustacean mtDNAs as outgroups, supported this. This high level of divergence in the Apis genome has been noted elsewhere and is possibly an effect of directional mutation pressure having resulted in an accelerated pattern of sequence evolution. If the general assumption that the Holometabola are monophyletic holds, then these results emphasize the difficulties of reconstructing phylogenies that include lineages with variable substitution rates and base composition biases. The need to exercise caution in using information about tRNA gene orders in phylogenetic analysis is also illustrated. However, if the honeybee sequence is excluded, the correspondence between the other five arthropod sequences supports the findings of previous studies which have endorsed the use of mtDNA sequences for studies of phylogeny at deep levels of taxonomy when mutation rates are equivalent. Correspondence to: P.K. Flook  相似文献   

9.
In the present study, we determined the complete mitochondrial genome sequence of Oncicola luehei (14,281bp), the first archiacanthocephalan representative and the second complete sequence from the phylum Acanthocephala. The complete genome contains 36 genes including 12 protein coding genes, 22 transfer RNA (tRNA) genes and 2 ribosomal RNA genes (rrnL and rrnS) as reported for other syndermatan species. All genes are encoded on the same strand. The overall nucleotide composition of O. luehei mtDNA is 37.7% T, 29.6% G, 22.5% A, and 10.2% C. The overall A+T content (60.2%) is much lower, compared to other syndermatan species reported so far, due to the high frequency (18.3%) of valine encoded by GTN in its protein-coding genes. Results from phylogenetic analyses of amino acid sequences for 10 protein-coding genes from 41 representatives of major metazoan groups including O. luehei supported monophyly of the phylum Acanthocephala and of the clade Syndermata (Acanthocephala+Rotifera), and the paraphyly of the clade Eurotatoria (classes Bdelloidea+Monogononta from phylum Rotifera). Considering the position of the acanthocephalan species within Syndermata, it is inferred that obligatory parasitism characteristic of acanthocephalans was acquired after the common ancestor of acanthocephalans diverged from its sister group, Bdelloidea. Additional comparison of complete mtDNA sequences from unsampled acanthocephalan lineages, especially classes Polyacanthocephala and Eoacanthocephala, is required to test if mtDNA provides reliable information for the evolutionary relationships and pattern of life history diversification found in the syndermatan groups.  相似文献   

10.
11.
鳙的线粒体基因组核苷酸全序列分析   总被引:1,自引:0,他引:1  
对采集自我国长江的鳙的线粒体DNA全序列进行了测定.结果表明,鳙的线粒体DNA全长为166221 bp,其碱基因组成为A=31.6%;C=27.1%;G=16.0%;T=25.3%,A+T含量为56.9%.鳙线粒体基因组的排列、结构和组成与其它鲤科鱼类相似,包括37个基因,即13个蛋白质编码基因,2个rRNA基因,22个tRNA基因和一个非编码控制区(D-loop).在13个蛋白编码基因中,除ND6由轻链编码外,其余12个基因均由重链编码.COI基因的起始密码子为GTG,而其它12个蛋白编码基因的起始密码子均为ATG.  相似文献   

12.
人类蛋白编码基因局部GC水平相关性分析   总被引:2,自引:0,他引:2  
陈祥贵  胡军  杨潇 《遗传》2008,30(9):1169-1174
GC含量是基因组DNA序列碱基组成的重要特征, 蕴涵基因结构、功能和进化信息。文中通过从公共数据库提取7 992个非冗余的人类蛋白质编码基因DNA序列, 分析了基因序列不同区域的局部GC含量和相关性。结果表明: 基因局部GC含量呈现不均一性, 5′非翻译区GC水平最高, 为62.56%; 而3′非翻译区GC水平最低, 为43.97%。3′侧翼序列的GC含量能较好地代表基因所在区域DNA长片段的GC水平。虽然开放阅读框的GC含量比内含子、3′非翻译区和3′侧翼序列的GC含量高, 但4个区域的GC含量之间均存在较高的相关性。密码子第三位置的平均GC含量(GC3)为58.09%, 显著高于密码子第一位置和第二位置的GC含量, 且与开放阅读框的GC水平高度相关, 相关系数高达0.91。GC3与内含子、3′非翻译区、3′侧翼序列的GC水平相关性也较高, GC3对3′侧翼序列的GC含量的直线回归斜率为1.25。因此, GC3可作为基因所在区域GC水平变化的敏感性指标。而密码子第一位置和第二位置以及5′侧翼序列和5′非翻译区GC水平与基因其他区域的GC水平的相关性较弱。该研究结果提示: 基因蛋白编码区密码子第三位置、内含子、3′非翻译区和3′侧翼序列的碱基可能经历了相近的进化过程, 而蛋白编码区密码子第一位置和第二位置、5′侧翼序列和5′非翻译区由于功能的需要而经历了不同的突变和选择。  相似文献   

13.
The Leishmania tarentolae Parrot-TarII strain genome sequence was resolved to an average 16-fold mean coverage by next-generation DNA sequencing technologies. This is the first non-pathogenic to humans kinetoplastid protozoan genome to be described thus providing an opportunity for comparison with the completed genomes of pathogenic Leishmania species. A high synteny was observed between all sequenced Leishmania species. A limited number of chromosomal regions diverged between L. tarentolae and L. infantum, while remaining syntenic to L. major. Globally, >90% of the L. tarentolae gene content was shared with the other Leishmania species. We identified 95 predicted coding sequences unique to L. tarentolae and 250 genes that were absent from L. tarentolae. Interestingly, many of the latter genes were expressed in the intracellular amastigote stage of pathogenic species. In addition, genes coding for products involved in antioxidant defence or participating in vesicular-mediated protein transport were underrepresented in L. tarentolae. In contrast to other Leishmania genomes, two gene families were expanded in L. tarentolae, namely the zinc metallo-peptidase surface glycoprotein GP63 and the promastigote surface antigen PSA31C. Overall, L. tarentolae's gene content appears better adapted to the promastigote insect stage rather than the amastigote mammalian stage.  相似文献   

14.
Two G protein alpha subunit genes orthologous to gpa-2 and gpa-3 in Caenorhabditis elegans have been identified in the parasitic nematode, Strongyloides stercoralis. These genes mediate chemosensory signal transduction regulating dauer arrest in C. elegans. In the parasite, they represent candidate mediators for regulation of the choice between free-living and parasitic life cycles, the obligatory developmental arrest of infective larvae, and reactivation of development after infection. The (A+T) content of these genes is 72.2% for coding sequences, 90% for introns, and 84.1% for 5' and 3' flanking regions, requiring the use of low extension temperatures for long distance PCR. The possible significance of conserved structural motifs of these proteins is discussed.  相似文献   

15.
16.
The complete sequence of Musa acuminata bacterial artificial chromosome (BAC) clones is presented and, consequently, the first analysis of the banana genome organization. One clone (MuH9) is 82,723 bp long with an overall G+C content of 38.2%. Twelve putative protein-coding sequences were identified, representing a gene density of one per 6.9 kb, which is slightly less than that previously reported for Arabidopsis but similar to rice. One coding sequence was identified as a partial M. acuminata malate synthase, while the remaining sequences showed a similarity to predicted or hypothetical proteins identified in genome sequence data. A second BAC clone (MuG9) is 73,268 bp long with an overall G+C content of 38.5%. Only seven putative coding regions were discovered, representing a gene density of only one gene per 10.5 kb, which is strikingly lower than that of the first BAC. One coding sequence showed significant homology to the soybean ribonucleotide reductase (large subunit). A transition point between coding regions and repeated sequences was found at approximately 45 kb, separating the coding upstream BAC end from its downstream end that mainly contained transposon-like sequences and regions similar to known repetitive sequences of M. acuminata. This gene organization resembles Gramineae genome sequences, where genes are clustered in gene-rich regions separated by gene-poor DNA containing abundant transposons.Communicated by J.S. Heslop-Harrison  相似文献   

17.
Correlation was positive between the G + C content at the codon third position in genes of vertebrates and the G + C content of the genome portion surrounding each gene. Exons of genes with a high G + C% at the codon 3rd position are surrounded by G + C-rich introns and G + C-rich flanking sequences, and those with a low G + C% at the position by A + T-rich introns and flanking sequences. Analysis of G + C content distribution along DNA sequences using a DNA Sequence Data Bank supported the view that the vertebrate genome is a mosaic of regions with clear differences in their G + C content. The biological significance of the variation in G + C content throughout the vertebrate genome is discussed in connection with chromosomal banding.  相似文献   

18.
J E Hyde  P F Sims 《Gene》1987,61(2):177-187
We have statistically analysed the distribution of nucleotides and dinucleotides in 21 genes of the 81% A + T-rich human malaria parasite Plasmodium falciparum. The mRNA-synonymous strands of this protozoan show in general a marked excess of purines over pyrimidines, correlated with abnormally high levels of Lys and Glu. We have used the large differences in base composition between coding and non-coding regions to estimate that the parasite possesses in the range of 2700-5400 genes. The dinucleotide preference patterns are compared with consensus patterns derived from other organisms [Nussinov, Nucl. Acids Res. 12 (1984) 1749-1763]. Patterns in the coding regions surprisingly resemble those of higher, rather than lower eukaryotes, particularly with respect to TG elevation and CG suppression. The latter is correlated with an abnormally low level of Arg in these parasites. In the non-coding regions, the four dinucleotides made up of C and/or G are found with significantly higher frequencies than expected (approx. 50-150%), specifically to the 5' side of the coding regions. The possible role of these dinucleotides in control sequences is discussed.  相似文献   

19.
Abstract Degenerate PCR primers were used to amplify a conserved gene portion coding chitin synthase from genomic DNA of six species of ectomycorrhizal truffles. DNA was extracted from both hypogeous fruitbodies and in vitro growing mycelium of Tuber borchii . A single fragment of about 600 bp was amplified for each species. The amplification products from Tuber magnatum, T. borchii and T. ferrugineum were cloned and sequenced, revealing a high degree of identity (91.5%) at the nucleotide level. On the basis of the deduced amino acid sequences these clones were assigned to class II chitin synthase. Southern blot experiments performed on genomic DNA showed that the amplification products derive from a single copy gene. Phylogenetic analysis of the nucleotide sequences of class II chitin synthase genes confirmed the current taxonomic position of the genus Tuber , and suggested a close relationship between T. magnatum and T. uncinatum .  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号