首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
A correspondence between open reading frames in sense and antisense strands is expected from the hypothesis that the prototypic triplet code was of general form RNY, where R is a purine base, N is any base, and Y is a pyrimidine. A deficit of stop codons in the antisense strand (and thus long open reading frames) is predicted for organisms with high G + C percentages; however, two bacteria (Azotobacter vinelandii, Rhodobacter capsulatum) have larger average antisense strand open reading frames than predicted from (G + C)%. The similar Codon frequencies found in sense and antisense strands can be attributed to the wide distribution of inverted repeats (stem-loop potential) in natural DNA sequences.  相似文献   

4.
The question whether the noncoding DNA strand had or still has the capability for encoding functional polypeptides has been addressed in several articles. The theoretical background of the views advocating this idea arose from two groups of findings. One of them was based on various observations implying that the genetic code was adapted for double-strand coding. The other group of theories arose from the observation of gene-length overlapping open reading frames (O-ORFs) on the antisense DNA strand in a number of genes. In fact, the above theories, which I term selectionist, conceive a novel conception of gene evolution, proposing that new genes can be created by the utilization of antisense DNA strand. In contrast, neutralist theory claims that the O-ORFs are mere by-products of evolutionary processes acting to create special codon usage and base distribution patterns in the coding sequences. Received: 16 June 2000 / Accepted: 31 August 2000  相似文献   

5.
6.
7.
钟智  李宏 《生物物理学报》2008,24(5):379-392
以细菌和古菌基因组5′ UTR序列作为研究对象,分析在5′ UTR 的3个不同阅读框架中三联体AUG的分布,发现无论是细菌还是古菌基因组都在阅读框1中有非常明显的AUG缺失(depletion)。AUG的缺失表明在起始密码子上游的AUG很可能会对基因的翻译起始产生影响。分析得知:绝大部分的AUG都是以uORF(upstream open reading frame)的形式出现的,uAUG(upstream AUG)的数量很少,特别是在阅读框1中,而且在细菌基因组的阅读框1中uAUG较多地出现在了含有SD序列的基因上游。比较发现,uAUG引导的序列在同义密码子使用上的偏好性较真正的编码序列差,这可能表明细菌和古菌在同义密码子使用上的偏好性也是决定基因准确地翻译起始的重要因素之一。  相似文献   

8.
The bacterial DNA sequence in GenBank database were divided into coding and noncoding regions and examined for the base-trimer distribution in every triplet frame on the sense and antisense strands. The results revealed that for the noncoding region, both strands have very similar base-trimer distributions and have no frame specificity; that is, DNA is symmetric in the noncoding region. For the coding region, on the other hand, the symmetry is broken only in the triplet framework, and we found a special triplet-frame-specific symmetry which appears when the two complementary strands of the coding region are read from their 5 ends. In addition, the following frame specificity was also observed in the distribution of stop codons on the antisense strand of the coding region. When the antisense sequences of the open reading frames (ORFs) in the database are read in the three reading frames, the same reading frame as the corresponding ORF contains a significantly larger amount of long open frames without stop codons (i.e., nonstop frames [NSFs]) than expected, while the number of NSFs in the other two reading frames is similar to that of the expected one. That is, NSFs as well as ORFs are maintained in a frame-specific manner, and in this sense, DNA becomes symmetrical even in the coding region. These two kinds of frame-specific symmetries indicate that only an ORF and its complementary triplets are specifically recognized and maintained in DNA. We suppose that the antisense strands as well as the sense strands in the coding region may be transcribed, thereby producing various kinds of proteins corresponding to NSFs, though their amount may not be large. The presence of these proteins should have some benefits for living organisms, and therefore we propose that these proteins are upcoming enzymes having novel functions.Correspondence to: I. Urabe  相似文献   

9.
The short-chain oxidoreductase (SCOR) family of enzymes includes over 6000 members, extending from bacteria and archaea to humans. Nucleic acid sequence analysis reveals that significant numbers of these genes are remarkably free of stopcodons in reading frames other than the coding frame, including those on the antisense strand. The genes from this subset also use almost entirely the GC-rich half of the 64 codons. Analysis of a million hypothetical genes having random nucleotide composition shows that the percentage of SCOR genes having multiple open reading frames exceeds random by a factor of as much as 1 x 10(6). Nevertheless, screening the content of the SWISS-PROT TrEMBL database reveals that 15% of all genes contain multiple open reading frames. The SCOR genes having multiple open reading frames and a GC-rich coding bias exhibit a similar GC bias in the nucleotide triple composition of their DNA. This bias is not correlated with the GC content of the species in which the SCOR genes are found. One possible explanation for the conservation of multiple open reading frames and extreme bias in nucleic acid composition in the family of Rossman folds is that the primordial member of this family was encoded early using only very stable GC-rich DNA and that evolution proceeded with extremely limited introduction of any codons having two or more adenine or thymine nucleotides. These and other data suggest that the SCOR family of enzymes may even have diverged from a common ancestor before most of the AT-rich half of the genetic code was fully defined.  相似文献   

10.
The general property of asymmetry in word use in meaningful texts written in a variety of languages, motivates a quantification of the differences in the use of mutually symmetric triplets in genomic sequences. When this is done in the three reading frames, high values found for one of them are used as indication that the sequence is coding for a protein. Moreover, a similar quantification of the differences in the use of complementary triplets is introduced, again with predictive power of the coding character of a sequence. This method reflects the non-equivalence between sense and anti-sense strand of a coding segment. In both approaches, "linguistic asymmetry" in coding sequences is related to the form of the genetic code and to the bias in codon usage and amino acid use skews.  相似文献   

11.
Long Open Reading Frames (ORFs) in antisense DNA strands have been reported in the literature as being rare events. However, an extensive analysis of the GenBank database revealed that a substantial number of genes from several species contain an in-phase ORF in the antisense strand, that overlaps entirely the coding sequence of the sense strand, or even extends beyond. The findings described in this paper show that this is a frequent, non-random phenomenon, which is primarily dependent on codon usage, and to a lesser extent on gene size and GC content. Examination of the sequence database for several prokaryotic and eukaryotic organisms, demonstrates that coding sequences with in-phase, 100% overlapping antisense ORFs are present in every genome studied so far.  相似文献   

12.
紫花苜蓿叶绿体基因组密码子偏好性分析   总被引:1,自引:0,他引:1  
喻凤  韩明 《广西植物》2021,41(12):2069-2076
为分析紫花苜蓿叶绿体基因组密码子偏好性的使用模式,该文以紫花苜蓿叶绿体基因组中筛选到的49条蛋白质编码序列为研究对象,利用CodonW、CUSP、CHIPS、SPSS等软件对其密码子的使用模式和偏好性进行研究。结果表明:(1)紫花苜蓿叶绿体基因的第3位密码子的平均GC含量为26.44%,有效密码子数(ENC)在40.6~51.41之间,多数密码子的偏好性较弱。(2)相对同义密码子使用度(RSCU)分析发现,RSCU>1 的密码子数目有30个,以A、U结尾的有29个,说明了紫花苜蓿叶绿体基因组A或U出现的频率较高。(3)中性分析发现,GC3与 GC12的相关性不显著,表明密码子偏性主要受自然选择的影响; ENC-plot 分析发现一部分基因落在曲线的下方及周围,表明突变也影响了部分密码子偏性的形成。此外,有17个密码子被鉴定为紫花苜蓿叶绿体基因组的最优密码子。紫花苜蓿叶绿体基因组的密码子偏好性可能受自然选择和突变的共同作用。该研究将为紫花苜蓿叶绿体基因工程的开展和目标性状的遗传改良奠定基础。  相似文献   

13.
The phenomenon of codon usage bias is known to exist in many genomes and it is mainly determined by mutation and selection. To understand the patterns of codon usage in nemertean mitochondrial genomes, we use bioinformatic approaches to analyze the protein-coding sequences of eight nemertean species. Neutrality analysis did not find a significant correlation between GC12 and GC3. ENc-plot showed a few genes on or close to the expected curve, but the majority of points with low-ENc values are below it. ENc-plot suggested that mutational bias plays a major role in shaping codon usage. The Parity Rule 2 plot (PR2) analysis showed that GC and AT were not used proportionally and we propose that codons containing A or U at third position are used preferentially in nemertean species, regardless of whether corresponding tRNAs are encoded in the mitochondrial DNA. Context-dependent analysis indicated that the nucleotide at the second codon position slightly affects synonymous codon choices. These results suggested that mutational and selection forces are probably acting to codon usage bias in nemertean mitochondrial genomes.  相似文献   

14.
Two species of the DNA virus Torque teno sus virus (TTSuV), TTSuV1 and TTSuV2, have become widely distributed in pig-farming countries in recent years. In this study, we performed a comprehensive analysis of synonymous codon usage bias in 41 available TTSuV2 coding sequences (CDS), and compared the codon usage patterns of TTSuV2 and TTSuV1. TTSuV codon usage patterns were found to be phylogenetically conserved. Values for the effective number of codons (ENC) indicated that the overall extent of codon usage bias in both TTSuV2 and TTSuV1 was not significant, the most frequently occurring codons had an A or C at the third codon position. Correspondence analysis (COA) was performed and TTSuV2 and TTSuV1 sequences were located in different quadrants of the first two major axes. A plot of the ENC revealed that compositional constraint was the major factor determining the codon usage bias for TTSuV2. In addition, hierarchical cluster analysis of 41 TTSuV2 isolates based on relative synonymous codon usage (RSCU) values suggested that there was no association between geographic distribution and codon bias of TTSuV2 sequences. Finally, the comparison of RSCU for TTSuV2, TTSuV1 and the corresponding host sequence indicated that the codon usage pattern of TTSuV2 was similar to that of TTSuV1. However the similarity was low for each virus and its host. These conclusions provide important insight into the synonymous codon usage pattern of TTSuV2, as well as better understangding of the molecular evolution of TTSuV2 genomes.  相似文献   

15.
Complete DNA sequences have been determined for the mitochondrial genomes of the crinoids Phanogenia gracilis (15892 bp) and Gymnocrinus richeri (15966 bp). The mitochondrial genetic map of the stalkless feather star P. gracilis is identical to that of the comatulid feather star Florometra serratissima (Scouras, A., Smith, M.J., 2001. Mol. Biol. Evol. 18, 61-73). The mitochondrial gene order of the stalked crinoid G. richeri differs from that of F. serratissima and P. gracilis by the transposition of the nad4L protein gene. The G. richeri nad4L mitochondrial map position is unique among metazoa and is likely a derived feature in this stalked crinoid. Nucleotide compositional analyses of protein genes encoded on the major sense strand confirm earlier conclusions regarding a crinoid-distinctive T over C bias. All three crinoids exhibit high T levels in third codon positions, whereas other echinoderm classes favor A or C in the third codon position. The nucleotide bias is reflected in the relative synonymous codon usage patterns of crinoids versus other echinoderms. We suggest that the nucleotide bias of crinoids, in comparison to other echinoderms, indicates that a physical inversion of the origin of replication has occurred in the crinoid lineage. Evolutionary rate tests support the use of the cytochrome b (cob) gene in molecular phylogenetic analyses of echinoderms. A consensus echinoderm tree was generated based on cytochrome b nucleotide alignments that placed the asteroids as a sister group to a clade containing the ophiuroids and the (echinoids+holothuroids) with the crinoids basal to the rest of the echinoderm classes: [Crinoid,(Asteroid,(Ophiuroid,(Echinoid,Holothuroid)))].  相似文献   

16.
T Itoh  H Matsuda  H Mori 《DNA research》1999,6(5):299-305
Novel members of the highly conserved protein family, Hsp70, have been found in the complete sequences of several genomes. To elucidate a phylogenetic relationship among Hsp70 proteins of Escherichia coli, we searched all open reading frames derived from 13 complete genomes for Hsp70/actin-related proteins by the single-linkage clustering method. Phylogenetic analysis of this superfamily revealed that E. coli possesses at least three Hsp70 homologs (DnaK, Hsc66 and Hsc62). We found that Hsc62, which is the product of hscC, is a new member of the Hsc66 subfamily, and is specific to E. coli. The analysis also suggested that YegD of E. coli is closely related to the actin family, which consists of the actin, FtsA and MreB subfamilies. A further database search revealed that two dnaJ homologs, ybeS and ybeV, were located on the opposite strand near hscC. Consequently, E. coli seems to have three gene clusters composed of DnaK and DnaJ homologs.  相似文献   

17.
J L Weber 《Gene》1987,52(1):103-109
The genome of the human malaria parasite Plasmodium falciparum has an A + T content of about 82%, higher than any other organism whose DNA has been characterized. Computer analysis of 36 kb of available nucleotide sequences from this species showed that the coding regions, with an A + T content of 69.0%, are flanked by more A + T-rich regions of 86.0% A + T. Within the coding sequences, the A/T ratio was 1.68 in the mRNA sense strand, and overall A + T content in the three codon positions increased in the order 1st-2nd-3rd position. Codons with T or especially A in the third position were strongly preferred. Codon usage among individual parasite genes was very similar compared to genes from other species. Dinucleotide frequencies for the parasite DNA were close to those expected for a random sequence with the known base composition, except that the CpG frequency in the coding sequences was low.  相似文献   

18.
Mycoplasma bovis is a major pathogen causing arthritis, respiratory disease and mastitis in cattle. A better understanding of its genetic features and evolution might represent evidences of surviving host environments. In this study, multiple factors influencing synonymous codon usage patterns in M. bovis (three strains’ genomes) were analyzed. The overall nucleotide content of genes in the M. bovis genome is AT-rich. Although the G and C contents at the third codon position of genes in the leading strand differ from those in the lagging strand (p<0.05), the 59 synonymous codon usage patterns of genes in the leading strand are highly similar to those in the lagging strand. The over-represented codons and the under-represented codons were identified. A comparison of the synonymous codon usage pattern of M. bovis and cattle (susceptible host) indicated the independent formation of synonymous codon usage of M. bovis. Principal component analysis revealed that (i) strand-specific mutational bias fails to affect the synonymous codon usage pattern in the leading and lagging strands, (ii) mutation pressure from nucleotide content plays a role in shaping the overall codon usage, and (iii) the major trend of synonymous codon usage has a significant correlation with the gene expression level that is estimated by the codon adaptation index. The plot of the effective number of codons against the G+C content at the third codon position also reveals that mutation pressure undoubtedly contributes to the synonymous codon usage pattern of M. bovis. Additionally, the formation of the overall codon usage is determined by certain evolutionary selections for gene function classification (30S protein, 50S protein, transposase, membrane protein, and lipoprotein) and translation elongation region of genes in M. bovis. The information could be helpful in further investigations of evolutionary mechanisms of the Mycoplasma family and heterologous expression of its functionally important proteins.  相似文献   

19.
20.
The large open reading frames of insertion sequences from Escherichia coli were examined for their spatial pattern of codon usage bias and distribution of rarely used codons. There is a bias in codon usage that is generally lower toward the terminal ends of the coding regions, which is reflected in the occurrence of an excess of nonpreferred codons in the 3 portions of the coding regions as compared with the 5 portions. In contrast, typical chromosomal genes have a lower codon usage bias toward the 5 ends of the coding regions. These results imply that the selective forces reflected in codon usage bias may differ according to position within the coding sequence. In addition, these constraints apparently differ in important ways between genes contained in insertion sequences and those in the chromosome.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号