首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 703 毫秒
1.
MOTIVATION: Translation initiation sites (TISs) of genes are the key points of protein synthesis. Exact recognition of TISs in eukaryotic genes is one of the most important tasks in gene-finding algorithms. However, the task has not been satisfactorily fulfilled up to the present. Here, we propose a cooperatively scanning model for recognizing TISs and the first exons of eukaryotic genes on the basis of the structural characteristics of multi-exon genes. RESULTS: The model was employed to cooperatively scan the TISs and 3' splicing sites in eukaryotic genes, and the TISs and the first exons of 132 mammalian gene sequences are identified to evaluate the model. Accuracy of exactly recognizing the TISs and the first exons has been found to amount respectively to 64.4 and 51.5%. We believe that the model will be a useful tool for genome annotation and that it can be easily incorporated into other algorithms to achieve higher accuracy in recognizing TISs and the first exons. AVAILABILITY: The program is available upon request.  相似文献   

2.
The prediction of translation initiation sites (TISs) in eukaryotic mRNAs has been a challenging problem in computational molecular biology. In this paper, we present a new algorithm to recognize TISs with a very high accuracy. Our algorithm includes two novel ideas. First, we introduce a class of new sequence-similarity kernels based on string editing, called edit kernels, for use with support vector machines (SVMs) in a discriminative approach to predict TISs. The edit kernels are simple and have significant biological and probabilistic interpretations. Although the edit kernels are not positive definite, it is easy to make the kernel matrix positive definite by adjusting the parameters. Second, we convert the region of an input mRNA sequence downstream to a putative TIS into an amino acid sequence before applying SVMs to avoid the high redundancy in the genetic code. The algorithm has been implemented and tested on previously published data. Our experimental results on real mRNA data show that both ideas improve the prediction accuracy greatly and that our method performs significantly better than those based on neural networks and SVMs with polynomial kernels or Salzberg kernels.  相似文献   

3.
Although ribosome-profiling and translation initiation sequencing (TI-seq) analyses have identified many noncanonical initiation codons, the precise detection of translation initiation sites (TISs) remains a challenge, mainly because of experimental artifacts of such analyses. Here, we describe a new method, TISCA (TIS detection by translation Complex Analysis), for the accurate identification of TISs. TISCA proved to be more reliable for TIS detection compared with existing tools, and it identified a substantial number of near-cognate codons in Kozak-like sequence contexts. Analysis of proteomics data revealed the presence of methionine at the NH2-terminus of most proteins derived from near-cognate initiation codons. Although eukaryotic initiation factor 2 (eIF2), eIF2A and eIF2D have previously been shown to contribute to translation initiation at near-cognate codons, we found that most noncanonical initiation events are most probably dependent on eIF2, consistent with the initial amino acid being methionine. Comprehensive identification of TISs by TISCA should facilitate characterization of the mechanism of noncanonical initiation.  相似文献   

4.

Background  

Accurate annotation of translation initiation sites (TISs) is essential for understanding the translation initiation mechanism. However, the reliability of TIS annotation in widely used databases such as RefSeq is uncertain due to the lack of experimental benchmarks.  相似文献   

5.
翻译起始位点(TIS,即基因5’端)的精确定位是原核生物基因预测的一个关键问题,而基因组GC含量和翻译起始机制的多样性是影响当前TIS预测水平的重要因素.结合基因组结构的复杂信息(包括GC含量、TIS邻近序列及上游调控信号、序列编码潜能、操纵子结构等),发展刻画翻译起始机制的数学统计模型,据此设计TIS预测的新算法MED.StartPlus.并将MED.StartPlus与同类方法RBSfinder、GS.Finder、MED-Start、TiCo和Hon-yaku等进行系统地比较和评价.测试针对两种数据集进行:当前14个已知的TIS被确认的基因数据集,以及300个物种中功能已知的基因数据集.测试结果表明,MED-StartPlus的预测精度在总体上超过同类方法.尤其是对高GC含量基因组以及具有复杂翻译起始机制的基因组,MED-StartPlus具有明显的优势.  相似文献   

6.
7.
MOTIVATION: Tightly packed prokaryotic genes frequently overlap with each other. This feature, rarely seen in eukaryotic DNA, makes detection of translation initiation sites and, therefore, exact predictions of prokaryotic genes notoriously difficult. Improving the accuracy of precise gene prediction in prokaryotic genomic DNA remains an important open problem. RESULTS: A software program implementing a new algorithm utilizing a uniform Hidden Markov Model for prokaryotic gene prediction was developed. The algorithm analyzes a given DNA sequence in each of six possible global reading frames independently. Twelve complete prokaryotic genomes were analyzed using the new tool. The accuracy of gene finding, predicting locations of protein-coding ORFs, as well as the accuracy of precise gene prediction, and detecting the whole gene including translation initiation codon were assessed by comparison with existing annotation. It was shown that in terms of gene finding, the program performs at least as well as the previously developed tools, such as GeneMark and GLIMMER. In terms of precise gene prediction the new program was shown to be more accurate, by several percentage points, than earlier developed tools, such as GeneMark.hmm, ECOPARSE and ORPHEUS. The results of testing the program indicated the possibility of systematic bias in start codon annotation in several early sequenced prokaryotic genomes. AVAILABILITY: The new gene-finding program can be accessed through the Web site: http:@dixie.biology.gatech.edu/GeneMark/fbf.cgi CONTACT: mark@amber.gatech.edu.  相似文献   

8.

Background  

Computational prediction methods are currently used to identify genes in prokaryote genomes. However, identification of the correct translation initiation sites remains a difficult task. Accurate translation initiation sites (TISs) are important not only for the annotation of unknown proteins but also for the prediction of operons, promoters, and small non-coding RNA genes, as this typically makes use of the intergenic distance. A further problem is that most existing methods are optimized for Escherichia coli data sets; applying these methods to newly sequenced bacterial genomes may not result in an equivalent level of accuracy.  相似文献   

9.
Liu H  Han H  Li J  Wong L 《In silico biology》2004,4(3):255-269
The translation initiation site (TIS) prediction problem is about how to correctly identify TIS in mRNA, cDNA, or other types of genomic sequences. High prediction accuracy can be helpful in a better understanding of protein coding from nucleotide sequences. This is an important step in genomic analysis to determine protein coding from nucleotide sequences. In this paper, we present an in silico method to predict translation initiation sites in vertebrate cDNA or mRNA sequences. This method consists of three sequential steps as follows. In the first step, candidate features are generated using k-gram amino acid patterns. In the second step, a small number of top-ranked features are selected by an entropy-based algorithm. In the third step, a classification model is built to recognize true TISs by applying support vector machines or ensembles of decision trees to the selected features. We have tested our method on several independent data sets, including two public ones and our own extracted sequences. The experimental results achieved are better than those reported previously using the same data sets. Our high accuracy not only demonstrates the feasibility of our method, but also indicates that there might be "amino acid" patterns around TIS in cDNA and mRNA sequences.  相似文献   

10.
In recent years, biotechnology has permitted regulation of the expression of endogenous plant genes to improve agronomlcally important traits. Genetic modification of crops has benefited from emerging knowledge of new genes, especially genes that exhibit novel functions, one of which is eukaryotlc initiation factor 4E (eIF4E). eIF4E Is one of the most important translation initiation factors Involved in eukaryotic initiation. Recent research has demonstrated that virus resistance mediated by eIF4E and Its isoform elf (Iso)4E occurs in several plant-virus interactions, thus indicating a potential new role for eIF4E/elF(Iso)4E In resistance strategies against plant viruses. In this review, we briefly describe eIF4E activity In plant translation, its potential role, and functions of the eIF4E subfamily In plant-virus interactions. Other initiation factors such as elF4G could also play a role In plant resistance against viruses. Finally, the potential for developing eIF4E-mediated resistance to plant viruses in the future Is discussed. Future research should focus on elucidation of the resistance mechanism and spectrum mediated by eIF4E. Knowledge of a particu- lar plant-virus interaction will help to deepen our understanding of eIF4E and other eukaryotic Initiation factors, and their involvement in virus disease control.  相似文献   

11.
Feature selection for the prediction of translation initiation sites   总被引:3,自引:0,他引:3  
Translation initiation sites (TISs) are important signals in cDNA sequences. In many previous attempts to predict TISs in cDNA sequences, three major factors affect the prediction performance: the nature of the cDNA sequence sets, the relevant features selected. and the classification methods used. In this paper, we examine different approaches to select and integrate relevant features for TIS prediction. The top selected significant features include the features from the position weight matrix and the propensity matrix, the number of nucleotide C in the sequence downstream ATG, the number of downstream stop codons. the number of upstream ATGs, and the number of some amino acids, such as amino acids A and D. With the numerical data generated from these features, different classification methods, including decision tree. naive Bayes, and support vector machine, were applied to three independent sequence sets. The identified significant features were found to be biologically meaningful. while the experiments showed promising results.  相似文献   

12.
人类全基因组范围的CpG岛的预测与分析   总被引:1,自引:0,他引:1  
CpG岛的甲基化是表观遗传中基因表达调控的重要机制。虽然目前已存在几个从DNA序列判别CpG岛的标准,但如何在标准中选择合适的参数仍是研究的焦点。文章通过分析比较两种经典CpG岛判定标准与三种预测方法,提出了改进的CpG岛预测方法——CpGISeeker。应用该预测方法,结合判定标准中的三个基本参数组合出的13组组合参数,在人类全基因组范围内进行了CpG岛预测,并统计分析了CpG岛的重复序列组成以及相对于基因转录起始位点的位置分布情况。分析结果表明CpGISeeker具有更精确判定CpG岛的特性;同时还提示,随着判定标准严格性的增加,CpG岛的重复序列含量降低,与基因转录起始位点的相关性提高。将CpG岛最小尺寸为500bp、GC含量为60%、CpG出现率达到0.65的组合参数作为标准,是目前预测CpG岛的最佳方式。  相似文献   

13.
Identifying potential tRNA genes in genomic DNA sequences.   总被引:16,自引:0,他引:16  
We have developed an algorithm that automatically and reproducibly identifies potential tRNA genes in genomic DNA sequences, and we present a general strategy for testing the sensitivity of such algorithms. This algorithm is useful for the flagging and characterization of long genomic sequences that have not been experimentally analyzed for identification of functional regions, and for the scanning of nucleotide sequence databases for errors in the sequences and the functional assignments associated with them. In an exhaustive scan of the GenBank database, 97.5% of the 744 known tRNA genes were correctly identified (true-positives), and 42 previously unidentified sequences were predicted to be tRNAs. A detailed analysis of these latter predictions reveals that 16 of the 42 are very similar to known tRNA genes, and we predict that they do, in fact, code for tRNA, yielding a false-positive rate for the algorithm of 0.003%. The new algorithm and testing strategy are a considerable improvement over any previously described strategies for recognizing tRNA genes, and they allow detections of genes (including introns) embedded in long genomic sequences.  相似文献   

14.
Protein synthesis in eukaryotic cells is fundamental for gene expression. This process involves the binding of an mRNA molecule to the small ribosomal subunit in a group of reactions catalyzed by eukaryotic translation initiation factors (eIF) eIF4. To date, the role of each of the four eIF4, i.e. eIF4E, eIF4G, eIF4A and eIF4B, is well established. However, with the advent of genome-wide sequencing projects of various organisms, families of genes for each translation initiation factor have been identified. Intriguingly, recent studies have now established that certain eIF4 proteins can promote or inhibit translation of specific mRNAs, and also that some of them are active in processes other than translation. In addition, there is evidence of tissue- and developmental-stage-specific expression for some of these proteins. These new findings point to an additional level of complexity in the translation initiation process. In this review, we analyze the latest advances concerning the functionality of members of the eIF4 families in eukaryotic organisms and discuss the implications of this in the context of our current understanding of regulation of the translation initiation process.  相似文献   

15.
在真核生物中,mRNA翻译是一个复杂的多步骤过程,包括起始、延伸和终止3个阶段。其中,起始阶段的调控是影响mRNA翻译的关键。目前已经发现,mRNA翻译起始方式有多种,以最早发现的m 7G帽依赖性扫描机制最为经典,但当细胞处于逆境,经典起始机制受到抑制时,其他类型的起始机制会将其替代以保证翻译的顺利进行。本文对目前已发现的真核生物mRNA不同翻译起始机制特别是经典起始机制的替代机制进行了综述,旨在为深入认识真核生物基因在翻译水平上的表达调控提供参考。  相似文献   

16.
17.
MicroReview Control of translation initiation in Saccharomyces cerevisiae   总被引:1,自引:0,他引:1  
The first observations regarding the control of translation initiation in the yeast Saccharomyces cerevisiae were made by Fred Sherman and his colleagues in 1971. Elegant genetic studies of the CYC1 gene resulted in the formulation of 'Sherman's Rules' for translation initiation as follows: (i) AUG is the only initiator codon. (ii) the most proximal AUG from the 5' end of a message will serve as the start site of translation; and (iii) if the upstream AUG codon is mutated then initiation begins at the next available AUG in the message. Hidden within these rules is the mechanism of eukaryotic translation initiation, as these very same rules were later shown to apply to higher eukaryotic organisms and were formulated into the scanning model. However, only in the past five years has yeast been taken seriously as an organism for studying the mechanism of eukaryotic translation initiation. The basis for this is that the yeast genes for at least four mammalian translation initiation factor homologues have been identified and the number is growing. Similar factors suggest similar mechanisms for translation initiation between yeast and mammals. For some translation initiation factors, the genetics of yeast has provided new insights into their function. A mechanism for regulating translation initiation in mammalian cells is now evident in yeast. It seems clear that the molecular genetics of yeast coupled with the available in vitro translation system will provide a wealth of information in the future regarding translational control and regulatory mechanisms. The purpose of this review is to summarize what is known about translational control in S. cerevisiae.  相似文献   

18.
Initiation of translation in prokaryotes and eukaryotes.   总被引:74,自引:0,他引:74  
M Kozak 《Gene》1999,234(2):187-208
  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号