首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
原核生物操纵子结构的准确注释对基因功能和基因调控网络的研究具有重要意义,通过生物信息学方法计算预测是当前基因组操纵子结构注释的最主要来源.当前的预测算法大都需要实验确认的操纵子作为训练集,但实验确认的操纵子数据的缺乏一直成为发展算法的瓶颈.基于对操纵子结构的认识,从基因间距离、转录翻译相关的调控信号以及COG功能注释等特征出发,建立了描述操纵子复杂结构的概率模型,并提出了不依赖于特定物种操纵子数据作为训练集的迭代自学习算法.通过对实验验证的操纵子数据集的测试比较,结果表明算法对于预测操纵子结构非常有效.在不依赖于任何已知操纵子信息的情况下,算法在总体预测水平上超过了目前最好的操纵子预测方法,而且这种自学习的预测算法要优于依赖特定物种进行训练的算法.这些特点使得该算法能够适用于新测序的物种,有别于当前常用的操纵子预测方法.对细菌和古细菌的基因组进行大规模比较分析,进一步提高了对基因组操纵子结构的普遍特征和物种特异性的认识.  相似文献   

2.
We have carried out a systematic analysis of the contribution of a set of selected features that include three new features to the accuracy of operon prediction. Our analyses have led to a number of new insights about operon prediction, including that (i) different features have different levels of discerning power when used on adjacent gene pairs with different ranges of intergenic distance, (ii) certain features are universally useful for operon prediction while others are more genome-specific and (iii) the prediction reliability of operons is dependent on intergenic distances. Based on these new insights, our newly developed operon-prediction program achieves more accurate operon prediction than the previous ones, and it uses features that are most readily available from genomic sequences. Our prediction results indicate that our (non-linear) decision tree-based classifier can predict operons in a prokaryotic genome very accurately when a substantial number of operons in the genome are already known. For example, the prediction accuracy of our program can reach 90.2 and 93.7% on Bacillus subtilis and Escherichia coli genomes, respectively. When no such information is available, our (linear) logistic function-based classifier can reach the prediction accuracy at 84.6 and 83.3% for E.coli and B.subtilis, respectively.  相似文献   

3.
4.
A fuzzy guided genetic algorithm for operon prediction   总被引:4,自引:0,他引:4  
Motivation: The operon structure of the prokaryotic genome isa critical input for the reconstruction of regulatory networksat the whole genome level. As experimental methods for the detectionof operons are difficult and time-consuming, efforts are beingput into developing computational methods that can use availablebiological information to predict operons. Method: A genetic algorithm is developed to evolve a startingpopulation of putative operon maps of the genome into progressivelybetter predictions. Fuzzy scoring functions based on multiplecriteria are used for assessing the ‘fitness’ ofthe newly evolved operon maps and guiding their evolution. Results: The algorithm organizes the whole genome into operons.The fuzzy guided genetic algorithm-based approach makes it possibleto use diverse biological information like genome sequence data,functional annotations and conservation across multiple genomes,to guide the organization process. This approach does not requireany prior training with experimental operons. The predictionsfrom this algorithm for Escherchia coli K12 and Bacillus subtilisare evaluated against experimentally discovered operons forthese organisms. The accuracy of the method is evaluated usingan ROC (receiver operating characteristic) analysis. The areaunder the ROC curve is around 0.9, which indicates excellentaccuracy. Contact: roschen_csir{at}rediffmail.com  相似文献   

5.
Accurate prediction of operons can improve the functional annotation and application of genes within operons in prokaryotes. Here, we review several features: (i) intergenic distance, (ii) metabolic pathways, (iii) homologous genes, (iv) promoters and terminators, (v) gene order conservation, (vi) microarray, (vii) clusters of orthologous groups, (viii) gene length ratio, (ix) phylogenetic profiles, (x) operon length/size and (xi) STRING database scores, as well as some other features, which have been applied in recent operon prediction methods in prokaryotes in the literature. Based on a comparison of the prediction performances of these features, we conclude that other, as yet undiscovered features, or feature selection with a receiver operating characteristic analysis before algorithm processing can improve operon prediction in prokaryotes.  相似文献   

6.
Operon prediction in Pyrococcus furiosus   总被引:1,自引:0,他引:1  
Identification of operons in the hyperthermophilic archaeon Pyrococcus furiosus represents an important step to understanding the regulatory mechanisms that enable the organism to adapt and thrive in extreme environments. We have predicted operons in P.furiosus by combining the results from three existing algorithms using a neural network (NN). These algorithms use intergenic distances, phylogenetic profiles, functional categories and gene-order conservation in their operon prediction. Our method takes as inputs the confidence scores of the three programs, and outputs a prediction of whether adjacent genes on the same strand belong to the same operon. In addition, we have applied Gene Ontology (GO) and KEGG pathway information to improve the accuracy of our algorithm. The parameters of this NN predictor are trained on a subset of all experimentally verified operon gene pairs of Bacillus subtilis. It subsequently achieved 86.5% prediction accuracy when applied to a subset of gene pairs for Escherichia coli, which is substantially better than any of the three prediction programs. Using this new algorithm, we predicted 470 operons in the P.furiosus genome. Of these, 349 were validated using DNA microarray data.  相似文献   

7.
Detecting uber-operons in prokaryotic genomes   总被引:3,自引:1,他引:3       下载免费PDF全文
Che D  Li G  Mao F  Wu H  Xu Y 《Nucleic acids research》2006,34(8):2418-2427
  相似文献   

8.
Operon prediction without a training set   总被引:5,自引:0,他引:5  
  相似文献   

9.
10.
11.
翻译起始位点(TIS,即基因5’端)的精确定位是原核生物基因预测的一个关键问题,而基因组GC含量和翻译起始机制的多样性是影响当前TIS预测水平的重要因素.结合基因组结构的复杂信息(包括GC含量、TIS邻近序列及上游调控信号、序列编码潜能、操纵子结构等),发展刻画翻译起始机制的数学统计模型,据此设计TIS预测的新算法MED.StartPlus.并将MED.StartPlus与同类方法RBSfinder、GS.Finder、MED-Start、TiCo和Hon-yaku等进行系统地比较和评价.测试针对两种数据集进行:当前14个已知的TIS被确认的基因数据集,以及300个物种中功能已知的基因数据集.测试结果表明,MED-StartPlus的预测精度在总体上超过同类方法.尤其是对高GC含量基因组以及具有复杂翻译起始机制的基因组,MED-StartPlus具有明显的优势.  相似文献   

12.
Operons are a major feature of all prokaryotic genomes, but how and why operon structures vary is not well understood. To elucidate the life-cycle of operons, we compared gene order between Escherichia coli K12 and its relatives and identified the recently formed and destroyed operons in E. coli. This allowed us to determine how operons form, how they become closely spaced, and how they die. Our findings suggest that operon evolution may be driven by selection on gene expression patterns. First, both operon creation and operon destruction lead to large changes in gene expression patterns. For example, the removal of lysA and ruvA from ancestral operons that contained essential genes allowed their expression to respond to lysine levels and DNA damage, respectively. Second, some operons have undergone accelerated evolution, with multiple new genes being added during a brief period. Third, although genes within operons are usually closely spaced because of a neutral bias toward deletion and because of selection against large overlaps, genes in highly expressed operons tend to be widely spaced because of regulatory fine-tuning by intervening sequences. Although operon evolution may be adaptive, it need not be optimal: new operons often comprise functionally unrelated genes that were already in proximity before the operon formed.  相似文献   

13.
14.
Yan Y  Moult J 《Proteins》2006,64(3):615-628
Operons are clusters of genes that are transcribed as a single message, and regulated by the same gene expression machinery. They are found primarily in prokaryotic genomes. Because genes in the same operon are likely to have related functions, identification of the operon structure is potentially useful for assigning gene function. We report the development and benchmarking of two different methods for detecting operons, based on an analysis of 42 fully sequenced prokaryotic organisms. The Gene Neighbor method (GNM) utilizes the relatively high conservation of gene order in operons, compared with genes in general. The Gene Gap Method (GGM) makes use of the relatively short gap between genes in operons compared with that otherwise found between adjacent genes. The methods have been benchmarked using KEGG pathway data and RegulonDB Escherichia coli operon data. With optimum parameters, the specificity of the GNM is 93% and the sensitivity is 70%. For the GGM, the specificity is 95% and the sensitivity is 68%. Together, the two methods have a sensitivity of 87.2%, while joint predictions have a sensitivity of 50% and a specificity of 98%. The methods are used to infer possible functions for some hypothetical genes in prokaryotic genomes. The methods have proven a useful addition to structure information in deriving protein function in a structural genomics project.  相似文献   

15.
16.
Operons are an important feature of prokaryotic genomes. Evolution of operons is hypothesized to be adaptive and has contributed significantly towards coordinated optimization of functions. Two conflicting theories, based on (i) in situ formation to achieve co-regulation and (ii) horizontal gene transfer of functionally linked gene clusters, are generally considered to explain why and how operons have evolved. Furthermore, effects of operon evolution on genomic traits such as intergenic spacing, operon size and co-regulation are relatively less explored. Based on the conservation level in a set of diverse prokaryotes, we categorize the operonic gene pair associations and in turn the operons as ancient and recently formed. This allowed us to perform a detailed analysis of operonic structure in cyanobacteria, a morphologically and physiologically diverse group of photoautotrophs. Clustering based on operon conservation showed significant similarity with the 16S rRNA-based phylogeny, which groups the cyanobacterial strains into three clades. Clade C, dominated by strains that are believed to have undergone genome reduction, shows a larger fraction of operonic genes that are tightly packed in larger sized operons. Ancient operons are in general larger, more tightly packed, better optimized for co-regulation and part of key cellular processes. A sub-clade within Clade B, which includes Synechocystis sp. PCC 6803, shows a reverse trend in intergenic spacing. Our results suggest that while in situ formation and vertical descent may be a dominant mechanism of operon evolution in cyanobacteria, optimization of intergenic spacing and co-regulation are part of an ongoing process in the life-cycle of operons.  相似文献   

17.
18.
Since operons are unstable across Prokaryotes, it has been suggested that perhaps they re-combine in a conservative manner. Thus, genes belonging to a given operon in one genome might re-associate in other genomes revealing functional relationships among gene products. We developed a system to build networks of functional relationships of gene products based on their organization into operons in any available genome. The operon predictions are based on inter-genic distances. Our system can use different kinds of thresholds to accept a functional relationship, either related to the prediction of operons, or to the number of non-redundant genomes that support the associations. We also work by shells, meaning that we decide on the number of linking iterations to allow for the complementation of related gene sets. The method shows high reliability benchmarked against knowledge-bases of functional interactions. We also illustrate the use of Nebulon in finding new members of regulons, and of other functional groups of genes. Operon rearrangements produce thousands of high-quality new interactions per prokaryotic genome, and thousands of confirmations per genome to other predictions, making it another important tool for the inference of functional interactions from genomic context.  相似文献   

19.
Almost 50 years following the discovery of the prokaryotic operon, the functional relevance of gene order within operons remains unclear. In this work, we take advantage of the eroded genome of Mycobacterium leprae to add evidence supporting the notion that functionally less important genes have a tendency to be located at the end of its operons. M. leprae’s genome includes 1133 pseudogenes and 1614 protein-coding genes and can be compared with the close genome of M. tuberculosis. Assuming M. leprae’s pseudogenes to represent dispensable genes, we have studied the position of these pseudogenes in the operons of M. leprae and of their orthologs in M. tuberculosis. We observed that both tend to be located in the 3′ (downstream) half of the operon (P-values of 0.03 and 0.18, respectively). Analysis of pseudogenes in all available prokaryotic genomes confirms this trend (P-value of 7.1 × 10−7). In a complementary analysis, we found a significant tendency for essential genes to be located at the 5′ (upstream) half of the operon (P-value of 0.006). Our work provides an indication that, in prokarya, functionally less important genes have a tendency to be located at the end of operons, while more relevant genes tend to be located toward operon starts.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号