首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Operon prediction without a training set   总被引:5,自引:0,他引:5  
  相似文献   

2.
3.
4.
5.
We have carried out a systematic analysis of the contribution of a set of selected features that include three new features to the accuracy of operon prediction. Our analyses have led to a number of new insights about operon prediction, including that (i) different features have different levels of discerning power when used on adjacent gene pairs with different ranges of intergenic distance, (ii) certain features are universally useful for operon prediction while others are more genome-specific and (iii) the prediction reliability of operons is dependent on intergenic distances. Based on these new insights, our newly developed operon-prediction program achieves more accurate operon prediction than the previous ones, and it uses features that are most readily available from genomic sequences. Our prediction results indicate that our (non-linear) decision tree-based classifier can predict operons in a prokaryotic genome very accurately when a substantial number of operons in the genome are already known. For example, the prediction accuracy of our program can reach 90.2 and 93.7% on Bacillus subtilis and Escherichia coli genomes, respectively. When no such information is available, our (linear) logistic function-based classifier can reach the prediction accuracy at 84.6 and 83.3% for E.coli and B.subtilis, respectively.  相似文献   

6.
7.
8.
原核生物操纵子结构的准确注释对基因功能和基因调控网络的研究具有重要意义,通过生物信息学方法计算预测是当前基因组操纵子结构注释的最主要来源.当前的预测算法大都需要实验确认的操纵子作为训练集,但实验确认的操纵子数据的缺乏一直成为发展算法的瓶颈.基于对操纵子结构的认识,从基因间距离、转录翻译相关的调控信号以及COG功能注释等特征出发,建立了描述操纵子复杂结构的概率模型,并提出了不依赖于特定物种操纵子数据作为训练集的迭代自学习算法.通过对实验验证的操纵子数据集的测试比较,结果表明算法对于预测操纵子结构非常有效.在不依赖于任何已知操纵子信息的情况下,算法在总体预测水平上超过了目前最好的操纵子预测方法,而且这种自学习的预测算法要优于依赖特定物种进行训练的算法.这些特点使得该算法能够适用于新测序的物种,有别于当前常用的操纵子预测方法.对细菌和古细菌的基因组进行大规模比较分析,进一步提高了对基因组操纵子结构的普遍特征和物种特异性的认识.  相似文献   

9.
Yan Y  Moult J 《Proteins》2006,64(3):615-628
Operons are clusters of genes that are transcribed as a single message, and regulated by the same gene expression machinery. They are found primarily in prokaryotic genomes. Because genes in the same operon are likely to have related functions, identification of the operon structure is potentially useful for assigning gene function. We report the development and benchmarking of two different methods for detecting operons, based on an analysis of 42 fully sequenced prokaryotic organisms. The Gene Neighbor method (GNM) utilizes the relatively high conservation of gene order in operons, compared with genes in general. The Gene Gap Method (GGM) makes use of the relatively short gap between genes in operons compared with that otherwise found between adjacent genes. The methods have been benchmarked using KEGG pathway data and RegulonDB Escherichia coli operon data. With optimum parameters, the specificity of the GNM is 93% and the sensitivity is 70%. For the GGM, the specificity is 95% and the sensitivity is 68%. Together, the two methods have a sensitivity of 87.2%, while joint predictions have a sensitivity of 50% and a specificity of 98%. The methods are used to infer possible functions for some hypothetical genes in prokaryotic genomes. The methods have proven a useful addition to structure information in deriving protein function in a structural genomics project.  相似文献   

10.
11.
12.
13.
14.
15.
16.
17.
Escherichia coli G3/10 is a component of the probiotic drug Symbioflor 2. In an in vitro assay with human intestinal epithelial cells, E. coli G3/10 is capable of suppressing adherence of enteropathogenic E. coli E2348/69. In this study, we demonstrate that a completely novel class II microcin, produced by probiotic E. coli G3/10, is responsible for this behavior. We named this antibacterial peptide microcin S (MccS). Microcin S is coded on a 50.6 kb megaplasmid of E. coli G3/10, which we have completely sequenced and annotated. The microcin S operon is about 4.7 kb in size and is comprised of four genes. Subcloning of the genes and gene fragments followed by gene expression experiments enabled us to functionally characterize all members of this operon, and to clearly identify the nucleotide sequences encoding the microcin itself (mcsS), its transport apparatus and the gene mcsI conferring self immunity against microcin S. Overexpression of cloned mcsI antagonizes MccS activity, thus protecting indicator strain E. coli E2348/69 in the in vitro adherence assay. Moreover, growth of E. coli transformed with a plasmid containing mcsS under control of an araC PBAD activator-promoter is inhibited upon mcsS induction. Our data provide further mechanistic insight into the probiotic behavior of E. coli G3/10.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号