首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Operon prediction without a training set   总被引:5,自引:0,他引:5  
  相似文献   

4.
We have carried out a systematic analysis of the contribution of a set of selected features that include three new features to the accuracy of operon prediction. Our analyses have led to a number of new insights about operon prediction, including that (i) different features have different levels of discerning power when used on adjacent gene pairs with different ranges of intergenic distance, (ii) certain features are universally useful for operon prediction while others are more genome-specific and (iii) the prediction reliability of operons is dependent on intergenic distances. Based on these new insights, our newly developed operon-prediction program achieves more accurate operon prediction than the previous ones, and it uses features that are most readily available from genomic sequences. Our prediction results indicate that our (non-linear) decision tree-based classifier can predict operons in a prokaryotic genome very accurately when a substantial number of operons in the genome are already known. For example, the prediction accuracy of our program can reach 90.2 and 93.7% on Bacillus subtilis and Escherichia coli genomes, respectively. When no such information is available, our (linear) logistic function-based classifier can reach the prediction accuracy at 84.6 and 83.3% for E.coli and B.subtilis, respectively.  相似文献   

5.
Prediction of operons in microbial genomes   总被引:28,自引:7,他引:21       下载免费PDF全文
  相似文献   

6.
7.
Conservation of adjacency as evidence of paralogous operons   总被引:5,自引:2,他引:3  
  相似文献   

8.
9.
10.
The study of conserved gene clusters is important for understanding the forces behind genome organization and evolution, as well as the function of individual genes or gene groups. In this paper, we present a new model and algorithm for identifying conserved gene clusters from pairwise genome comparison. This generalizes a recent model called "gene teams." A gene team is a set of genes that appear homologously in two or more species, possibly in a different order yet with the distance of adjacent genes in the team for each chromosome always no more than a certain threshold. We remove the constraint in the original model that each gene must have a unique occurrence in each chromosome and thus allow the analysis on complex prokaryotic or eukaryotic genomes with extensive paralogs. Our algorithm analyzes a pair of chromosomes in O(mn) time and uses O(m+n) space, where m and n are the number of genes in the respective chromosomes. We demonstrate the utility of our methods by studying two bacterial genomes, E. coli K-12 and B. subtilis. Many of the teams identified by our algorithm correlate with documented E. coli operons, while several others match predicted operons, previously suggested by computational techniques. Our implementation and data are publicly available at euler.slu.edu/ approximately goldwasser/homologyteams/.  相似文献   

11.
12.
13.
14.
15.
16.
17.
18.
The availability of a growing number of completely sequenced genomes opens new opportunities for understanding of complex biological systems. Success of genome-based biology will, to a large extent, depend on the development of new approaches and tools for efficient comparative analysis of the genomes and their organization. We have developed a technique for detecting possible functional coupling between genes based on detection of potential operons. The approach involves computation of "pairs of close bidirectional best hits", which are pairs of genes that apparently occur within operons in multiple genomes. Using these pairs, one can compose evidence (based on the number of distinct genomes and the phylogenetic distance between the orthologous pairs) that a pair of genes is potentially functionally coupled. The technique has revealed a surprisingly rich and apparently accurate set of functionally coupled genes. The approach depends on the use of a relatively large number of genomes, and the amount of detected coupling grows dramatically as the number of genomes increases.  相似文献   

19.
20.
Chen X  Su Z  Dam P  Palenik B  Xu Y  Jiang T 《Nucleic acids research》2004,32(7):2147-2157
We present a computational method for operon prediction based on a comparative genomics approach. A group of consecutive genes is considered as a candidate operon if both their gene sequences and functions are conserved across several phylogenetically related genomes. In addition, various supporting data for operons are also collected through the application of public domain computer programs, and used in our prediction method. These include the prediction of conserved gene functions, promoter motifs and terminators. An apparent advantage of our approach over other operon prediction methods is that it does not require many experimental data (such as gene expression data and pathway data) as input. This feature makes it applicable to many newly sequenced genomes that do not have extensive experimental information. In order to validate our prediction, we have tested the method on Escherichia coli K12, in which operon structures have been extensively studied, through a comparative analysis against Haemophilus influenzae Rd and Salmonella typhimurium LT2. Our method successfully predicted most of the 237 known operons. After this initial validation, we then applied the method to a newly sequenced and annotated microbial genome, Synechococcus sp. WH8102, through a comparative genome analysis with two other cyanobacterial genomes, Prochlorococcus marinus sp. MED4 and P.marinus sp. MIT9313. Our results are consistent with previously reported results and statistics on operons in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号