首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Next-generation sequencing projects continue to drive a vast accumulation of metagenomic sequence data. Given the growth rate of this data, automated approaches to functional annotation are indispensable and a cornerstone heuristic of many computational protocols is the concept of guilt by association. The guilt by association paradigm has been heavily exploited by genomic context methods that offer functional predictions that are complementary to homology-based annotations, thereby offering a means to extend functional annotation. In particular, operon methods that exploit co-directional intergenic distances can provide homology-free functional annotation through the transfer of functions among co-operonic genes, under the assumption that guilt by association is indeed applicable. Although guilt by association is a well-accepted annotative device, its applicability to metagenomic functional annotation has not been definitively demonstrated. Here a large-scale assessment of metagenomic guilt by association is undertaken where functional associations are predicted on the basis of co-directional intergenic distances. Specifically, functional annotations are compared within pairs of adjacent co-directional genes, as well as operons of various lengths (i.e. number of member genes), in order to reveal new information about annotative cohesion versus operon length. The results suggests that co-directional gene pairs offer reduced confidence for metagenomic guilt by association due to difficulty in resolving the existence of functional associations when intergenic distance is the sole predictor of pairwise gene interactions. However, metagenomic operons, particularly those with substantial lengths, appear to be capable of providing a superior basis for metagenomic guilt by association due to increased annotative stability. The need for improved recognition of metagenomic operons is discussed, as well as the limitations of the present work.  相似文献   

2.
Since operons are unstable across Prokaryotes, it has been suggested that perhaps they re-combine in a conservative manner. Thus, genes belonging to a given operon in one genome might re-associate in other genomes revealing functional relationships among gene products. We developed a system to build networks of functional relationships of gene products based on their organization into operons in any available genome. The operon predictions are based on inter-genic distances. Our system can use different kinds of thresholds to accept a functional relationship, either related to the prediction of operons, or to the number of non-redundant genomes that support the associations. We also work by shells, meaning that we decide on the number of linking iterations to allow for the complementation of related gene sets. The method shows high reliability benchmarked against knowledge-bases of functional interactions. We also illustrate the use of Nebulon in finding new members of regulons, and of other functional groups of genes. Operon rearrangements produce thousands of high-quality new interactions per prokaryotic genome, and thousands of confirmations per genome to other predictions, making it another important tool for the inference of functional interactions from genomic context.  相似文献   

3.
4.
原核生物操纵子结构的准确注释对基因功能和基因调控网络的研究具有重要意义,通过生物信息学方法计算预测是当前基因组操纵子结构注释的最主要来源.当前的预测算法大都需要实验确认的操纵子作为训练集,但实验确认的操纵子数据的缺乏一直成为发展算法的瓶颈.基于对操纵子结构的认识,从基因间距离、转录翻译相关的调控信号以及COG功能注释等特征出发,建立了描述操纵子复杂结构的概率模型,并提出了不依赖于特定物种操纵子数据作为训练集的迭代自学习算法.通过对实验验证的操纵子数据集的测试比较,结果表明算法对于预测操纵子结构非常有效.在不依赖于任何已知操纵子信息的情况下,算法在总体预测水平上超过了目前最好的操纵子预测方法,而且这种自学习的预测算法要优于依赖特定物种进行训练的算法.这些特点使得该算法能够适用于新测序的物种,有别于当前常用的操纵子预测方法.对细菌和古细菌的基因组进行大规模比较分析,进一步提高了对基因组操纵子结构的普遍特征和物种特异性的认识.  相似文献   

5.
6.
A fuzzy guided genetic algorithm for operon prediction   总被引:4,自引:0,他引:4  
Motivation: The operon structure of the prokaryotic genome isa critical input for the reconstruction of regulatory networksat the whole genome level. As experimental methods for the detectionof operons are difficult and time-consuming, efforts are beingput into developing computational methods that can use availablebiological information to predict operons. Method: A genetic algorithm is developed to evolve a startingpopulation of putative operon maps of the genome into progressivelybetter predictions. Fuzzy scoring functions based on multiplecriteria are used for assessing the ‘fitness’ ofthe newly evolved operon maps and guiding their evolution. Results: The algorithm organizes the whole genome into operons.The fuzzy guided genetic algorithm-based approach makes it possibleto use diverse biological information like genome sequence data,functional annotations and conservation across multiple genomes,to guide the organization process. This approach does not requireany prior training with experimental operons. The predictionsfrom this algorithm for Escherchia coli K12 and Bacillus subtilisare evaluated against experimentally discovered operons forthese organisms. The accuracy of the method is evaluated usingan ROC (receiver operating characteristic) analysis. The areaunder the ROC curve is around 0.9, which indicates excellentaccuracy. Contact: roschen_csir{at}rediffmail.com  相似文献   

7.
8.
Operon prediction without a training set   总被引:5,自引:0,他引:5  
  相似文献   

9.
Gene arrangement into operons varies between bacterial species. Genes in a given system can be on one operon in some organisms and on several operons in other organisms. Existing theories explain why genes that work together should be on the same operon, since this allows for advantageous lateral gene transfer and accurate stoichiometry. But what causes the frequent separation into multiple operons of co-regulated genes that act together in a pathway? Here we suggest that separation is due to benefits made possible by differential regulation of each operon. We present a simple mathematical model for the optimal distribution of genes into operons based on a balance of the cost of operons and the benefit of regulation that provides 'just-when-needed' temporal order. The analysis predicts that genes are arranged such that genes on the same operon do not skip functional steps in the pathway. This prediction is supported by genomic data from 137 bacterial genomes. Our work suggests that gene arrangement is not only the result of random historical drift, genome re-arrangement and gene transfer, but has elements that are solutions of an evolutionary optimization problem. Thus gene functional order may be inferred by analyzing the operon structure across different genomes.  相似文献   

10.
11.
Biodiversity estimates based on ribosomal operon sequence diversity rely on the premise that a sequence is characteristic of a single specific taxon or operational taxonomic unit (OTU). Here, we have studied the sequence diversity of 14 ribosomal RNA operons (rrn) contained in the genomes of two isolates (five operons in each genome) and four metagenomic fosmids, all from the same seawater sample. Complete sequencing of the isolate genomes and the fosmids establish that they represent strains of the same species, Alteromonas macleodii, with average nucleotide identity (ANI) values >97 %. Nonetheless, we observed high levels of intragenomic heterogeneity (i.e., variability between operons of a single genome) affecting multiple regions of the 16S and 23S rRNA genes as well as the internally transcribed spacer 1 (ITS-1) region. Furthermore, the ribosomal operons exhibited intergenomic heterogeneity (i.e., variability between operons located in separate genomes) in each of these regions, compounding the variability. Our data reveal the extensive heterogeneity observed in natural populations of A. macleodii at a single point in time and support the idea that distinct lineages of A. macleodii exist in the deep Mediterranean. These findings highlight the potential of rRNA fingerprinting methods to misrepresent species diversity while simultaneously failing to recognize the ecological significance of individual strains.  相似文献   

12.
Genome-wide operon prediction in Staphylococcus aureus   总被引:5,自引:0,他引:5  
  相似文献   

13.
Almost 50 years following the discovery of the prokaryotic operon, the functional relevance of gene order within operons remains unclear. In this work, we take advantage of the eroded genome of Mycobacterium leprae to add evidence supporting the notion that functionally less important genes have a tendency to be located at the end of its operons. M. leprae’s genome includes 1133 pseudogenes and 1614 protein-coding genes and can be compared with the close genome of M. tuberculosis. Assuming M. leprae’s pseudogenes to represent dispensable genes, we have studied the position of these pseudogenes in the operons of M. leprae and of their orthologs in M. tuberculosis. We observed that both tend to be located in the 3′ (downstream) half of the operon (P-values of 0.03 and 0.18, respectively). Analysis of pseudogenes in all available prokaryotic genomes confirms this trend (P-value of 7.1 × 10−7). In a complementary analysis, we found a significant tendency for essential genes to be located at the 5′ (upstream) half of the operon (P-value of 0.006). Our work provides an indication that, in prokarya, functionally less important genes have a tendency to be located at the end of operons, while more relevant genes tend to be located toward operon starts.  相似文献   

14.
The slow-growing Mycobacterium celatum is known to have two different 16S rRNA gene sequences. This study confirms the presence of two rrn operons and describes their organization. One operon (rrnA) was found to be located downstream from murA and the other (rrnB) was found downstream from tyrS. The promoter regions were sequenced, and also the intergenic transcribed spacer (ITS1 and ITS2) regions separating the 16S rRNA, 23S rRNA and 5S rRNA gene coding regions. Analysis of the RNA fraction revealed that rrnA is regulated by two (P1 and PCL1) promoters and rrnB is regulated by one (P1). These data show that the two rrn operons of M. celatum are organized in the same way as the two rrn operons of classical fast-growing mycobacteria. This information was incorporated into a phylogenetic analysis of the genus based on both 16S rRNA gene sequences and (where possible) the number of rrn operons per genome. The results suggest that the ancestral Mycobacterium possessed two (rrnA and rrnB) operons per genome and that subsequently, on two separate occasions, an operon (rrnB) was lost, leading to two clusters of species having a single operon (rrnA); one cluster includes the classical pathogens and the other includes Mycobacterium abscessus and Mycobacterium chelonae.  相似文献   

15.
16.
17.
18.
Operon prediction in Pyrococcus furiosus   总被引:1,自引:0,他引:1  
Identification of operons in the hyperthermophilic archaeon Pyrococcus furiosus represents an important step to understanding the regulatory mechanisms that enable the organism to adapt and thrive in extreme environments. We have predicted operons in P.furiosus by combining the results from three existing algorithms using a neural network (NN). These algorithms use intergenic distances, phylogenetic profiles, functional categories and gene-order conservation in their operon prediction. Our method takes as inputs the confidence scores of the three programs, and outputs a prediction of whether adjacent genes on the same strand belong to the same operon. In addition, we have applied Gene Ontology (GO) and KEGG pathway information to improve the accuracy of our algorithm. The parameters of this NN predictor are trained on a subset of all experimentally verified operon gene pairs of Bacillus subtilis. It subsequently achieved 86.5% prediction accuracy when applied to a subset of gene pairs for Escherichia coli, which is substantially better than any of the three prediction programs. Using this new algorithm, we predicted 470 operons in the P.furiosus genome. Of these, 349 were validated using DNA microarray data.  相似文献   

19.
Mycobacteria are thought to have either one or two rRNA operons per genome. All mycobacteria investigated to date have an operon, designated rrnA, located downstream from the murA gene. We report that Mycobacteriun fortuitum has a second rrn operon, designated rrnB, which is located downstream from the tyrS gene; tyrS is very close to the 3' end of a gene (3-mag) coding for 3-methylpurine-DNA-glycosylase. The second rrn operon of Mycobacterium smegmatis was shown to have a similar organization, namely, 5' 3-mag-tyrS-rrnB 3'. The rrnB operon of M. fortuitum was found to have a single dedicated promoter. During exponential growth in a rich medium, the rrnB and rrnA operons were the major and minor contributors, respectively, to pre-rRNA synthesis. Genomic DNA was isolated from eight other fast-growing mycobacterial species. Samples were investigated by Southern blot analysis using probes for murA, tyrS, and 16S rRNA sequences. The results revealed that both rrnA and rrnB operons were present in each species. The results form the basis for a proposed new scheme for the classification of mycobacteria. The approach, which is phylogenetic in concept, is based on particular properties of the rrn operons of a cell, namely, the number per genome and a feature of 16S rRNA gene sequences.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号