首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The BioCyc database collection is a set of 160 pathway/genome databases (PGDBs) for most eukaryotic and prokaryotic species whose genomes have been completely sequenced to date. Each PGDB in the BioCyc collection describes the genome and predicted metabolic network of a single organism, inferred from the MetaCyc database, which is a reference source on metabolic pathways from multiple organisms. In addition, each bacterial PGDB includes predicted operons for the corresponding species. The BioCyc collection provides a unique resource for computational systems biology, namely global and comparative analyses of genomes and metabolic networks, and a supplement to the BioCyc resource of curated PGDBs. The Omics viewer available through the BioCyc website allows scientists to visualize combinations of gene expression, proteomics and metabolomics data on the metabolic maps of these organisms. This paper discusses the computational methodology by which the BioCyc collection has been expanded, and presents an aggregate analysis of the collection that includes the range of number of pathways present in these organisms, and the most frequently observed pathways. We seek scientists to adopt and curate individual PGDBs within the BioCyc collection. Only by harnessing the expertise of many scientists we can hope to produce biological databases, which accurately reflect the depth and breadth of knowledge that the biomedical research community is producing.  相似文献   

2.
In the post-genomic era, the biochemical information for individual compounds, enzymes, reactions to be found within named organisms has become readily available. The well-known KEGG and BioCyc databases provide a comprehensive catalogue for this information and have thereby substantially aided the scientific community. Using these databases, the complement of enzymes present in a given organism can be determined and, in principle, used to reconstruct the metabolic network. However, such reconstructed networks contain numerous properties contradicting biological expectation. The metabolic networks for a number of organisms are reconstructed from KEGG and BioCyc databases, and features of these networks are related to properties of their originating database.  相似文献   

3.
Yang JO  Charny P  Lee B  Kim S  Bhak J  Woo HG 《Bioinformation》2007,2(5):194-196
GS2PATH is a Web-based pipeline tool to permit functional enrichment of a given gene set from prior knowledge databases, including gene ontology (GO) database and biological pathway databases. The tool also provides an estimation of gene set enrichment, in GO terms, from the databases of the KEGG and BioCarta pathways, which may allow users to compute and compare functional over-representations. This is especially useful in the perspective of biological pathways such as metabolic, signal transduction, genetic information processing, environmental information processing, cellular process, disease, and drug development. It provides relevant images of biochemical pathways with highlighting of the gene set by customized colors, which can directly assist in the visualization of functional alteration.

Availability  相似文献   


4.
5.
With numerous whole genomes now in hand, and experimental data about genes and biological pathways on the increase, a systems approach to biological research is becoming essential. Ontologies provide a formal representation of knowledge that is amenable to computational as well as human analysis, an obvious underpinning of systems biology. Mapping function to gene products in the genome consists of two, somewhat intertwined enterprises: ontology building and ontology annotation. Ontology building is the formal representation of a domain of knowledge; ontology annotation is association of specific genomic regions (which we refer to simply as 'genes', including genes and their regulatory elements and products such as proteins and functional RNAs) to parts of the ontology. We consider two complementary representations of gene function: the Gene Ontology (GO) and pathway ontologies. GO represents function from the gene's eye view, in relation to a large and growing context of biological knowledge at all levels. Pathway ontologies represent function from the point of view of biochemical reactions and interactions, which are ordered into networks and causal cascades. The more mature GO provides an example of ontology annotation: how conclusions from the scientific literature and from evolutionary relationships are converted into formal statements about gene function. Annotations are made using a variety of different types of evidence, which can be used to estimate the relative reliability of different annotations.  相似文献   

6.
Yang HH  Hu Y  Buetow KH  Lee MP 《Genomics》2004,84(1):211-217
This study uses a computational approach to analyze coherence of expression of genes in pathways. Microarray data were analyzed with respect to coherent gene expression in a group of genes defined as a pathway in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Our hypothesis is that genes in the same pathway are more likely to be coordinately regulated than a randomly selected gene set. A correlation coefficient for each pair of genes in a pathway was estimated based on gene expression in normal or tumor samples, and statistically significant correlation coefficients were identified. The coherence indicator was defined as the ratio of the number of gene pairs in the pathway whose correlation coefficients are significant, divided by the total number of gene pairs in the pathway. We defined all genes that appeared in the KEGG pathways as a reference gene set. Our analysis indicated that the mean coherence indicator of pathways is significantly larger than the mean coherence indicator of random gene sets drawn from the reference gene set. Thus, the result supports our hypothesis. The significance of each individual pathway of n genes was evaluated by comparing its coherence indicator with coherence indicators of 1000 random permutation sets of n genes chosen from the reference gene set. We analyzed three data sets: two Affymetrix microarrays and one cDNA microarray. For each of the three data sets, statistically significant pathways were identified among all KEGG pathways. Seven of 96 pathways had a significant coherence indicator in normal tissue and 14 of 96 pathways had a significant coherence indicator in tumor tissue in all three data sets. The increase in the number of pathways with significant coherence indicators may reflect the fact that tumor cells have a higher rate of metabolism than normal cells. Five pathways involved in oxidative phosphorylation, ATP synthesis, protein synthesis, or RNA synthesis were coherent in both normal and tumor tissue, demonstrating that these are essential genes, a high level of expression of which is required regardless of cell type.  相似文献   

7.
We used established databases in standard ways to systematically characterize gene ontologies, pathways and functional linkages in the large set of genes now associated with autism spectrum disorders (ASDs). These conditions are particularly challenging—they lack clear pathognomonic biological markers, they involve great heterogeneity across multiple levels (genes, systemic biological and brain characteristics, and nuances of behavioral manifestations)—and yet everyone with this diagnosis meets the same defining behavioral criteria. Using the human gene list from Simons Foundation Autism Research Initiative (SFARI) we performed gene set enrichment analysis with the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Database, and then derived a pathway network from pathway-pathway functional interactions again in reference to KEGG. Through identifying the GO (Gene Ontology) groups in which SFARI genes were enriched, mapping the coherence between pathways and GO groups, and ranking the relative strengths of representation of pathway network components, we 1) identified 10 disease-associated and 30 function-associated pathways 2) revealed calcium signaling pathway and neuroactive ligand-receptor interaction as the most enriched, statistically significant pathways from the enrichment analysis, 3) showed calcium signaling pathways and MAPK signaling pathway to be interactive hubs with other pathways and also to be involved with pervasively present biological processes, 4) found convergent indications that the process “calcium-PRC (protein kinase C)-Ras-Raf-MAPK/ERK” is likely a major contributor to ASD pathophysiology, and 5) noted that perturbations associated with KEGG’s category of environmental information processing were common. These findings support the idea that ASD-associated genes may contribute not only to core features of ASD themselves but also to vulnerability to other chronic and systemic problems potentially including cancer, metabolic conditions and heart diseases. ASDs may thus arise, or emerge, from underlying vulnerabilities related to pleiotropic genes associated with pervasively important molecular mechanisms, vulnerability to environmental input and multiple systemic co-morbidities.  相似文献   

8.
9.
Given a compounds-forming system, i.e., a system consisting of some compounds and their relationship, can it form a biologically meaningful pathway? It is a fundamental problem in systems biology. Nowadays, a lot of information on different organisms, at both genetic and metabolic levels, has been collected and stored in some specific databases. Based on these data, it is feasible to address such an essential problem. Metabolic pathway is one kind of compounds-forming systems and we analyzed them in yeast by extracting different (biological and graphic) features from each of the 13,736 compounds-forming systems, of which 136 are positive pathways, i.e., known metabolic pathway from KEGG; while 13,600 were negative. Each of these compounds-forming systems was represented by 144 features, of which 88 are graph features and 56 biological features. "Minimum Redundancy Maximum Relevance" and "Incremental Feature Selection" were utilized to analyze these features and 16 optimal features were selected as being able to predict a query compounds- forming system most successfully. It was found through Jackknife cross-validation that the overall success rate of identifying the positive pathways was 74.26%. It is anticipated that this novel approach and encouraging result may give meaningful illumination to investigate this important topic.  相似文献   

10.
11.
12.
Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques.  相似文献   

13.
14.
miR-17-92是一个高度保守的基因家簇,参与哺乳动物多个器官发育并与多种实体瘤的发生密切相关。运用多个在线数据库,发现了miR-17-92的上游转录因子及下游靶基因间的多个前馈和反馈环路。并对参与miR-17-92调控环路的基因进行功能聚类分析,进而绘制出miR-17-92的核心调控网络图。结果提示miR-17-92与其上游转录因子共调控的靶基因可能参与了生物体的细胞周期调控,迁移、凋亡、激素应答、免疫系统发育等多种生物学过程,KEGG pathway分析提示其还与多种肿瘤 信号通路密切相关。因此,对miR-17-92分子调控网络生物信息学的分析可以有助于理解其在细胞发育和肿瘤发生过程中的作用机制并为后续实验验证提供良好的指导。  相似文献   

15.
Enormous amounts of data result from genome sequencing projects and new experimental methods. Within this tremendous amount of genomic data 30-40 per cent of the genes being identified in an organism remain unknown in terms of their biological function. As a consequence of this lack of information the overall schema of all the biological functions occurring in a specific organism cannot be properly represented. To understand the functional properties of the genomic data more experimental data must be collected. A pathway database is an effort to handle the current knowledge of biochemical pathways and in addition can be used for interpretation of sequence data. Some of the existing pathway databases can be interpreted as detailed functional annotations of genomes because they are tightly integrated with genomic information. However, experimental data are often lacking in these databases. This paper summarises a list of pathway databases and some of their corresponding biological databases, and also focuses on information about the content and the structure of these databases, the organisation of the data and the reliability of stored information from a biological point of view. Moreover, information about the representation of the pathway data and tools to work with the data are given. Advantages and disadvantages of the analysed databases are pointed out, and an overview to biological scientists on how to use these pathway databases is given.  相似文献   

16.
KEGG: Kyoto Encyclopedia of Genes and Genomes.   总被引:14,自引:0,他引:14       下载免费PDF全文
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).  相似文献   

17.
The KEGG databases at GenomeNet   总被引:30,自引:0,他引:30       下载免费PDF全文
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is the primary database resource of the Japanese GenomeNet service (http://www.genome.ad.jp/) for understanding higher order functional meanings and utilities of the cell or the organism from its genome information. KEGG consists of the PATHWAY database for the computerized knowledge on molecular interaction networks such as pathways and complexes, the GENES database for the information about genes and proteins generated by genome sequencing projects, and the LIGAND database for the information about chemical compounds and chemical reactions that are relevant to cellular processes. In addition to these three main databases, limited amounts of experimental data for microarray gene expression profiles and yeast two-hybrid systems are stored in the EXPRESSION and BRITE databases, respectively. Furthermore, a new database, named SSDB, is available for exploring the universe of all protein coding genes in the complete genomes and for identifying functional links and ortholog groups. The data objects in the KEGG databases are all represented as graphs and various computational methods are developed to detect graph features that can be related to biological functions. For example, the correlated clusters are graph similarities which can be used to predict a set of genes coding for a pathway or a complex, as summarized in the ortholog group tables, and the cliques in the SSDB graph are used to annotate genes. The KEGG databases are updated daily and made freely available (http://www.genome.ad.jp/kegg/).  相似文献   

18.
Over-representation analysis (ORA) is one of the commonest pathway analysis approaches used for the functional interpretation of metabolomics datasets. Despite the widespread use of ORA in metabolomics, the community lacks guidelines detailing its best-practice use. Many factors have a pronounced impact on the results, but to date their effects have received little systematic attention. Using five publicly available datasets, we demonstrated that changes in parameters such as the background set, differential metabolite selection methods, and pathway database used can result in profoundly different ORA results. The use of a non-assay-specific background set, for example, resulted in large numbers of false-positive pathways. Pathway database choice, evaluated using three of the most popular metabolic pathway databases (KEGG, Reactome, and BioCyc), led to vastly different results in both the number and function of significantly enriched pathways. Factors that are specific to metabolomics data, such as the reliability of compound identification and the chemical bias of different analytical platforms also impacted ORA results. Simulated metabolite misidentification rates as low as 4% resulted in both gain of false-positive pathways and loss of truly significant pathways across all datasets. Our results have several practical implications for ORA users, as well as those using alternative pathway analysis methods. We offer a set of recommendations for the use of ORA in metabolomics, alongside a set of minimal reporting guidelines, as a first step towards the standardisation of pathway analysis in metabolomics.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号