首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
KEGG spider is a web-based tool for interpretation of experimentally derived gene lists in order to gain understanding of metabolism variations at a genomic level. KEGG spider implements a 'pathway-free' framework that overcomes a major bottleneck of enrichment analyses: it provides global models uniting genes from different metabolic pathways. Analyzing a number of experimentally derived gene lists, we demonstrate that KEGG spider provides deeper insights into metabolism variations in comparison to existing methods.  相似文献   

2.
A major challenge in microarray data analysis is the functional interpretation of gene lists. A common approach to address this is over-representation analysis (ORA), which uses the hypergeometric test (or its variants) to evaluate whether a particular functionally defined group of genes is represented more than expected by chance within a gene list. Existing applications of ORA have been largely limited to pre-defined terminologies such as GO and KEGG. We report our explorations of whether ORA can be applied to a wider mining of free-text. We found that a hitherto underappreciated feature of experimentally derived gene lists is that the constituents have substantially more annotation associated with them, as they have been researched upon for a longer period of time. This bias, a result of patterns of research activity within the biomedical community, is a major problem for classical hypergeometric test-based ORA approaches, which cannot account for such bias. We have therefore developed three approaches to overcome this bias, and demonstrate their usability in a wide range of published datasets covering different species. A comparison with existing tools that use GO terms suggests that mining PubMed abstracts can reveal additional biological insight that may not be possible by mining pre-defined ontologies alone.  相似文献   

3.
Curated gene sets from databases such as KEGG Pathway and Gene Ontology are often used to systematically organize lists of genes or proteins derived from high-throughput data. However, the information content inherent to some relationships between the interrogated gene sets, such as pathway crosstalk, is often underutilized. A gene set network, where nodes representing individual gene sets such as KEGG pathways are connected to indicate a functional dependency, is well suited to visualize and analyze global gene set relationships. Here we introduce a novel gene set network construction algorithm that integrates gene lists derived from high-throughput experiments with curated gene sets to construct co-enrichment gene set networks. Along with previously described co-membership and linkage algorithms, we apply the co-enrichment algorithm to eight gene set collections to construct integrated multi-evidence gene set networks with multiple edge types connecting gene sets. We demonstrate the utility of approach through examples of novel gene set networks such as the chromosome map co-differential expression gene set network. A total of twenty-four gene set networks are exposed via a web tool called MetaNet, where context-specific multi-edge gene set networks are constructed from enriched gene sets within user-defined gene lists. MetaNet is freely available at http://blaispathways.dfci.harvard.edu/metanet/.  相似文献   

4.
The metabolic and water evaporation strategies in spiders may be part of a set of physiological adaptations to tolerate low or unpredictable food availability, buffering spiders against environmental fluctuations such as those of the high mountains of the central Andes.The aim of this study is to analyze experimentally the variations in metabolic rate and the rate of evaporative water with food and/or water restriction in a high mountain mygalomorph spider population (Paraphysa sp.).We found that the low metabolism of this spider was not affected by water restriction, but its metabolism was depressed after 3 weeks of food deprivation. The spider did not show seasonal metabolic changes but it presented seasonal changes in the rate of evaporative water loss at high temperatures.Females with egg sacs reduced their metabolic rate and evaporative water at high temperatures.These findings constitute a set of possible adaptations to a highly fluctuating Mediterranean environment, which is completely covered with snow for many months and then progresses rapidly to a very dry climate with high temperatures.  相似文献   

5.

Background

High-throughput technologies like functional screens and gene expression analysis produce extended lists of candidate genes. Gene-Set Enrichment Analysis is a commonly used and well established technique to test for the statistically significant over-representation of particular pathways. A shortcoming of this method is however, that most genes that are investigated in the experiments have very sparse functional or pathway annotation and therefore cannot be the target of such an analysis. The approach presented here aims to assign lists of genes with limited annotation to previously described functional gene collections or pathways. This works by comparing InterPro domain signatures of the candidate gene lists with domain signatures of gene sets derived from known classifications, e.g. KEGG pathways.

Results

In order to validate our approach, we designed a simulation study. Based on all pathways available in the KEGG database, we create test gene lists by randomly selecting pathway genes, removing these genes from the known pathways and adding variable amounts of noise in the form of genes not annotated to the pathway. We show that we can recover pathway memberships based on the simulated gene lists with high accuracy. We further demonstrate the applicability of our approach on a biological example.

Conclusion

Results based on simulation and data analysis show that domain based pathway enrichment analysis is a very sensitive method to test for enrichment of pathways in sparsely annotated lists of genes. An R based software package domainsignatures, to routinely perform this analysis on the results of high-throughput screening, is available via Bioconductor.  相似文献   

6.
Recent advances in experimental technologies allow for the detection of a complete cell proteome. Proteins that are expressed at a particular cell state or in a particular compartment as well as proteins with differential expression between various cells states are commonly delivered by many proteomics studies. Once a list of proteins is derived, a major challenge is to interpret the identified set of proteins in the biological context. Protein–protein interaction (PPI) data represents abundant information that can be employed for this purpose. However, these data have not yet been fully exploited due to the absence of a methodological framework that can integrate this type of information. Here, we propose to infer a network model from an experimentally identified protein list based on the available information about the topology of the global PPI network. We propose to use a Monte Carlo simulation procedure to compute the statistical significance of the inferred models. The method has been implemented as a freely available web‐based tool, PPI spider ( http://mips.helmholtz‐muenchen.de/proj/ppispider ). To support the practical significance of PPI spider, we collected several hundreds of recently published experimental proteomics studies that reported lists of proteins in various biological contexts. We reanalyzed them using PPI spider and demonstrated that in most cases PPI spider could provide statistically significant hypotheses that are helpful for understanding of the protein list.  相似文献   

7.
偏头痛相关酶和KEGG通路分析   总被引:1,自引:0,他引:1       下载免费PDF全文
黄瑞  郑珩 《生物信息学》2014,12(3):218-226
搜集与偏头痛相关的编码酶的基因,利用KEGG通路分析目标基因的分布和功能,促进偏头痛遗传学研究和新药靶点研究。以"gene name"AND migraine检索PUBMED数据库,从原始文献中搜集并整理偏头痛相关酶基因数据,用DAVID在线分析工具对数据进行处理。搜索得到31个偏头痛酶基因,对7条KEGG代谢通路进行了分析:色氨酸代谢通路、酪氨酸代谢通路、精氨酸和脯氨酸代谢通路、叶酸一碳单位循环代谢通路、药物代谢通路、外源物质细胞色素P450代谢通路、肾素血管紧张素代谢通路。其中药物代谢通路包括9个药物,又以高选择性5-羟色胺重摄取抑制剂西酞普兰的应用前景最大。DDC、DBH、MTHFD1等6个偏头痛相关基因需要完善多态性研究。CYP450和单胺氧化酶在偏头痛的病理和治疗中都占有重要的地位。通过分析疾病相关酶基因的代谢通路,有助于了解疾病的分子病理基础,并为新药设计提供可靠靶点。  相似文献   

8.
Analysis of multivariate data sets from, for example, microarray studies frequently results in lists of genes which are associated with some response of interest. The biological interpretation is often complicated by the statistical instability of the obtained gene lists, which may partly be due to the functional redundancy among genes, implying that multiple genes can play exchangeable roles in the cell. In this paper, we use the concept of exchangeability of random variables to model this functional redundancy and thereby account for the instability. We present a flexible framework to incorporate the exchangeability into the representation of lists. The proposed framework supports straightforward comparison between any 2 lists. It can also be used to generate new more stable gene rankings incorporating more information from the experimental data. Using 2 microarray data sets, we show that the proposed method provides more robust gene rankings than existing methods with respect to sampling variations, without compromising the biological significance of the rankings.  相似文献   

9.
Linking networks of molecular interactions to cellular functions and phenotypes is a key goal in systems biology. Here, we adapt concepts of spatial statistics to assess the functional content of molecular networks. Based on the guilt-by-association principle, our approach (called SANTA) quantifies the strength of association between a gene set and a network, and functionally annotates molecular networks like other enrichment methods annotate lists of genes. As a general association measure, SANTA can (i) functionally annotate experimentally derived networks using a collection of curated gene sets and (ii) annotate experimentally derived gene sets using a collection of curated networks, as well as (iii) prioritize genes for follow-up analyses. We exemplify the efficacy of SANTA in several case studies using the S. cerevisiae genetic interaction network and genome-wide RNAi screens in cancer cell lines. Our theory, simulations, and applications show that SANTA provides a principled statistical way to quantify the association between molecular networks and cellular functions and phenotypes. SANTA is available from http://bioconductor.org/packages/release/bioc/html/SANTA.html.  相似文献   

10.
KEGG: kyoto encyclopedia of genes and genomes   总被引:85,自引:3,他引:82       下载免费PDF全文
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).  相似文献   

11.
Genome-scale technologies are increasingly adopted by the stem cell research community, because of the potential to uncover the molecular events most informative about a stem cell state. These technologies also present enormous challenges around the sharing and visualisation of data derived from different laboratories or under different experimental conditions. Stemformatics is an easy to use, publicly accessible portal that hosts a large collection of exemplar stem cell data. It provides fast visualisation of gene expression across a range of mouse and human datasets, with transparent links back to the original studies. One difficulty in the analysis of stem cell signatures is the paucity of public pathways/gene lists relevant to stem cell or developmental biology. Stemformatics provides a simple mechanism to create, share and analyse gene sets, providing a repository of community-annotated stem cell gene lists that are informative about pathways, lineage commitment, and common technical artefacts. Stemformatics can be accessed at stemformatics.org.  相似文献   

12.
13.
为鉴定鱼源鲁氏耶尔森氏菌(Yersinia ruckeri) SC09菌株水生环境中不同温度的转录组水平上的差异, 研究采用链特异性转录组测序(Strand-specific RNA-seq)技术对菌体生理温度(28℃)和实验培养温度(37℃)下进行链特异性测序, 原始数据质控后, 筛选得到差异表达基因, 通过KEGG (Kyoto Encyclopedia of Genes and Genomes)数据库对差异表达基因进行富集分析, 并利用Rockhopper软件筛选出的重要原核生物基因簇进行验证。结果显示, 共筛选获得173个显著差异表达基因(P<0.05), 其中包括58个上调基因, 主要富集到一些特殊的碳水化合物代谢相关的通路中; 以及115个下调基因, 主要富集到双组份信号系统中与三羧酸循环相关的代谢通路上, 同时部分基因富集到编码鞭毛素相关的基因簇中。结果表明, 相对于37℃的实验室培养温度, 在水生环境的生理温度条件下(28℃) SC09菌株拥有较高的运动性和较强的葡萄糖代谢, 但相对的SC09菌株代谢一些特殊糖类的能力减弱。  相似文献   

14.
The search for feature enrichment is a widely used method to characterize a set of genes. While several tools have been designed for nominal features such as Gene Ontology annotations or KEGG Pathways, very little has been proposed to tackle numerical features such as the chromosomal positions of genes. For instance, microarray studies typically generate gene lists that are differentially expressed in the sample subgroups under investigation, and when studying diseases caused by genome alterations, it is of great interest to delineate the chromosomal regions that are significantly enriched in these lists. In this article, we present a positional gene enrichment analysis method (PGE) for the identification of chromosomal regions that are significantly enriched in a given set of genes. The strength of our method relies on an original query optimization approach that allows to virtually consider all the possible chromosomal regions for enrichment, and on the multiple testing correction which discriminates truly enriched regions versus those that can occur by chance. We have developed a Web tool implementing this method applied to the human genome (http://www.esat.kuleuven.be/~bioiuser/pge). We validated PGE on published lists of differentially expressed genes. These analyses showed significant overrepresentation of known aberrant chromosomal regions.  相似文献   

15.
蜘蛛的物种多样性是极其丰富的,但目前只有一小部分的蜘蛛种类被描述。世界上已描述的蜘蛛种类已超过40000种,隶属于110个科。在我们居住的小范围内,可能至少有30个科的数百种蜘蛛。就中国而言,估计可能有40000种以上的蜘蛛种类,但目前也只有大约4000种被命名。本检索表首次列出了中国现有67个蜘蛛科的答定特征.以及不同科之间的相似处和不同处。  相似文献   

16.
Wang C  Wang J  Ju Z  Zhai R  Zhou L  Li Q  Li J  Li R  Huang J  Zhong J 《Molecular biology reports》2012,39(7):7311-7318
Understanding bovine metabolism and its relationship with milk products is important in cow breeding. In the present work, the metabolic network in the mammary gland tissue of cattle was reconstructed with the available bovine genome information using several public datasets from NCBI, Uniprot, and KEGG. The network consisted of 1,743 metabolites named by KEGG compound numbers as nodes and 657 enzymes that catalyzed the corresponding reactions as edges. The characteristics of the network were analyzed. The top 20 hub metabolites were determined, and the mean path length was identified to be 6.52. Moreover, 11 key enzymes with significant changes in expression under the condition of mastitis were identified and analyzed by integrating the microarray expression data of normal and clinical mastitis. Aside from the GATM gene, 10 downregulated enzymes were detected in bovine with mastitis. In addition, many of the identified enzymes were involved in amino acid metabolisms or had a direct connection to amino acid metabolisms. These results indicate that mastitis could affect the expression of enzymes, which is vital in some amino acid metabolisms, resulting in the reduction of milk proteins. The present work provides information that may improve the understanding on bovine milk production and mastitis.  相似文献   

17.
KEGG: Kyoto Encyclopedia of Genes and Genomes.   总被引:14,自引:0,他引:14       下载免费PDF全文
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).  相似文献   

18.
为了比较变异链球菌和血链球菌全代谢途径,依据KEGG数据库(http://www.genome.ad.jp/kegg)对变异链球菌和血链球菌的全部代谢途径作逐项比对。结果显示,二者参与了85个代谢途径,包括多数以相同的酶参与的中央代谢途径,即糖酵解、三羧酸循环、磷酸戊糖途径等,和多数以不同的酶参与的双组分感应系统等。通过变异链球菌和血链球菌整体代谢网络对比,了解了变异链球菌和血链球菌理论上的全部代谢途径,为全面揭示二者代谢交流研究奠定了基础。  相似文献   

19.
Rice (Oryza sativa) feeds over half of the global population. A web-based integrated platform for rice microarray annotation and data analysis in various biological contexts is presented, which provides a convenient query for comprehensive annotation compared with similar databases. Coupled with existing rice microarray data, it provides online analysis methods from the perspective of bioinformatics. This comprehensive bioinformatics analysis platform is composed of five modules, including data retrieval, microarray annotation, sequence analysis, results visualization and data analysis. The BioChip module facilitates the retrieval of microarray data information via identifiers of “Probe Set ID”, “Locus ID” and “Analysis Name”. The BioAnno module is used to annotate the gene or probe set based on the gene function, the domain information, the KEGG biochemical and regulatory pathways and the potential microRNA which regulates the genes. The BioSeq module lists all of the related sequence information by a microarray probe set. The BioView module provides various visual results for the microarray data. The BioAnaly module is used to analyze the rice microarray’s data set.  相似文献   

20.
In the modern chicken industry, fast-growing broilers have undergone strong artificial selection for muscle growth, which has led to remarkable phenotypic variations compared with slow-growing chickens. However, the molecular mechanism underlying these phenotypes differences remains unknown. In this study, a systematic identification of candidate genes and new pathways related to myofiber development and composition in chicken Soleus muscle (SOL) has been made using gene expression profiles of two distinct breeds: Qingyuan partridge (QY), a slow-growing Chinese breed possessing high meat quality and Cobb 500 (CB), a commercial fast-growing broiler line. Agilent cDNA microarray analyses were conducted to determine gene expression profiles of soleus muscle sampled at sexual maturity age of QY (112 d) and CB (42 d). The 1318 genes with at least 2-fold differences were identified (P?相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号