首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Extending pathways based on gene lists using InterPro domain signatures
Authors:Florian Hahne  Alexander Mehrle  Dorit Arlt  Annemarie Poustka  Stefan Wiemann  Tim Beissbarth
Institution:1. Bioinformatics Group, School of Computer Science,Institute for Studies in Theoretical Physics and Mathematics (IPM), Tehran, Iran
2. Computer Engineering Department, Sharif University of Technology, Tehran, Iran
3. Department of Bioinformatics, Institute of Biochemistry and Biophysics,University of Tehran, Tehran, Iran
4. National Institute of Genetic Engineering and Biotechnology, Tehran-Karaj Highway, Tehran, Iran
5. Center of Excellence in Biomathematics,School of Mathematics,Statistics and Computer Science,College of Science,University of Tehran, Tehran, Iran
6. Faculty of Mathematics,Shahid-Beheshti University, Tehran, Iran
7. IMPRS-CBSC,Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, Berlin, Germany
8. DFG-Research Center Matheon,FB Mathematik und Informatik,Freie Universit?t Berlin, Arnimallee 6, D-14195, Berlin, Germany
Abstract:

Background

High-throughput technologies like functional screens and gene expression analysis produce extended lists of candidate genes. Gene-Set Enrichment Analysis is a commonly used and well established technique to test for the statistically significant over-representation of particular pathways. A shortcoming of this method is however, that most genes that are investigated in the experiments have very sparse functional or pathway annotation and therefore cannot be the target of such an analysis. The approach presented here aims to assign lists of genes with limited annotation to previously described functional gene collections or pathways. This works by comparing InterPro domain signatures of the candidate gene lists with domain signatures of gene sets derived from known classifications, e.g. KEGG pathways.

Results

In order to validate our approach, we designed a simulation study. Based on all pathways available in the KEGG database, we create test gene lists by randomly selecting pathway genes, removing these genes from the known pathways and adding variable amounts of noise in the form of genes not annotated to the pathway. We show that we can recover pathway memberships based on the simulated gene lists with high accuracy. We further demonstrate the applicability of our approach on a biological example.

Conclusion

Results based on simulation and data analysis show that domain based pathway enrichment analysis is a very sensitive method to test for enrichment of pathways in sparsely annotated lists of genes. An R based software package domainsignatures, to routinely perform this analysis on the results of high-throughput screening, is available via Bioconductor.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号