首页 | 本学科首页   官方微博 | 高级检索  
     


An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
Authors:Appala Raju Kotaru  Khader Shameer  Pandurangan Sundaramurthy  Ramesh Chandra Joshi
Affiliation:1.Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, 247667, Roorkee, India;2.Division of Biomedical Statistics and Informatics & Division of Cardiovascular Diseases, Mayo Clinic, Rochester 55905, USA;3.Department of Mathematics, Indian Institute of Technology Roorkee, 247667, Roorkee, India;4.School of Advanced Sciences, VIT University, Vellore - 632014, Tamil Nadu, India
Abstract:Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications ofbioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in agenome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. Thesemethods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identifythe influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Herewe propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derivethe basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to targetgenomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to define phylogeneticrelationships in the approach. We used Escherichia coli K12 genome as the reference genome and its 4195 proteins were used in thecurrent analysis. We compared our approach with two existing methods and our initial results show that the predictions haveoutperformed two of the existing approaches. In addition, we have validated our method using a targeted protein-proteininteraction network derived from protein-protein interaction database STRING. Our preliminary results indicates thatimprovement in function prediction can be attained by using coevolution-based similarity measures and the runs on to the samescale instead of computing them in different scales. Our method can be applied at the whole-genome level for annotatinghypothetical proteins from prokaryotic genomes.
Keywords:Protein function prediction   phylogenetic profiles   functional annotation   functional similarity
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号