首页 | 本学科首页   官方微博 | 高级检索  
     


Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
Authors:Yiannis A. I. Kourmpetis  Aalt D. J. van Dijk  Marco C. A. M. Bink  Roeland C. H. J. van Ham  Cajo J. F. ter Braak
Affiliation:1. Biometris, Wageningen University and Research Centre, Wageningen, TheNetherlands.; 2. Applied Bioinformatics, Plant Research International, Wageningen, TheNetherlands.; 3. Laboratory of Bioinformatics, Wageningen University, Wageningen, TheNetherlands.;Miami University, United States of America
Abstract:Inference of protein functions is one of the most important aims of modernbiology. To fully exploit the large volumes of genomic data typically producedin modern-day genomic experiments, automated computational methods for proteinfunction prediction are urgently needed. Established methods use sequence orstructure similarity to infer functions but those types of data do not sufficeto determine the biological context in which proteins act. Currenthigh-throughput biological experiments produce large amounts of data on theinteractions between proteins. Such data can be used to infer interactionnetworks and to predict the biological process that the protein is involved in.Here, we develop a probabilistic approach for protein function prediction usingnetwork data, such as protein-protein interaction measurements. We take aBayesian approach to an existing Markov Random Field method by performingsimultaneous estimation of the model parameters and prediction of proteinfunctions. We use an adaptive Markov Chain Monte Carlo algorithm that leads tomore accurate parameter estimates and consequently to improved predictionperformance compared to the standard Markov Random Fields method. We tested ourmethod using a high quality S.cereviciae validation networkwith 1622 proteins against 90 Gene Ontology terms of different levels ofabstraction. Compared to three other protein function prediction methods, ourapproach shows very good prediction performance. Our method can be directlyapplied to protein-protein interaction or coexpression networks, but also can beextended to use multiple data sources. We apply our method to physical proteininteraction data from S. cerevisiae and provide novelpredictions, using 340 Gene Ontology terms, for 1170 unannotated proteins and weevaluate the predictions using the available literature.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号