首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment‐based local and energy‐based nonlocal profiles
Authors:Zhixiu Li  Yuedong Yang  Eshel Faraggi  Jian Zhan  Yaoqi Zhou
Institution:1. School of Informatics and Computing, Indiana University‐Purdue University, , Indianapolis, Indiana, 46202;2. Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, , Indianapolis, Indiana, 46202;3. Institute for Glycomics and School of Information and Communication Technology, Griffith University,;4. Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, , Indianapolis, Indiana, 46202;5. Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, , Columbus, Ohio, 43215;6. Research and Information Systems, , Indiana, 46032
Abstract:Locating sequences compatible with a protein structural fold is the well‐known inverse protein‐folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy‐optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment‐derived sequence profiles and structure‐derived energy profiles. SPIN improves over the fragment‐derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild‐type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single‐body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks‐lab.org . Proteins 2014; 82:2565–2573. © 2014 Wiley Periodicals, Inc.
Keywords:protein design  knowledge‐based energy function  neural network  sequence profiles  inverse protein folding problem
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号