首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Prediction of protein‐glucose binding sites using support vector machines
Authors:Houssam Nassif  Hassan Al‐Ali  Sawsan Khuri  Walid Keirouz
Institution:1. Department of Computer Sciences, University of Wisconsin‐Madison, Madison, Wisconsin;2. Department of Biochemistry and Molecular Biology, University of Miami, Miller School of Medicine, Miami, Florida;3. Center for Computational Science, University of Miami, Miller School of Medicine, Miami, Florida;4. Dr. John T. Macdonald Foundation, Department of Human Genetics, University of Miami, Miller School of Medicine, Florida;5. Department of Computer Science, American University of Beirut, Beirut, Lebanon
Abstract:Glucose is a simple sugar that plays an essential role in many basic metabolic and signaling pathways. Many proteins have binding sites that are highly specific to glucose. The exponential increase of genomic data has revealed the identity of many proteins that seem to be central to biological processes, but whose exact functions are unknown. Many of these proteins seem to be associated with disease processes. Being able to predict glucose‐specific binding sites in these proteins will greatly enhance our ability to annotate protein function and may significantly contribute to drug design. We hereby present the first glucose‐binding site classifier algorithm. We consider the sugar‐binding pocket as a spherical spatio‐chemical environment and represent it as a vector of geometric and chemical features. We then perform Random Forests feature selection to identify key features and analyze them using support vector machines classification. Our work shows that glucose binding sites can be modeled effectively using a limited number of basic chemical and residue features. Using a leave‐one‐out cross‐validation method, our classifier achieves a 8.11% error, a 89.66% sensitivity and a 93.33% specificity over our dataset. From a biochemical perspective, our results support the relevance of ordered water molecules and ions in determining glucose specificity. They also reveal the importance of carboxylate residues in glucose binding and the high concentration of negatively charged atoms in direct contact with the bound glucose molecule. Proteins 2009. © 2009 Wiley‐Liss, Inc.
Keywords:hexose  carbohydrate  protein‐carbohydrate interaction  substrate recognition  binding site signature  feature vector  SVM  random forests
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号