Improving the performance of an SVM-based method for predicting protein-protein interactions |
| |
Authors: | Dohkan Shinsuke Koike Asako Takagi Toshihisa |
| |
Affiliation: | Department of Computational Biology, Graduate School of Frontier Sciences, University of Tokyo, (CB01) 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8581, Japan. dohkan@cb.k.u-tokyo.ac.jp |
| |
Abstract: | Predicting the interactions between all the possible pairs of proteins in a given organism (making a protein-protein interaction map) is a crucial subject in bioinformatics. Most of the previous methods based on supervised machine learning use datasets containing approximately the same number of interacting pairs of proteins (positives) and non-interacting pairs of proteins (negatives) for training a classifier and are estimated to yield a large number of false positives. Thinking that the negatives used in previous studies cannot adequately represent all the negatives that need to be taken into account, we have developed a method based on multiple Support Vector Machines (SVMs) that uses more negatives than positives for predicting interactions between pairs of yeast proteins and pairs of human proteins. We show that the performance of a single SVM improved as we increased the number of negatives used for training and that, if more than one CPU is available, an approach using multiple SVMs is useful not only for improving the performance of classifiers but also for reducing the time required for training them. Our approach can also be applied to assessing the reliability of high-throughput interactions. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|