aComputational Biophysics and Bioinformatics, Department of Physics, Clemson University, Clemson, SC 29634, United States
Abstract:
Development of sequence-based methods for predicting putative interfacial residues is an extremely important task in modeling 3D structures of protein–protein complexes. In the present paper we used non-gapped sequence segments to predict both interacting and interfacial residues. We demonstrated that continuous sequence segments do occur at the protein–protein interfaces and showed that continuous interacting interfacial segments (CIIS) of length nine are presented on average, in 37% of the complexes in our dataset. Our results indicate that CIIS consist mostly of interacting strands and/or loops, while the CIIS involving the helixes are scarce. We performed scoring of CIIS using four different scoring mechanisms and found that scores of CIIS differ significantly from the scores calculated for random stretches of residues. We argue that such statistical difference inferred thought the corresponding Z-scores could be used for detecting putative interfacial residue segments without using any structural information. This hypothesis was tested on our dataset and benchmarking resulted to 10–60% prediction accuracy depending on type of benchmarking and scoring scheme used in calculations. Such predictions that do not depend on the availability of the 3D structures of monomers can be quite valuable in modeling 3D structures of obligatory complexes, for which structures of separated monomers do not exist.