993.
A statistical method to predict protein p
Ka has been developed by using the 3D structure of a protein and a database of 434 experimental protein p
Ka values. Each p
Ka in the database is associated with a fingerprint that describes the chemical environment around an ionizable residue. A computational tool, MoKaBio, has been developed to identify automatically ionizable residues in a protein, generate fingerprints that describe the chemical environment around such residues, and predict p
Ka from the experimental p
Ka values in the database by using a similarity metric. The method, which retrieved the p
Ka of 429 of the 434 ionizable sites in the database correctly, was crossvalidated by leave‐one‐out and yielded root mean square error (RMSE) = 0.95, a result that is superior to that obtained by using the Null Model (RMSE 1.07) and other well‐established protein p
Ka prediction tools. This novel approach is suitable to rationalize protein p
Ka by comparing the region around the ionizable site with similar regions whose ionizable site p
Ka is known. The p
Ka of residues that have a unique environment not represented in the training set cannot be predicted accurately, however, the method offers the advantage of being trainable to increase its predictive power. Proteins 2009. © 2009 Wiley‐Liss, Inc.
相似文献