期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An atomic environment potential for use in protein structure prediction

Summa CM Levitt M Degrado WF 《Journal of molecular biology》2005,352(4):986-1001

We describe the derivation and testing of a knowledge-based atomic environment potential for the modeling of protein structural energetics. An analysis of the probabilities of atomic interactions in a dataset of high-resolution protein structures shows that the probabilities of non-bonded inter-atomic contacts are not statistically independent events, and that the multi-body contact frequencies are poorly predicted from pairwise contact potentials. A pseudo-energy function is defined that measures the preferences for protein atoms to be in a given microenvironment defined by the number of contacting atoms in the environment and its atomic composition. This functional form is tested for its ability to recognize native protein structures amongst an ensemble of decoy structures and a detailed relative performance comparison is made with a number of common functions used in protein structure prediction. 相似文献

2.

Prediction of protein residue contacts with a PDB-derived likelihood matrix 总被引：8，自引：0，他引：8

Singer MS Vriend G Bywater RP 《Protein engineering》2002,15(9):721-725

Proteins with similar folds often display common patterns of residue variability. A widely discussed question is how these patterns can be identified and deconvoluted to predict protein structure. In this respect, correlated mutation analysis (CMA) has shown considerable promise. CMA compares multiple members of a protein family and detects residues that remain constant or mutate in tandem. Often this behavior points to structural or functional interdependence between residues. CMA has been used to predict pairs of amino acids that are distant in the primary sequence but likely to form close contacts in the native three-dimensional structure. Until now these methods have used evolutionary or biophysical models to score the fit between residues. We wished to test whether empirical methods, derived from known protein structures, would provide useful predictive power for CMA. We analyzed 672 known protein structures, derived contact likelihood scores for all possible amino acid pairs, and used these scores to predict contacts. We then tested the method on 118 different protein families for which structures have been solved to atomic resolution. The mean performance was almost seven times better than random prediction. Used in concert with secondary structure prediction, the new CMA method could supply restraints for predicting still undetermined structures. 相似文献

3.

Preliminarily investigating the polymorphism of self-organized actin filament in vitro by atomic force microscope

Zhang J Wang YL Chen XY He CL Cheng C Cao Y 《Acta biochimica et biophysica Sinica》2004,36(9):637-643

Actin is a major structural component of eukaryoticcytoskeleton and exists in monomer G-actin and filamen-tous F-actin. G-actin consists of 375 amino acid residueswith molecular weight 43 kD and is a highly conservedprotein expressed in most living organi… 相似文献

4.

Design of multispecific protein sequences using probabilistic graphical modeling

Menachem Fromer Chen Yanover Michal Linial 《Proteins》2010,78(3):530-547

In nature, proteins partake in numerous protein– protein interactions that mediate their functions. Moreover, proteins have been shown to be physically stable in multiple structures, induced by cellular conditions, small ligands, or covalent modifications. Understanding how protein sequences achieve this structural promiscuity at the atomic level is a fundamental step in the drug design pipeline and a critical question in protein physics. One way to investigate this subject is to computationally predict protein sequences that are compatible with multiple states, i.e., multiple target structures or binding to distinct partners. The goal of engineering such proteins has been termed multispecific protein design. We develop a novel computational framework to efficiently and accurately perform multispecific protein design. This framework utilizes recent advances in probabilistic graphical modeling to predict sequences with low energies in multiple target states. Furthermore, it is also geared to specifically yield positional amino acid probability profiles compatible with these target states. Such profiles can be used as input to randomly bias high‐throughput experimental sequence screening techniques, such as phage display, thus providing an alternative avenue for elucidating the multispecificity of natural proteins and the synthesis of novel proteins with specific functionalities. We prove the utility of such multispecific design techniques in better recovering amino acid sequence diversities similar to those resulting from millions of years of evolution. We then compare the approaches of prediction of low energy ensembles and of amino acid profiles and demonstrate their complementarity in providing more robust predictions for protein design. Proteins 2010. © 2009 Wiley‐Liss, Inc. 相似文献

5.

Identification of specificity and promiscuity of PDZ domain interactions through their dynamic behavior

Z. Nevin Gerek Ozlem Keskin S. Banu Ozkan 《Proteins》2009,77(4):796-811

PDZ domains (PDZs), the most common interaction domain proteins, play critical roles in many cellular processes. PDZs perform their job by binding specific protein partners. However, they are very promiscuous, binding to more than one protein, yet selective at the same time. We examined the binding related dynamics of various PDZs to have insight about their specificity and promiscuity. We used full atomic normal mode analysis and a modified coarse‐grained elastic network model to compute the binding related dynamics. In the latter model, we introduced specificity for each single parameter constant and included the solvation effect implicitly. The modified model, referred to as specific‐Gaussian Network Model (s‐GNM), highlights some interesting differences in the conformational changes of PDZs upon binding to Class I or Class II type peptides. By clustering the residue fluctuation profiles of PDZs, we have shown: (i) binding selectivities can be discriminated from their dynamics, and (ii) the dynamics of different structural regions play critical roles for Class I and Class II specificity. s‐GNM is further tested on a dual‐specific PDZ which showed only Class I specificity when a point mutation exists on the βA‐βB loop. We observe that the binding dynamics change consistently in the mutated and wild type structures. In addition, we found that the binding induced fluctuation profiles can be used to discriminate the binding selectivity of homolog structures. These results indicate that s‐GNM can be a powerful method to study the changes in binding selectivities for mutant or homolog PDZs. Proteins 2009. © 2009 Wiley‐Liss, Inc. 相似文献

6.

Quantitative theory of hydrophobic effect as a driving force of protein structure

Nikolay Perunov Jeremy L. England 《Protein science : a publication of the Protein Society》2014,23(4):387-399

Various studies suggest that the hydrophobic effect plays a major role in driving the folding of proteins. In the past, however, it has been challenging to translate this understanding into a predictive, quantitative theory of how the full pattern of sequence hydrophobicity in a protein shapes functionally important features of its tertiary structure. Here, we extend and apply such a phenomenological theory of the sequence‐structure relationship in globular protein domains, which had previously been applied to the study of allosteric motion. In an effort to optimize parameters for the model, we first analyze the patterns of backbone burial found in single‐domain crystal structures, and discover that classic hydrophobicity scales derived from bulk physicochemical properties of amino acids are already nearly optimal for prediction of burial using the model. Subsequently, we apply the model to studying structural fluctuations in proteins and establish a means of identifying ligand‐binding and protein–protein interaction sites using this approach. 相似文献

7.

Predicting nucleic acid binding interfaces from structural models of proteins

Srayanta Mukherjee Yang Zhang Fabian Glaser Yael Mandel‐Gutfreund 《Proteins》2012,80(2):482-489

The function of DNA‐ and RNA‐binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure‐based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high‐resolution three‐dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I‐TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high‐resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I‐TASSER produces high‐quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low‐resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Proteins 2012. © 2011 Wiley Periodicals, Inc. 相似文献

8.

Discovery of Rab1 binding sites using an ensemble of clustering methods

下载免费PDF全文

Suryani Lukman Minh N. Nguyen Kelvin Sim Jeremy C.M. Teo 《Proteins》2017,85(5):859-871

Targeting non‐native‐ligand binding sites for potential investigative and therapeutic applications is an attractive strategy in proteins that share common native ligands, as in Rab1 protein. Rab1 is a subfamily member of Rab proteins, which are members of Ras GTPase superfamily. All Ras GTPase superfamily members bind to native ligands GTP and GDP, that switch on and off the proteins, respectively. Rab1 is physiologically essential for autophagy and transport between endoplasmic reticulum and Golgi apparatus. Pathologically, Rab1 is implicated in human cancers, a neurodegenerative disease, cardiomyopathy, and bacteria‐caused infectious diseases. We have performed structural analyses on Rab1 protein using a unique ensemble of clustering methods, including multi‐step principal component analysis, non‐negative matrix factorization, and independent component analysis, to better identify representative Rab1 proteins than the application of a single clustering method alone does. We then used the identified representative Rab1 structures, resolved in multiple ligand states, to map their known and novel binding sites. We report here at least a novel binding site on Rab1, involving Rab1‐specific residues that could be further explored for the rational design and development of investigative probes and/or therapeutic small molecules against the Rab1 protein. Proteins 2017; 85:859–871. © 2016 Wiley Periodicals, Inc. 相似文献

9.

Lessons from the design of a novel atomic potential for protein folding 总被引：2，自引：0，他引：2

Chen WW Shakhnovich EI 《Protein science : a publication of the Protein Society》2005,14(7):1741-1752

相似文献

10.

AFM force spectroscopy reveals how subtle structural differences affect the interaction strength between Candida albicans and DC‐SIGN

Joost te Riet Inge Reinieren‐Beeren Carl G. Figdor Alessandra Cambi 《Journal of molecular recognition : JMR》2015,28(11):687-698

The fungus Candida albicans is the most common cause of mycotic infections in immunocompromised hosts. Little is known about the initial interactions between Candida and immune cell receptors, such as the C‐type lectin dendritic cell‐specific intracellular cell adhesion molecule‐3 (ICAM‐3)‐grabbing non‐integrin (DC‐SIGN), because a detailed characterization at the structural level is lacking. DC‐SIGN recognizes specific Candida‐associated molecular patterns, that is, mannan structures present in the cell wall of Candida. The molecular recognition mechanism is however poorly understood. We postulated that small differences in mannan‐branching may result in considerable differences in the binding affinity. Here, we exploit atomic force microscope‐based dynamic force spectroscopy with single Candida cells to gain better insight in the carbohydrate recognition capacity of DC‐SIGN. We demonstrate that slight differences in the N‐mannan structure of Candida, that is, the absence or presence of a phosphomannan side chain, results in differences in the recognition by DC‐SIGN as follows: (i) it contributes to the compliance of the outer cell wall of Candida, and (ii) its presence results in a higher binding energy of 1.6 k_BT. The single‐bond affinity of tetrameric DC‐SIGN for wild‐type C. albicans is ~10.7 k_BT and a dissociation constant k_D of 23 μM, which is relatively strong compared with other carbohydrate–protein interactions described in the literature. In conclusion, this study shows that DC‐SIGN specifically recognizes mannan patterns on C. albicans with high affinity. Knowledge on the binding pocket of DC‐SIGN and its pathogenic ligands will lead to a better understanding of how fungal‐associated carbohydrate structures are recognized by receptors of the immune system and can ultimately contribute to the development of new anti‐fungal drugs. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

11.

Structures of mesophilic and extremophilic citrate synthases reveal rigidity and flexibility for function

Stephen A. Wells Susan J. Crennell Michael J. Danson 《Proteins》2014,82(10):2657-2670

Citrate synthase (CS) catalyses the entry of carbon into the citric acid cycle and is highly‐conserved structurally across the tree of life. Crystal structures of dimeric CSs are known in both “open” and “closed” forms, which differ by a substantial domain motion that closes the substrate‐binding clefts. We explore both the static rigidity and the dynamic flexibility of CS structures from mesophilic and extremophilic organisms from all three evolutionary domains. The computational expense of this wide‐ranging exploration is kept to a minimum by the use of rigidity analysis and rapid all‐atom simulations of flexible motion, combining geometric simulation and elastic network modeling. CS structures from thermophiles display increased structural rigidity compared with the mesophilic enzyme. A CS structure from a psychrophile, stabilized by strong ionic interactions, appears to display likewise increased rigidity in conventional rigidity analysis; however, a novel modified analysis, taking into account the weakening of the hydrophobic effect at low temperatures, shows a more appropriate decreased rigidity. These rigidity variations do not, however, affect the character of the flexible dynamics, which are well conserved across all the structures studied. Simulation trajectories not only duplicate the crystallographically observed symmetric open‐to‐closed transitions, but also identify motions describing a previously unidentified antisymmetric functional motion. This antisymmetric motion would not be directly observed in crystallography but is revealed as an intrinsic property of the CS structure by modeling of flexible motion. This suggests that the functional motion closing the binding clefts in CS may be independent rather than symmetric and cooperative. Proteins 2014; 82:2657–2670. © 2014 Wiley Periodicals, Inc. 相似文献

12.

Method for comparing the structures of protein ligand-binding sites and application for predicting protein-drug interactions

Minai R Matsuo Y Onuki H Hirota H 《Proteins》2008,72(1):367-381

Many drugs, even ones that are designed to act selectively on a target protein, bind unintended proteins. These unintended bindings can explain side effects or indicate additional mechanisms for a drug's medicinal properties. Structural similarity between binding sites is one of the reasons for binding to multiple targets. We developed a method for the structural alignment of atoms in the solvent-accessible surface of proteins that uses similarities in the local atomic environment, and carried out all-against-all structural comparisons for 48,347 potential ligand-binding regions from a nonredundant protein structure subset (nrPDB, provided by NCBI). The relationships between the similarity of ligand-binding regions and the similarity of the global structures of the proteins containing the binding regions were examined. We found 10,403 known ligand-binding region pairs whose structures were similar despite having different global folds. Of these, we detected 281 region pairs that had similar ligands with similar binding modes. These proteins are good examples of convergent evolution. In addition, we found a significant correlation between Z-score of structural similarity and true positive rate of "active" entries in the PubChem BioAssay database. Moreover, we confirmed the interaction between ibuprofen and a new target, porcine pancreatic elastase, by NMR experiment. Finally, we used this method to predict new drug-target protein interactions. We obtained 540 predictions for 105 drugs (e.g., captopril, lovastatin, flurbiprofen, metyrapone, and salicylic acid), and calculated the binding affinities using AutoDock simulation. The results of these structural comparisons are available at http://www.tsurumi.yokohama-cu.ac.jp/fold/database.html. 相似文献

13.

Statistical analysis of structural determinants for protein–DNA‐binding specificity

下载免费PDF全文

Rosario I. Corona Jun‐tao Guo 《Proteins》2016,84(8):1147-1161

DNA‐binding proteins play critical roles in biological processes including gene expression, DNA packaging and DNA repair. They bind to DNA target sequences with different degrees of binding specificity, ranging from highly specific (HS) to nonspecific (NS). Alterations of DNA‐binding specificity, due to either genetic variation or somatic mutations, can lead to various diseases. In this study, a comparative analysis of protein–DNA complex structures was carried out to investigate the structural features that contribute to binding specificity. Protein–DNA complexes were grouped into three general classes based on degrees of binding specificity: HS, multispecific (MS), and NS. Our results show a clear trend of structural features among the three classes, including amino acid binding propensities, simple and complex hydrogen bonds, major/minor groove and base contacts, and DNA shape. We found that aspartate is enriched in HS DNA binding proteins and predominately binds to a cytosine through a single hydrogen bond or two consecutive cytosines through bidentate hydrogen bonds. Aromatic residues, histidine and tyrosine, are highly enriched in the HS and MS groups and may contribute to specific binding through different mechanisms. To further investigate the role of protein flexibility in specific protein–DNA recognition, we analyzed the conformational changes between the bound and unbound states of DNA‐binding proteins and structural variations. The results indicate that HS and MS DNA‐binding domains have larger conformational changes upon DNA‐binding and larger degree of flexibility in both bound and unbound states. Proteins 2016; 84:1147–1161. © 2016 Wiley Periodicals, Inc. 相似文献

14.

Counterbalance of ligand‐ and self‐coupled motions characterizes multispecificity of ubiquitin

Bhaskar Dasgupta Haruki Nakamura Akira R Kinjo 《Protein science : a publication of the Protein Society》2013,22(2):168-178

Date hub proteins are a type of proteins that show multispecificity in a time‐dependent manner. To understand dynamic aspects of such multispecificity we studied Ubiquitin as a typical example of a date hub protein. Here we analyzed 9 biologically relevant Ubiquitin‐protein (ligand) heterodimer structures by using normal mode analysis based on an elastic network model. Our result showed that the self‐coupled motion of Ubiquitin in the complex, rather than its ligand‐coupled motion, is similar to the motion of Ubiquitin in the unbound condition. The ligand‐coupled motions are correlated to the conformational change between the unbound and bound conditions of Ubiquitin. Moreover, ligand‐coupled motions favor the formation of the bound states, due to its in‐phase movements of the contacting atoms at the interface. The self‐coupled motions at the interface indicated loss of conformational entropy due to binding. Therefore, such motions disfavor the formation of the bound state. We observed that the ligand‐coupled motions are embedded in the motions of unbound Ubiquitin. In conclusion, multispecificity of Ubiquitin can be characterized by an intricate balance of the ligand‐ and self‐coupled motions, both of which are embedded in the motions of the unbound form. 相似文献

15.

Effective knowledge‐based potentials

Evandro Ferrada Francisco Melo 《Protein science : a publication of the Protein Society》2009,18(7):1469-1485

Empirical or knowledge‐based potentials have many applications in structural biology such as the prediction of protein structure, protein–protein, and protein–ligand interactions and in the evaluation of stability for mutant proteins, the assessment of errors in experimentally solved structures, and the design of new proteins. Here, we describe a simple procedure to derive and use pairwise distance‐dependent potentials that rely on the definition of effective atomic interactions, which attempt to capture interactions that are more likely to be physically relevant. Based on a difficult benchmark test composed of proteins with different secondary structure composition and representing many different folds, we show that the use of effective atomic interactions significantly improves the performance of potentials at discriminating between native and near‐native conformations. We also found that, in agreement with previous reports, the potentials derived from the observed effective atomic interactions in native protein structures contain a larger amount of mutual information. A detailed analysis of the effective energy functions shows that atom connectivity effects, which mostly arise when deriving the potential by the incorporation of those indirect atomic interactions occurring beyond the first atomic shell, are clearly filtered out. The shape of the energy functions for direct atomic interactions representing hydrogen bonding and disulfide and salt bridges formation is almost unaffected when effective interactions are taken into account. On the contrary, the shape of the energy functions for indirect atom interactions (i.e., those describing the interaction between two atoms bound to a direct interacting pair) is clearly different when effective interactions are considered. Effective energy functions for indirect interacting atom pairs are not influenced by the shape or the energy minimum observed for the corresponding direct interacting atom pair. Our results suggest that the dependency between the signals in different energy functions is a key aspect that need to be addressed when empirical energy functions are derived and used, and also highlight the importance of additivity assumptions in the use of potential energy functions. 相似文献

16.

Spatial features of proteins related to their phosphorylation and associated structural changes

Dmitry A. Karasev Darya A. Veselova Alexander V. Veselovsky Boris N. Sobolev Victor G. Zgoda Alexander I. Archakov 《Proteins》2018,86(1):13-20

Protein phosphorylation is widely used in biological regulatory processes. The study of spatial features related to phosphorylation sites is necessary to increase the efficacy of recognition of phosphorylation patterns in protein sequences. Using the data on phosphosites found in amino acid sequences, we mapped these sites onto 3D structures and studied the structural variability of the same sites in different PDB entries related to the same proteins. Solvent accessibility was calculated for the residues known to be phosphorylated. A significant change in accessibility was shown for many sites, but several ones were determined as buried in all the structures considered. Most phosphosites were found in coil regions. However, a significant portion was located in the structurally stable ordered regions. Comparison of structures with the same sites in modified and unmodified states showed that the region surrounding a site could be significantly shifted due to phosphorylation. Comparison between non‐modified structures (as well as between the modified ones) suggested that phosphorylation stabilizes one of the possible conformations. The local structure around the site could be changed due to phosphorylation, but often the initial conformation of the site surrounding is not altered within bounds of a rather large substructure. In this case, we can observe an extensive displacement within a protein domain. Phosphorylation without structural alteration seems to provide the interface for domain‐domain or protein‐protein interactions. Accounting for structural features is important for revealing more specific patterns of phosphorylation. It is also necessary for explaining structural changes as a basis for regulatory processes. 相似文献

17.

Three enhancements to the inference of statistical protein‐DNA potentials

Mohammed AlQuraishi Harley H. McAdams 《Proteins》2013,81(3):426-442

The energetics of protein‐DNA interactions are often modeled using so‐called statistical potentials, that is, energy models derived from the atomic structures of protein‐DNA complexes. Many statistical protein‐DNA potentials based on differing theoretical assumptions have been investigated, but little attention has been paid to the types of data and the parameter estimation process used in deriving the statistical potentials. We describe three enhancements to statistical potential inference that significantly improve the accuracy of predicted protein‐DNA interactions: (i) incorporation of binding energy data of protein‐DNA complexes, in conjunction with their X‐ray crystal structures, (ii) use of spatially‐aware parameter fitting, and (iii) use of ensemble‐based parameter fitting. We apply these enhancements to three widely‐used statistical potentials and use the resulting enhanced potentials in a structure‐based prediction of the DNA binding sites of proteins. These enhancements are directly applicable to all statistical potentials used in protein‐DNA modeling, and we show that they can improve the accuracy of predicted DNA binding sites by up to 21%. Proteins 2013. © 2012 Wiley Periodicals, Inc. 相似文献

18.

Enhanced meta-analysis of acetylcholine binding protein structures reveals conformational signatures of agonism in nicotinic receptors

Stober ST Abrams CF 《Protein science : a publication of the Protein Society》2012,21(3):307-317

The soluble acetylcholine binding protein (AChBP) is the default structural proxy for pentameric ligand‐gated ion channels (LGICs). Unfortunately, it is difficult to recognize conformational signatures of LGIC agonism and antagonism within the large set of AChBP crystal structures in both apo and ligand‐bound states, primarily because AChBP conformations in this set are nearly superimposable (root mean square deviation < 1.5 Å). We have undertaken a systematic, alignment‐free approach to elucidate conformational differences displayed by AChBP that cleanly differentiate apo/antagonist‐bound from agonist‐bound states. Our approach uses statistical inference based on both crystallographic states and conformations sampled during long molecular dynamics simulations to select important inter‐C_α distances and map their collective values onto functional states. We observe that binding of (nAChR) agonists to AChBP elicits clockwise rotation of the inner β‐sheet with respect to the outer β‐sheet, causing tilting of the cys‐loop away from the five‐fold axis, in a manner quite similar to that speculated for α‐subunits of the heteromeric nAChR structure (Unwin, J Mol Biol 2005;346:967), making this motion potentially important in transmission of the gating signal to the transmembrane domain of a LGIC. The method is also successful at discriminating partial from full agonists and supports the hypothesis that a particularly controversial ligand, lobeline, is in fact an LGIC antagonist. 相似文献

19.

Coarse-grained normal mode analysis in structural biology

Bahar I Rader AJ 《Current opinion in structural biology》2005,15(5):586-592

The realization that experimentally observed functional motions of proteins can be predicted by coarse-grained normal mode analysis has renewed interest in applications to structural biology. Notable applications include the prediction of biologically relevant motions of proteins and supramolecular structures driven by their structure-encoded collective dynamics; the refinement of low-resolution structures, including those determined by cryo-electron microscopy; and the identification of conserved dynamic patterns and mechanically key regions within protein families. Additionally, hybrid methods that couple atomic simulations with deformations derived from coarse-grained normal mode analysis are able to sample collective motions beyond the range of conventional molecular dynamics simulations. Such applications have provided great insight into the underlying principles linking protein structures to their dynamics and their dynamics to their functions. 相似文献

20.

Structure-based prediction of DNA target sites by regulatory proteins 总被引：15，自引：0，他引：15

Kono H Sarai A 《Proteins》1999,35(1):114-131

Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins. 相似文献