首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

The genetic algorithm is a technique of function optimization derived from the principles of evolutionary theory. We have adapted it to perform conformational search on polypeptides and proteins. The algorithm was first tested on several small polypeptides and the 46 amino acid protein crambin under the AMBER potential energy function. The probable global minimum conformations of the polypeptides were located 90% of the time and a non-native conformation of crambin was located that was 150kcal/mol lower in potential energy than the minimized crystal structure conformation. Next, we used a knowledge-based potential function to predict the structures of melittin, pancreatic polypeptide, and crambin. A 2.31 Å ΔRMS conformation of melittin and a 5.33 Å ΔRMS conformation of pancreatic polypeptide were located by genetic algorithm-based conformational search under the knowledge-based potential function. Although the ΔRMS of pancreatic polypeptide was somewhat high, most of the secondary structure was correct. The secondary structure of crambin was predicted correctly, but the potential failed to promote packing interactions. Finally, we tested the packing aspects of our potential function by attempting to predict the tertiary structure of cytochrome b 562 given correct secondary structure as a constraint. The final predicted conformation of cytochrome b 562 was an almost completely extended continuous helix which indicated that the knowledge-based potential was useless for tertiary structure prediction. This work serves as a warning against testing potential functions designed for tertiary structure prediction on small proteins.  相似文献   

2.
Protein structure prediction remains an unsolved problem. Since prediction of the native structure seems very difficult, one usually tries to predict the correct fold of a protein. Here the "fold" is defined by the approximate backbone structure of the protein. However, physicochemical factors that determine the correct fold are not well understood. It has recently been reported that molecular mechanics energy functions combined with effective solvent terms can discriminate the native structures from misfolded ones. Using such a physicochemical energy function, we studied the factors necessary for discrimination of correct and incorrect folds. We first selected correct and incorrect folds by a conventional threading method. Then, all-atom models of those folds were constructed by simply minimizing the atomic overlaps. The constructed correct model representing the native fold has almost the same backbone structure as the native structure but differs in side-chain packing. Finally, the energy values of the constructed models were compared with that of the experimentally determined native structure. The correct model as well as the native structure showed lower energy than misfolded models. However, a large energy gap was found between the native structure and the correct model. By decomposing the energy values into their components, it was found that solvent effects such as the hydrophobic interaction or solvent shielding and the Born energy stabilized the correct model rather than the native structure. The large energetic stabilization of the native structure was attained by specific side-chain packing. The stabilization by solvent effects is small compared to that by side-chain packing. Therefore, it is suggested that in order to confidently predict the correct fold of a protein, it is also necessary to predict correct side-chain packing.  相似文献   

3.
Prediction of protein side-chain conformation by packing optimization   总被引:16,自引:0,他引:16  
We have developed a rapid and completely automatic method for prediction of protein side-chain conformation, applying the simulated annealing algorithm to optimization of side-chain packing (van der Waals) interactions. The method directly attacks the combinatorial problem of simultaneously predicting many residues' conformation, solving in 8 to 12 hours problems for which the systematic search would require over 10(300) central processing unit years. Over a test set of nine proteins ranging in size from 46 to 323 residues, the program's predictions for side-chain atoms had a root-mean-square (r.m.s.) deviation of 1.77 A overall versus the native structures. More importantly, the predictions for core residues were especially accurate, with an r.m.s. value of 1.25 A overall: 80 to 90% of the large hydrophobic side-chains dominating the internal core were correctly predicted, versus 30 to 40% for most current methods. The predictions' main errors were in surface residues poorly constrained by packing and small residues with greater steric freedom and hydrogen bonding interactions, which were not included in the program's potential function. van der Waals interactions appear to be the supreme determinant of the arrangement of side-chains in the core, enforcing a unique allowed packing that in every case so far examined matches the native structure.  相似文献   

4.
Computational determination of optimal side-chain conformations in protein structures has been a long-standing and challenging problem. Solving this problem is important for many applications including homology modeling, protein docking, and for placing small molecule ligands on protein-binding sites. Programs available as of this writing are very fast and reasonably accurate, as measured by deviations of side-chain dihedral angles; however, often due to multiple atomic clashes, they produce structures with high positive energies. This is problematic in applications where the energy values are important, for example when placing small molecules in docking applications; the relatively small binding energy of the small molecule is drowned by the large energy due to atomic clashes that hampers finding the lowest energy state of the docked ligand. To address this we have developed an algorithm for generating a set of side-chain conformations that is dense enough that at least one of its members would have a root mean-square deviation of no more than R Å from any possible side-chain conformation of the amino acid. We call such a set a side-chain cover set of order R for the amino acid. The size of the set is constrained by the energy of the interaction of the side chain to the backbone atoms. Then, side-chain cover sets are used to optimize the conformation of the side chains given the coordinates of the backbone of a protein. The method we use is based on a variety of dead-end elimination methods and the recently discovered dynamic programming algorithm for this problem. This was implemented in a computer program called Octopus where we use side-chain cover sets with very small values for R, such as 0.1 Å, which ensures that for each amino-acid side chain the set contains a conformation with a root mean-square deviation of, at most, R from the optimal conformation. The side-chain dihedral-angle accuracy of the program is comparable to other implementations; however, it has the important advantage that the structures produced by the program have negative energies that are very close to the energies of the crystal structure for all tested proteins.  相似文献   

5.
A graph-theory algorithm for rapid protein side-chain prediction   总被引:19,自引:0,他引:19       下载免费PDF全文
Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.  相似文献   

6.
Side-chain modeling with an optimized scoring function   总被引:1,自引:0,他引:1       下载免费PDF全文
Modeling side-chain conformations on a fixed protein backbone has a wide application in structure prediction and molecular design. Each effort in this field requires decisions about a rotamer set, scoring function, and search strategy. We have developed a new and simple scoring function, which operates on side-chain rotamers and consists of the following energy terms: contact surface, volume overlap, backbone dependency, electrostatic interactions, and desolvation energy. The weights of these energy terms were optimized to achieve the minimal average root mean square (rms) deviation between the lowest energy rotamer and real side-chain conformation on a training set of high-resolution protein structures. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. We obtained prediction accuracy of 90.4% for chi(1), 78.3% for chi(1 + 2), and 1.18 A overall rms deviation. Furthermore, the derived scoring function combined with a Monte Carlo search algorithm was used to place all side chains onto a protein backbone simultaneously. The average prediction accuracy was 87.9% for chi(1), 73.2% for chi(1 + 2), and 1.34 A rms deviation for 30 protein structures. Our approach was compared with available side-chain construction methods and showed improvement over the best among them: 4.4% for chi(1), 4.7% for chi(1 + 2), and 0.21 A for rms deviation. We hypothesize that the scoring function instead of the search strategy is the main obstacle in side-chain modeling. Additionally, we show that a more detailed rotamer library is expected to increase chi(1 + 2) prediction accuracy but may have little effect on chi(1) prediction accuracy.  相似文献   

7.
The ab initio folding problem can be divided into two sequential tasks of approximately equal computational complexity: the generation of native-like backbone folds and the positioning of side chains upon these backbones. The prediction of side-chain conformation in this context is challenging, because at best only the near-native global fold of the protein is known. To test the effect of displacements in the protein backbones on side-chain prediction for folds generated ab initio, sets of near-native backbones (≤ 4 Å Cα RMS error) for four small proteins were generated by two methods. The steric environment surrounding each residue was probed by placing the side chains in the native conformation on each of these decoys, followed by torsion-space optimization to remove steric clashes on a rigid backbone. We observe that on average 40% of the χ1 angles were displaced by 40° or more, effectively setting the limits in accuracy for side-chain modeling under these conditions. Three different algorithms were subsequently used for prediction of side-chain conformation. The average prediction accuracy for the three methods was remarkably similar: 49% to 51% of the χ1 angles were predicted correctly overall (33% to 36% of the χ1+2 angles). Interestingly, when the inter-side-chain interactions were disregarded, the mean accuracy increased. A consensus approach is described, in which side-chain conformations are defined based on the most frequently predicted χ angles for a given method upon each set of near-native backbones. We find that consensus modeling, which de facto includes backbone flexibility, improves side-chain prediction: χ1 accuracy improved to 51–54% (36–42% of χ1+2). Implications of a consensus method for ab initio protein structure prediction are discussed. Proteins 33:204–217, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

8.
Here we report an orientation-dependent statistical all-atom potential derived from side-chain packing, named OPUS-PSP. It features a basis set of 19 rigid-body blocks extracted from the chemical structures of all 20 amino acid residues. The potential is generated from the orientation-specific packing statistics of pairs of those blocks in a non-redundant structural database. The purpose of such an approach is to capture the essential elements of orientation dependence in molecular packing interactions. Tests of OPUS-PSP on commonly used decoy sets demonstrate that it significantly outperforms most of the existing knowledge-based potentials in terms of both its ability to recognize native structures and consistency in achieving high Z-scores across decoy sets. As OPUS-PSP excludes interactions among main-chain atoms, its success highlights the crucial importance of side-chain packing in forming native protein structures. Moreover, OPUS-PSP does not explicitly include solvation terms, and thus the potential should perform well when the solvation effect is difficult to determine, such as in membrane proteins. Overall, OPUS-PSP is a generally applicable potential for protein structure modeling, especially for handling side-chain conformations, one of the most difficult steps in high-accuracy protein structure prediction and refinement.  相似文献   

9.
Chung SY  Subbiah S 《Proteins》1999,35(2):184-194
The precision and accuracy of protein structures determined by nuclear magnetic resonance (NMR) spectroscopy depend on the completeness of input experimental data set. Typically, rather than a single structure, an ensemble of up to 20 equally representative conformers is generated and routinely deposited in the Protein Database. There are substantially more experimentally derived restraints available to define the main-chain coordinates than those of the side chains. Consequently, the side-chain conformations among the conformers are more variable and less well defined than those of the backbone. Even when a side chain is determined with high precision and is found to adopt very similar orientations among all the conformers in the ensemble, it is possible that its orientation might still be incorrect. Thus, it would be helpful if there were a method to assess independently the side-chain orientations determined by NMR. Recently, homology modeling by side-chain packing algorithms has been shown to be successful in predicting the side-chain conformations of the buried residues for a protein when the main-chain coordinates and sequence information are given. Since the main-chain coordinates determined by NMR are consistently more reliable than those of the side-chains, we have applied the side-chain packing algorithms to predict side-chain conformations that are compatible with the NMR-derived backbone. Using four test cases where the NMR solution structures and the X-ray crystal structure of the same protein are available, we demonstrate that the side-chain packing method can provide independent validation for the side-chain conformations of NMR structures. Comparison of the side-chain conformations derived by side-chain packing prediction and by NMR spectroscopy demonstrates that when there is agreement between the NMR model and the predicted model, on average 78% of the time the X-ray structure also concurs. While the side-chain packing method can confirm the reliable residue conformations in NMR models, more importantly, it can also identify the questionable residue conformations with an accuracy of 60%. This validation method can serve to increase the confidence level for potential users of structural models determined by NMR.  相似文献   

10.
MOTIVATION: The protein side-chain conformation problem is a central problem in proteomics with wide applications in protein structure prediction and design. Computational complexity results show that the problem is hard to solve. Yet, instances from realistic applications are large and demand fast and reliable algorithms. RESULTS: We propose a new global optimization algorithm, which for the first time integrates residue reduction and rotamer reduction techniques previously developed for the protein side-chain conformation problem. We show that the proposed approach simplifies dramatically the topology of the underlining residue graph. Computations show that our algorithm solves problems using only 1-10% of the time required by the mixed-integer linear programming approach available in the literature. In addition, on a set of hard side-chain conformation problems, our algorithm runs 2-78 times faster than SCWRL 3.0, which is widely used for solving these problems. AVAILABILITY: The implementation is available as an online server at http://eudoxus.scs.uiuc.edu/r3.html  相似文献   

11.
Modeling of loops in protein structures   总被引:27,自引:0,他引:27       下载免费PDF全文
Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM-22 force field, statistical preferences for the main-chain and side-chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30% of 4-, 8-, and 12-residue loop predictions, respectively, had <2 A RMSD error for the mainchain N, C(alpha), C, and O atoms; the average accuracies were 0.59 +/- 0.05, 1.16 +/- 0.10, and 2.61 +/- 0.16 A, respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main-chain stem atoms is 2.5 A, the average loop prediction error increased by 180, 25, and 3% for 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.  相似文献   

12.
13.
Finding the minimum energy amino acid side-chain conformation is a fundamental problem in both homology modeling and protein design. To address this issue, numerous computational algorithms have been proposed. However, there have been few quantitative comparisons between methods and there is very little general understanding of the types of problems that are appropriate for each algorithm. Here, we study four common search techniques: Monte Carlo (MC) and Monte Carlo plus quench (MCQ); genetic algorithms (GA); self-consistent mean field (SCMF); and dead-end elimination (DEE). Both SCMF and DEE are deterministic, and if DEE converges, it is guaranteed that its solution is the global minimum energy conformation (GMEC). This provides a means to compare the accuracy of SCMF and the stochastic methods. For the side-chain placement calculations, we find that DEE rapidly converges to the GMEC in all the test cases. The other algorithms converge on significantly incorrect solutions; the average fraction of incorrect rotamers for SCMF is 0.12, GA 0.09, and MCQ 0.05. For the protein design calculations, design positions are progressively added to the side-chain placement calculation until the time required for DEE diverges sharply. As the complexity of the problem increases, the accuracy of each method is determined so that the results can be extrapolated into the region where DEE is no longer tractable. We find that both SCMF and MCQ perform reasonably well on core calculations (fraction amino acids incorrect is SCMF 0.07, MCQ 0.04), but fail considerably on the boundary (SCMF 0.28, MCQ 0.32) and surface calculations (SCMF 0.37, MCQ 0.44).  相似文献   

14.
The role of crystal packing in determining the observed conformations of amino acid side-chains in protein crystals is investigated by (1) analysis of a database of proteins that have been crystallized in different unit cells (space group or unit cell dimensions) and (2) theoretical predictions of side-chain conformations with the crystal environment explicitly represented. Both of these approaches indicate that the crystal environment plays an important role in determining the conformations of polar side-chains on the surfaces of proteins. Inclusion of the crystal environment permits a more sensitive measurement of the achievable accuracy of side-chain prediction programs, when validating against structures obtained by X-ray crystallography. Our side-chain prediction program uses an all-atom force field and a Generalized Born model of solvation and is thus capable of modeling simple packing effects (i.e. van der Waals interactions), electrostatic effects, and desolvation, which are all important mechanisms by which the crystal environment impacts observed side-chain conformations. Our results are also relevant to the understanding of changes in side-chain conformation that may result from ligand docking and protein-protein association, insofar as the results reveal how side-chain conformations change in response to their local environment.  相似文献   

15.
Liu Z  Jiang L  Gao Y  Liang S  Chen H  Han Y  Lai L 《Proteins》2003,50(1):49-62
The disturbing genetic algorithm, incorporating the disturbing mutation process into the genetic algorithm flow, has been developed to extend the searching space of side-chain conformations and to improve the quality of the rotamer library. Moreover, the growing generation amount idea, simulating the real situation of the natural evolution, is introduced to improve the searching speed. In the calculations using the pseudo energy scoring function of the root mean squared deviation, the disturbing genetic algorithm method has been shown to be highly efficient. With the real energy function based on AMBER force field, the program has been applied to rebuilding side-chain conformations of 25 high-quality crystallographic structures of single-protein and protein-protein complexes. The averaged root mean standard deviation of atom coordinates in side-chains and veracities of the torsion angles of chi(1) and chi(1) + chi(2) are 1.165 A, 88.2 and 72.9% for the buried residues, respectively, and 1.493 A, 79.2 and 64.7% for all residues, showing that the method has equal precision to the program SCWRL, whereas it performs better in the prediction of buried residues and protein-protein interfaces. This method has been successfully used in redesigning the interface of the Basnase-Barstar complex, indicating that it will have extensive application in protein design, protein sequence and structure relationship studies, and research on protein-protein interaction.  相似文献   

16.
The application of all-atom force fields (and explicit or implicit solvent models) to protein homology-modeling tasks such as side-chain and loop prediction remains challenging both because of the expense of the individual energy calculations and because of the difficulty of sampling the rugged all-atom energy surface. Here we address this challenge for the problem of loop prediction through the development of numerous new algorithms, with an emphasis on multiscale and hierarchical techniques. As a first step in evaluating the performance of our loop prediction algorithm, we have applied it to the problem of reconstructing loops in native structures; we also explicitly include crystal packing to provide a fair comparison with crystal structures. In brief, large numbers of loops are generated by using a dihedral angle-based buildup procedure followed by iterative cycles of clustering, side-chain optimization, and complete energy minimization of selected loop structures. We evaluate this method by using the largest test set yet used for validation of a loop prediction method, with a total of 833 loops ranging from 4 to 12 residues in length. Average/median backbone root-mean-square deviations (RMSDs) to the native structures (superimposing the body of the protein, not the loop itself) are 0.42/0.24 A for 5 residue loops, 1.00/0.44 A for 8 residue loops, and 2.47/1.83 A for 11 residue loops. Median RMSDs are substantially lower than the averages because of a small number of outliers; the causes of these failures are examined in some detail, and many can be attributed to errors in assignment of protonation states of titratable residues, omission of ligands from the simulation, and, in a few cases, probable errors in the experimentally determined structures. When these obvious problems in the data sets are filtered out, average RMSDs to the native structures improve to 0.43 A for 5 residue loops, 0.84 A for 8 residue loops, and 1.63 A for 11 residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that sampling rarely limits prediction accuracy. The overall results are, to our knowledge, the best reported to date, and we attribute this success to the combination of an accurate all-atom energy function, efficient methods for loop buildup and side-chain optimization, and, especially for the longer loops, the hierarchical refinement protocol.  相似文献   

17.
We combine a new, extremely fast technique to generate a library of low energy structures of an oligopeptide (by using mutually orthogonal Latin squares to sample its conformational space) with a genetic algorithm to predict protein structures. The protein sequence is divided into oligopeptides, and a structure library is generated for each. These libraries are used in a newly defined mutation operator that, together with variation, crossover, and diversity operators, is used in a modified genetic algorithm to make the prediction. Application to five small proteins has yielded near native structures.  相似文献   

18.
pi-pi, Cation-pi, and hydrophobic packing interactions contribute specificity to protein folding and stability to the native state. As a step towards developing improved models of these interactions in proteins, we compare the side-chain packing arrangements in native proteins to those found in compact decoys produced by the Rosetta de novo structure prediction method. We find enrichments in the native distributions for T-shaped and parallel offset arrangements of aromatic residue pairs, in parallel stacked arrangements of cation-aromatic pairs, in parallel stacked pairs involving proline residues, and in parallel offset arrangements for aliphatic residue pairs. We then investigate the extent to which the distinctive features of native packing can be explained using Lennard-Jones and electrostatics models. Finally, we derive orientation-dependent pi-pi, cation-pi and hydrophobic interaction potentials based on the differences between the native and compact decoy distributions and investigate their efficacy for high-resolution protein structure prediction. Surprisingly, the orientation-dependent potential derived from the packing arrangements of aliphatic side-chain pairs distinguishes the native structure from compact decoys better than the orientation-dependent potentials describing pi-pi and cation-pi interactions.  相似文献   

19.
We present a computational approach for predicting structures of ligand-protein complexes and analyzing binding energy landscapes that combines Monte Carlo simulated annealing technique to determine the ligand bound conformation with the dead-end elimination algorithm for side-chain optimization of the protein active site residues. Flexible ligand docking and optimization of mobile protein side-chains have been performed to predict structural effects in the V32I/I47V/V82I HIV-1 protease mutant bound with the SB203386 ligand and in the V82A HIV-1 protease mutant bound with the A77003 ligand. The computational structure predictions are consistent with the crystal structures of these ligand-protein complexes. The emerging relationships between ligand docking and side-chain optimization of the active site residues are rationalized based on the analysis of the ligand-protein binding energy landscape. Proteins 33:295–310, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

20.
It is widely believed that the dominant force opposing protein folding is the entropic cost of restricting internal rotations. The energetic changes from restricting side-chain torsional motion are more complex than simply a loss of conformational entropy, however. A second force opposing protein folding arises when a side-chain in the folded state is not in its lowest-energy rotamer, giving rotameric strain. chi strain energy results from a dihedral angle being shifted from the most stable conformation of a rotamer when a protein folds. We calculated the energy of a side-chain as a function of its dihedral angles in a poly(Ala) helix. Using these energy profiles, we quantify conformational entropy, rotameric strain energy and chi strain energy for all 17 amino acid residues with side-chains in alpha-helices. We can calculate these terms for any amino acid in a helix interior in a protein, as a function of its side-chain dihedral angles, and have implemented this algorithm on a web page. The mean change in rotameric strain energy on folding is 0.42 kcal mol-1 per residue and the mean chi strain energy is 0.64 kcal mol-1 per residue. Loss of conformational entropy opposes folding by a mean of 1.1 kcal mol-1 per residue, and the mean total force opposing restricting a side-chain into a helix is 2.2 kcal mol-1. Conformational entropy estimates alone therefore greatly underestimate the forces opposing protein folding. The introduction of strain when a protein folds should not be neglected when attempting to quantify the balance of forces affecting protein stability. Consideration of rotameric strain energy may help the use of rotamer libraries in protein design and rationalise the effects of mutations where side-chain conformations change.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号