首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
Gao C  Stern HA 《Proteins》2007,68(1):67-75
We perform a systematic examination of the ability of several different high-resolution, atomic-detail scoring functions to discriminate native conformations of loops in membrane proteins from non-native but physically reasonable, or "decoy," conformations. Decoys constructed from changing a loop conformation while keeping the remainder of the protein fixed are a challenging test of energy function accuracy. Nevertheless, the best of the energy functions we examined recognized the native structure as lowest in energy around half the time, and consistently chose it as a low-energy structure. This suggests that the best of present energy functions, even without a representation of the lipid bilayer, are of sufficient accuracy to give reasonable confidence in predictions of membrane protein structure. We also constructed homology models for each structure, using other known structures in the same protein family as templates. Homology models were constructed using several scoring functions and modeling programs, but with a comparable sampling effort for each procedure. Our results indicate that the quality of sequence alignment is probably the most important factor in model accuracy for sequence identity from 20-40%; one can expect a reasonably accurate model for membrane proteins when sequence identity is greater than 30%, in agreement with previous studies. Most errors are localized in loop regions, which tend to be found outside the lipid bilayer. For the most discriminative energy functions, it appears that errors are most likely due to lack of sufficient sampling, although it should be stressed that present energy functions are still far from perfectly reliable.  相似文献   

2.
The conformations of loops are determined by the water-mediated interactions between amino acid residues. Energy functions that describe the interactions can be derived either from physical principles (physical-based energy function) or statistical analysis of known protein structures (knowledge-based statistical potentials). It is commonly believed that statistical potentials are appropriate for coarse-grained representation of proteins but are not as accurate as physical-based potentials when atomic resolution is required. Several recent applications of physical-based energy functions to loop selections appear to support this view. In this article, we apply a recently developed DFIRE-based statistical potential to three different loop decoy sets (RAPPER, Jacobson, and Forrest-Woolf sets). Together with a rotamer library for side-chain optimization, the performance of DFIRE-based potential in the RAPPER decoy set (385 loop targets) is comparable to that of AMBER/GBSA for short loops (two to eight residues). The DFIRE is more accurate for longer loops (9 to 12 residues). Similar trend is observed when comparing DFIRE with another physical-based OPLS/SGB-NP energy function in the large Jacobson decoy set (788 loop targets). In the Forrest-Woolf decoy set for the loops of membrane proteins, the DFIRE potential performs substantially better than the combination of the CHARMM force field with several solvation models. The results suggest that a single-term DFIRE-statistical energy function can provide an accurate loop prediction at a fraction of computing cost required for more complicate physical-based energy functions. A Web server for academic users is established for loop selection at the softwares/services section of the Web site http://theory.med.buffalo.edu/.  相似文献   

3.
Protein decoy data sets provide a benchmark for testing scoring functions designed for fold recognition and protein homology modeling problems. It is commonly believed that statistical potentials based on reduced atomic models are better able to discriminate native-like from misfolded decoys than scoring functions based on more detailed molecular mechanics models. Recent benchmark tests on small data sets, however, suggest otherwise. In this work, we report the results of extensive decoy detection tests using an effective free energy function based on the OPLS all-atom (OPLS-AA) force field and the Surface Generalized Born (SGB) model for the solvent electrostatic effects. The OPLS-AA/SGB effective free energy is used as a scoring function to detect native protein folds among a total of 48,832 decoys for 32 different proteins from Park and Levitt's 4-state-reduced, Levitt's local-minima, Baker's ROSETTA all-atom, and Skolnick's decoy sets. Solvent electrostatic effects are included through the Surface Generalized Born (SGB) model. All structures are locally minimized without restraints. From an analysis of the individual energy components of the OPLS-AA/SGB energy function for the native and the best-ranked decoy, it is determined that a balance of the terms of the potential is responsible for the minimized energies that most successfully distinguish the native from the misfolded conformations. Different combinations of individual energy terms provide less discrimination than the total energy. The results are consistent with observations that all-atom molecular potentials coupled with intermediate level solvent dielectric models are competitive with knowledge-based potentials for decoy detection and protein modeling problems such as fold recognition and homology modeling.  相似文献   

4.
Limitations in protein homology modeling often arise from the inability to adequately model loops. In this paper we focus on the selection of loop conformations. We present a complete computational treatment that allows the screening of loop conformations to identify those that best fit a molecular model. The stability of a loop in a protein is evaluated via computations of conformational free energies in solution, i.e., the free energy difference between the reference structure and the modeled one. A thermodynamic cycle is used for calculation of the conformational free energy, in which the total free energy of the reference state (i.e., gas phase) is the CHARMm potential energy. The electrostatic contribution of the solvation free energy is obtained from solving the finite-difference Poisson-Boltzmann equation. The nonpolar contribution is based on a surface area-based expression. We applied this computational scheme to a simple but well-characterized system, the antibody hypervariable loop (complementarity-determining region, CDR). Instead of creating loop conformations, we generated a database of loops extracted from high-resolution crystal structures of proteins, which display geometrical similarities with antibody CDRs. We inserted loops from our database into a framework of an antibody; then we calculated the conformational free energies of each loop. Results show that we successfully identified loops with a "reference-like" CDR geometry, with the lowest conformational free energy in gas phase only. Surprisingly, the solvation energy term plays a confusing role, sometimes discriminating "reference-like" CDR geometry and many times allowing "non-reference-like" conformations to have the lowest conformational free energies (for short loops). Most "reference-like" loop conformations are separated from others by a gap in the gas phase conformational free energy scale. Naturally, loops from antibody molecules are found to be the best models for long CDRs (> or = 6 residues), mainly because of a better packing of backbone atoms into the framework of the antibody model.  相似文献   

5.
In this paper we discuss the problem of including solvation free energies in evaluating the relative stabilities of loops in proteins. A conformational search based on a gas-phase potential function is used to generate a large number of trial conformations. As has been found previously, the energy minimization step in this process tends to pack charged and polar side chains against the protein surface, resulting in conformations which are unstable in the aqueous phase. Various solvation models can easily identify such structures. In order to provide a more severe test of solvation models, gas phase conformations were generated in which side chains were kept extended so as to maximize their interaction with the solvent. The free energies of these conformations were compared to that calculated for the crystal structure in three loops of the protein E. coli RNase H, with lengths of 7, 8, and 9 residues. Free energies were evaluated with a finite difference Poisson-Boltzmann (FDPB) calculation for electrostatics and a surface area-based term for nonpolar contributions. These were added to a gas-phase potential function. A free energy function based on atomic solvation parameters was also tested. Both functions were quite successful in selecting, based on a free energy criterion, conformations quite close to the crystal structure for two of the three loops. For one loop, which is involved in crystal contacts, conformations that are quite different from the crystal structure were also selected. A method to avoid precision problems associated with using the FDPB method to evaluate conformational free energies in proteins is described. © 1994 John Wiley & Sons, Inc.  相似文献   

6.
In the prediction of protein structure from amino acid sequence, loops are challenging regions for computational methods. Since loops are often located on the protein surface, they can have significant roles in determining protein functions and binding properties. Loop prediction without the aid of a structural template requires extensive conformational sampling and energy minimization, which are computationally difficult. In this article we present a new de novo loop sampling method, the Parallely filtered Energy Targeted All‐atom Loop Sampler (PETALS) to rapidly locate low energy conformations. PETALS explores both backbone and side‐chain positions of the loop region simultaneously according to the energy function selected by the user, and constructs a nonredundant ensemble of low energy loop conformations using filtering criteria. The method is illustrated with the DFIRE potential and DiSGro energy function for loops, and shown to be highly effective at discovering conformations with near‐native (or better) energy. Using the same energy function as the DiSGro algorithm, PETALS samples conformations with both lower RMSDs and lower energies. PETALS is also useful for assessing the accuracy of different energy functions. PETALS runs rapidly, requiring an average time cost of 10 minutes for a length 12 loop on a single 3.2 GHz processor core, comparable to the fastest existing de novo methods for generating an ensemble of conformations. Proteins 2017; 85:1402–1412. © 2017 Wiley Periodicals, Inc.  相似文献   

7.
We have improved the original Rosetta centroid/backbone decoy set by increasing the number of proteins and frequency of near native models and by building on sidechains and minimizing clashes. The new set consists of 1,400 model structures for 78 different and diverse protein targets and provides a challenging set for the testing and evaluation of scoring functions. We evaluated the extent to which a variety of all-atom energy functions could identify the native and close-to-native structures in the new decoy sets. Of various implicit solvent models, we found that a solvent-accessible surface area-based solvation provided the best enrichment and discrimination of close-to-native decoys. The combination of this solvation treatment with Lennard Jones terms and the original Rosetta energy provided better enrichment and discrimination than any of the individual terms. The results also highlight the differences in accuracy of NMR and X-ray crystal structures: a large energy gap was observed between native and non-native conformations for X-ray structures but not for NMR structures.  相似文献   

8.
Continuum solvent models such as Generalized-Born and Poisson–Boltzmann methods hold the promise to treat solvation effect efficiently and to enable rapid scoring of protein structures when they are combined with physics-based energy functions. Yet, direct comparison of these two approaches on large protein data set is lacking. Building on our previous work with a scoring function based on a Generalized-Born (GB) solvation model, and short molecular-dynamics simulations, we further extended the scoring function to compare with the MM-PBSA method to treat the solvent effect. We benchmarked this scoring function against seven publicly available decoy sets. We found that, somewhat surprisingly, the results of MM-PBSA approach are comparable to the previous GB-based scoring function. We also discussed the effect to the scoring function accuracy due to presence of large ligands and ions in some native structures of the decoy sets.  相似文献   

9.
A refinement protocol based on physics‐based techniques established for water soluble proteins is tested for membrane protein structures. Initial structures were generated by homology modeling and sampled via molecular dynamics simulations in explicit lipid bilayer and aqueous solvent systems. Snapshots from the simulations were selected based on scoring with either knowledge‐based or implicit membrane‐based scoring functions and averaged to obtain refined models. The protocol resulted in consistent and significant refinement of the membrane protein structures similar to the performance of refinement methods for soluble proteins. Refinement success was similar between sampling in the presence of lipid bilayers and aqueous solvent but the presence of lipid bilayers may benefit the improvement of lipid‐facing residues. Scoring with knowledge‐based functions (DFIRE and RWplus) was found to be as good as scoring using implicit membrane‐based scoring functions suggesting that differences in internal packing is more important than orientations relative to the membrane during the refinement of membrane protein homology models.  相似文献   

10.
Comparative or homology modeling of a target protein based on sequence similarity to a protein with known structure is widely used to provide structural models of proteins. Depending on the target‐template similarity these model structures may contain regions of limited structural accuracy. In principle, molecular dynamics (MD) simulations can be used to refine protein model structures and also to model loop regions that connect structurally conserved regions but it is limited by the currently accessible simulation time scales. A recently developed biasing potential replica exchange (BP‐REMD) method was used to refine loops and complete decoy protein structures at atomic resolution including explicit solvent. In standard REMD simulations several replicas of a system are run in parallel at different temperatures allowing exchanges at preset time intervals. In a BP‐REMD simulation replicas are controlled by various levels of a biasing potential to reduce the energy barriers associated with peptide backbone dihedral transitions. The method requires much fewer replicas for efficient sampling compared with T‐REMD. Application of the approach to several protein loops indicated improved conformational sampling of backbone dihedral angle of loop residues compared to conventional MD simulations. BP‐REMD refinement simulations on several test cases starting from decoy structures deviating significantly from the native structure resulted in final structures in much closer agreement with experiment compared to conventional MD simulations. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

11.
12.
One of the common methods for assessing energy functions of proteins is selection of native or near-native structures from decoys. This is an efficient but indirect test of the energy functions because decoy structures are typically generated either by sampling procedures or by a separate energy function. As a result, these decoys may not contain the global minimum structure that reflects the true folding accuracy of the energy functions. This paper proposes to assess energy functions by ab initio refolding of fully unfolded terminal segments with secondary structures while keeping the rest of the proteins fixed in their native conformations. Global energy minimization of these short unfolded segments, a challenging yet tractable problem, is a direct test of the energy functions. As an illustrative example, refolding terminal segments is employed to assess two closely related all-atom statistical energy functions, DFIRE (distance-scaled, finite, ideal-gas reference state) and DOPE (discrete optimized protein energy). We found that a simple sequence-position dependence contained in the DOPE energy function leads to an intrinsic bias toward the formation of helical structures. Meanwhile, a finer statistical treatment of short-range interactions yields a significant improvement in the accuracy of segment refolding by DFIRE. The updated DFIRE energy function yields success rates of 100% and 67%, respectively, for its ability to sample and fold fully unfolded terminal segments of 15 proteins to within 3.5 A global root-mean-squared distance from the corresponding native structures. The updated DFIRE energy function is available as DFIRE 2.0 upon request.  相似文献   

13.
Statistical potential for assessment and prediction of protein structures   总被引:2,自引:0,他引:2  
Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance-dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non-native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo-electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER-8.  相似文献   

14.
Flexible loop regions of proteins play a crucial role in many biological functions such as protein–ligand recognition, enzymatic catalysis, and protein–protein association. To date, most computational methods that predict the conformational states of loops only focus on individual loop regions. However, loop regions are often spatially in close proximity to one another and their mutual interactions stabilize their conformations. We have developed a new method, titled CorLps, capable of simultaneously predicting such interacting loop regions. First, an ensemble of individual loop conformations is generated for each loop region. The members of the individual ensembles are combined and are accepted or rejected based on a steric clash filter. After a subsequent side‐chain optimization step, the resulting conformations of the interacting loops are ranked by the statistical scoring function DFIRE that originated from protein structure prediction. Our results show that predicting interacting loops with CorLps is superior to sequential prediction of the two interacting loop regions, and our method is comparable in accuracy to single loop predictions. Furthermore, improved predictive accuracy of the top‐ranked solution is achieved for 12‐residue length loop regions by diversifying the initial pool of individual loop conformations using a quality threshold clustering algorithm. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

15.
Protein loops are often involved in important biological functions such as molecular recognition, signal transduction, or enzymatic action. The three dimensional structures of loops can provide essential information for understanding molecular mechanisms behind protein functions. In this article, we develop a novel method for protein loop modeling, where the loop conformations are generated by fragment assembly and analytical loop closure. The fragment assembly method reduces the conformational space drastically, and the analytical loop closure method finds the geometrically consistent loop conformations efficiently. We also derive an analytic formula for the gradient of any analytical function of dihedral angles in the space of closed loops. The gradient can be used to optimize various restraints derived from experiments or databases, for example restraints for preferential interactions between specific residues or for preferred backbone angles. We demonstrate that the current loop modeling method outperforms previous methods that employ residue‐based torsion angle maps or different loop closure strategies when tested on two sets of loop targets of lengths ranging from 4 to 12. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

16.
We present loop structure prediction results of the intracellular and extracellular loops of four G‐protein‐coupled receptors (GPCRs): bovine rhodopsin (bRh), the turkey β1‐adrenergic (β1Ar), the human β2‐adrenergic (β2Ar) and the human A2a adenosine receptor (A2Ar) in perturbed environments. We used the protein local optimization program, which builds thousands of loop candidates by sampling rotamer states of the loops' constituent amino acids. The candidate loops are discriminated between with our physics‐based, all‐atom energy function, which is based on the OPLS force field with implicit solvent and several correction terms. For relevant cases, explicit membrane molecules are included to simulate the effect of the membrane on loop structure. We also discuss a new sampling algorithm that divides phase space into different regions, allowing more thorough sampling of long loops that greatly improves results. In the first half of the paper, loop prediction is done with the GPCRs' transmembrane domains fixed in their crystallographic positions, while the loops are built one‐by‐one. Side chains near the loops are also in non‐native conformations. The second half describes a full homology model of β2Ar using β1Ar as a template. No information about the crystal structure of β2Ar was used to build this homology model. We are able to capture the architecture of short loops and the very long second extracellular loop, which is key for ligand binding. We believe this the first successful example of an RMSD validated, physics‐based loop prediction in the context of a GPCR homology model. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

17.
Lee MC  Duan Y 《Proteins》2004,55(3):620-634
Recent works have shown the ability of physics-based potentials (e.g., CHARMM and OPLS-AA) and energy minimization to differentiate the native protein structures from large ensemble of non-native structures. In this study, we extended previous work by other authors and developed an energy scoring function using a new set of AMBER parameters (also recently developed in our laboratory) in conjunction with molecular dynamics and the Generalized Born solvent model. We evaluated the performance of our new scoring function by examining its ability to distinguish between the native and decoy protein structures. Here we present a systematic comparison of our results with those obtained with use of other physics-based potentials by previous authors. A total of 7 decoy sets, 117 protein sequences, and more than 41,000 structures were evaluated. The results of our study showed that our new scoring function represents a significant improvement over previously published physics-based scoring functions.  相似文献   

18.
Loops in proteins are flexible regions connecting regular secondary structures. They are often involved in protein functions through interacting with other molecules. The irregularity and flexibility of loops make their structures difficult to determine experimentally and challenging to model computationally. Conformation sampling and energy evaluation are the two key components in loop modeling. We have developed a new method for loop conformation sampling and prediction based on a chain growth sequential Monte Carlo sampling strategy, called Distance-guided Sequential chain-Growth Monte Carlo (DiSGro). With an energy function designed specifically for loops, our method can efficiently generate high quality loop conformations with low energy that are enriched with near-native loop structures. The average minimum global backbone RMSD for 1,000 conformations of 12-residue loops is Å, with a lowest energy RMSD of Å, and an average ensemble RMSD of Å. A novel geometric criterion is applied to speed up calculations. The computational cost of generating 1,000 conformations for each of the x loops in a benchmark dataset is only about cpu minutes for 12-residue loops, compared to ca cpu minutes using the FALCm method. Test results on benchmark datasets show that DiSGro performs comparably or better than previous successful methods, while requiring far less computing time. DiSGro is especially effective in modeling longer loops (– residues).  相似文献   

19.
Loops are regions of nonrepetitive conformation connecting regular secondary structures. We identified 2,024 loops of one to eight residues in length, with acceptable main-chain bond lengths and peptide bond angles, from a database of 223 protein and protein-domain structures. Each loop is characterized by its sequence, main-chain conformation, and relative disposition of its bounding secondary structures as described by the separation between the tips of their axes and the angle between them. Loops, grouped according to their length and type of their bounding secondary structures, were superposed and clustered into 161 conformational classes, corresponding to 63% of all loops. Of these, 109 (51% of the loops) were populated by at least four nonhomologous loops or four loops sharing a low sequence identity. Another 52 classes, including 12% of the loops, were populated by at least three loops of low sequence similarity from three or fewer nonhomologous groups. Loop class suprafamilies resulting from variations in the termini of secondary structures are discussed in this article. Most previously described loop conformations were found among the classes. New classes included a 2:4 type IV hairpin, a helix-capping loop, and a loop that mediates dinucleotide-binding. The relative disposition of bounding secondary structures varies among loop classes, with some classes such as beta-hairpins being very restrictive. For each class, sequence preferences as key residues were identified; those most frequently at these conserved positions than in proteins were Gly, Asp, Pro, Phe, and Cys. Most of these residues are involved in stabilizing loop conformation, often through a positive phi conformation or secondary structure capping. Identification of helix-capping residues and beta-breakers among the highly conserved positions supported our decision to group loops according to their bounding secondary structures. Several of the identified loop classes were associated with specific functions, and all of the member loops had the same function; key residues were conserved for this purpose, as is the case for the parvalbumin-like calcium-binding loops. A significant number, but not all, of the member loops of other loop classes had the same function, as is the case for the helix-turn-helix DNA-binding loops. This article provides a systematic and coherent conformational classification of loops, covering a broad range of lengths and all four combinations of bounding secondary structure types, and supplies a useful basis for modelling of loop conformations where the bounding secondary structures are known or reliably predicted.  相似文献   

20.
Zhu J  Zhu Q  Shi Y  Liu H 《Proteins》2003,52(4):598-608
One strategy for ab initio protein structure prediction is to generate a large number of possible structures (decoys) and select the most fitting ones based on a scoring or free energy function. The conformational space of a protein is huge, and chances are rare that any heuristically generated structure will directly fall in the neighborhood of the native structure. It is desirable that, instead of being thrown away, the unfitting decoy structures can provide insights into native structures so prediction can be made progressively. First, we demonstrate that a recently parameterized physics-based effective free energy function based on the GROMOS96 force field and a generalized Born/surface area solvent model is, as several other physics-based and knowledge-based models, capable of distinguishing native structures from decoy structures for a number of widely used decoy databases. Second, we observe a substantial increase in correlations of the effective free energies with the degree of similarity between the decoys and the native structure, if the similarity is measured by the content of native inter-residue contacts in a decoy structure rather than its root-mean-square deviation from the native structure. Finally, we investigate the possibility of predicting native contacts based on the frequency of occurrence of contacts in decoy structures. For most proteins contained in the decoy databases, a meaningful amount of native contacts can be predicted based on plain frequencies of occurrence at a relatively high level of accuracy. Relative to using plain frequencies, overwhelming improvements in sensitivity of the predictions are observed for the 4_state_reduced decoy sets by applying energy-dependent weighting of decoy structures in determining the frequency. There, approximately 80% native contacts can be predicted at an accuracy of approximately 80% using energy-weighted frequencies. The sensitivity of the plain frequency approach is much lower (20% to 40%). Such improvements are, however, not observed for the other decoy databases. The rationalization and implications of the results are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号