首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Predicting the conformations of loops is a critical aspect of protein comparative (homology) modeling. Despite considerable advances in developing loop prediction algorithms, refining loops in homology models remains challenging. In this work, we use antibodies as a model system to investigate strategies for more robustly predicting loop conformations when the protein model contains errors in the conformations of side chains and protein backbone surrounding the loop in question. Specifically, our test system consists of partial models of antibodies in which the “scaffold” (i.e., the portion other than the complementarity determining region, CDR, loops) retains native backbone conformation, whereas the CDR loops are predicted using a combination of knowledge‐based modeling (H1, H2, L1, L2, and L3) and ab initio loop prediction (H3). H3 is the most variable of the CDRs. Using a previously published method, a test set of 10 shorter H3 loops (5–7 residues) are predicted to an average backbone (N? Cα? C? O) RMSD of 2.7 Å while 11 longer loops (8–9 residues) are predicted to 5.1 Å, thus recapitulating the difficulties in refining loops in models. By contrast, in control calculations predicting the same loops in crystal structures, the same method reconstructs the loops to an average of 0.5 and 1.4 Å for the shorter and longer loops, respectively. We modify the loop prediction method to improve the ability to sample near‐native loop conformations in the models, primarily by reducing the sensitivity of the sampling to the loop surroundings, and allowing the other CDR loops to optimize with the H3 loop. The new method improves the average accuracy significantly to 1.3 Å RMSD and 3.1 Å RMSD for the shorter and longer loops, respectively. Finally, we present results predicting 8–10 residue loops within complete comparative models of five nonantibody proteins. While anecdotal, these mixed, full‐model results suggest our approach is a promising step toward more accurately predicting loops in homology models. Furthermore, while significant challenges remain, our method is a potentially useful tool for predicting antibody structures based on a known Fv scaffold. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

2.
Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ≤12 residues in crystal structures. However, InsEnds may contain as many as ~50 amino acids, and the template-based model of the protein itself may be imperfect. To address these challenges, we present a free modeling method for predicting the local structure of loops and large InsEnds in both crystal structures and template-based models. The approach uses single amino acid torsional angle "pivot" moves of the protein backbone with a C(β) level representation. Nevertheless, our accuracy for loops is comparable to existing methods. We also apply a more stringent test, the blind structure prediction and refinement categories of the CASP9 tournament, where we improve the quality of several homology based models by modeling InsEnds as long as 45 amino acids, sizes generally inaccessible to existing loop prediction methods. Our approach ranks as one of the best in the CASP9 refinement category that involves improving template-based models so that they can function as molecular replacement models to solve the phase problem for crystallographic structure determination.  相似文献   

3.
The ability to determine the structure of a protein in solution is a critical tool for structural biology, as proteins in their native state are found in aqueous environments. Using a physical chemistry based prediction protocol, we demonstrate the ability to reproduce protein loop geometries in experimentally derived solution structures. Predictions were run on loops drawn from (1)NMR entries in the Protein Databank (PDB), and from (2) the RECOORD database in which NMR entries from the PDB have been standardized and re-refined in explicit solvent. The predicted structures are validated by comparison with experimental distance restraints, a test of structural quality as defined by the WHAT IF structure validation program, root mean square deviation (RMSD) of the predicted loops to the original structural models, and comparison of precision of the original and predicted ensembles. Results show that for the RECOORD ensembles, the predicted loops are consistent with an average of 95%, 91%, and 87% of experimental restraints for the short, medium and long loops respectively. Prediction accuracy is strongly affected by the quality of the original models, with increases in the percentage of experimental restraints violated of 2% for the short loops, and 9% for both the medium and long loops in the PDB derived ensembles. We anticipate the application of our protocol to theoretical modeling of protein structures, such as fold recognition methods; as well as to experimental determination of protein structures, or segments, for which only sparse NMR restraint data is available.  相似文献   

4.
Modeling protein loops using a phi i + 1, psi i dimer database.   总被引:1,自引:1,他引:0       下载免费PDF全文
We present an automated method for modeling backbones of protein loops. The method samples a database of phi i + 1 and psi i angles constructed from a nonredundant version of the Protein Data Bank (PDB). The dihedral angles phi i + 1 and psi i completely define the backbone conformation of a dimer when standard bond lengths, bond angles, and a trans planar peptide configuration are used. For the 400 possible dimers resulting from 20 natural amino acids, a list of allowed phi i + 1, psi i pairs for each dimer is created by pooling all such pairs from the loop segments of each protein in the nonredundant version of the PDB. Starting from the N-terminus of the loop sequence, conformations are generated by assigning randomly selected pairs of phi i + 1, psi i for each dimer from the respective pool using standard bond lengths, bond angles, and a trans peptide configuration. We use this database to simulate protein loops of lengths varying from 5 to 11 amino acids in five proteins of known three-dimensional structures. Typically, 10,000-50,000 models are simulated for each protein loop and are evaluated for stereochemical consistency. Depending on the length and sequence of a given loop, 50-80% of the models generated have no stereochemical strain in the backbone atoms. We demonstrate that, when simulated loops are extended to include flanking residues from homologous segments, only very few loops from an ensemble of sterically allowed conformations orient the flanking segments consistent with the protein topology. The presence of near-native backbone conformations for loops from five different proteins suggests the completeness of the dimeric database for use in modeling loops of homologous proteins. Here, we take advantage of this observation to design a method that filters near-native loop conformations from an ensemble of sterically allowed conformations. We demonstrate that our method eliminates the need for a loop-closure algorithm and hence allows for the use of topological constraints of the homologous proteins or disulfide constraints to filter near-native loop conformations.  相似文献   

5.
Consistently predicting biopolymer structure at atomic resolution from sequence alone remains a difficult problem, even for small sub-segments of large proteins. Such loop prediction challenges, which arise frequently in comparative modeling and protein design, can become intractable as loop lengths exceed 10 residues and if surrounding side-chain conformations are erased. Current approaches, such as the protein local optimization protocol or kinematic inversion closure (KIC) Monte Carlo, involve stages that coarse-grain proteins, simplifying modeling but precluding a systematic search of all-atom configurations. This article introduces an alternative modeling strategy based on a ‘stepwise ansatz’, recently developed for RNA modeling, which posits that any realistic all-atom molecular conformation can be built up by residue-by-residue stepwise enumeration. When harnessed to a dynamic-programming-like recursion in the Rosetta framework, the resulting stepwise assembly (SWA) protocol enables enumerative sampling of a 12 residue loop at a significant but achievable cost of thousands of CPU-hours. In a previously established benchmark, SWA recovers crystallographic conformations with sub-Angstrom accuracy for 19 of 20 loops, compared to 14 of 20 by KIC modeling with a comparable expenditure of computational power. Furthermore, SWA gives high accuracy results on an additional set of 15 loops highlighted in the biological literature for their irregularity or unusual length. Successes include cis-Pro touch turns, loops that pass through tunnels of other side-chains, and loops of lengths up to 24 residues. Remaining problem cases are traced to inaccuracies in the Rosetta all-atom energy function. In five additional blind tests, SWA achieves sub-Angstrom accuracy models, including the first such success in a protein/RNA binding interface, the YbxF/kink-turn interaction in the fourth ‘RNA-puzzle’ competition. These results establish all-atom enumeration as an unusually systematic approach to ab initio protein structure modeling that can leverage high performance computing and physically realistic energy functions to more consistently achieve atomic accuracy.  相似文献   

6.
The DI protein of photosystem II (PS II) complex of a microalga Chaetosphaeridium globosum has been theoretically modelled from its sequence using comparative modeling with known backbone structure of DI protein from bacterium Thermosynechococcus vulcanus as template. The model is built with missing loops and all side chains, which are not resolved in the structure of the template. The structure of the tetramanganese cluster (TMC) and the ligand forming side chains have been subjected to modeling studies in order to gather more information useful to understanding of the water splitting reactions. Earlier models of TMC have been scrutinized and an insight into the manganese coordination sphere has been provided.  相似文献   

7.
8.
Membrane proteins (MPs) have become a major focus in structure prediction, due to their medical importance. There is, however, a lack of fast and reliable methods that specialize in the modeling of MP loops. Often methods designed for soluble proteins (SPs) are applied directly to MPs. In this article, we investigate the validity of such an approach in the realm of fragment‐based methods. We also examined the differences in membrane and soluble protein loops that might affect accuracy. We test our ability to predict soluble and MP loops with the previously published method FREAD. We show that it is possible to predict accurately the structure of MP loops using a database of MP fragments (0.5–1 Å median root‐mean‐square deviation). The presence of homologous proteins in the database helps prediction accuracy. However, even when homologues are removed better results are still achieved using fragments of MPs (0.8–1.6 Å) rather than SPs (1–4 Å) to model MP loops. We find that many fragments of SPs have shapes similar to their MP counterparts but have very different sequences; however, they do not appear to differ in their substitution patterns. Our findings may allow further improvements to fragment‐based loop modeling algorithms for MPs. The current version of our proof‐of‐concept loop modeling protocol produces high‐accuracy loop models for MPs and is available as a web server at http://medeller.info/fread . Proteins 2014; 82:175–186. © 2013 Wiley Periodicals, Inc.  相似文献   

9.
High‐resolution homology models are useful in structure‐based protein engineering applications, especially when a crystallographic structure is unavailable. Here, we report the development and implementation of RosettaAntibody, a protocol for homology modeling of antibody variable regions. The protocol combines comparative modeling of canonical complementarity determining region (CDR) loop conformations and de novo loop modeling of CDR H3 conformation with simultaneous optimization of VL‐VH rigid‐body orientation and CDR backbone and side‐chain conformations. The protocol was tested on a benchmark of 54 antibody crystal structures. The median root mean square deviation (rmsd) of the antigen binding pocket comprised of all the CDR residues was 1.5 Å with 80% of the targets having an rmsd lower than 2.0 Å. The median backbone heavy atom global rmsd of the CDR H3 loop prediction was 1.6, 1.9, 2.4, 3.1, and 6.0 Å for very short (4–6 residues), short (7–9), medium (10–11), long (12–14) and very long (17–22) loops, respectively. When the set of ten top‐scoring antibody homology models are used in local ensemble docking to antigen, a moderate‐to‐high accuracy docking prediction was achieved in seven of fifteen targets. This success in computational docking with high‐resolution homology models is encouraging, but challenges still remain in modeling antibody structures for sequences with long H3 loops. This first large‐scale antibody–antigen docking study using homology models reveals the level of “functional accuracy” of these structural models toward protein engineering applications. Proteins 2009; 74:497–514. © 2008 Wiley‐Liss, Inc.  相似文献   

10.
Rohl CA  Strauss CE  Chivian D  Baker D 《Proteins》2004,55(3):656-677
A major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem. Here we describe a method based on the de novo structure prediction algorithm, Rosetta, for predicting conformations of structurally divergent regions in comparative models. Initial conformations for short segments are selected from the protein structure database, whereas longer segments are built up by using three- and nine-residue fragments drawn from the database and combined by using the Rosetta algorithm. A gap closure term in the potential in combination with modified Newton's method for gradient descent minimization is used to ensure continuity of the peptide backbone. Conformations of variable regions are refined in the context of a fixed template structure using Monte Carlo minimization together with rapid repacking of side-chains to iteratively optimize backbone torsion angles and side-chain rotamers. For short loops, mean accuracies of 0.69, 1.45, and 3.62 A are obtained for 4, 8, and 12 residue loops, respectively. In addition, the method can provide reasonable models of conformations of longer protein segments: predicted conformations of 3A root-mean-square deviation or better were obtained for 5 of 10 examples of segments ranging from 13 to 34 residues. In combination with a sequence alignment algorithm, this method generates complete, ungapped models of protein structures, including regions both similar to and divergent from a homologous structure. This combined method was used to make predictions for 28 protein domains in the Critical Assessment of Protein Structure 4 (CASP 4) and 59 domains in CASP 5, where the method ranked highly among comparative modeling and fold recognition methods. Model accuracy in these blind predictions is dominated by alignment quality, but in the context of accurate alignments, long protein segments can be accurately modeled. Notably, the method correctly predicted the local structure of a 39-residue insertion into a TIM barrel in CASP 5 target T0186.  相似文献   

11.
In spite of the tremendous increase in the rate at which protein structures are being determined, there is still an enormous gap between the numbers of known DNA-derived sequences and the numbers of three-dimensional structures. In order to shed light on the biological functions of the molecules, researchers often resort to comparative molecular modeling. Earlier work has shown that when the sequence alignment is in error, then the comparative model is guaranteed to be wrong. In addition, loops, the sites of insertions and deletions in families of homologous proteins, are exceedingly difficult to model. Thus, many of the current problems in comparative molecular modeling are minor versions of the global protein folding problem. In order to assess objectively the current state of comparative molecular modeling, 13 groups submitted blind predictions of seven different proteins of undisclosed tertiary structure. This assessment shows that where sequence identity between the target and the template structure is high (> 70%), comparative molecular modeling is highly successful. On the other hand, automated modeling techniques and sophisticated energy minimization methods fail to improve upon the starting structures when the sequence identity is low (~30%). Based on these results it appears that insertions and deletions are still major problems. Successfully deducing the correct sequence alignment when the local similarity is low is still difficult. We suggest some minimal testing of submitted coordinates that should be required of authors before papers on comparative molecular modeling are accepted for publication in journals. © 1995 Wiley-Liss, Inc.  相似文献   

12.
G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs.  相似文献   

13.
G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs.  相似文献   

14.

Background  

Template-target sequence alignment and loop modeling are key components of protein comparative modeling. Short loops can be predicted with high accuracy using structural fragments from other, not necessairly homologous proteins, or by various minimization methods. For longer loops multiscale approaches employing coarse-grained de novo modeling techniques should be more effective.  相似文献   

15.
Ab initio modeling of small, medium, and large loops in proteins.   总被引:1,自引:0,他引:1  
This study presents different procedures for ab initio modeling of peptide loops of different sizes in proteins. Small loops (up to 8--12 residues) were generated by a straightforward procedure with subsequent "averaging" over all the low-energy conformers obtained. The averaged conformer fairly represents the entire set of low-energy conformers, root mean square deviation (RMSD) values being from 1.01 A for a 4-residue loop to 1.94 A for an 8-residue loop. Three-dimensional (3D) structures for several medium loops (20--30 residues) and for two large loops (54 and 61 residues) were predicted using residue-residue contact matrices divided into variable parts corresponding to the loops, and into a constant part corresponding to the known core of the protein. For each medium loop, a very limited number of sterically reasonable C(alpha) traces (from 1 to 3) was found; RMSD values ranged from 2.4 to 5.9 A. Single C(alpha) traces predicted for each of the large loops possessed RMSD values of 4.5 A. Generally, ab initio loop modeling presented in this work combines elements of computational procedures developed both for protein folding and for peptide conformational analysis.  相似文献   

16.
17.
18.
Li X  Jacobson MP  Friesner RA 《Proteins》2004,55(2):368-382
We have developed a new method for predicting helix positions in globular proteins that is intended primarily for comparative modeling and other applications where high precision is required. Unlike helix packing algorithms designed for ab initio folding, we assume that knowledge is available about the qualitative placement of all helices. However, even among homologous proteins, the corresponding helices can demonstrate substantial differences in positions and orientations, and for this reason, improperly positioned helices can contribute significantly to the overall backbone root-mean-square deviation (RMSD) of comparative models. A helix packing algorithm for use in comparative modeling must obtain high precision to be useful, and for this reason we utilize an all-atom protein force field (OPLS) and a Generalized Born continuum solvent model. To reduce the computational expense associated with using a detailed, physics-based energy function, we have developed new hierarchical and multiscale algorithms for sampling the helices and flanking loops. We validate the method using a test suite of 33 cases, which are drawn from a diverse set of high-resolution crystal structures. The helix positions are reproduced with an average backbone RMSD of 0.6 A, while the average backbone RMSD of the complete loop-helix-loop region (i.e., the helix with the surrounding loops, which are also repredicted) is 1.3 A.  相似文献   

19.
In protein structure prediction, it is often the case that a protein segment must be adjusted to connect two fixed segments. This occurs during loop structure prediction in homology modeling as well as in ab initio structure prediction. Several algorithms for this purpose are based on the inverse Jacobian of the distance constraints with respect to dihedral angle degrees of freedom. These algorithms are sometimes unstable and fail to converge. We present an algorithm developed originally for inverse kinematics applications in robotics. In robotics, an end effector in the form of a robot hand must reach for an object in space by altering adjustable joint angles and arm lengths. In loop prediction, dihedral angles must be adjusted to move the C-terminal residue of a segment to superimpose on a fixed anchor residue in the protein structure. The algorithm, referred to as cyclic coordinate descent or CCD, involves adjusting one dihedral angle at a time to minimize the sum of the squared distances between three backbone atoms of the moving C-terminal anchor and the corresponding atoms in the fixed C-terminal anchor. The result is an equation in one variable for the proposed change in each dihedral. The algorithm proceeds iteratively through all of the adjustable dihedral angles from the N-terminal to the C-terminal end of the loop. CCD is suitable as a component of loop prediction methods that generate large numbers of trial structures. It succeeds in closing loops in a large test set 99.79% of the time, and fails occasionally only for short, highly extended loops. It is very fast, closing loops of length 8 in 0.037 sec on average.  相似文献   

20.
Das B  Meirovitch H 《Proteins》2003,51(3):470-483
A new procedure for optimizing parameters of implicit solvation models introduced by us has been applied successfully first to cyclic peptides and more recently to three surface loops of ribonuclease A (Das and Meirovitch, Proteins 2001;43:303-314) using the simplified model E(tot) = E(FF)(epsilon = nr) + Sigma(i) sigma(i)A(i), where sigma(i) are atomic solvation parameters (ASPs) to be optimized, A(i) is the solvent accessible surface area of atom i, E(FF)(epsilon = nr) is the AMBER force-field energy of the loop-loop and loop-template interactions with a distance-dependent dielectric constant, epsilon = nr, where n is a parameter. The loop is free to move while the protein template is held fixed in its X-ray structure; an extensive conformational search for energy minimized loop structures is carried out with our local torsional deformation method. The optimal ASPs and n are those for which the structure with the lowest minimized energy [E(tot)(n,sigma(i))] becomes the experimental X-ray structure, or less strictly, the energy gap between these structures is within 2-3 kcal/mol. To check if a set of ASPs can be defined, which is transferable to a large number of loops, we optimize individual sets of ASPs (based on n = 2) for 12 surface loops from which an "averaged" best-fit set is defined. This set is then applied to the 12 loops and an independent "test" group of 8 loops leading in most cases to very small RMSD values; thus, this set can be useful for structure prediction of loops in homology modeling. For three loops we also calculate the free energy gaps to find that they are only slightly smaller than their energy counterparts, indicating that only larger n will enable reducing too large gaps. Because of its simplicity, this model allowed carrying out an extensive application of our methodology, providing thereby a large number of benchmark results for comparison with future calculations based on n > 2 as well as on more sophisticated solvation models with as yet unknown performance for loops.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号