首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We describe a web server, which provides easy access to the SLoop database of loop conformations connecting elements of protein secondary structure. The loops are classified according to their length, the type of bounding secondary structures and the conformation of the mainchain. The current release of the database consists of over 8000 loops of up to 20 residues in length. A loop prediction method, which selects conformers on the basis of the sequence and the positions of the elements of secondary structure, is also implemented. These web pages are freely accessible over the internet at http://www-cryst.bioc.cam.ac.uk/ approximately sloop.  相似文献   

2.
In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about phi-psi angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI-BLAST. A jack-knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X-ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI-PRED (Jones, J Mol Biol 1999;292:195-202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations.  相似文献   

3.
Loops are regions of nonrepetitive conformation connecting regular secondary structures. We identified 2,024 loops of one to eight residues in length, with acceptable main-chain bond lengths and peptide bond angles, from a database of 223 protein and protein-domain structures. Each loop is characterized by its sequence, main-chain conformation, and relative disposition of its bounding secondary structures as described by the separation between the tips of their axes and the angle between them. Loops, grouped according to their length and type of their bounding secondary structures, were superposed and clustered into 161 conformational classes, corresponding to 63% of all loops. Of these, 109 (51% of the loops) were populated by at least four nonhomologous loops or four loops sharing a low sequence identity. Another 52 classes, including 12% of the loops, were populated by at least three loops of low sequence similarity from three or fewer nonhomologous groups. Loop class suprafamilies resulting from variations in the termini of secondary structures are discussed in this article. Most previously described loop conformations were found among the classes. New classes included a 2:4 type IV hairpin, a helix-capping loop, and a loop that mediates dinucleotide-binding. The relative disposition of bounding secondary structures varies among loop classes, with some classes such as beta-hairpins being very restrictive. For each class, sequence preferences as key residues were identified; those most frequently at these conserved positions than in proteins were Gly, Asp, Pro, Phe, and Cys. Most of these residues are involved in stabilizing loop conformation, often through a positive phi conformation or secondary structure capping. Identification of helix-capping residues and beta-breakers among the highly conserved positions supported our decision to group loops according to their bounding secondary structures. Several of the identified loop classes were associated with specific functions, and all of the member loops had the same function; key residues were conserved for this purpose, as is the case for the parvalbumin-like calcium-binding loops. A significant number, but not all, of the member loops of other loop classes had the same function, as is the case for the helix-turn-helix DNA-binding loops. This article provides a systematic and coherent conformational classification of loops, covering a broad range of lengths and all four combinations of bounding secondary structure types, and supplies a useful basis for modelling of loop conformations where the bounding secondary structures are known or reliably predicted.  相似文献   

4.
One of the most important and challenging tasks in protein modelling is the prediction of loops, as can be seen in the large variety of existing approaches. Loops In Proteins (LIP) is a database that includes all protein segments of a length up to 15 residues contained in the Protein Data Bank (PDB). In this study, the applicability of LIP to loop prediction in the framework of homology modelling is investigated. Searching the database for loop candidates takes less than 1 s on a desktop PC, and ranking them takes a few minutes. This is an order of magnitude faster than most existing procedures. The measure of accuracy is the root mean square deviation (RMSD) with respect to the main-chain atoms after local superposition of target loop and predicted loop. Loops of up to nine residues length were modelled with a local RMSD <1 A and those of length up to 14 residues with an accuracy better than 2 A. The results were compared in detail with a thoroughly evaluated and tested ab initio method published recently and additionally with two further methods for a small loop test set. The LIP method produced very good predictions. In particular for longer loops it outperformed other methods.  相似文献   

5.
Modeling of loops in protein structures   总被引:27,自引:0,他引:27       下载免费PDF全文
Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM-22 force field, statistical preferences for the main-chain and side-chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30% of 4-, 8-, and 12-residue loop predictions, respectively, had <2 A RMSD error for the mainchain N, C(alpha), C, and O atoms; the average accuracies were 0.59 +/- 0.05, 1.16 +/- 0.10, and 2.61 +/- 0.16 A, respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main-chain stem atoms is 2.5 A, the average loop prediction error increased by 180, 25, and 3% for 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.  相似文献   

6.
Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment‐based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ~25–75% of the best predictions came from the Dynameomics set, resulting in lower main chain root‐mean‐square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments .  相似文献   

7.
Kai Zhu  Tyler Day 《Proteins》2013,81(6):1081-1089
Antibodies have the capability of binding a wide range of antigens due to the diversity of the six loops constituting the complementarity determining region (CDR). Among the six loops, the H3 loop is the most diverse in structure, length, and sequence identity. Prediction of the three‐dimensional structures of antibodies, especially the CDR loops, is an important step in the computational design and engineering of novel antibodies for improved affinity and specificity. Although it has been demonstrated that the conformation of the five non‐H3 loops can be accurately predicted by comparing their sequences against databases of canonical loop conformations, no such connection has been established for H3 loops. In this work, we present the results for ab initio structure prediction of the H3 loop using conformational sampling and energy calculations with the program Prime on a dataset of 53 loops ranging in length from 4 to 22 residues. When the prediction is performed in the crystal environment and including symmetry mates, the median backbone root mean square deviation (RMSD) is 0.5 Å to the crystal structure, with 91% of cases having an RMSD of less than 2.0 Å. When the prediction is performed in a noncrystallographic environment, where the scaffold is constructed by swapping the H3 loops between homologous antibodies, 70% of cases have an RMSD below 2.0 Å. These results show promise for ab initio loop predictions applied to modeling of antibodies. © 2012 Wiley Periodicals, Inc.  相似文献   

8.
J M Chandonia  M Karplus 《Proteins》1999,35(3):293-306
A primary and a secondary neural network are applied to secondary structure and structural class prediction for a database of 681 non-homologous protein chains. A new method of decoding the outputs of the secondary structure prediction network is used to produce an estimate of the probability of finding each type of secondary structure at every position in the sequence. In addition to providing a reliable estimate of the accuracy of the predictions, this method gives a more accurate Q3 (74.6%) than the cutoff method which is commonly used. Use of these predictions in jury methods improves the Q3 to 74.8%, the best available at present. On a database of 126 proteins commonly used for comparison of prediction methods, the jury predictions are 76.6% accurate. An estimate of the overall Q3 for a given sequence is made by averaging the estimated accuracy of the prediction over all residues in the sequence. As an example, the analysis is applied to the target beta-cryptogein, which was a difficult target for ab initio predictions in the CASP2 study; it shows that the prediction made with the present method (62% of residues correct) is close to the expected accuracy (66%) for this protein. The larger database and use of a new network training protocol also improve structural class prediction accuracy to 86%, relative to 80% obtained previously. Secondary structure content is predicted with accuracy comparable to that obtained with spectroscopic methods, such as vibrational or electronic circular dichroism and Fourier transform infrared spectroscopy.  相似文献   

9.
Li W  Liu Z  Lai L 《Biopolymers》1999,49(6):481-495
A general problem in comparative modeling and protein design is the conformational evaluation of loops with a certain sequence in specific environmental protein frameworks. Loops of different sequences and structures on similar scaffolds are common in the Protein Data Bank (PDB). In order to explore both structural and sequential diversity of them, a data base of loops connecting similar secondary structure fragments is constructed by searching the data base of families of structurally similar proteins and PDB. A total of 84 loop families having 2-13 residues are found among the well-determined structures of resolution better than 2.5 A. Eight alpha-alpha, 20 alpha-beta, 19 beta-alpha, and 37 beta-beta families are identified. Every family contains more than 5 loop motifs. In each family, no loops share same sequence and all the frameworks are well superimposed. Forty-three new loop classes are distinguished in the data base. The structural variability of loops in homologous proteins are examined and shown in 44 families. Motif families are characterized with geometric parameters and sequence patterns. The conformations of loops in each family are clustered into subfamilies using average linkage cluster analysis method. Information such as geometric properties, sequence profile, sequential and structural variability in loop, structural alignment parameters, sequence similarities, and clustering results are provided. Correlations between the conformation of loops and loop sequence, motif sequence, and global sequence of PDB chain are examined in order to find how loop structures depend on their sequences and how they are affected by the local and global environment. Strong correlations (R > 0.75) are only found in 24 families. The best R value is 0.98. The data base is available through the Internet.  相似文献   

10.
An algorithm has been developed to improve the success rate in the prediction of the secondary structure of proteins by taking into account the predicted class of the proteins. This method has been called the 'double prediction method' and consists of a first prediction of the secondary structure from a new algorithm which uses parameters of the type described by Chou and Fasman, and the prediction of the class of the proteins from their amino acid composition. These two independent predictions allow one to optimize the parameters calculated over the secondary structure database to provide the final prediction of secondary structure. This method has been tested on 59 proteins in the database (i.e. 10,322 residues) and yields 72% success in class prediction, 61.3% of residues correctly predicted for three states (helix, sheet and coil) and a good agreement between observed and predicted contents in secondary structure.  相似文献   

11.
MOTIVATION: As protein structure database expands, protein loop modeling remains an important and yet challenging problem. Knowledge-based protein loop prediction methods have met with two challenges in methodology development: (1) loop boundaries in protein structures are frequently problematic in constructing length-dependent loop databases for protein loop predictions; (2) knowledge-based modeling of loops of unknown structure requires both aligning a query loop sequence to loop templates and ranking the loop sequence-template matches. RESULTS: We developed a knowledge-based loop prediction method that circumvents the need of constructing hierarchically clustered length-dependent loop libraries. The method first predicts local structural fragments of a query loop sequence and then structurally aligns the predicted structural fragments to a set of non-redundant loop structural templates regardless of the loop length. The sequence-template alignments are then quantitatively evaluated with an artificial neural network model trained on a set of predictions with known outcomes. Prediction accuracy benchmarks indicated that the novel procedure provided an alternative approach overcoming the challenges of knowledge-based loop prediction. AVAILABILITY: http://cmb.genomics.sinica.edu.tw  相似文献   

12.
The 2-hydroxycarboxylate transporter (2HCT) family of secondary transporters belongs to a much larger structural class of secondary transporters termed ST3 which contains about 2000 transporters in 32 families. The transporters of the 2HCT family are among the best studied in the class. Here we detect weak sequence similarity between the N- and C-terminal halves of the proteins using a sensitive method which uses a database containing the N- and C-terminal halves of all the sequences in ST3 and involves blast searches of each sequence in the database against the whole database. Unrelated families of secondary transporters of the same length and composition were used as controls. The sequence similarity involved major parts of the N- and C-terminal halves and not just a small stretch. The membrane topology of the homologous N- and C-terminal domains was deduced from the experimentally determined topology of the members of the 2HCT family. The domains consist of five transmembrane segments each and have opposite orientations in the membrane. The N terminus of the N-terminal domain is extracellular, while the N terminus of the C-terminal domain is cytoplasmic. The loops between the fourth and fifth transmembrane segment in each domain are well conserved throughout the class and contain a high fraction of residues with small side chains, Gly, Ala and Ser. Experimental work on the citrate transporter CitS in the 2HCT family indicates that the loops are re-entrant or pore loops. The re-entrant loops in the N- and C-terminal domains enter the membrane from opposite sides (trans-re-entrant loops). The combination of inverted membrane topology and trans-re-entrant loops represents a new fold for secondary transporters and resembles the structure of aquaporins and models proposed for Na+/Ca2+ exchangers.  相似文献   

13.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix, beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69, respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30% of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

14.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix,beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69,respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30%of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

15.
We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105,950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed phi/psi main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 A of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a approximately 5:1 ratio an earlier developed database search method.  相似文献   

16.
In this study, various 400 ps molecular dynamics simulations were conducted to determine the stabilizing effect of O-glycosylation on the secondary structural integrity of the design alpha-loop-alpha motif, which has the optimal loop length of 7 Gly residues (denoted as N-A16G7A16-C). In general, O-glycosylation stabilizes the structural integrity of the model peptide regardless of the length and position of glycosylation sites because it decreases the opportunity for water molecules to compete for the intramolecular hydrogen bonds. The designed peptide exhibits the highest helicity when residues 11 and 31 are replaced with Ser residues followed by O-linked with 3 galactose residues, representing the "face-to-face" glycosylation near the loop. In this case, the loop exhibits an extended conformation and several new hydrogen bonds are observed between the main chain of the loop and the galactose residues, resulting in decreasing the fluctuation and increasing the stability of the entire peptide. When the glycosylation are made close to the loop, the secondary structural integrity of the alpha-loop-alpha motif increases with the number of galactose residues. In addition, "face-to-face" glycosylation increases the structural integrity of this motif to a greater extent than "back-to-back" glycosylation. However, when the glycosylation are created away from the loop and near the N- and C-termini, no general rule is found for the stabilizing effect.  相似文献   

17.
A common feature of alpha-helices in proteins is a loop at the C-terminal end, with a characteristic hydrogen bond pattern. It is noted that several loops with the same structural features occur independently of alpha-helices; two are even situated at the loop ends of beta-hairpins. The name paperclip is suggested for loops possessing the appropriate hydrogen bonds. A number of features of paperclips are described: they exist in two classes, depending on the number of residues at the loop end; one class is very much commoner than the other. Two paperclips are found that belong to the common class, except that the main-chain conformation of each is the mirror image of that normally found. The majority of paperclips are shown to have tightly clustered sets of main-chain dihedral angles. These are somewhat similar to, but distinct from, a subgroup of another common family of loops that have been called beta-bulge loops; in the latter, the dihedral angles are also tightly clustered. The high degree of clustering in both cases is likely to be a result of steric constraints associated with hydrogen bond patterns at the ends of loops.  相似文献   

18.
Wong S  Jacobson MP 《Proteins》2008,71(1):153-164
Ligand binding frequently induces significant conformational changes in a protein receptor. Understanding and predicting such conformational changes represent an important challenge for computational biology, including applications to structure-based drug design. We describe an approach to this problem based on the assumption that the holo state is at least transiently populated in the absence of a ligand; this hypothesis has been referred to as "conformational selection." Here, we apply a method that tests this hypothesis on a challenging class of ligand-induced conformational changes, which we refer to as loop latching: the closing of a loop around an active site that sequesters the ligand from solvent. The method uses a combination of replica exchange molecular dynamics and a loop prediction algorithm to generate low-energy loop structures, and docking to select the conformation appropriate for binding a particular ligand. On a test set of six proteins, it yields loop structures including hololike conformations, generally below 2 A RMSD from the liganded structure, for loops that span up to 15 residues. Docking serves as a stringent test of the predictions. In five of the six cases, the predicted loop conformations improve the ranks of cognate ligands relative to using the apo structure, although the results remain, in most cases, significantly worse than using a holo structure. The poses of the cognate ligands are correct in four of the six test cases, while they are correct for five of the six using a holo structure.  相似文献   

19.
Loops are integral components of protein structures, providing links between elements of secondary structure, and in many cases contributing to catalytic and binding sites. The conformations of short loops are now understood to depend primarily on their amino acid sequences. In contrast, the structural determinants of longer loops involve hydrogen-bonding and packing interactions within the loop and with other parts of the protein. By searching solved protein structures for regions similar in main chain conformation to the antigen-binding loops in immunoglobulins, we identified medium-sized loops of similar structure in unrelated proteins, and compared the determinants of their conformations. For loops that form compact substructures the major determinant of the conformation is the formation of hydrogen bonds to inward-pointing main chain atoms. For loops that have more extended conformations, the major determinant of their structure is the packing of a particular residue or residues against the rest of the protein. The following picture emerges: Medium-sized loops of similar conformation are stabilized by similar interactions. The groups that interact with the loop have very similar spatial dispositions with respect to the loop. However, the residues that provide these interactions may arise from dissimilar parts of the protein: The conformation of the loop requires certain interactions that the protein may provide in a variety of ways.  相似文献   

20.
Loops are the most variable regions of protein structure and are, in general, the least accurately predicted. Their prediction has been approached in two ways, ab initio and database search. In recent years, it has been thought that ab initio methods are more powerful. In light of the continued rapid expansion in the number of known protein structures, we have re‐evaluated FREAD, a database search method and demonstrate that the power of database search methods may have been underestimated. We found that sequence similarity as quantified by environment specific substitution scores can be used to significantly improve prediction. In fact, FREAD performs appreciably better for an identifiable subset of loops (two thirds of shorter loops and half of the longer loops tested) than the ab initio methods of MODELLER, PLOP, and RAPPER. Within this subset, FREAD's predictive ability is length independent, in general, producing results within 2Å RMSD, compared to an average of over 10Å for loop length 20 for any of the other tested methods. We also benchmarked the prediction protocols on a set of 212 loops from the model structures in CASP 7 and 8. An extended version of FREAD is able to make predictions for 127 of these, it gives the best prediction of the methods tested in 61 of these cases. In examining FREAD's ability to predict in the model environment, we found that whole structure quality did not affect the quality of loop predictions. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号