首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Added-value is the additional information that a model carries with respect to the template structure used for model building. Thousands of single-template models, corresponding to proteins of known structure, were analyzed. The accuracy of structure-derived properties, such as residue accessibility, surface area, electrostatic potential, and others, was determined as a function of template:target sequence identity by comparing the models with their corresponding experimental structures. Added-value was determined by comparing the accuracy in models with that from templates. Geometry-dependent properties such as neighborhood of buried residues and accessible surface area showed low added-value. Properties that also depend on the protein sequence, such as presence of polar areas and electrostatic potential, showed high added-value. In general added-value increases when template:target sequence identity decreases, but it is also affected by alignment errors. This study justifies the use of models instead of the use of templates to estimate structure-derived properties of a target protein.  相似文献   

2.
We evaluate 3D models of human nucleoside diphosphate kinase, mouse cellular retinoic acid binding protein I, and human eosinophil neurotoxin that were calculated by MODELLER , a program for comparative protein modeling by satisfaction of spatial restraints. The models have good stereochemistry and are at least as similar to the crystallographic structures as the closest template structures. The largest errors occur in the regions that were not aligned correctly or where the template structures are not similar to the correct structure. These regions correspond predominantly to exposed loops, insertions of any length, and non-conserved side chains. When a template structure with more than 40% sequence identity to the target protein is available, the model is likely to have about 90% of the mainchain atoms modeled with an rms deviation from the X-ray structure of ≈ 1 Å, in large part because the templates are likely to be that similar to the X-ray structure of the target. This rms deviation is comparable to the overall differences between refined NMR and X-ray crystallography structures of the same protein. © 1995 Wiley-Liss, Inc.  相似文献   

3.
4.
SCWRL and MolIDE are software applications for prediction of protein structures. SCWRL is designed specifically for the task of prediction of side-chain conformations given a fixed backbone usually obtained from an experimental structure determined by X-ray crystallography or NMR. SCWRL is a command-line program that typically runs in a few seconds. MolIDE provides a graphical interface for basic comparative (homology) modeling using SCWRL and other programs. MolIDE takes an input target sequence and uses PSI-BLAST to identify and align templates for comparative modeling of the target. The sequence alignment to any template can be manually modified within a graphical window of the target-template alignment and visualization of the alignment on the template structure. MolIDE builds the model of the target structure on the basis of the template backbone, predicted side-chain conformations with SCWRL and a loop-modeling program for insertion-deletion regions with user-selected sequence segments. SCWRL and MolIDE can be obtained at (http://dunbrack.fccc.edu/Software.php).  相似文献   

5.
Computational small molecule docking into comparative models of proteins is widely used to query protein function and in the development of small molecule therapeutics. We benchmark RosettaLigand docking into comparative models for nine proteins built during CASP8 that contain ligands. We supplement the study with 21 additional protein/ligand complexes to cover a wider space of chemotypes. During a full docking run in 21 of the 30 cases, RosettaLigand successfully found a native-like binding mode among the top ten scoring binding modes. From the benchmark cases we find that careful template selection based on ligand occupancy provides the best chance of success while overall sequence identity between template and target do not appear to improve results. We also find that binding energy normalized by atom number is often less than −0.4 in native-like binding modes.  相似文献   

6.
7.

Background

Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 15% and 40% sequence identity and 1.5 to 4.5 Å backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.

Results

We have designed an automated modeling procedure for predicting the three-dimensionnal structure of knottins. The different steps of the homology modeling pipeline were carefully optimized relatively to a test set of knottins with known structures: template selection and alignment, extraction of structural constraints and model building, model evaluation and refinement. After optimization, the accuracy of predicted models was shown to lie between 1.50 and 1.96 Å from native structures at 50% and 10% maximum sequence identity levels, respectively. These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template. A database of 1621 structural models for all known knottin sequences was generated and is freely accessible from our web server at http://knottin.cbs.cnrs.fr. Models can also be interactively constructed from any knottin sequence using the structure prediction module Knoter1D3D available from our protein analysis toolkit PAT at http://pat.cbs.cnrs.fr.

Conclusions

This work explores different directions for a systematic homology modeling of a diverse family of protein sequences. In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.
  相似文献   

8.
G-protein coupled receptors (GPCRs) are targets of nearly one third of the drugs at the current pharmaceutical market. Despite their importance in many cellular processes the crystal structures are available for less than 20 unique GPCRs of the Rhodopsin-like class. Fortunately, even though involved in different signaling cascades, this large group of membrane proteins has preserved a uniform structure comprising seven transmembrane helices that allows quite reliable comparative modeling. Nevertheless, low sequence similarity between the GPCR family members is still a serious obstacle not only in template selection but also in providing theoretical models of acceptable quality. An additional level of difficulty is the prediction of kinks and bulges in transmembrane helices. Usage of multiple templates and generation of alignments based on sequence profiles may increase the rate of success in difficult cases of comparative modeling in which the sequence similarity between GPCRs is exceptionally low. Here, we present GPCRM, a novel method for fast and accurate generation of GPCR models using averaging of multiple template structures and profile-profile comparison. In particular, GPCRM is the first GPCR structure predictor incorporating two distinct loop modeling techniques: Modeller and Rosetta together with the filtering of models based on the Z-coordinate. We tested our approach on all unique GPCR structures determined to date and report its performance in comparison with other computational methods targeting the Rhodopsin-like class. We also provide a database of precomputed GPCR models of the human receptors from that class.

Availability

GPCRM server and database: http://gpcrm.biomodellab.eu  相似文献   

9.
Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the template protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured.  相似文献   

10.
The structural biology of proteins mediating iron-sulfur (Fe-S) cluster assembly is central for understanding several important biological processes. Here we present the NMR structure of the 16-kDa protein YgdK from Escherichia coli, which shares 35% sequence identity with the E. coli protein SufE. The SufE X-ray crystal structure was solved in parallel with the YdgK NMR structure in the Northeast Structural Genomics (NESG) consortium. Both proteins are (1) key components for Fe-S metabolism, (2) exhibit the same distinct fold, and (3) belong to a family of at least 70 prokaryotic and eukaryotic sequence homologs. Accurate homology models were calculated for the YgdK/SufE family based on YgdK NMR and SufE crystal structure. Both structural templates contributed equally, exemplifying synergy of NMR and X-ray crystallography. SufE acts as an enhancer of the cysteine desulfurase activity of SufS by SufE-SufS complex formation. A homology model of CsdA, a desulfurase encoded in the same operon as YgdK, was modeled using the X-ray structure of SufS as a template. Protein surface and electrostatic complementarities strongly suggest that YgdK and CsdA likewise form a functional two-component desulfurase complex. Moreover, structural features of YgdK and SufS, which can be linked to their interaction with desulfurases, are conserved in all homology models. It thus appears very likely that all members of the YgdK/SufE family act as enhancers of Suf-S-like desulfurases. The present study exemplifies that "refined" selection of two (or more) targets enables high-quality homology modeling of large protein families.  相似文献   

11.
Liu S  Zhang C  Liang S  Zhou Y 《Proteins》2007,68(3):636-645
Recognizing the structural similarity without significant sequence identity (called fold recognition) is the key for bridging the gap between the number of known protein sequences and the number of structures solved. Previously, we developed a fold-recognition method called SP(3) which combines sequence-derived sequence profiles, secondary-structure profiles and residue-depth dependent, structure-derived sequence profiles. The use of residue-depth-dependent profiles makes SP(3) one of the best automatic predictors in CASP 6. Because residue depth (RD) and solvent accessible surface area (solvent accessibility) are complementary in describing the exposure of a residue to solvent, we test whether or not incorporation of solvent-accessibility profiles into SP(3) could further increase the accuracy of fold recognition. The resulting method, called SP(4), was tested in SALIGN benchmark for alignment accuracy and Lindahl, LiveBench 8 and CASP7 blind prediction for fold recognition sensitivity and model-structure accuracy. For remote homologs, SP(4) is found to consistently improve over SP(3) in the accuracy of sequence alignment and predicted structural models as well as in the sensitivity of fold recognition. Our result suggests that RD and solvent accessibility can be used concurrently for improving the accuracy and sensitivity of fold recognition. The SP(4) server and its local usage package are available on http://sparks.informatics.iupui.edu/SP4.  相似文献   

12.
Sequences of the ubiquitin-conjugating enzyme (UBC or E2) family were used as a test set to investigate issues associated with the high-throughput comparative modelling of protein structures. A semi-automatic method was initially developed with particular emphasis on producing models of a quality suitable for structural comparison. Structural and sequence features of the E2 family were used to improve the sequence alignment and the quality of the structural templates. Initially, failure to correct for subtle structural inconsistencies between templates lead to problems in the comparative analysis of the UBC electrostatic potentials. Modelling of known UBC structures using Modeller 4.0 showed that multiple templates produced, on average, no better models than the use of just one template, as judged by the root-mean-squared deviation between the comparative model and crystal structure backbones. Using four different quality-checking methods, for a given target sequence, it was not possible to distinguish the model most similar to the experimental structure. The UBC models were thus finally modelled using only the crystal structure template with the highest sequence identity to the target to be modelled, and producing only one model solution. Quality checking was used to reject models with obvious structural anomalies (e.g., bad side-chain packing). The resulting models have been used for a comparison of UBC structural features and of their electrostatic potentials. The work was extended through the development of a fully automated pipeline that identifies E2 sequences in the sequence databases, aligns and models them, and calculates the associated electrostatic potential.  相似文献   

13.
Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often?<?10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)—with an average TM-score performance of 0.68 (vs. 0.50–0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca.  相似文献   

14.
Protein comparative modeling has useful applications in large-scale structural initiatives and in rational design of drug targets in medicinal chemistry. The reliability of a homology model is dependent on the sequence identity between the query and the structural homologue used as a template for modeling. Here, we present a method for the utilization and conservation of important structural features of template structures by providing additional spatial restraints in comparative modeling programs like MODELLER. We show that root mean square deviation at C(alpha) positions between the model and the corresponding experimental structure and the quality of the models can be significantly improved for distantly related systems by utilizing additional spatial restraints of the template structures. We demonstrate the influence of such approaches to homology modeling during distant relationships in understanding functional properties of protein such as ligand binding using cytochrome P450 as an example.  相似文献   

15.
The complexities of X-ray crystallography and NMR spectroscopy for large protein complexes, and the comparative ease of approaches such as electron microscopy mean that low-resolution structures are often available long before their atomic resolution equivalents. To help bridge this gap in knowledge, we present 3SOM: an approach for finding the best fit of atomic resolution structures into lower-resolution density maps through surface overlap maximization. High-resolution templates (i.e. partial structures or models for multi-subunit complexes) and targets (lower-resolution maps) are initially represented as iso-surfaces. The latter are used first in a fast search for transformations that superimpose a significant portion of the target surface onto the template surface, which is quantified as surface overlap. The vast search space is reduced by considering key vectors that capture local surface information. The set of transformations with the highest surface overlap scores are then re-ranked by using more sophisticated scores including cross-correlation. We give a number of examples to illustrate the efficiency of the method and its restrictions. For targets for which partial complexes are available, the speed and performance of the method make it an attractive complement to existing methods, as many different hypotheses can be tested quickly on a single processor.  相似文献   

16.
The accuracy of protein structures, particularly their binding sites, is essential for the success of modeling protein complexes. Computationally inexpensive methodology is required for genome-wide modeling of such structures. For systematic evaluation of potential accuracy in high-throughput modeling of binding sites, a statistical analysis of target-template sequence alignments was performed for a representative set of protein complexes. For most of the complexes, alignments containing all residues of the interface were found. The full interface alignments were obtained even in the case of poor alignments where a relatively small part of the target sequence (as low as 40%) aligned to the template sequence, with a low overall alignment identity (<30%). Although such poor overall alignments might be considered inadequate for modeling of whole proteins, the alignment of the interfaces was strong enough for docking. In the set of homology models built on these alignments, one third of those ranked 1 by a simple sequence identity criteria had RMSD<5 Å, the accuracy suitable for low-resolution template free docking. Such models corresponded to multi-domain target proteins, whereas for single-domain proteins the best models had 5 Å<RMSD<10 Å, the accuracy suitable for less sensitive structure-alignment methods. Overall, ∼50% of complexes with the interfaces modeled by high-throughput techniques had accuracy suitable for meaningful docking experiments. This percentage will grow with the increasing availability of co-crystallized protein-protein complexes.  相似文献   

17.
Molecular replacement (MR) is widely used for addressing the phase problem in X-ray crystallography. Historically, crystallographers have had limited success using NMR structures as MR search models. Here, we report a comprehensive investigation of the utility of protein NMR ensembles as MR search models, using data for 25 pairs of X-ray and NMR structures solved and refined using modern NMR methods. Starting from NMR ensembles prepared by an improved protocol, FindCore, correct MR solutions were obtained for 22 targets. Based on these solutions, automatic model rebuilding could be done successfully. Rosetta refinement of NMR structures provided MR solutions for another two proteins. We also demonstrate that such properly prepared NMR ensembles and X-ray crystal structures have similar performance when used as MR search models for homologous structures, particularly for targets with sequence identity >40%.  相似文献   

18.
DeWeese-Scott C  Moult J 《Proteins》2004,55(4):942-961
Experimental protein structures often provide extensive insight into the mode and specificity of small molecule binding, and this information is useful for understanding protein function and for the design of drugs. We have performed an analysis of the reliability with which ligand-binding information can be deduced from computer model structures, as opposed to experimentally derived ones. Models produced as part of the CASP experiments are used. The accuracy of contacts between protein model atoms and experimentally determined ligand atom positions is the main criterion. Only comparative models are included (i.e., models based on a sequence relationship between the protein of interest and a known structure). We find that, as expected, contact errors increase with decreasing sequence identity used as a basis for modeling. Analysis of the causes of errors shows that sequence alignment errors between model and experimental template have the most deleterious effect. In general, good, but not perfect, insight into ligand binding can be obtained from models based on a sequence relationship, providing there are no alignment errors in the model. The results support a structural genomics strategy based on experimental sampling of structure space so that all protein domains can be modeled on the basis of 30% or higher sequence identity.  相似文献   

19.
We have analyzed the pairs of protein structures obtained by X-ray diffraction analysis and nuclear magnetic resonance (X-ray and NMR structures) that display no major differences when superimposed on one another (61 pairs). Analyzing atom-to-atom contacts (contact distances 2–8 Å), it has been found that the NMR structures (compared to the X-ray structures) have more contacts at distances below 3.5 Å and above 5.5 Å. In the case of residue-to-residue contacts, the NMR structures have more contacts at distances below 3 Å and between 4.5 and 6.5 Å. At other distances analyzed, the X-ray structures have more contacts. The difference in the numbers of atom-to-atom and residue-to-residue contacts is greater for buried residues inaccessible to water than for surface residues. Another important difference is related to the number of hydrogen bonds in the main chain: this number is greater in the X-ray structures. The coefficient of correlation between the numbers of hydrogen bonds identified in the structures obtained by both methods is only 32%. If a complete set of NMR models of protein structure is considered, the total number of hydrogen bonds proves to be 1.2 times greater than in the X-ray structures, whereas the correlation coefficient increases to only 65%. We have also demon-strated that -helices in the NMR structures are more distorted (compared to the ideal -helix) than those in the X-ray structures.Translated from Molekulyarnaya Biologiya, Vol. 39, No. 1, 2005, pp. 129–138.Original Russian Text Copyright © 2005 by Melnik, Garbuzynskiy, Lobanov, Galzitskaya.  相似文献   

20.
Opioid receptors are the principal targets for opioids, which have been used as analgesics for centuries. Opioid receptors belong to the rhodopsin family of G-protein coupled receptors (GPCRs). In the absence of crystal structures of opioid receptors, 3D homology models have been reported with bovine rhodopsin as a template, though the sequence homology is low. Recently, it has been reported that use of multiple templates results in a better model for a target having low sequence identity with a single template. With the objective of carrying out a comparative study on the structural quality of the 3D models based on single and multiple templates, the homology models for opioid receptors (mu, delta and kappa) were generated using bovine rhodopsin as single template and the recently deposited crystal structures of squid rhodopsin, turkey β-1 and human β-2 adrenoreceptors along with bovine rhodopsin as multiple templates. In this paper we report the results of comparison between the refined 3D models based on multiple sequence alignment (MSA) and models built with bovine rhodopsin as template, using validation programs PROCHECK, PROSA, Verify 3D, Molprobity and docking studies. The results indicate that homology models of mu and kappa with multiple templates are better than those built with only bovine rhodopsin as template, whereas, in many aspects, the homology model of delta opioid receptor with single template is better with respect to the model based on multiple templates. Three nonselective ligands were docked to both the models of mu, delta and kappa opioid receptors using GOLD 3.1. The results of docking complied well with the pharamacophore, reported for nonspecific opioid ligands. The comparison of docking results for models with multiple templates and those with single template have been discussed in detail. Three selective ligands for each receptor were also docked. As the crystallographic structures are not yet known, this comparison will help in choosing better homology models of opioid receptors for studying ligand receptor interactions to design new potent opioid antagonists.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号