首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: The prediction of the regions of homology models that can be 'restrained by' or 'copied from' the basis structures is a vital step in correct model generation, because these regions are the models most accurate part. However, there is no ideal method for the identification of their limits. In most algorithms their length depends on the number of family members and definitions of secondary structure. RESULTS: The algorithm SCORE steps away from the conventional definitions of the core to identify from large numbers of basis structures those regions that can be considered structurally related to a target sequence. The use of phi, psi constraints to accurately pinpoint the regions that are conserved across a family and environmentally constrained substitution tables to extend these regions allows SCORE to rapidly (generally in under 1 s, an order of magnitude faster than methods such as MODELLER) identify and build the core of homology models from the alignments of the target sequence to the basis structures. The SCORE algorithm was used to build 114 model cores. In only two cases was the core size less than 50% of the structure and all the cores built had an RMSD of 3.7 A or less to the target structure.  相似文献   

2.
Lee KW  Briggs JM 《Proteins》2004,54(4):693-704
Aminoacyl-tRNA synthetases (aaRSs) strictly discriminate their cognate amino acids. Some aaRSs accomplish this via proofreading and editing mechanisms. Mursinna and coworkers recently reported that substituting a highly conserved threonine (T252) with an alanine within the editing domain of Escherichia coli leucyl-tRNA synthetase (LeuRS) caused LeuRS to cleave its cognate aminoacylated leucine from tRNA(Leu) (Mursinna et al., Biochemistry 2001;40:5376-5381). To achieve atomic level insight into the role of T252 in LeuRS and the editing reaction of aaRSs, a series of molecular modeling studies including homology modeling and automated docking simulations were carried out. A 3D structure of E. coli LeuRS was constructed via homology modeling using the X-ray structure of Thermus thermophilus LeuRS as a template because the E. coli LeuRS structure is not available from X-ray or NMR studies. However, both the X-ray T. thermophilus and homology-modeled E. coli structures were used in our studies. Amino acid binding sites in the proposed editing domain, which is also called the connective polypeptide 1 (CP1) domain, were investigated by automated docking studies. The root mean square deviation (RMSD) for backbone atoms between the X-ray and homology-modeled structures was 1.18 A overall and 0.60 A for the editing (CP1) domain. Automated docking studies of a leucine ligand into the editing domain were performed for both structures: homology structure of E. coli LeuRS and X-ray structure of T. thermophilus LeuRS for comparison. The results of the docking studies suggested that there are two possible amino acid binding sites in the CP1 domain for both proteins. The first site lies near a threonine-rich region that includes the highly conserved T252 residue, which is important for amino acid discrimination. The second site is located in a flexible loop region surrounded by residues E292, A293, M295, A296, and M298. The important T252 residue is at the bottom of the first binding pocket.  相似文献   

3.
The structural biology of proteins mediating iron-sulfur (Fe-S) cluster assembly is central for understanding several important biological processes. Here we present the NMR structure of the 16-kDa protein YgdK from Escherichia coli, which shares 35% sequence identity with the E. coli protein SufE. The SufE X-ray crystal structure was solved in parallel with the YdgK NMR structure in the Northeast Structural Genomics (NESG) consortium. Both proteins are (1) key components for Fe-S metabolism, (2) exhibit the same distinct fold, and (3) belong to a family of at least 70 prokaryotic and eukaryotic sequence homologs. Accurate homology models were calculated for the YgdK/SufE family based on YgdK NMR and SufE crystal structure. Both structural templates contributed equally, exemplifying synergy of NMR and X-ray crystallography. SufE acts as an enhancer of the cysteine desulfurase activity of SufS by SufE-SufS complex formation. A homology model of CsdA, a desulfurase encoded in the same operon as YgdK, was modeled using the X-ray structure of SufS as a template. Protein surface and electrostatic complementarities strongly suggest that YgdK and CsdA likewise form a functional two-component desulfurase complex. Moreover, structural features of YgdK and SufS, which can be linked to their interaction with desulfurases, are conserved in all homology models. It thus appears very likely that all members of the YgdK/SufE family act as enhancers of Suf-S-like desulfurases. The present study exemplifies that "refined" selection of two (or more) targets enables high-quality homology modeling of large protein families.  相似文献   

4.
Protein structure prediction is based mainly on the modeling of proteins by homology to known structures; this knowledgebased approach is the most promising method to date. Although it is used in the whole area of protein research, no general rules concerning the quality and applicability of concepts and procedures used in homology modeling have been put forward yet. Therefore, the main goal of the present work is to provide tools for the assessment of accuracy of modeling at a given level of sequence homology. A large set of known structures from different conformational and functional classes, but various degrees of homology was selected. Pairwise structure superpositions were performed. Starting with the definition of the structurally conserved regions and determination of topologically correct sequence alignments, we correlated geometrical properties with sequence homology (defined by the 250 PAM Dayhoff Matrix) and identity. It is shown that both the topological differences of the protein backbones and the relative positions of corresponding side chains diverge with decreasing sequence identity. Below 50% identity, the deviation in regions that are structurally not conserved continually increases, thus implying that with decreasing sequence identity modeling has to take into account more and more structurally diverging loop regions that are difficult to predict. © 1993 Wiley-Liss, Inc.  相似文献   

5.
Proteins have been classified into families based upon sequence homology. An accurate, systematic comparative model-building procedure for a homologous family of proteins would be very valuable scientifically. This paper presents such a procedure and applies it to the mammalian serine proteases, which are ubiquitous and involved in many important biological functions. Eleven proteins of this family are considered here, including a variety of blood serum, intestinal and pancreatic proteins as well as a closely related bacterial enzyme.The modeling method capitalizes upon the availability of three experimentally determined structures for mammalian serine proteases. These structures show that the molecule is divided into structurally conserved regions, which contain the strong sequence homology, and structurally variable regions, which include all the additions and deletions. We show that by applying this structural distinction to new sequences, erroneous alignments of the sequences are greatly minimized.For each aligned new sequence, the structurally conserved regions can be constructed from any of the known structures. In examining the variable regions, we have found that a variable region that has the same length and residue character in two different known structures usually has the same conformation in both. Thus, when the eight structurally unknown proteins are modeled, most of the variable regions can be constructed directly from the known structures. A minority of the variable regions require more sophisticated analysis to evaluate the relative merits of a small number of possible conformations. Only a very few are so different that modeling by homology is entirely ruled out. We demonstrate, therefore, that by this modeling procedure, the maximum of each of these mammalian serine proteases is constructed directly from the experimentally determined structures and the necessity to build from intuition or from energy considerations is greatly reduced.  相似文献   

6.
We evaluate 3D models of human nucleoside diphosphate kinase, mouse cellular retinoic acid binding protein I, and human eosinophil neurotoxin that were calculated by MODELLER , a program for comparative protein modeling by satisfaction of spatial restraints. The models have good stereochemistry and are at least as similar to the crystallographic structures as the closest template structures. The largest errors occur in the regions that were not aligned correctly or where the template structures are not similar to the correct structure. These regions correspond predominantly to exposed loops, insertions of any length, and non-conserved side chains. When a template structure with more than 40% sequence identity to the target protein is available, the model is likely to have about 90% of the mainchain atoms modeled with an rms deviation from the X-ray structure of ≈ 1 Å, in large part because the templates are likely to be that similar to the X-ray structure of the target. This rms deviation is comparable to the overall differences between refined NMR and X-ray crystallography structures of the same protein. © 1995 Wiley-Liss, Inc.  相似文献   

7.
There have been several studies suggesting that protein structures solved by NMR spectroscopy and X-ray crystallography show significant differences. To understand the origin of these differences, we assembled a database of high-quality protein structures solved by both methods. We also find significant differences between NMR and crystal structures—in the root-mean-square deviations of the C α atomic positions, identities of core amino acids, backbone, and side-chain dihedral angles, and packing fraction of core residues. In contrast to prior studies, we identify the physical basis for these differences by modeling protein cores as jammed packings of amino acid-shaped particles. We find that we can tune the jammed packing fraction by varying the degree of thermalization used to generate the packings. For an athermal protocol, we find that the average jammed packing fraction is identical to that observed in the cores of protein structures solved by X-ray crystallography. In contrast, highly thermalized packing-generation protocols yield jammed packing fractions that are even higher than those observed in NMR structures. These results indicate that thermalized systems can pack more densely than athermal systems, which suggests a physical basis for the structural differences between protein structures solved by NMR and X-ray crystallography.  相似文献   

8.
Cobalamin-dependent methionine synthase is a large enzyme composed of structurally and functionally distinct regions. Recent studies have begun to define the roles of several regions of the protein. In particular, the structure of a 27 kDa cobalamin-binding fragment of the enzyme from Escherichia coli has been determined by X-ray crystallography, and has revealed the motifs and interactions responsible for recognition of the cofactor. The amino acid sequences of several adenosylcobalamin-dependent enzymes, the methylmalonyl coenzyme A mutases and glutamate mutases, show homology with the cobalamin-binding region of methionine synthase and retain conserved residues that are determinants for the binding of the prosthetic group, suggesting that these mutases and methionine synthase share common three-dimensional structures.  相似文献   

9.
SCWRL and MolIDE are software applications for prediction of protein structures. SCWRL is designed specifically for the task of prediction of side-chain conformations given a fixed backbone usually obtained from an experimental structure determined by X-ray crystallography or NMR. SCWRL is a command-line program that typically runs in a few seconds. MolIDE provides a graphical interface for basic comparative (homology) modeling using SCWRL and other programs. MolIDE takes an input target sequence and uses PSI-BLAST to identify and align templates for comparative modeling of the target. The sequence alignment to any template can be manually modified within a graphical window of the target-template alignment and visualization of the alignment on the template structure. MolIDE builds the model of the target structure on the basis of the template backbone, predicted side-chain conformations with SCWRL and a loop-modeling program for insertion-deletion regions with user-selected sequence segments. SCWRL and MolIDE can be obtained at (http://dunbrack.fccc.edu/Software.php).  相似文献   

10.
Lee SY  Zhang Y  Skolnick J 《Proteins》2006,63(3):451-456
The TASSER structure prediction algorithm is employed to investigate whether NMR structures can be moved closer to their corresponding X-ray counterparts by automatic refinement procedures. The benchmark protein dataset includes 61 nonhomologous proteins whose structures have been determined by both NMR and X-ray experiments. Interestingly, by starting from NMR structures, the majority (79%) of TASSER refined models show a structural shift toward their X-ray structures. On average, the TASSER refined models have a root-mean-square-deviation (RMSD) from the X-ray structure of 1.785 A (1.556 A) over the entire chain (aligned region), while the average RMSD between NMR and X-ray structures (RMSD(NMR_X-ray)) is 2.080 A (1.731 A). For all proteins having a RMSD(NMR_X-ray) >2 A, the TASSER refined structures show consistent improvement. However, for the 34 proteins with a RMSD(NMR_X-ray) <2 A, there are only 21 cases (60%) where the TASSER model is closer to the X-ray structure than NMR, which may be due to the inherent resolution of TASSER. We also compare the TASSER models with 12 NMR models in the RECOORD database that have been recalculated recently by Nederveen et al. from original NMR restraints using the newest molecular dynamics tools. In 8 of 12 cases, TASSER models show a smaller RMSD to X-ray structures; in 3 of 12 cases, where RMSD(NMR_X-ray) <1 A, RECOORD does better than TASSER. These results suggest that TASSER can be a useful tool to improve the quality of NMR structures.  相似文献   

11.
Molecular modeling of proteins is confronted with the problem of finding homologous proteins, especially when few identities remain after the process of molecular evolution. Using even the most recent methods based on sequence identity detection, structural relationships are still difficult to establish with high reliability. As protein structures are more conserved than sequences, we investigated the possibility of using protein secondary structure comparison (observed or predicted structures) to discriminate between related and unrelated proteins sequences in the range of 10%-30% sequence identity. Pairwise comparison of secondary structures have been measured using the structural overlap (Sov) parameter. In this article, we show that if the secondary structures likeness is >50%, most of the pairs are structurally related. Taking into account the secondary structures of proteins that have been detected by BLAST, FASTA, or SSEARCH in the noisy region (with high E: value), we show that distantly related protein sequences (even with <20% identity) can be still identified. This strategy can be used to identify three-dimensional templates in homology modeling by finding unexpected related proteins and to select proteins for experimental investigation in a structural genomic approach, as well as for genome annotation.  相似文献   

12.
As an alternative to X-ray crystallography, nuclear magnetic resonance (NMR) has also emerged as the method of choice for studying both protein structure and dynamics in solution. However, little work using computational models such as Gaussian network model (GNM) and machine learning approaches has focused on NMR-derived proteins to predict the residue flexibility, which is represented by the root mean square deviation (RMSD) with respect to the average structure. We provide a large-scale comparison of computational models, including GNM, parameter-free GNM and several linear regression models using local solvent exposures as inputs, based on a dataset of 1609 protein chains whose structures were resolved by NMR. The result again confirmed that the correlation of GNM outputs with raw RMSD values was better than that using B-factors of X-ray data. Nevertheless, it was also concluded that the parameter-free GNM and the solvent exposure based linear regression models performed worse than GNM when predicting RMSD, contrary to results using X-ray data. The discrepancy of residue flexibility prediction between NMR and X-ray data is likely attributable to a combination of their physical and methodological differences.  相似文献   

13.
The structure of human protein HSPC034 has been determined by both solution nuclear magnetic resonance (NMR) spectroscopy and X-ray crystallography. Refinement of the NMR structure ensemble, using a Rosetta protocol in the absence of NMR restraints, resulted in significant improvements not only in structure quality, but also in molecular replacement (MR) performance with the raw X-ray diffraction data using MOLREP and Phaser. This method has recently been shown to be generally applicable with improved MR performance demonstrated for eight NMR structures refined using Rosetta (Qian et al., Nature 2007;450:259-264). Additionally, NMR structures of HSPC034 calculated by standard methods that include NMR restraints have improvements in the RMSD to the crystal structure and MR performance in the order DYANA, CYANA, XPLOR-NIH, and CNS with explicit water refinement (CNSw). Further Rosetta refinement of the CNSw structures, perhaps due to more thorough conformational sampling and/or a superior force field, was capable of finding alternative low energy protein conformations that were equally consistent with the NMR data according to the Recall, Precision, and F-measure (RPF) scores. On further examination, the additional MR-performance shortfall for NMR refined structures as compared with the X-ray structure were attributed, in part, to crystal-packing effects, real structural differences, and inferior hydrogen bonding in the NMR structures. A good correlation between a decrease in the number of buried unsatisfied hydrogen-bond donors and improved MR performance demonstrates the importance of hydrogen-bond terms in the force field for improving NMR structures. The superior hydrogen-bond network in Rosetta-refined structures demonstrates that correct identification of hydrogen bonds should be a critical goal of NMR structure refinement. Inclusion of nonbivalent hydrogen bonds identified from Rosetta structures as additional restraints in the structure calculation results in NMR structures with improved MR performance.  相似文献   

14.
We present a novel de novo method to generate protein models from sparse, discretized restraints on the conformation of the main chain and side chain atoms. We focus on Calpha-trace generation, the problem of constructing an accurate and complete model from approximate knowledge of the positions of the Calpha atoms and, in some cases, the side chain centroids. Spatial restraints on the Calpha atoms and side chain centroids are supplemented by constraints on main chain geometry, phi/xi angles, rotameric side chain conformations, and inter-atomic separations derived from analyses of known protein structures. A novel conformational search algorithm, combining features of tree-search and genetic algorithms, generates models consistent with these restraints by propensity-weighted dihedral angle sampling. Models with ideal geometry, good phi/xi angles, and no inter-atomic overlaps are produced with 0.8 A main chain and, with side chain centroid restraints, 1.0 A all-atom root-mean-square deviation (RMSD) from the crystal structure over a diverse set of target proteins. The mean model derived from 50 independently generated models is closer to the crystal structure than any individual model, with 0.5 A main chain RMSD under only Calpha restraints and 0.7 A all-atom RMSD under both Calpha and centroid restraints. The method is insensitive to randomly distributed errors of up to 4 A in the Calpha restraints. The conformational search algorithm is efficient, with computational cost increasing linearly with protein size. Issues relating to decoy set generation, experimental structure determination, efficiency of conformational sampling, and homology modeling are discussed.  相似文献   

15.

Background

We have developed the program PERMOL for semi-automated homology modeling of proteins. It is based on restrained molecular dynamics using a simulated annealing protocol in torsion angle space. As main restraints defining the optimal local geometry of the structure weighted mean dihedral angles and their standard deviations are used which are calculated with an algorithm described earlier by Döker et al. (1999, BBRC, 257, 348–350). The overall long-range contacts are established via a small number of distance restraints between atoms involved in hydrogen bonds and backbone atoms of conserved residues. Employing the restraints generated by PERMOL three-dimensional structures are obtained using standard molecular dynamics programs such as DYANA or CNS.

Results

To test this modeling approach it has been used for predicting the structure of the histidine-containing phosphocarrier protein HPr from E. coli and the structure of the human peroxisome proliferator activated receptor γ (Ppar γ). The divergence between the modeled HPr and the previously determined X-ray structure was comparable to the divergence between the X-ray structure and the published NMR structure. The modeled structure of Ppar γ was also very close to the previously solved X-ray structure with an RMSD of 0.262 nm for the backbone atoms.

Conclusion

In summary, we present a new method for homology modeling capable of producing high-quality structure models. An advantage of the method is that it can be used in combination with incomplete NMR data to obtain reasonable structure models in accordance with the experimental data.
  相似文献   

16.
FlgM proteins, also known as Anti-sigma-28 factor (sigma28), are negative regulators of flagellin synthesis. Recently, a three-dimensional structure of the Aquifex aeolicus sigma28/FlgM complex (PDB code: 1rp3) was determined by X-ray crystallography at 2.3 A resolution. Furthermore, experimental data on bacterial FlgM, including site-directed mutagenesis and structural characterization by NMR are also available. However, an interpretation of the sequence-structure-function relationships combining X-ray and NMR data with the evolutionary information extracted from the increasing number of FlgM-related sequences annotated in databases is not available. In the present study, we combined database sequence searches and sequence-analysis tools to update the multiple sequence alignment of a previously characterized cluster of orthologs (COG2747) and the PFAM classification of protein domains (PF04316) for the FlgM family. A phylogenetic analysis of 77 protein sequences revealed the presence of at least three major sequence clades within the FlgM family. Besides, we predicted functional residues using a SequenceSpace method. We also generated homology models for Bacillus subtilis and Salmonella typhimurium FlgM proteins, for which sequence-structure-function relationship data are available, and used the docking program ClusPro to hypothesize about the dimer association between FlgM proteins. In conclusion, the analysis presented in this work will be useful in designing new experiments to understand better protein-protein interactions between FglM, sigma factors, and putative molecules from the flagellar export apparatus. Electronic Supplementary Material is available in the online version of this article at http://link.springer.de/  相似文献   

17.
18.
Plant family 1 UDP-dependent glycosyltransferases (UGTs) catalyze the glycosylation of a plethora of bioactive natural products. In Arabidopsis thaliana, 120 UGT encoding genes have been identified. The crystal-based 3D structures of four plant UGTs have recently been published. Despite low sequence conservation, the UGTs show a highly conserved secondary and tertiary structure. The sugar acceptor and sugar donor substrates of UGTs are accommodated in the cleft formed between the N- and C-terminal domains. Several regions of the primary sequence contribute to the formation of the substrate binding pocket including structurally conserved domains as well as loop regions differing both with respect to their amino acid sequence and sequence length. In this review we provide a detailed analysis of the available plant UGT crystal structures to reveal structural features determining substrate specificity. The high 3D structural conservation of the plant UGTs render homology modeling an attractive tool for structure elucidation. The accuracy and utility of UGT structures obtained by homology modeling are discussed and quantitative assessments of model quality are performed by modeling of a plant UGT for which the 3D crystal structure is known. We conclude that homology modeling offers a high degree of accuracy. Shortcomings in homology modeling are also apparent with modeling of loop regions remaining as a particularly difficult task.  相似文献   

19.
The three-dimensional solution structure of maize nonspecific lipid transfer protein (nsLTP) obtained by nuclear magnetic resonance (NMR) is compared to the X-ray structure. Although both structures are very similar, some local structural differences are observed in the first and the fourth helices and in several side-chain conformations. These discrepancies arise partly from intermolecular contacts in the crystal lattice. The main characteristic of nsLTP structures is the presence of an internal hydrophobic cavity whose volume was found to vary from 237 to 513 Å3 without major variations in the 15 solution structures. Comparison of crystal and NMR structures shows the existence of another small hollow at the periphery of the protein containing a water molecule in the X-ray structure, which could play an important structural role. A model of the complexed form of maize nsLTP by α-lysopalmitoylphosphatidylcholine was built by docking the lipid inside the protein cavity of the NMR structure. The main structural feature is a hydrogen bond found also in the X-ray structure of the complex maize nsLTP/palmitate between the hydroxyl of Tyr81 and the carbonyl of the lipid. Comparison of 12 primary sequences of nsLTPs emphasizes that all residues delineating the cavities calculated on solution and X-ray structures are conserved, which suggests that this large cavity is a common feature of all compared plant nsLTPs. Furthermore several conserved basic residues seem to be involved in the stabilization of the protein architecture. Proteins 31:160–171, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

20.
Chen S  Jancrick J  Yokota H  Kim R  Kim SH 《Proteins》2004,55(4):785-791
UPF0040 is a family of proteins implicated in a cellular function of bacteria cell division. There is no structure information available on protein of this family. We have determined the crystal structure of a protein from Mycoplasma pneumoniae that belongs to this family using X-ray crystallography. Structural homology search reveals that this protein has a novel fold with no significant similarity to any proteins of known three-dimensional structure. The crystal structures of the protein in three different crystal forms reveal that the protein exists as a ring of octamer. The conserved protein residues, including a highly conserved DXXXR motif, are examined on the basis of crystal structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号