首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have investigated some of the basic principles that influence generation of protein structures using a fragment-based, random insertion method. We tested buildup methods and fragment library quality for accuracy in constructing a set of known structures. The parameters most influential in the construction procedure are bond and torsion angles with minor inaccuracies in bond angles alone causing >6 A CalphaRMSD for a 150-residue protein. Idealization to a standard set of values corrects this problem, but changes the torsion angles and does not work for every structure. Alternatively, we found using Cartesian coordinates instead of torsion angles did not reduce performance and can potentially increase speed and accuracy. Under conditions simulating ab initio structure prediction, fragment library quality can be suboptimal and still produce near-native structures. Using various clustering criteria, we created a number of libraries and used them to predict a set of native structures based on nonnative fragments. Local CalphaRMSD fit of fragments, library size, and takeoff/landing angle criteria weakly influence the accuracy of the models. Based on a fragment's minimal perturbation upon insertion into a known structure, a seminative fragment library was created that produced more accurate structures with fragments that were less similar to native fragments than the other sets. These results suggest that fragments need only contain native-like subsections, which when correctly overlapped, can recreate a native-like model. For fragment-based, random insertion methods used in protein structure prediction and design, our findings help to define the parameters this method needs to generate near-native structures.  相似文献   

2.
Structural genomics projects as well as ab initio protein structure prediction methods provide structures of proteins with no sequence or fold similarity to proteins with known functions. These are often low-resolution structures that may only include the positions of C alpha atoms. We present a fast and efficient method to predict DNA-binding proteins from just the amino acid sequences and low-resolution, C alpha-only protein models. The method uses the relative proportions of certain amino acids in the protein sequence, the asymmetry of the spatial distribution of certain other amino acids as well as the dipole moment of the molecule. These quantities are used in a linear formula, with coefficients derived from logistic regression performed on a training set, and DNA-binding is predicted based on whether the result is above a certain threshold. We show that the method is insensitive to errors in the atomic coordinates and provides correct predictions even on inaccurate protein models. We demonstrate that the method is capable of predicting proteins with novel binding site motifs and structures solved in an unbound state. The accuracy of our method is close to another, published method that uses all-atom structures, time-consuming calculations and information on conserved residues.  相似文献   

3.
Prediction of the three-dimensional structure of human growth hormone   总被引:2,自引:0,他引:2  
F E Cohen  I D Kuntz 《Proteins》1987,2(2):162-166
In recent years, the protein-folding problem has attracted the attention of molecular biologists. Efforts have focused on developing heuristic and energy-based algorithms to predict the three-dimensional structure of a protein from its amino acid sequence. We have applied a series of heuristic algorithms to the sequence of human growth hormone. A family of five structures which are generically right-handed fourfold alpha-helical bundles are found from an investigation of approximately 10(8) structures. A plausible receptor binding site is suggested. Independent crystallographic analysis confirms some aspects of these predictions. These methods only deal with the "core" structure, and conformations of many residues are not defined. Further work is required to identify a unique set of coordinates and to clarify the topological alternative available to alpha-helical proteins.  相似文献   

4.
We extracted phosphorus atom coordinates from the database of DNA crystal structures and calculated geometrical parameters needed to reproduce the crystal structures in the phosphorus atom representation. Using the geometrical parameters we wrote a piece of software assigning the phosphorus atom coordinates to the DNA of any nucleotide sequence. The software demonstrates non-negligible influence of the primary structure on DNA helicity, which may stand behind the heteromonous double helices of poly(dA).poly(dT) and poly(dG).poly(dC). In addition, the software is so simple that it makes possible to simulate the "crystal" structures of not only viral DNAs, but also the whole genome of Saccharomyces cerevisiae as well as the DNA human chromosome 22 having dozens of megabases in length.  相似文献   

5.
We report a very fast and accurate physics-based method to calculate pH-dependent electrostatic effects in protein molecules and to predict the pK values of individual sites of titration. In addition, a CHARMm-based algorithm is included to construct and refine the spatial coordinates of all hydrogen atoms at a given pH. The present method combines electrostatic energy calculations based on the Generalized Born approximation with an iterative mobile clustering approach to calculate the equilibria of proton binding to multiple titration sites in protein molecules. The use of the GBIM (Generalized Born with Implicit Membrane) CHARMm module makes it possible to model not only water-soluble proteins but membrane proteins as well. The method includes a novel algorithm for preliminary refinement of hydrogen coordinates. Another difference from existing approaches is that, instead of monopeptides, a set of relaxed pentapeptide structures are used as model compounds. Tests on a set of 24 proteins demonstrate the high accuracy of the method. On average, the RMSD between predicted and experimental pK values is close to 0.5 pK units on this data set, and the accuracy is achieved at very low computational cost. The pH-dependent assignment of hydrogen atoms also shows very good agreement with protonation states and hydrogen-bond network observed in neutron-diffraction structures. The method is implemented as a computational protocol in Accelrys Discovery Studio and provides a fast and easy way to study the effect of pH on many important mechanisms such as enzyme catalysis, ligand binding, protein-protein interactions, and protein stability.  相似文献   

6.
A method for generating a complete polypeptide backbone structure from a set of Cα coordinates is presented. Initial trial values of ? and ψ for a selected residue are chosen (essentially from an identification of the conformational region of the virtual-bond backbone, e.g., and α-helical region), and values of ? and ψ for the remaining residues (both towards the N- and C-terminus) are then computed, subject to the constraint that the chain have the same virtual-bond angles and virtual-bond dihedral angles as the given set of Cα coordinates. The conversion from Cα coordinates to full backbone dihedral angles (?,ψ) involves the solution of a set of algebraic equations relating the virtual-bond angles and virtual-bond dihedral angles to standard peptide geometry and backbone dihedral angles. The procedure has been tested successfully on Cα coordinates taken from standard-geometry full-atom structures of bovine pancreatic trypsin inhibitor (BPTI). Some difficulty was encountered with error-sensitive residues, but on the whole the backbone generation was successful. Application of the method to Cα coordinates for BPTI derived from simplified model calculations (involving nonstandard geometry) showed that such coordinates may be inconsistent with the requirement that ?Pro be near ?75°. In such a case, i.e., for residues for which the algebraic method failed, a leastsquares minimizer was then used in conjunction with the algebraic method; the mean-square deviation of the calculated Cα coordinates from the given ones was minimized by varying the backbone dihedral angles. Thus, these inconsistencies were circumvented and a full backbone structure whose Cα coordinates had an rms deviation of 0.26 Å from the given set of Cα coordinates was obtained.  相似文献   

7.
We present a new method for predicting the secondary structure of globular proteins based on non-linear neural network models. Network models learn from existing protein structures how to predict the secondary structure of local sequences of amino acids. The average success rate of our method on a testing set of proteins non-homologous with the corresponding training set was 64.3% on three types of secondary structure (alpha-helix, beta-sheet, and coil), with correlation coefficients of C alpha = 0.41, C beta = 0.31 and Ccoil = 0.41. These quality indices are all higher than those of previous methods. The prediction accuracy for the first 25 residues of the N-terminal sequence was significantly better. We conclude from computational experiments on real and artificial structures that no method based solely on local information in the protein sequence is likely to produce significantly better results for non-homologous proteins. The performance of our method of homologous proteins is much better than for non-homologous proteins, but is not as good as simply assuming that homologous sequences have identical structures.  相似文献   

8.
Circular dichroism (CD) is a spectroscopic technique commonly used to investigate the structure of proteins. Major secondary structure types, alpha‐helices and beta‐strands, produce distinctive CD spectra. Thus, by comparing the CD spectrum of a protein of interest to a reference set consisting of CD spectra of proteins of known structure, predictive methods can estimate the secondary structure of the protein. Currently available methods, including K2D2, use such experimental CD reference sets, which are very small in size when compared to the number of tertiary structures available in the Protein Data Bank (PDB). Conversely, given a PDB structure, it is possible to predict a theoretical CD spectrum from it. The methodological framework for this calculation was established long ago but only recently a convenient implementation called DichroCalc has been developed. In this study, we set to determine whether theoretically derived spectra could be used as reference set for accurate CD based predictions of secondary structure. We used DichroCalc to calculate the theoretical CD spectra of a nonredundant set of structures representing most proteins in the PDB, and applied a straightforward approach for predicting protein secondary structure content using these theoretical CD spectra as reference set. We show that this method improves the predictions, particularly for the wavelength interval between 200 and 240 nm and for beta‐strand content. We have implemented this method, called K2D3, in a publicly accessible web server at http://www. ogic.ca/projects/k2d3 . Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

9.
A knowledge-based approach to the modelling of enzyme-peptide inhibitor complexes is described. Given the structure of an enzyme, and knowledge of its binding site, the method seeks to predict the binding geometry of a peptide ligand. This novel method involves using examples of side-chain packing derived from proteins of known three-dimensional structure to define possible packing arrangements of a peptide inhibitor group to its binding site. A suite of programs, GEMINI, was written and used to predict the packing of pairs of amino acid groups from three inhibitors complexed to their enzymes for which the X-ray structures were available. These included the Phe group of the inhibitor H142 bound to endothiapepsin, the Leu group of CLT complexed to thermolysin and the C-terminus of Gly-L-Tyr bound to carboxypeptidase A. A detailed comparison of the modelled and observed inhibitor coordinates was made. This approach may be extended to modelling other types of protein interactions.  相似文献   

10.
Atomic-resolution structures have had a tremendous impact on modern biological science. Much useful information also has been gleaned by merging and correlating atomic-resolution structural details with lower-resolution (15-40 A), three-dimensional (3D) reconstructions computed from images recorded with cryo-transmission electron microscopy (cryoTEM) procedures. One way to merge these structures involves reducing the resolution of an atomic model to a level comparable to a cryoTEM reconstruction. A low-resolution density map can be derived from an atomic-resolution structure by retrieving a set of atomic coordinates editing the coordinate file, computing structure factors from the model coordinates, and computing the inverse Fourier transform of the structure factors. This method is a useful tool for structural studies primarily in combination with 3D cryoTEM reconstructions. It has been used to assess the quality of 3D reconstructions, to determine corrections for the phase-contrast transfer function of the transmission electron microscope, to calibrate the dimensions and handedness of 3D reconstructions, to produce difference maps, to model features in macromolecules or macromolecular complexes, and to generate models to initiate model-based determination of particle orientation and origin parameters for 3D reconstruction.  相似文献   

11.
A new method is proposed for docking ligands into proteins in cases where an NMR-determined solution structure of a related complex is available. The method uses a set of experimentally determined values for protein–ligand, ligand–ligand, and protein–protein restraints for residues in or near to the binding site, combined with a set of protein–protein restraints involving all the other residues which is taken from the list of restraints previously used to generate the reference structure of a related complex. This approach differs from ordinary docking methods where the calculation uses fixed atomic coordinates from the reference structure rather than the restraints used to determine the reference structure. The binding site residues influenced by replacing the reference ligand by the new ligand were determined by monitoring differences in 1H chemical shifts. The method has been validated by showing the excellent agreement between structures of L. casei dihydrofolate reductase.trimetrexate calculated by conventional methods using a full experimentally determined set of restraints and those using this new restraint docking method based on an L. casei dihydrofolate reductase.methotrexate reference structure.  相似文献   

12.
Chung SY  Subbiah S 《Proteins》1999,35(2):184-194
The precision and accuracy of protein structures determined by nuclear magnetic resonance (NMR) spectroscopy depend on the completeness of input experimental data set. Typically, rather than a single structure, an ensemble of up to 20 equally representative conformers is generated and routinely deposited in the Protein Database. There are substantially more experimentally derived restraints available to define the main-chain coordinates than those of the side chains. Consequently, the side-chain conformations among the conformers are more variable and less well defined than those of the backbone. Even when a side chain is determined with high precision and is found to adopt very similar orientations among all the conformers in the ensemble, it is possible that its orientation might still be incorrect. Thus, it would be helpful if there were a method to assess independently the side-chain orientations determined by NMR. Recently, homology modeling by side-chain packing algorithms has been shown to be successful in predicting the side-chain conformations of the buried residues for a protein when the main-chain coordinates and sequence information are given. Since the main-chain coordinates determined by NMR are consistently more reliable than those of the side-chains, we have applied the side-chain packing algorithms to predict side-chain conformations that are compatible with the NMR-derived backbone. Using four test cases where the NMR solution structures and the X-ray crystal structure of the same protein are available, we demonstrate that the side-chain packing method can provide independent validation for the side-chain conformations of NMR structures. Comparison of the side-chain conformations derived by side-chain packing prediction and by NMR spectroscopy demonstrates that when there is agreement between the NMR model and the predicted model, on average 78% of the time the X-ray structure also concurs. While the side-chain packing method can confirm the reliable residue conformations in NMR models, more importantly, it can also identify the questionable residue conformations with an accuracy of 60%. This validation method can serve to increase the confidence level for potential users of structural models determined by NMR.  相似文献   

13.
It has been many years since position-specific residue preference around the ends of a helix was revealed. However, all the existing secondary structure prediction methods did not exploit this preference feature, resulting in low accuracy in predicting the ends of secondary structures. In this study, we collected a relatively large data set consisting of 1860 high-resolution, non-homology proteins from the PDB, and further analyzed the residue distributions around the ends of regular secondary structures. It was found that there exist position-specific residue preferences (PSRP) around the ends of not only helices but also strands. Based on the unique features, we proposed a novel strategy and developed a tool named E-SSpred that treats the secondary structure as a whole and builds models to predict entire secondary structure segments directly by integrating relevant features. In E-SSpred, the support vector machine (SVM) method is adopted to model and predict the ends of helices and strands according to the unique residue distributions around them. A simple linear discriminate analysis method is applied to model and predict entire secondary structure segments by integrating end-prediction results, tri-peptide composition, and length distribution features of secondary structures, as well as the prediction results of the most famous program PSIPRED. The results of fivefold cross-validation on a widely used data set demonstrate that the accuracy of E-SSpred in predicting ends of secondary structures is about 10% higher than PSIPRED, and the overall prediction accuracy (Q(3) value) of E-SSpred (82.2%) is also better than PSIPRED (80.3%). The E-SSpred web server is available at http://bioinfo.hust.edu.cn/bio/tools/E-SSpred/index.html.  相似文献   

14.
Comparing and classifying the three-dimensional (3D) structures of proteins is of crucial importance to molecular biology, from helping to determine the function of a protein to determining its evolutionary relationships. Traditionally, 3D structures are classified into groups of families that closely resemble the grouping according to their primary sequence. However, significant structural similarities exist at multiple levels between proteins that belong to these different structural families. In this study, we propose a new algorithm, CLICK, to capture such similarities. The method optimally superimposes a pair of protein structures independent of topology. Amino acid residues are represented by the Cartesian coordinates of a representative point (usually the C(α) atom), side chain solvent accessibility, and secondary structure. Structural comparison is effected by matching cliques of points. CLICK was extensively benchmarked for alignment accuracy on four different sets: (i) 9537 pair-wise alignments between two structures with the same topology; (ii) 64 alignments from set (i) that were considered to constitute difficult alignment cases; (iii) 199 pair-wise alignments between proteins with similar structure but different topology; and (iv) 1275 pair-wise alignments of RNA structures. The accuracy of CLICK alignments was measured by the average structure overlap score and compared with other alignment methods, including HOMSTRAD, MUSTANG, Geometric Hashing, SALIGN, DALI, GANGSTA(+), FATCAT, ARTS and SARA. On average, CLICK produces pair-wise alignments that are either comparable or statistically significantly more accurate than all of these other methods. We have used CLICK to uncover relationships between (previously) unrelated proteins. These new biological insights include: (i) detecting hinge regions in proteins where domain or sub-domains show flexibility; (ii) discovering similar small molecule binding sites from proteins of different folds and (iii) discovering topological variants of known structural/sequence motifs. Our method can generally be applied to compare any pair of molecular structures represented in Cartesian coordinates as exemplified by the RNA structure superimposition benchmark.  相似文献   

15.
De novo structure prediction can be defined as a search in conformational space under the guidance of an energy function. The most successful de novo structure prediction methods, such as Rosetta, assemble the fragments from known structures to reduce the search space. Therefore, the fragment quality is an important factor in structure prediction. In our study, a method is proposed to generate a new set of fragments from the lowest energy de novo models. These fragments were subsequently used to predict the next‐round of models. In a benchmark of 30 proteins, the new set of fragments showed better performance when used to predict de novo structures. The lowest energy model predicted using our method was closer to native structure than Rosetta for 22 proteins. Following a similar trend, the best model among top five lowest energy models predicted using our method was closer to native structure than Rosetta for 20 proteins. In addition, our experiment showed that the C‐alpha root mean square deviation was improved from 5.99 to 5.03 Å on average compared to Rosetta when the lowest energy models were picked as the best predicted models. Proteins 2014; 82:2240–2252. © 2014 Wiley Periodicals, Inc.  相似文献   

16.
Predicted protein residue–residue contacts can be used to build three‐dimensional models and consequently to predict protein folds from scratch. A considerable amount of effort is currently being spent to improve contact prediction accuracy, whereas few methods are available to construct protein tertiary structures from predicted contacts. Here, we present an ab initio protein folding method to build three‐dimensional models using predicted contacts and secondary structures. Our method first translates contacts and secondary structures into distance, dihedral angle, and hydrogen bond restraints according to a set of new conversion rules, and then provides these restraints as input for a distance geometry algorithm to build tertiary structure models. The initially reconstructed models are used to regenerate a set of physically realistic contact restraints and detect secondary structure patterns, which are then used to reconstruct final structural models. This unique two‐stage modeling approach of integrating contacts and secondary structures improves the quality and accuracy of structural models and in particular generates better β‐sheets than other algorithms. We validate our method on two standard benchmark datasets using true contacts and secondary structures. Our method improves TM‐score of reconstructed protein models by 45% and 42% over the existing method on the two datasets, respectively. On the dataset for benchmarking reconstructions methods with predicted contacts and secondary structures, the average TM‐score of best models reconstructed by our method is 0.59, 5.5% higher than the existing method. The CONFOLD web server is available at http://protein.rnet.missouri.edu/confold/ . Proteins 2015; 83:1436–1449. © 2015 Wiley Periodicals, Inc.  相似文献   

17.
A directed conformational search algorithm using the program CONGEN (ref. 3), which samples backbone conformers, is described. The search technique uses information from the partially built structures to direct the search process and is tested on the problem of generating a full set of backbone Cartesian coordinates given only alpha-carbon coordinates. The method has been tested on six proteins of known structure, varying in size and classification, and was able to generate the original backbone coordinates with RMSs ranging from 0.30-0.87A for the alpha-carbons and 0.5-0.99A RMSs for the backbone atoms. Cis peptide linkages were also correctly identified. The procedure was also applied to two proteins available with only alpha-carbon coordinates in the Brookhaven Protein Data Bank; thioredoxin (SRX) and triacylglycerol acylhydrolase (TGL). All-atom models are proposed for the backbone of both these proteins. In addition, the technique was applied to randomized coordinates of flavodoxin to assess the effects of irregularities in the data on the final RMS. This study represents the first time a deterministic conformational search was used on such a large scale.  相似文献   

18.
We propose a new approach for calculating the three-dimensional (3D) structure of a protein from distance and dihedral angle constraints derived from experimental data. We suggest that such constraints can be obtained from experiments such as tritium planigraphy, chemical or enzymatic cleavage of the polypeptide chain, paramagnetic perturbation of nuclear magnetic resonance (NMR) spectra, measurement of hydrogen-exchange rates, mutational studies, mass spectrometry, and electron paramagnetic resonance. These can be supplemented with constraints from theoretical prediction of secondary structures and of buried/exposed residues. We report here distance geometry calculations to generate the structures of a test protein Staphylococcal nuclease (STN), and the HIV-1 rev protein (REV) of unknown structure. From the available 3D atomic coordinates of STN, we set up simulated data sets consisting of varying number and quality of constraints, and used our group's Self Correcting Distance Geometry (SECODG) program DIAMOD to generate structures. We could generate the correct tertiary fold from qualitative (approximate) as well as precise distance constraints. The root mean square deviations of backbone atoms from the native structure were in the range of 2.0 A to 8.3 A, depending on the number of constraints used. We could also generate the correct fold starting from a subset of atoms that are on the surface and those that are buried. When we used data sets containing a small fraction of incorrect distance constraints, the SECODG technique was able to detect and correct them. In the case of REV, we used a combination of constraints obtained from mutagenic data and structure predictions. DIAMOD generated helix-loop-helix models, which, after four self-correcting cycles, populated one family exclusively. The features of the energy-minimized model are consistent with the available data on REV-RNA interaction. Our method could thus be an attractive alternative for calculating protein 3D structures, especially in cases where the traditional methods of X-ray crystallography and multidimensional NMR spectroscopy have been unsuccessful.  相似文献   

19.
Generation of full protein coordinates from limited information, e.g., the Cα coordinates, is an important step in protein homology modeling and structure determination, and molecular dynamics (MD) simulations may prove to be important in this task. We describe a new method, in which the protein backbone is built quickly in a rather crude way and then refined by minimization techniques. Subsequently, the side chains are positioned using extensive MD calculations. The method is tested on two proteins, and results compared to proteins constructed using two other MD-based methods. In the first method, we supplemented an existing backbone building method with a new procedure to add side chains. The second one largely consists of available methodology. The constructed proteins are compared to the corresponding X-ray structures, which became available during this study, and they are in good agreement (backbone RMS values of 0.5–0.7 Å, and all-atom RMS values of 1.5–1.9 Å). This comparative study indicates that extensive MD simulations are able, to some extent, to generate details of the native protein structure, and may contribute to the development of a standardized methodology to predict reliably (parts of) protein structures when only partial coordinate data are available. © 1994 John Wiley & Sons, Inc.  相似文献   

20.
Atomic-resolution structures have had a tremendous impact on modern biological science. Much useful information also has been gleaned by merging and correlating atomic-resolution structural details with lower-resolution (15–40 Å), three-dimensional (3D) reconstructions computed from images recorded with cryo-transmission electron microscopy (cryoTEM) procedures. One way to merge these structures involves reducing the resolution of an atomic model to a level comparable to a cryoTEM reconstruction. A low-resolution density map can be derived from an atomic-resolution structure by retrieving a set of atomic coordinates editing the coordinate file, computing structure factors from the model coordinates, and computing the inverse Fourier transform of the structure factors. This method is a useful tool for structural studies primarily in combination with 3D cryoTEM reconstructions. It has been used to assess the quality of 3D reconstructions, to determine corrections for the phase-contrast transfer function of the transmission electron microscope, to calibrate the dimensions and handedness of 3D reconstructions, to produce difference maps, to model features in macromolecules or macromolecular complexes, and to generate models to initiate model-based determination of particle orientation and origin parameters for 3D reconstruction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号