首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
VISTRAJ is an application which allows 3D visualization, manipulation and editing of protein conformational space using probabilistic maps of this space called 'trajectory distributions'. Trajectory distributions serve as input to FOLDTRAJ which samples protein structures based on the represented conformational space. VISTRAJ also allows FOLDTRAJ to be used as a tool for homology model creation, and structures may be generated containing post-translationally modified amino acids. AVAILABILITY: Binaries are freely available for non-profit use as part of the FOLDTRAJ package at ftp://ftp.mshri.on.ca/pub/TraDES/foldtraj/.  相似文献   

2.
Protein structure prediction from sequence alone by "brute force" random methods is a computationally expensive problem. Estimates have suggested that it could take all the computers in the world longer than the age of the universe to compute the structure of a single 200-residue protein. Here we investigate the use of a faster version of our FOLDTRAJ probabilistic all-atom protein-structure-sampling algorithm. We have improved the method so that it is now over twenty times faster than originally reported, and capable of rapidly sampling conformational space without lattices. It uses geometrical constraints and a Leonard-Jones type potential for self-avoidance. We have also implemented a novel method to add secondary structure-prediction information to make protein-like amounts of secondary structure in sampled structures. In a set of 100,000 probabilistic conformers of 1VII, 1ENH, and 1PMC generated, the structures with smallest Calpha RMSD from native are 3.95, 5.12, and 5.95A, respectively. Expanding this test to a set of 17 distinct protein folds, we find that all-helical structures are "hit" by brute force more frequently than beta or mixed structures. For small helical proteins or very small non-helical ones, this approach should have a "hit" close enough to detect with a good scoring function in a pool of several million conformers. By fitting the distribution of RMSDs from the native state of each of the 17 sets of conformers to the extreme value distribution, we are able to estimate the size of conformational space for each. With a 0.5A RMSD cutoff, the number of conformers is roughly 2N where N is the number of residues in the protein. This is smaller than previous estimates, indicating an average of only two possible conformations per residue when sterics are accounted for. Our method reduces the effective number of conformations available at each residue by probabilistic bias, without requiring any particular discretization of residue conformational space, and is the fastest method of its kind. With computer speeds doubling every 18 months and parallel and distributed computing becoming more practical, the brute force approach to protein structure prediction may yet have some hope in the near future.  相似文献   

3.
罗升  吕强 《生物信息学》2016,14(2):117-122
蛋白质结构预测中,采样是指在构象空间中生成具有最小自由能的状态。传统的采样方法是对自由度直接赋值。这种方法在处理较少的残基时能取得好的效果。但是对于包含100个残基以上的蛋白质结构,由于构象空间的急剧增长,难以得到理想的结构。本文引入深度学习中的HMC(Hybrid Monte Carlo)采样方法,以概率分布为依据对蛋白质的自由度进行采样,能够对包含100、200甚至更多个残基的蛋白质结构进行采样。并且,在采样的过程中加入残基间的距离约束,使得一个结构中,相对于Rosetta的ab initio最多有75%(平均40%)的残基对得到优化,满足距离约束。  相似文献   

4.
D R Ripoll  H A Scheraga 《Biopolymers》1990,30(1-2):165-176
The conformational space of the membrane-bound portion of melittin has been searched using the electrostatically driven Monte Carlo (EDMC) method with the ECEPP/2 (empirical conformational energy program for peptides) algorithm. The former methodology assumes that a polypeptide or protein molecule is driven toward the native structure by the combined action of electrostatic interactions and stochastic conformational changes associated with thermal movements. The algorithm produces a Monte Carlo search in the conformational hyperspace of the polypeptide using electrostatic predictions and a random sampling technique, combined with local minimization of the energy function, to locate low-energy conformations. As a result of 8 test calculations on the 20-residue membrane-bound portion of melittin, starting from six arbitrary and two completely random conformations, the method was able to locate a very low-energy region of the potential with a well-defined structure for the backbone. In all of the cases under study, the method found a cluster of similar low-energy conformations that agree well with the structure deduced from x-ray diffraction experiments and with one computed earlier by the build-up procedure.  相似文献   

5.
Daily MD  Gray JJ 《Proteins》2007,67(2):385-399
Allosteric proteins have been studied extensively in the last 40 years, but so far, no systematic analysis of conformational changes between allosteric structures has been carried out. Here, we compile a set of 51 pairs of known inactive and active allosteric protein structures from the Protein Data Bank. We calculate local conformational differences between the two structures of each protein using simple metrics, such as backbone and side-chain Cartesian displacement, and torsion angle change and rearrangement in residue-residue contacts. Thresholds for each metric arise from distributions of motions in two control sets of pairs of protein structures in the same biochemical state. Statistical analysis of motions in allosteric proteins quantifies the magnitude of allosteric effects and reveals simple structural principles about allostery. For example, allosteric proteins exhibit substantial conformational changes comprising about 20% of the residues. In addition, motions in allosteric proteins show strong bias toward weakly constrained regions such as loops and the protein surface. Correlation functions show that motions communicate through protein structures over distances averaging 10-20 residues in sequence space and 10-20 A in Cartesian space. Comparison of motions in the allosteric set and a set of 21 nonallosteric ligand-binding proteins shows that nonallosteric proteins also exhibit bias of motion toward weakly constrained regions and local correlation of motion. However, allosteric proteins exhibit twice as much percent motion on average as nonallosteric proteins with ligand-induced motion. These observations may guide efforts to design flexibility and allostery into proteins.  相似文献   

6.
The crystal structures of a number of globular proteins are currently available. An analysis of the distribution of side-chains among different allowed conformations in these proteins has been carried out. The observed conformations of individual residues are discussed on the basis of well-known stereochemical criteria. The population distribution of side-chains in different allowed regions in conformational space can be explained largely on the basis of simple steric considerations. In addition to examining the conformational behaviour of individual residues, some population distributions of conformational angles of general interest involving groups of residues have also been analyzed.  相似文献   

7.
MOTIVATION: Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. RESULTS: We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. AVAILABILITY: The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide  相似文献   

8.
Prediction of natively unfolded regions in protein chains   总被引:1,自引:0,他引:1  
Analysis showed that the globular or natively unfolded state of a protein can be inferred not only from a lower hydrophobicity or a higher charge, but also from the average environment density (average number of close residues located within a certain distance of a given one) of its residues. A database of 6626 protein structures was used to construct a statistical scale of the average number of close residues in globular structures for the 20 amino acids. The portion of false predictions in distinguishing between 80 globular and 90 natively unfolded proteins was 11% with the new scale and 17% with a hydrophobicity scale. The new scale proved suitable for predicting the folded or unfolded state for native proteins or the natively unfolded regions for protein chains. In comparisons with the available algorithms, the new method yielded the highest portion of true predictions (87 and 77% with averaging over residues and over proteins, respectively).  相似文献   

9.
Despite the increasing number of published protein structures, and the fact that each protein's function relies on its three-dimensional structure, there is limited access to automatic programs used for the identification of critical residues from the protein structure, compared with those based on protein sequence. Here we present a new algorithm based on network analysis applied exclusively on protein structures to identify critical residues. Our results show that this method identifies critical residues for protein function with high reliability and improves automatic sequence-based approaches and previous network-based approaches. The reliability of the method depends on the conformational diversity screened for the protein of interest. We have designed a web site to give access to this software at http://bis.ifc.unam.mx/jamming/. In summary, a new method is presented that relates critical residues for protein function with the most traversed residues in networks derived from protein structures. A unique feature of the method is the inclusion of the conformational diversity of proteins in the prediction, thus reproducing a basic feature of the structure/function relationship of proteins.  相似文献   

10.
We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods.  相似文献   

11.
12.
De novo design of the hydrophobic core of ubiquitin.   总被引:9,自引:7,他引:2       下载免费PDF全文
We have previously reported the development and evaluation of a computational program to assist in the design of hydrophobic cores of proteins. In an effort to investigate the role of core packing in protein structure, we have used this program, referred to as Repacking of Cores (ROC), to design several variants of the protein ubiquitin. Nine ubiquitin variants containing from three to eight hydrophobic core mutations were constructed, purified, and characterized in terms of their stability and their ability to adopt a uniquely folded native-like conformation. In general, designed ubiquitin variants are more stable than control variants in which the hydrophobic core was chosen randomly. However, in contrast to previous results with 434 cro, all designs are destabilized relative to the wild-type (WT) protein. This raises the possibility that beta-sheet structures have more stringent packing requirements than alpha-helical proteins. A more striking observation is that all variants, including random controls, adopt fairly well-defined conformations, regardless of their stability. This result supports conclusions from the cro studies that non-core residues contribute significantly to the conformational uniqueness of these proteins while core packing largely affects protein stability and has less impact on the nature or uniqueness of the fold. Concurrent with the above work, we used stability data on the nine ubiquitin variants to evaluate and improve the predictive ability of our core packing algorithm. Additional versions of the program were generated that differ in potential function parameters and sampling of side chain conformers. Reasonable correlations between experimental and predicted stabilities suggest the program will be useful in future studies to design variants with stabilities closer to that of the native protein. Taken together, the present study provides further clarification of the role of specific packing interactions in protein structure and stability, and demonstrates the benefit of using systematic computational methods to predict core packing arrangements for the design of proteins.  相似文献   

13.
We have recently developed a computational technique that uses mutually orthogonal Latin square sampling to explore the conformational space of oligopeptides in an exhaustive manner. In this article, we report its use to analyze the conformational spaces of 120 protein loop sequences in proteins, culled from the PDB, having the length ranging from 5 to 10 residues. The force field used did not have any information regarding the sequences or structures that flanked the loop. The results of the analyses show that the native structure of the loop, as found in the PDB falls at one of the low energy points in the conformational landscape of the sequences. Thus, a large portion of the structural determinants of the loop may be considered intrinsic to the sequence, regardless of either adjacent sequences or structures, or the interactions that the atoms of the loop make with other residues in the protein or in neighboring proteins.  相似文献   

14.
We show that long- and short-range interactions in almost all protein native structures are actually consistent with each other for coarse-grained energy scales; specifically we mean the long-range inter-residue contact energies and the short-range secondary structure energies based on peptide dihedral angles, which are potentials of mean force evaluated from residue distributions observed in protein native structures. This consistency is observed at equilibrium in sequence space rather than in conformational space. Statistical ensembles of sequences are generated by exchanging residues for each of 797 protein native structures with the Metropolis method. It is shown that adding the other category of interaction to either the short- or long-range interactions decreases the means and variances of those energies for essentially all protein native structures, indicating that both interactions consistently work by more-or-less restricting sequence spaces available to one of the interactions. In addition to this consistency, independence by these interaction classes is also indicated by the fact that there are almost no correlations between them when equilibrated using both interactions and significant but small, positive correlations at equilibrium using only one of the interactions. Evidence is provided that protein native sequences can be regarded approximately as samples from the statistical ensembles of sequences with these energy scales and that all proteins have the same effective conformational temperature. Designing protein structures and sequences to be consistent and minimally frustrated among the various interactions is a most effective way to increase protein stability and foldability.  相似文献   

15.
We have performed a statistical analysis of unstructured amino acid residues in protein structures available in the databank of protein structures. Data on the occurrence of disordered regions at the ends and in the middle part of protein chains have been obtained: in the regions near the ends (at distance less than 30 residues from the N- or C-terminus), there are 66% of unstructured residues (38% are near the N-terminus and 28% are near the C-terminus), although these terminal regions include only 23% of the amino acid residues. The frequencies of occurrence of unstructured residues have been calculated for each of 20 types in different positions in the protein chain. It has been shown that relative frequencies of occurrence of unstructured residues of 20 types at the termini of protein chains differ from the ones in the middle part of the protein chain; amino acid residues of the same type have different probabilities to be unstructured in the terminal regions and in the middle part of the protein chain. The obtained frequencies of occurrence of unstructured residues in the middle part of the protein chain have been used as a scale for predicting disordered regions from amino acid sequence using the method (FoldUnfold) previously developed by us. This scale of frequencies of occurrence of unstructured residues correlates with the contact scale (previously developed by us and used for the same purpose) at a level of 95%. Testing the new scale on a database of 427 unstructured proteins and 559 completely structured proteins has shown that this scale can be successfully used for the prediction of disordered regions in protein chains.  相似文献   

16.
Bostick DL  Shen M  Vaisman II 《Proteins》2004,56(3):487-501
A topological representation of proteins is developed that makes use of two metrics: the Euclidean metric for identifying natural nearest neighboring residues via the Delaunay tessellation in Cartesian space and the distance between residues in sequence space. Using this representation, we introduce a quantitative and computationally inexpensive method for the comparison of protein structural topology. The method ultimately results in a numerical score quantifying the distance between proteins in a heuristically defined topological space. The properties of this scoring scheme are investigated and correlated with the standard Calpha distance root-mean-square deviation measure of protein similarity calculated by rigid body structural alignment. The topological comparison method is shown to have a characteristic dependence on protein conformational differences and secondary structure. This distinctive behavior is also observed in the comparison of proteins within families of structural relatives. The ability of the comparison method to successfully classify proteins into classes, superfamilies, folds, and families that are consistent with standard classification methods, both automated and human-driven, is demonstrated. Furthermore, it is shown that the scoring method allows for a fine-grained classification on the family, protein, and species level that agrees very well with currently established phylogenetic hierarchies. This fine classification is achieved without requiring visual inspection of proteins, sequence analysis, or the use of structural superimposition methods. Implications of the method for a fast, automated, topological hierarchical classification of proteins are discussed.  相似文献   

17.
Similarity of protein structures has been analyzed using three-dimensional Delaunay triangulation patterns derived from the backbone representation. It has been found that structurally related proteins have a common spatial invariant part, a set of tetrahedrons, mathematically described as a common spatial subgraph volume of the three-dimensional contact graph derived from Delaunay tessellation (DT). Based on this property of protein structures, we present a novel common volume superimposition (TOPOFIT) method to produce structural alignments. Structural alignments usually evaluated by a number of equivalent (aligned) positions (N(e)) with corresponding root mean square deviation (RMSD). The superimposition of the DT patterns allows one to uniquely identify a maximal common number of equivalent residues in the structural alignment. In other words, TOPOFIT identifies a feature point on the RMSD N(e) curve, a topomax point, until which the topologies of two structures correspond to each other, including backbone and interresidue contacts, whereas the growing number of mismatches between the DT patterns occurs at larger RMSD (N(e)) after the topomax point. It has been found that the topomax point is present in all alignments from different protein structural classes; therefore, the TOPOFIT method identifies common, invariant structural parts between proteins. The alignments produced by the TOPOFIT method have a good correlation with alignments produced by other current methods. This novel method opens new opportunities for the comparative analysis of protein structures and for more detailed studies on understanding the molecular principles of tertiary structure organization and functionality. The TOPOFIT method also helps to detect conformational changes, topological differences in variable parts, which are particularly important for studies of variations in active/ binding sites and protein classification.  相似文献   

18.
Sims GE  Kim SH 《Nucleic acids research》2003,31(19):5607-5616
A global conformational space of 6253 dinucleoside monophosphate (DMP) units consisting of RNA and DNA (free and protein/drug-bound) was 'mapped' using high resolution crystal structures cataloged in the Nucleic Acid Database (NDB). The torsion angles of each DMP were clustered in a reduced three-dimensional space using a classical multi-dimensional scaling method. The mapping of the conformational space reveals nine primary clusters which distinguish among the common A-, B- and Z-forms and their various substates, plus five secondary clusters for kinked or bent structures. Conformational relationships and possible transitional pathways among the substates are also examined using the conformational states of DNA and RNA bound with proteins or drugs as potential pathway intermediates.  相似文献   

19.
We consider in this paper the statistical distribution of hydrophobic residues along the length of protein chains. For this purpose we used a binary hydrophobicity scale which assigns hydrophobic residues a value of one and non-hydrophobes a value of zero. The resulting binary sequences are tested for randomness using the standard run test. For the majority of the 5,247 proteins examined, the distribution of hydrophobic residues along a sequence cannot be distinguished from that expected for a random distribution. This suggests that (a) functional proteins may have originated from random sequences, (b) the folding of proteins into compact structures may be much more permissive with less sequence specificity than previously thought, and (c) the clusters of hydrophobic residues along chains which are revealed by hydrophobicity plots are a natural consequence of a random distribution and can be conveniently described by binomial statistics.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号