首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We investigate the relationship between the flexibility, expressed with B‐factor, and the relative solvent accessibility (RSA) in the context of local, with respect to the sequence, neighborhood and related concepts such as residue depth. We observe that the flexibility of a given residue is strongly influenced by the solvent accessibility of the adjacent neighbors. The mean normalized B‐factor of the exposed residues with two buried neighbors is smaller than that of the buried residues with two exposed neighbors. Inclusion of RSA of the neighboring residues (local RSA) significantly increases correlation with the B‐factor. Correlation between the local RSA and B‐factor is shown to be stronger than the correlation that considers local distance‐ or volume‐based residue depth. We also found that the correlation coefficients between B‐factor and RSA for the 20 amino acids, called flexibility‐exposure correlation index, are strongly correlated with the stability scale that characterizes the average contributions of each amino acid to the folding stability. Our results reveal that the predicted RSA could be used to distinguish between the disordered and ordered residues and that the inclusion of local predicted RSA values helps providing a better contrast between these two types of residues. Prediction models developed based on local actual RSA and local predicted RSA show similar or better results in the context of B‐factor and disorder predictions when compared with several existing approaches. We validate our models using three case studies, which show that this work provides useful clues for deciphering the structure–flexibility–function relation. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

2.
Protein flexibility and intrinsic disorder   总被引:6,自引:0,他引:6  
Comparisons were made among four categories of protein flexibility: (1) low-B-factor ordered regions, (2) high-B-factor ordered regions, (3) short disordered regions, and (4) long disordered regions. Amino acid compositions of the four categories were found to be significantly different from each other, with high-B-factor ordered and short disordered regions being the most similar pair. The high-B-factor (flexible) ordered regions are characterized by a higher average flexibility index, higher average hydrophilicity, higher average absolute net charge, and higher total charge than disordered regions. The low-B-factor regions are significantly enriched in hydrophobic residues and depleted in the total number of charged residues compared to the other three categories. We examined the predictability of the high-B-factor regions and developed a predictor that discriminates between regions of low and high B-factors. This predictor achieved an accuracy of 70% and a correlation of 0.43 with experimental data, outperforming the 64% accuracy and 0.32 correlation of predictors based solely on flexibility indices. To further clarify the differences between short disordered regions and ordered regions, a predictor of short disordered regions was developed. Its relatively high accuracy of 81% indicates considerable differences between ordered and disordered regions. The distinctive amino acid biases of high-B-factor ordered regions, short disordered regions, and long disordered regions indicate that the sequence determinants for these flexibility categories differ from one another, whereas the significantly-greater-than-chance predictability of these categories from sequence suggest that flexible ordered regions, short disorder, and long disorder are, to a significant degree, encoded at the primary structure level.  相似文献   

3.
Adamczak R  Porollo A  Meller J 《Proteins》2005,59(3):467-475
Owing to the use of evolutionary information and advanced machine learning protocols, secondary structures of amino acid residues in proteins can be predicted from the primary sequence with more than 75% per-residue accuracy for the 3-state (i.e., helix, beta-strand, and coil) classification problem. In this work we investigate whether further progress may be achieved by incorporating the relative solvent accessibility (RSA) of an amino acid residue as a fingerprint of the overall topology of the protein. Toward that goal, we developed a novel method for secondary structure prediction that uses predicted RSA in addition to attributes derived from evolutionary profiles. Our general approach follows the 2-stage protocol of Rost and Sander, with a number of Elman-type recurrent neural networks (NNs) combined into a consensus predictor. The RSA is predicted using our recently developed regression-based method that provides real-valued RSA, with the overall correlation coefficients between the actual and predicted RSA of about 0.66 in rigorous tests on independent control sets. Using the predicted RSA, we were able to improve the performance of our secondary structure prediction by up to 1.4% and achieved the overall per-residue accuracy between 77.0% and 78.4% for the 3-state classification problem on different control sets comprising, together, 603 proteins without homology to proteins included in the training. The effects of including solvent accessibility depend on the quality of RSA prediction. In the limit of perfect prediction (i.e., when using the actual RSA values derived from known protein structures), the accuracy of secondary structure prediction increases by up to 4%. We also observed that projecting real-valued RSA into 2 discrete classes with the commonly used threshold of 25% RSA decreases the classification accuracy for secondary structure prediction. While the level of improvement of secondary structure prediction may be different for prediction protocols that implicitly account for RSA in other ways, we conclude that an increase in the 3-state classification accuracy may be achieved when combining RSA with a state-of-the-art protocol utilizing evolutionary profiles. The new method is available through a Web server at http://sable.cchmc.org.  相似文献   

4.
As an alternative to X-ray crystallography, nuclear magnetic resonance (NMR) has also emerged as the method of choice for studying both protein structure and dynamics in solution. However, little work using computational models such as Gaussian network model (GNM) and machine learning approaches has focused on NMR-derived proteins to predict the residue flexibility, which is represented by the root mean square deviation (RMSD) with respect to the average structure. We provide a large-scale comparison of computational models, including GNM, parameter-free GNM and several linear regression models using local solvent exposures as inputs, based on a dataset of 1609 protein chains whose structures were resolved by NMR. The result again confirmed that the correlation of GNM outputs with raw RMSD values was better than that using B-factors of X-ray data. Nevertheless, it was also concluded that the parameter-free GNM and the solvent exposure based linear regression models performed worse than GNM when predicting RMSD, contrary to results using X-ray data. The discrepancy of residue flexibility prediction between NMR and X-ray data is likely attributable to a combination of their physical and methodological differences.  相似文献   

5.
Comparison of the dynamics of myoglobin in different crystal forms.   总被引:3,自引:0,他引:3  
Crystals have been grown of "sperm whale" myoglobin produced in Escherichia coli from a synthetic gene and the structure has been solved to 1.9 A resolution. Because of a remaining initiator methionine, this protein crystallizes in a different space group from native sperm whale myoglobin. The three-dimensional structure of the synthetic protein is essentially identical to the native sperm whale protein. However, the crystallographic B-factors for parts of the molecule are quite different in the two crystal forms, and provide a measure of the effect of different packing constraints on the flexibility of the protein. The effect of the packing forces is to reduce the mobility of the protein in the regions of contact and thereby introduce differences in mobilities between the two crystal forms. Discrepancies between mobilities calculated from molecular dynamics simulations and crystallography can be reduced by considering the data from both crystal forms.  相似文献   

6.
Protein molecules exhibit varying degrees of flexibility throughout their three-dimensional structures, with some segments showing little mobility while others may be so disordered as to be unresolvable by techniques such as X-ray crystallography. Atomic displacement parameters, or B-factors, from X-ray crystallographic studies give an experimentally determined indication of the degree of mobility in a protein structure. To provide better estimators of amino acid flexibility, we have examined B-factors from a large set of high-resolution crystal structures. Because of the differences among structures, it is necessary to normalize the B-factors. However, many proteins have segments of unusually high mobility, which must be accounted for before normalization can be performed. Accordingly, a median-based method from quality control studies was used to identify outliers. After removal of outliers from, and normalization of, each protein chain, the B-factors were collected for each amino acid in the set. It was found that the distribution of normalized B-factors followed a Gumbel, or extreme value distribution, and the location parameter, or mode, of this distribution was used as an estimator of flexibility for the amino acid. These new parameters have a higher correlation with experimentally determined B-factors than parameters from earlier methods.  相似文献   

7.
Structure fluctuations and conformational changes accompany all biological processes involving macromolecules. The paper presents a classification of protein residues based on the normalized equilibrium fluctuations of the residue centers of mass in proteins and a statistical analysis of conformation changes in the side-chains upon binding. Normal mode analysis and an elastic network model were applied to a set of protein complexes to calculate the residue fluctuations and develop the residue classification. Comparison with a classification based on normalized B-factors suggests that the B-factors may underestimate protein flexibility in solvent. Our classification shows that protein loops and disordered fragments are enriched with highly fluctuating residues and depleted with weakly fluctuating residues. Strategies for engineering thermostable proteins are discussed. To calculate the dihedral angles distribution functions, the configuration space was divided into cells by a cubic grid. The effect of protein association on the distribution functions depends on the amino acid type and a grid step in the dihedral angles space. The changes in the dihedral angles increase from the near-backbone dihedral angle to the most distant one, for most residues. On average, one fifth of the interface residues change the rotamer state upon binding, whereas the rest of the interface residues undergo local readjustments within the same rotamer.  相似文献   

8.
9.
Adamczak R  Porollo A  Meller J 《Proteins》2004,56(4):753-767
Accurate prediction of relative solvent accessibilities (RSAs) of amino acid residues in proteins may be used to facilitate protein structure prediction and functional annotation. Toward that goal we developed a novel method for improved prediction of RSAs. Contrary to other machine learning-based methods from the literature, we do not impose a classification problem with arbitrary boundaries between the classes. Instead, we seek a continuous approximation of the real-value RSA using nonlinear regression, with several feed forward and recurrent neural networks, which are then combined into a consensus predictor. A set of 860 protein structures derived from the PFAM database was used for training, whereas validation of the results was carefully performed on several nonredundant control sets comprising a total of 603 structures derived from new Protein Data Bank structures and had no homology to proteins included in the training. Two classes of alternative predictors were developed for comparison with the regression-based approach: one based on the standard classification approach and the other based on a semicontinuous approximation with the so-called thermometer encoding. Furthermore, a weighted approximation, with errors being scaled by the observed levels of variability in RSA for equivalent residues in families of homologous structures, was applied in order to improve the results. The effects of including evolutionary profiles and the growth of sequence databases were assessed. In accord with the observed levels of variability in RSA for different ranges of RSA values, the regression accuracy is higher for buried than for exposed residues, with overall 15.3-15.8% mean absolute errors and correlation coefficients between the predicted and experimental values of 0.64-0.67 on different control sets. The new method outperforms classification-based algorithms when the real value predictions are projected onto two-class classification problems with several commonly used thresholds to separate exposed and buried residues. For example, classification accuracy of about 77% is consistently achieved on all control sets with a threshold of 25% RSA. A web server that enables RSA prediction using the new method and provides customizable graphical representation of the results is available at http://sable.cchmc.org.  相似文献   

10.
Da-Wei Li 《Biophysical journal》2009,96(8):3074-3081
An all-atom local contact model is described that can be used to predict protein motions underlying isotropic crystallographic B-factors. It uses a mean-field approximation to represent the motion of an atom in a harmonic potential generated by the surrounding atoms resting at their equilibrium positions. Based on a 400-ns molecular dynamics simulation of ubiquitin in explicit water, it is found that each surrounding atom stiffens the spring constant by a term that on average scales exponentially with the interatomic distance. This model combines features of the local density model by Halle and the local contact model by Zhang and Brüschweiler. When applied to a nonredundant set of 98 ultra-high resolution protein structures, an average correlation coefficient of 0.75 is obtained for all atoms. The systematic inclusion of crystal contact contributions and fraying effects is found to enhance the performance substantially. Because the computational cost of the local contact model scales linearly with the number of protein atoms, it is applicable to proteins of any size for the prediction of B-factors of both backbone and side-chain atoms. The model performs as well as or better than several other models tested, such as rigid-body motional models, the local density model, and various forms of the elastic network model. It is concluded that at the currently achievable level of accuracy, collective intramolecular motions are not essential for the interpretation of B-factors.  相似文献   

11.
We compare various predicted mechanical and thermodynamic properties of nine oxidized thioredoxins (TRX) using a Distance Constraint Model (DCM). The DCM is based on a nonadditive free energy decomposition scheme, where entropic contributions are determined from rigidity and flexibility of structure based on distance constraints. We perform averages over an ensemble of constraint topologies to calculate several thermodynamic and mechanical response functions that together yield quantitative stability/flexibility relationships (QSFR). Applied to the TRX protein family, QSFR metrics display a rich variety of similarities and differences. In particular, backbone flexibility is well conserved across the family, whereas cooperativity correlation describing mechanical and thermodynamic couplings between the residue pairs exhibit distinctive features that readily standout. The diversity in predicted QSFR metrics that describe cooperativity correlation between pairs of residues is largely explained by a global flexibility order parameter describing the amount of intrinsic flexibility within the protein. A free energy landscape is calculated as a function of the flexibility order parameter, and key values are determined where the native‐state, transition‐state, and unfolded‐state are located. Another key value identifies a mechanical transition where the global nature of the protein changes from flexible to rigid. The key values of the flexibility order parameter help characterize how mechanical and thermodynamic response is linked. Variation in QSFR metrics and key characteristics of global flexibility are related to the native state X‐ray crystal structure primarily through the hydrogen bond network. Furthermore, comparison of three TRX redox pairs reveals differences in thermodynamic response (i.e., relative melting point) and mechanical properties (i.e., backbone flexibility and cooperativity correlation) that are consistent with experimental data on thermal stabilities and NMR dynamical profiles. The results taken together demonstrate that small‐scale structural variations are amplified into discernible global differences by propagating mechanical couplings through the H‐bond network. Proteins 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

12.
B-factor from X-ray crystal structure can well measure protein structural flexibility, which plays an important role in different biological processes, such as catalysis, binding and molecular recognition. Understanding the essence of flexibility can be helpful for the further study of the protein function. In this study, we attempted to correlate the flexibility of a residue to its interactions with other residues by representing the protein structure as a residue contact network. Here, several well established network topological parameters were employed to feature such interactions. A prediction model was constructed for B-factor of a residue by using support vector regression (SVR). Pearson correlation coefficient (CC) was used as the performance measure. CC values were 0.63 and 0.62 for single amino acid and for the whole sequence, respectively. Our results revealed well correlations between B-factors and network topological parameters. This suggests that the protein structural flexibility could be well characterized by the inter-amino acid interactions in a protein.  相似文献   

13.
Prediction-based fingerprints of protein-protein interactions   总被引:2,自引:0,他引:2  
Porollo A  Meller J 《Proteins》2007,66(3):630-645
The recognition of protein interaction sites is an important intermediate step toward identification of functionally relevant residues and understanding protein function, facilitating experimental efforts in that regard. Toward that goal, the authors propose a novel representation for the recognition of protein-protein interaction sites that integrates enhanced relative solvent accessibility (RSA) predictions with high resolution structural data. An observation that RSA predictions are biased toward the level of surface exposure consistent with protein complexes led the authors to investigate the difference between the predicted and actual (i.e., observed in an unbound structure) RSA of an amino acid residue as a fingerprint of interaction sites. The authors demonstrate that RSA prediction-based fingerprints of protein interactions significantly improve the discrimination between interacting and noninteracting sites, compared with evolutionary conservation, physicochemical characteristics, structure-derived and other features considered before. On the basis of these observations, the authors developed a new method for the prediction of protein-protein interaction sites, using machine learning approaches to combine the most informative features into the final predictor. For training and validation, the authors used several large sets of protein complexes and derived from them nonredundant representative chains, with interaction sites mapped from multiple complexes. Alternative machine learning techniques are used, including Support Vector Machines and Neural Networks, so as to evaluate the relative effects of the choice of a representation and a specific learning algorithm. The effects of induced fit and uncertainty of the negative (noninteracting) class assignment are also evaluated. Several representative methods from the literature are reimplemented to enable direct comparison of the results. Using rigorous validation protocols, the authors estimated that the new method yields the overall classification accuracy of about 74% and Matthews correlation coefficients of 0.42, as opposed to up to 70% classification accuracy and up to 0.3 Matthews correlation coefficient for methods that do not utilize RSA prediction-based fingerprints. The new method is available at http://sppider.cchmc.org.  相似文献   

14.
目的 蛋白质的柔性运动对生物体各种反应有着重要意义,基于蛋白质的空间结构预测其柔性运动是蛋白质结构-功能关系领域的重要问题.卷积神经网络(convolutional neural network,CNN)在蛋白质结构-功能关系研究中已有成功应用.方法 本研究借鉴计算机视觉研究中PointNet方法的思想,提出了一种蛋白...  相似文献   

15.
Backbone dynamics of calcium-loaded calbindin D9k have been investigated by two-dimensional proton-detected heteronuclear nuclear magnetic resonance spectroscopy, using a uniformly 15N enriched protein sample. Spin-lattice relaxation rate constants, spin-spin relaxation rate constants, and steady-state [1H]-15N nuclear Overhauser effects were determined for 71 of the 72 backbone amide 15N nuclei. The relaxation parameters were analyzed using a model-free formalism that incorporates the overall rotational correlation time of the molecule, and a generalized order parameter (S2) and an effective internal correlation time for each amide group. Calbindin D9k contains two helix-loop-helix motifs joined by a linker loop at one end of the protein and a beta-type interaction between the two calcium-binding loops at the other end. The amplitude of motions for the calcium-binding loops and the helices are similar, as judged from the average S2 values of 0.83 +/- 0.05 and 0.85 +/- 0.04, respectively. The linker region joining the two calcium-binding subdomains of the molecule has a significantly higher flexibility, as indicated by a substantially lower average S2 value of 0.59 +/- 0.23. For residues in the linker loop and at the C-terminus, the order parameter is further decomposed into separate order parameters for motional processes on two distinct time scales. The effective correlation times are significantly longer for helices I and IV than for helices II and III or for the calcium-binding loops. Residue by residue comparisons reveal correlations of the order parameters with both the crystallographic B-factors and amide proton exchange rates, despite vast differences in the time scales to which these properties are sensitive. The order parameters are also utilized to distinguish regions of the NMR-derived three-dimensional structure of calbindin D9k that are poorly defined due to inherently high flexibility, from poorly defined regions with average flexibility but a low density of structural constraints.  相似文献   

16.
Simple coarse-grained models, such as the Gaussian network model, have been shown to capture some of the features of equilibrium protein dynamics. We extend this model by using atomic contacts to define residue interactions and introducing more than one interaction parameter between residues. We use B-factors from 98 ultra-high resolution (相似文献   

17.
Protein folding rates vary by several orders of magnitude and they depend on the topology of the fold and the size and composition of the sequence. Although recent works show that the rates can be predicted from the sequence, allowing for high‐throughput annotations, they consider only the sequence and its predicted secondary structure. We propose a novel sequence‐based predictor, PFR‐AF, which utilizes solvent accessibility and residue flexibility predicted from the sequence, to improve predictions and provide insights into the folding process. The predictor includes three linear regressions for proteins with two‐state, multistate, and unknown (mixed‐state) folding kinetics. PFR‐AF on average outperforms current methods when tested on three datasets. The proposed approach provides high‐quality predictions in the absence of similarity between the predicted and the training sequences. The PFR‐AF's predictions are characterized by high (between 0.71 and 0.95, depending on the dataset) correlation and the lowest (between 0.75 and 0.9) mean absolute errors with respect to the experimental rates, as measured using out‐of‐sample tests. Our models reveal that for the two‐state chains inclusion of solvent‐exposed Ala may accelerate the folding, while increased content of Ile may reduce the folding speed. We also demonstrate that increased flexibility of coils facilitates faster folding and that proteins with larger content of solvent‐exposed strands may fold at a slower pace. The increased flexibility of the solvent‐exposed residues is shown to elongate folding, which also holds, with a lower correlation, for buried residues. Two case studies are included to support our findings. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

18.
Proteins are the active players in performing essential molecular activities throughout biology, and their dynamics has been broadly demonstrated to relate to their mechanisms. The intrinsic fluctuations have often been used to represent their dynamics and then compared to the experimental B-factors. However, proteins do not move in a vacuum and their motions are modulated by solvent that can impose forces on the structure. In this paper, we introduce a new structural concept, which has been called the structural compliance, for the evaluation of the global and local deformability of the protein structure in response to intramolecular and solvent forces. Based on the application of pairwise pulling forces to a protein elastic network, this structural quantity has been computed and sometimes is even found to yield an improved correlation with the experimental B-factors, meaning that it may serve as a better metric for protein flexibility. The inverse of structural compliance, namely the structural stiffness, has also been defined, which shows a clear anticorrelation with the experimental data. Although the present applications are made to proteins, this approach can also be applied to other biomolecular structures such as RNA. This present study considers only elastic network models, but the approach could be applied further to conventional atomic molecular dynamics. Compliance is found to have a slightly better agreement with the experimental B-factors, perhaps reflecting its bias toward the effects of local perturbations, in contrast to mean square fluctuations. The code for calculating protein compliance and stiffness is freely accessible at https://jerniganlab.github.io/Software/PACKMAN/Tutorials/compliance .  相似文献   

19.
20.
Pairwise contact energies for 20 types of residues are estimated self-consistently from the actual observed frequencies of contacts with regression coefficients that are obtained by comparing "input" and predicted values with the Bethe approximation for the equilibrium mixtures of residues interacting. This is premised on the fact that correlations between the "input" and the predicted values are sufficiently high although the regression coefficients themselves can depend to some extent on protein structures as well as interaction strengths. Residue coordination numbers are optimized to obtain the best correlation between "input" and predicted values for the partition energies. The contact energies self-consistently estimated this way indicate that the partition energies predicted with the Bethe approximation should be reduced by a factor of about 0.3 and the intrinsic pairwise energies by a factor of about 0.6. The observed distribution of contacts can be approximated with a small relative error of only about 0.08 as an equilibrium mixture of residues, if many proteins were employed to collect more than 20,000 contacts. Including repulsive packing interactions and secondary structure interactions further reduces the relative errors. These new contact energies are demonstrated by threading to have improved their ability to discriminate native structures from other non-native folds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号