首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
The goal of controlling protein thermostability is tackled here through establishing, by in silico analyses, the relative weight of residue-residue interactions in proteins as a function of temperature. We have designed for that purpose a (melting-) temperature-dependent, statistical distance potential, where the interresidue distances are computed between the side-chain geometric centers or their functional centers. Their separate derivation from proteins of either high or low thermal resistance reveals the interactions that contribute most to stability in different temperature ranges. Thermostabilizing interactions include salt bridges and cation-π interactions (especially those involving arginine), aromatic interactions, and H-bonds between negatively charged and some aromatic residues. In contrast, H-bonds between two polar noncharged residues or between a polar noncharged residue and a negatively charged residue are relatively less stabilizing at high temperatures. An important observation is that it is necessary to consider both repulsive and attractive interactions in overall thermostabilization, as the degree of repulsion may also vary with temperature. These temperature-dependent potentials are not only useful for the identification of meso- and thermostabilizing pair interactions, but also exhibit predictive power, as illustrated by their ability to predict the melting temperature of a protein based on the melting temperature of homologous proteins.  相似文献   

2.
Organic molecule crystals are becoming more and more important in applications like piezoelectricity, ferroelectricity and pigments. These properties depend on the molecule and on the crystal structure. For this reason much effort is being made to predict the crystal structure of organic molecules. We have developed a new algorithm differing mainly in three features from other approaches (simulated annealing, Monte Carlo etc.). First, we analyze just one molecule for proper symmetry operations building up the crystal; second, the program works in a discrete space; and finally the scoring function (energy function) is derived statistically from known crystal structures and tabulated. Our program computes a list of crystal structures weighted according to our scoring function. The new algorithm FlexCryst is currently implemented for the four space groups P1, Pbar1, P21, and P212121. The three latter space groups are widespread in nature. The algorithm computes structural models of acceptable quality and shows excellent time performance. During our validation we found the experimental structure among the structures proposed by the algorithm in 123 of 129 cases for P1, in 66 of 95 cases for Pbar1, in 73 of 100 cases for P21, and in 94 of 98 cases for P212121. The performance depends on the space group. In the case of P1 the run time per molecule is about two minutes and increases up to roughly one hour for the space group P21.  相似文献   

3.
Jiang L  Li M  Wen Z  Wang K  Diao Y 《The protein journal》2006,25(4):241-249
A new method was proposed for prediction of mitochondrial proteins by the discrete wavelet transform, based on the sequence–scale similarity measurement. This sequence–scale similarity, revealing more information than other conventional methods, does not rely on subcellular location information and can directly predict protein sequences with different length. In our experiments, 499 mitochondrial protein sequences, constituting a mitochondria database, were used as training dataset, and 681 non-mitochondrial protein sequences were tested. The system can predict these sequences with sensitivity, specificity, accuracy and MCC of 50.30%, 95.74%, 76.53% and 0.54, respectively. Source code of the new program is available on request from the authors.  相似文献   

4.
《Journal of molecular biology》1996,257(5):1112-1126
The stability changes in peptides and proteins caused by the substitution of a single amino acid, which can be measured experimentally by the change in folding free energy, are evaluated here using effective potentials derived from known protein structures. The analysis is focused on mutations of residues that are accessible to the solvent. These represent in total 106 mutations, introduced at different sites in barnase, bacteriophage T4 lysozyme and chymotrypsin inhibitor 2, and in a synthetic helical peptide. Assuming that the mutations do not modify the backbone structure, the changes in folding free energies are computed using various types of database-derived potentials and are compared with the measured ones. Distance-dependent residue – residue potentials are found to be inadequate for estimating the stability changes caused by these mutations, as they are dominated by hydrophobic interactions, which do not play an essential role at the protein surface. On the contrary, the potentials based on backbone torsion angle propensities yield quite good results. Indeed, for a subset of 96 out of the 106 mutations, the computed and measured changes in folding free energy correlate with a linear correlation coefficient of 0.87. Moreover, the ten mutations that are excluded from the correlation either seem to cause modifications of the backbone structure or to involve strong hydrophobic interactions, which are atypical for solvent-accessible residues. We find furthermore that raising the ionic strength of the solvent used for measuring the changes in folding free energies improves the correlation, as it tends to mask the electrostatic interactions. When adding to these 106 mutations 44 mutations performed in staphylococcal nuclease and chemotactic protein, which were first discarded because some of them were suspected to affect the backbone conformation or the denatured state, the correlation between measured and computed folding free energy changes remains quite good: the correlation coefficient is 0.86 for 135 out of the 150 mutations. The success of the backbone torsion potentials in predicting stability changes indicates that the approximations made for deriving these potentials are adequate. It suggests moreover that the local interactions along the chain dominate at the protein surface.  相似文献   

5.
Abstract

We develop ways to predict the side chain orientations of residues within a protein structure by using several different statistical machine learning methods. Here side chain orientation of a given residue i is measured by an angle Ωi between the vector pointing from the center of the protein structure to the Cα i atom and the vector pointing from the Cα i atom to the center of its side chain atoms. To predict the Ωi angles, we construct statistical models by using several different methods such as general linear regression, a regression tree and bagging, a neural network, and a support vector machine. The root mean square errors for the different models range only from 36.67 to 37.60 degrees and the correlation coefficients are all between 30% and 34%. The performances of different models in the test set are, thus, quite similar, and show the relative predictive power of these models to be significant in comparison with random side chain orientations.  相似文献   

6.
Distance-dependent statistical potentials are an important class of energy functions extensively used in modeling protein structures and energetics. These potentials are obtained by statistically analyzing the proximity of atoms in all combinatorial amino-acid pairs in proteins with known structures. In model evaluation, the statistical potential is usually subtracted by the value of a reference state for better selectivity. An ideal reference state should include the general chemical properties of polypeptide chains so that only the unique factors stabilizing the native structures are retained after calibrating on reference state. However, reference states available as of this writing rarely model specific chemical constraints of peptide bonds and therefore poorly reflect the behavior of polypeptide chains. In this work, we proposed a statistical potential based on unfolded state ensemble (SPOUSE), where the reference state is summarized from the unfolded state ensembles of proteins produced according to the statistical coil model. Due to its better representation of the features of polypeptides, SPOUSE outperforms three of the most widely used distance-dependent potentials not only in native conformation identification, but also in the selection of close-to-native models and correlation coefficients between energy and model error. Furthermore, SPOUSE shows promising possibility of further improvement by integration with the orientation-dependent side-chain potentials.  相似文献   

7.
Repeat proteins have become increasingly important due to their capability to bind to almost any proteins and the potential as alternative therapy to monoclonal antibodies. In the past decade repeat proteins have been designed to mediate specific protein-protein interactions. The tetratricopeptide and ankyrin repeat proteins are two classes of helical repeat proteins that form different binding pockets to accommodate various partners. It is important to understand the factors that define folding and stability of repeat proteins in order to prioritize the most stable designed repeat proteins to further explore their potential binding affinities. Here we developed distance-dependant statistical potentials using two classes of alpha-helical repeat proteins, tetratricopeptide and ankyrin repeat proteins respectively, and evaluated their efficiency in predicting the stability of repeat proteins. We demonstrated that the repeat-specific statistical potentials based on these two classes of repeat proteins showed paramount accuracy compared with non-specific statistical potentials in: 1) discriminate correct vs. incorrect models 2) rank the stability of designed repeat proteins. In particular, the statistical scores correlate closely with the equilibrium unfolding free energies of repeat proteins and therefore would serve as a novel tool in quickly prioritizing the designed repeat proteins with high stability. StaRProtein web server was developed for predicting the stability of repeat proteins.  相似文献   

8.
Finding homologous and orthologous protein sequences is often the first step in evolutionary studies, annotation projects, and experiments of functional complementation. Despite all currently available computational tools, there is a requirement for easy-to-use tools that provide functional information. Here, a new web application called orthoFind is presented, which allows a quick search for homologous and orthologous proteins given one or more query sequences, allowing a recurrent and exhaustive search against reference proteomes, and being able to include user databases. It addresses the protein multidomain problem, searching for homologs with the same domain architecture, and gives a simple functional analysis of the results to help in the annotation process. orthoFind is easy to use and has been proven to provide accurate results with different datasets. Availability: http://www.bioinfocabd.upo.es/orthofind/.  相似文献   

9.
Practical application of genomic-based risk stratification to clinical diagnosis is appealing yet performance varies widely depending on the disease and genomic risk score (GRS) method. Celiac disease (CD), a common immune-mediated illness, is strongly genetically determined and requires specific HLA haplotypes. HLA testing can exclude diagnosis but has low specificity, providing little information suitable for clinical risk stratification. Using six European cohorts, we provide a proof-of-concept that statistical learning approaches which simultaneously model all SNPs can generate robust and highly accurate predictive models of CD based on genome-wide SNP profiles. The high predictive capacity replicated both in cross-validation within each cohort (AUC of 0.87–0.89) and in independent replication across cohorts (AUC of 0.86–0.9), despite differences in ethnicity. The models explained 30–35% of disease variance and up to ∼43% of heritability. The GRS''s utility was assessed in different clinically relevant settings. Comparable to HLA typing, the GRS can be used to identify individuals without CD with ≥99.6% negative predictive value however, unlike HLA typing, fine-scale stratification of individuals into categories of higher-risk for CD can identify those that would benefit from more invasive and costly definitive testing. The GRS is flexible and its performance can be adapted to the clinical situation by adjusting the threshold cut-off. Despite explaining a minority of disease heritability, our findings indicate a genomic risk score provides clinically relevant information to improve upon current diagnostic pathways for CD and support further studies evaluating the clinical utility of this approach in CD and other complex diseases.  相似文献   

10.
Disordered regions of proteins are highly abundant in various biological processes, involving regulation and signaling and also in relation with cancer, cardiovascular, autoimmune diseases and neurodegenerative disorders. Hence, recognizing disordered regions in proteins is a critical task. In this paper, we presented a new feature encoding technique built from physicochemical properties of residues selected as per the chaotic structure of related protein sequence. Our feature vector has been tested with various classification algorithms on an up-to-date data set and also compared to other methods. The proposed method shows better classification performance than many methods in terms of accuracy, sensitivity and specificity. Our results suggest that the new method that links the residues and their physicochemical properties using Lyapunov exponents is highly effective in recognition of disordered regions.  相似文献   

11.
外膜蛋白(Outer Membrane Proteins, OMPs)是一类具有重要生物功能的蛋白质, 通过生物信息学方法来预测OMPs能够为预测OMPs的二级和三级结构以及在基因组发现新的OMPs提供帮助。文中提出计算蛋白质序列的氨基酸含量特征、二肽含量特征和加权多阶氨基酸残基指数相关系数特征, 将三类特征组合, 采用支持向量机(Support Vector Machine, SVM)算法来识别OMPs。计算了包括四种残基指数的多种组合特征的识别结果, 并且讨论了相关系数的阶次和权值对预测性能的影响。在数据集上的十倍交叉验证测试和独立性测试结果显示, 组合特征识别方法对OMPs和非OMPs的识别精度最高分别达到96.96%和97.33%, 优于现有的多种方法。在五种细菌基因组内识别OMPs的结果显示, 组合特征方法具有很高的特异性, 并且对PDB数据库中已知结构的OMPs识别准确度超过99%。表明该方法能够作为基因组内筛选OMPs的有效工具。  相似文献   

12.
13.
外膜蛋白(Outer Membrane Proteins, OMPs)是一类具有重要生物功能的蛋白质, 通过生物信息学方法来预测OMPs能够为预测OMPs的二级和三级结构以及在基因组发现新的OMPs提供帮助。文中提出计算蛋白质序列的氨基酸含量特征、二肽含量特征和加权多阶氨基酸残基指数相关系数特征, 将三类特征组合, 采用支持向量机(Support Vector Machine, SVM)算法来识别OMPs。计算了包括四种残基指数的多种组合特征的识别结果, 并且讨论了相关系数的阶次和权值对预测性能的影响。在数据集上的十倍交叉验证测试和独立性测试结果显示, 组合特征识别方法对OMPs和非OMPs的识别精度最高分别达到96.96%和97.33%, 优于现有的多种方法。在五种细菌基因组内识别OMPs的结果显示, 组合特征方法具有很高的特异性, 并且对PDB数据库中已知结构的OMPs识别准确度超过99%。表明该方法能够作为基因组内筛选OMPs的有效工具。  相似文献   

14.
基于模糊支持向量机的膜蛋白折叠类型预测   总被引:1,自引:0,他引:1  
现有的基于支持向量机(support vector machine,SVM)来预测膜蛋白折叠类型的方法.利用的蛋白质序列特征并不充分.并且在处理多类蛋白质分类问题时存在不可分区域,针对这两类问题.提取蛋白质序列的氨基酸和二肽组成特征,并计算加权的多阶氨基酸残基指数相关系数特征,将3类特征融和作为分类器的输入特征矢量.并采用模糊SVM(fuzzy SVM,FSVM)算法解决对传统SVM不可分数据的分类.在无冗余的数据集上测试结果显示.改进的特征提取方法在相同分类算法下预测性能优于已有的特征提取方法:FSVM在相同特征提取方法下性能优于传统的SVM.二者相结合的分类策略在独立性数据集测试下的预测精度达到96.6%.优于现有的多种预测方法.能够作为预测膜蛋白和其它蛋白质折叠类型的有效工具.  相似文献   

15.
Antifreeze proteins (AFPs) are a group of proteins that protect organisms from deep freezing temperatures and are expressed in vertebrates, invertebrates, plants, bacteria, and fungi. The nuclear magnetic resonance, x-ray structure, and many spectroscopic studies with AFPs have been instrumental in determining the structure–function relationship. Mutational studies have indicated the importance of hydrophobic residues in ice binding. Various studies have pointed out that the mechanism of AFP action is through its adsorption on the ice surface, which leads to a curved surface, preventing further growth of ice by the “Kelvin effect.” The AFPs have potential industrial, medical, and agricultural application in different fields, such as food technology, preservation of cell lines, organs, cryosurgery, and cold hardy transgenic plants and animals. However, the applications of AFPs are marred by high cost due to low yield. This review deals with the source and properties of AFPs from an angle of their application and their potential. The possibility of production using different molecular biological techniques, which will help increase the yield, is also dealt with.  相似文献   

16.
17.
A statistical thermodynamics approach is proposed to determine structurally and functionally important residues in native proteins that are involved in energy exchange with a ligand and other residues along an interaction pathway. The structure-function relationships, ligand binding and allosteric activities of ten structures of HLA Class I proteins of the immune system are studied by the Gaussian Network Model. Five of these models are associated with inflammatory rheumatic disease and the remaining five are properly functioning. In the Gaussian Network Model, the protein structures are modeled as an elastic network where the inter-residue interactions are harmonic. Important residues and the interaction pathways in the proteins are identified by focusing on the largest eigenvalue of the residue interaction matrix. Predicted important residues match those known from previous experimental and clinical work. Graph perturbation is used to determine the response of the important residues along the interaction pathway. Differences in response patterns of the two sets of proteins are identified and their relations to disease are discussed.  相似文献   

18.
19.
Reaction-equilibrium constants determine the metabolite concentrations necessary to drive flux through metabolic pathways. Group-contribution methods offer a way to estimate reaction-equilibrium constants at wide coverage across the metabolic network. Here, we present an updated group-contribution method with 1) additional curated thermodynamic data used in fitting and 2) capabilities to calculate equilibrium constants as a function of temperature. We first collected and curated aqueous thermodynamic data, including reaction-equilibrium constants, enthalpies of reaction, Gibbs free energies of formation, enthalpies of formation, entropy changes of formation of compounds, and proton- and metal-ion-binding constants. Next, we formulated the calculation of equilibrium constants as a function of temperature and calculated the standard entropy change of formation (ΔfS°) using a model based on molecular properties. The median absolute error in estimating ΔfS° was 0.013 kJ/K/mol. We also estimated magnesium binding constants for 618 compounds using a linear regression model validated against measured data. We demonstrate the improved performance of the current method (8.17 kJ/mol in median absolute residual) over the current state-of-the-art method (11.47 kJ/mol) in estimating the 185 new reactions added in this work. The efforts here fill in gaps for thermodynamic calculations under various conditions, specifically different temperatures and metal-ion concentrations. These, to our knowledge, new capabilities empower the study of thermodynamic driving forces underlying the metabolic function of organisms living under diverse conditions.  相似文献   

20.
Repeat proteins have recently attracted much attention as alternative scaffolds to immunoglobulin antibodies due to their unique structural and biophysical features. In particular, repeat proteins show high stability against temperature and chaotic agents. Despite many studies, structural features for the stability of repeat proteins remain poorly understood. Here we present an interesting result from in silico analyses pursuing the factors which affect the stability of repeat proteins. Previously developed repebody structure based on variable lymphocytes receptors (VLRs) which consists of leucine-rich repeat (LRR) modules was used as initial structure for the present study. We constructed extra six repebody structures with varying numbers of repeat modules and those structures were used for molecular dynamics simulations. For the structures, the intramolecular interactions including backbone H-bonds, van der Waals energy, and hydrophobicity were investigated and then the radius of gyration, solvent-accessible surface area, ratio of secondary structure, and hydration free energy were also calculated to find out the relationship between the number of LRR modules and stability of the protein. Our results show that the intramolecular interactions lead to more compact structure and smaller surface area of the repebodies, which are critical for the stability of repeat proteins. The other features were also well compatible with the experimental results. Based on our observations, the repebody-5 was proposed as the best structure from the all repebodies in structure optimization process. The present study successfully demonstrated that our computer-based molecular modeling approach can significantly contribute to the experiment-based protein engineering challenge.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号