首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
We have demonstrated earlier that protein microenvironments were conserved around disulfide‐bridged cystine motifs with similar functions, irrespective of diversity in protein sequences. Here, cysteine thiol modifications were characterized based on protein microenvironments, secondary structures and specific protein functions. Protein microenvironment around an amino acid was defined as the summation of hydrophobic contributions from the surrounding protein fragments and the solvent molecules present within its first contact shell. Cysteine functions (modifications) were grouped into enzymatic and non‐enzymatic classes. Modifications studied were—disulfide formation, thio‐ether formation, metal‐binding, nitrosylation, acylation, selenylation, glutathionylation, sulfenylation, and ribosylation. 1079 enzymatic proteins were reported from high‐resolution crystal structures. Protein microenvironments around cysteine thiol, derived from above crystal structures, were clustered into 3 groups—buried‐hydrophobic, intermediate and exposed‐hydrophilic clusters. Characterization of cysteine functions were statistically meaningful for 4 modifications (disulfide formation, thioether formation, sulfenylation, and iron/zinc binding) those have sufficient amount of data in the current dataset. Results showed that protein microenvironment, secondary structure and protein functions were conserved for enzymatic cysteine functions, in contrast to the same function from non‐enzymatic cysteines. Disulfide forming enzymatic cysteines were tightly packed within intermediate protein microenvironment cluster, have alpha‐helical conformation and mostly belonged to CxxC motif of electron transport proteins. Disulfide forming non‐enzymatic cysteines did not belong to conserved motif and have variable secondary structures. Similarly, enzymatic thioether forming cysteines have conserved microenvironment compared to non‐enzymatic cystienes. Based on the compatibility between protein microenvironment and cysteine modifications, more efficient drug molecules could be designed against cysteine‐related diseases.  相似文献   

3.
Understanding the coupling specificity between G protein-coupled receptors (GPCRs) and specific classes of G proteins is important for further elucidation of receptor functions within a cell. Increasing information on GPCR sequences and the G protein family would facilitate prediction of the coupling properties of GPCRs. In this study, we describe a novel approach for predicting the coupling specificity between GPCRs and G proteins. This method uses not only GPCR sequences but also the functional knowledge generated by natural language processing, and can achieve 92.2% prediction accuracy by using the C4.5 algorithm. Furthermore, rules related to GPCR-G protein coupling are generated. The combination of sequence analysis and text mining improves the prediction accuracy for GPCR-G protein coupling specificity, and also provides clues for understanding GPCR signaling.  相似文献   

4.
蛋白质二级结构预测是进行蛋白质三级结构研究的重要基础,氨基酸的编码方式对二级结构预测有一定的影响。本文应用了一种新的组合编码方式,即将基团编码与位置特异性打分矩阵(PSSM)进行组合的编码方式。本文中提出的基团编码是针对氨基酸的一种新的编码方式,基团编码是根据氨基酸内部组成来进行编码的,由42位属性组成。本文选取位置特异性打分矩阵(PSSM)中的Blosum62进化矩阵和新的基团编码进行组合,形成新的编码方式。然后对CB513和25pdb两组数据分别进行实验。本文中将采用贝叶斯分类器与自动编码器两种方法来对这种新的编码方式进行实验,然后比较这两种方法得到的两组数据的结果。可以很明显的发现采用自动编码器的实验结果要比使用贝叶斯分类器的结果要高出1.65%。在本文的实验中,可以提取特征的自动编码器的预测准确率更好。  相似文献   

5.
Wang CC  Chen JH  Yin SH  Chuang WJ 《Proteins》2006,64(1):219-226
Different programs and methods were employed to superimpose protein structures, using members of four very different protein families as test subjects, and the results of these efforts were compared. Algorithms based on human identification of key amino acid residues on which to base the superpositions were nearly always more successful than programs that used automated techniques to identify key residues. Among those programs automatically identifying key residues, MASS could not superimpose all members of some families, but was very efficient with other families. MODELLER, MultiProt, and STAMP had varying levels of success. A genetic algorithm program written for this project did not improve superpositions when results from neighbor-joining and pseudostar algorithms were used as its starting cases, but it always improved superpositions obained by MODELLER and STAMP. A program entitled PyMSS is presented that includes three superposition algorithms featuring human interaction.  相似文献   

6.
Identification of asparagine (Asn) sites that are prone to deamidation is critical for the development of therapeutic monoclonal antibodies (mAbs). Despite a common chemical degradation pathway, the rates of Asn deamidation can vary dramatically among different sites, and prediction of the sensitive deamidation sites is still challenging. In this study, characterization of Asn deamidation for five IgG1 and five IgG4 mAbs under both normal and stressed conditions revealed dramatic differences in the Asn deamidation rates. A comprehensive analysis of the deamidation sites indicated that the deamidation rate differences could be explained by differences in the local structure conformation, structure flexibility and solvent accessibility. A decision tree was developed to predict the deamidation propensity for all Asn sites in IgG mAbs based on the analysis of these three structural parameters. This decision tree will allow potential Asn deamidation hot spots to be identified early in development.  相似文献   

7.
Cysteine (Cys) is a critically important amino acid, serving a variety of functions within proteins including structural roles, catalysis, and regulation of function through post‐translational modifications. Predicting which Cys residues are likely to be reactive is a very sought after feature. Few methods are currently available for the task, either based on evaluation of physicochemical features (e.g., pKa and exposure) or based on similarity with known instances. In this study, we developed an algorithm (named HAL‐Cy) which blends previous work with novel implementations to identify reactive Cys from nonreactive. HAL‐Cy present two major components: (i) an energy based part, rooted on the evaluation of H‐bond network contributions and (ii) a knowledge based part, composed of different profiling approaches (including a newly developed weighting matrix for sequence profiling). In our evaluations, HAL‐Cy provided significantly improved performances, as tested in comparisons with existing approaches. We implemented our algorithm in a web service (Cy‐preds), the ultimate product of our work; we provided it with a variety of additional features, tools, and options: Cy‐preds is capable of performing fully automated calculations for a thorough analysis of Cys reactivity in proteins, ranging from reactivity predictions (e.g., with HAL‐Cy) to functional characterization. We believe it represents an original, effective, and very useful addition to the current array of tools available to scientists involved in redox biology, Cys biochemistry, and structural bioinformatics. Proteins 2016; 84:278–291. © 2015 Wiley Periodicals, Inc.  相似文献   

8.
For many years it has been accepted that the sequence of a protein can specify its three-dimensional structure. However, there has been limited progress in explaining how the sequence dictates its fold and no attempt to do this computationally without the use of specific structural data has ever succeeded for any protein larger than 100 residues. We describe a method that can predict complex folds up to almost 200 residues using only basic principles that do not include any elements of sequence homology. The method does not simulate the folding chain but generates many thousands of models based on an idealized representation of structure. Each rough model is scored and the best are refined. On a set of five proteins, the correct fold score well and when tested on a set of larger proteins, the correct fold was ranked highest for some proteins more than 150 residues, with others being close topological variants. All other methods that approach this level of success rely on the use of templates or fragments of known structures. Our method is unique in using a database of ideal models based on general packing rules that, in spirit, is closer to an ab initio approach.  相似文献   

9.
Isom DG  Marguet PR  Oas TG  Hellinga HW 《Proteins》2011,79(4):1034-1047
Protein thermodynamic stability is a fundamental physical characteristic that determines biological function. Furthermore, alteration of thermodynamic stability by macromolecular interactions or biochemical modifications is a powerful tool for assessing the relationship between protein structure, stability, and biological function. High-throughput approaches for quantifying protein stability are beginning to emerge that enable thermodynamic measurements on small amounts of material, in short periods of time, and using readily accessible instrumentation. Here we present such a method, fast quantitative cysteine reactivity, which exploits the linkage between protein stability, sidechain protection by protein structure, and structural dynamics to characterize the thermodynamic and kinetic properties of proteins. In this approach, the reaction of a protected cysteine and thiol-reactive fluorogenic indicator is monitored over a gradient of temperatures after a short incubation time. These labeling data can be used to determine the midpoint of thermal unfolding, measure the temperature dependence of protein stability, quantify ligand-binding affinity, and, under certain conditions, estimate folding rate constants. Here, we demonstrate the fQCR method by characterizing these thermodynamic and kinetic properties for variants of Staphylococcal nuclease and E. coli ribose-binding protein engineered to contain single, protected cysteines. These straightforward, information-rich experiments are likely to find applications in protein engineering and functional genomics.  相似文献   

10.
11.
Among protein residues, cysteines are one of the prominent candidates to ROS‐mediated and RNS‐mediated post‐translational modifications, and hydrogen peroxide (H2O2) is the main ROS candidate for inducing cysteine oxidation. The reaction with H2O2 is not common to all cysteine residues, being their reactivity an utmost prerequisite for the sensitivity towards H2O2. Indeed, only deprotonated Cys (i.e. thiolate form, ? S?) can react with H2O2 leading to sulphenic acid formation (? SOH), which is considered as a major/central player of ROS sensing pathways. However, cysteine sulphenic acids are generally unstable because they can be further oxidized to irreversible forms (sulphinic and sulphonic acids, ? SO2H and ? SO3H, respectively), or alternatively, they can proceed towards further modifications including disulphide bond formation (? SS? ), S‐glutathionylation (? SSG) and sulphenamide formation (? SN?). To understand why and how cysteine residues undergo primary oxidation to sulphenic acid, and to explore the stability of cysteine sulphenic acids, a combination of biochemical, structural and computational studies are required. Here, we will discuss the current knowledge of the structural determinants for cysteine reactivity and sulphenic acid stability within protein microenvironments.  相似文献   

12.
在充分利用土壤类型、土地利用方式、岩性类型、地形、道路、工业类型等影响土壤质量主要因素,准确获取区域土壤质量的空间分布特征的基础上,采用互信息理论对13个辅助变量(岩性类型、土地利用方式、土壤类型、到城镇的距离、到道路的距离、到工业用地的距离、到河流的距离、相对高程、坡度、坡向、平向曲率、纵向曲率和切线曲率)进行筛选,然后通过决策树See5.0预测研究区土壤质量.结果表明: 影响研究区土壤质量的主要因素包括土壤类型、土地利用方式、岩性类型、到城镇的距离、到水域的距离、相对高程、到道路的距离和到工业用地的距离;以互信息理论选取的因子为预测变量的决策树模型精度明显优于以全部因子为预测变量的决策树模型,在前者的决策树模型中,无论是决策树还是决策规则,分类预测精度均达到80%以上.互信息理论结合决策树的方法在充分利用连续型和字符型数据的基础上,不仅精简了一般决策树算法的输入参数,而且能有效地预测和评价区域土壤质量等级.  相似文献   

13.
Lee S  Lee BC  Kim D 《Proteins》2006,62(4):1107-1114
Knowing protein structure and inferring its function from the structure are one of the main issues of computational structural biology, and often the first step is studying protein secondary structure. There have been many attempts to predict protein secondary structure contents. Previous attempts assumed that the content of protein secondary structure can be predicted successfully using the information on the amino acid composition of a protein. Recent methods achieved remarkable prediction accuracy by using the expanded composition information. The overall average error of the most successful method is 3.4%. Here, we demonstrate that even if we only use the simple amino acid composition information alone, it is possible to improve the prediction accuracy significantly if the evolutionary information is included. The idea is motivated by the observation that evolutionarily related proteins share the similar structure. After calculating the homolog-averaged amino acid composition of a protein, which can be easily obtained from the multiple sequence alignment by running PSI-BLAST, those 20 numbers are learned by a multiple linear regression, an artificial neural network and a support vector regression. The overall average error of method by a support vector regression is 3.3%. It is remarkable that we obtain the comparable accuracy without utilizing the expanded composition information such as pair-coupled amino acid composition. This work again demonstrates that the amino acid composition is a fundamental characteristic of a protein. It is anticipated that our novel idea can be applied to many areas of protein bioinformatics where the amino acid composition information is utilized, such as subcellular localization prediction, enzyme subclass prediction, domain boundary prediction, signal sequence prediction, and prediction of unfolded segment in a protein sequence, to name a few.  相似文献   

14.
15.
The conformational parametersP k for each amino acid species (j=1–20) of sequential peptides in proteins are presented as the product ofP i,k , wherei is the number of the sequential residues in thekth conformational state (k=-helix,-sheet,-turn, or unordered structure). Since the average parameter for ann-residue segment is related to the average probability of finding the segment in the kth state, it becomes a geometric mean of (P k )av=(P i,k ) 1/n with amino acid residuei increasing from 1 ton. We then used ln(Pk)av to convert a multiplicative process to a summation, i.e., ln(P k ) av =(1/n)P i,k (i=1 ton) for ease of operation. However, this is unlike the popular Chou-Fasman algorithm, which has the flaw of using the arithmetic mean for relative probabilities. The Chou-Fasman algorithm happens to be close to our calculations in many cases mainly because the difference between theirP k and our InP k is nearly constant for about one-half of the 20 amino acids. When stronger conformation formers and breakers exist, the difference become larger and the prediction at the N- and C-terminal-helix or-sheet could differ. If the average conformational parameters of the overlapping segments of any two states are too close for a unique solution, our calculations could lead to a different prediction.  相似文献   

16.
Lee J  Kim SY  Joo K  Kim I  Lee J 《Proteins》2004,56(4):704-714
A novel method for ab initio prediction of protein tertiary structures, PROFESY (PROFile Enumerating SYstem), is proposed. This method utilizes the secondary structure prediction information of a query sequence and the fragment assembly procedure based on global optimization. Fifteen-residue-long fragment libraries are constructed using the secondary structure prediction method PREDICT, and fragments in these libraries are assembled to generate full-length chains of a query protein. Tertiary structures of 50 to 100 conformations are obtained by minimizing an energy function for proteins, using the conformational space annealing method that enables one to sample diverse low-lying local minima of the energy. We apply PROFESY for benchmark tests to proteins with known structures to demonstrate its feasibility. In addition, we participated in CASP5 and applied PROFESY to four new-fold targets for blind prediction. The results are quite promising, despite the fact that PROFESY was in its early stages of development. In particular, PROFESY successfully provided us the best model-one structure for the target T0161.  相似文献   

17.
The Escherichia coli protein YajL (ThiJ) is a member of the DJ-1 superfamily with close homologues in many prokaryotes. YajL also shares 40% sequence identity with human DJ-1, an oncogene and neuroprotective protein whose loss-of-function mutants are associated with certain types of familial, autosomal recessive Parkinsonism. We report the 1.1 angstroms resolution crystal structure of YajL in a crystal form with two molecules in the asymmetric unit. The structure of YajL is remarkably similar to that of human DJ-1 (0.9 angstroms C(alpha) RMSD) and both proteins adopt the same dimeric structure. The conserved cysteine residue located in the "nucleophile elbow" is oxidized to either cysteine sulfenic or sulfinic acid in the two molecules in the asymmetric unit, and a mechanism for this oxidation is proposed that may be valid for other proteins in the DJ-1 superfamily as well. Rosenfield difference matrix analysis of the refined anisotropic displacement parameters in the YajL structure reveals significant differences in the intramolecular flexibility of the two non-crystallographic symmetry-related molecules in the asymmetric unit. Lastly, a comparison of the crystal structures of the four different E.coli members of the DJ-1 superfamily indicates that the variable oligomerization in this superfamily is due to a combination of protein-specific insertions into the core fold that form specific interfaces while occluding others plus optimization of residues in the structurally invariant regions of the core fold that facilitate protein-protein interactions.  相似文献   

18.
19.
We have developed a method to reliably identify partial membrane protein topologies using the consensus of five topology prediction methods. When evaluated on a test set of experimentally characterized proteins, we find that approximately 90% of the partial consensus topologies are correctly predicted in membrane proteins from prokaryotic as well as eukaryotic organisms. Whole-genome analysis reveals that a reliable partial consensus topology can be predicted for approximately 70% of all membrane proteins in a typical bacterial genome and for approximately 55% of all membrane proteins in a typical eukaryotic genome. The average fraction of sequence length covered by a partial consensus topology is 44% for the prokaryotic proteins and 17% for the eukaryotic proteins in our test set, and similar numbers are found when the algorithm is applied to whole genomes. Reliably predicted partial topologies may simplify experimental determinations of membrane protein topology.  相似文献   

20.
Many proteins need to form oligomers to be functional, so oligomer structures provide important clues to biological roles of proteins. Prediction of oligomer structures therefore can be a useful tool in the absence of experimentally resolved structures. In this article, we describe the server and human methods that we used to predict oligomer structures in the CASP13 experiment. Performances of the methods on the 42 CASP13 oligomer targets consisting of 30 homo-oligomers and 12 hetero-oligomers are discussed. Our server method, Seok-assembly, generated models with interface contact similarity measure greater than 0.2 as model 1 for 11 homo-oligomer targets when proper templates existed in the database. Model refinement methods such as loop modeling and molecular dynamics (MD)-based overall refinement failed to improve model qualities when target proteins have domains not covered by templates or when chains have very small interfaces. In human predictions, additional experimental data such as low-resolution electron microscopy (EM) map were utilized. EM data could assist oligomer structure prediction by providing a global shape of the complex structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号