首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
Protein functional sites control most biological processes and are important targets for drug design and protein engineering. To characterize them, the evolutionary trace (ET) ranks the relative importance of residues according to their evolutionary variations. Generally, top‐ranked residues cluster spatially to define evolutionary hotspots that predict functional sites in structures. Here, various functions that measure the physical continuity of ET ranks among neighboring residues in the structure, or in the sequence, are shown to inform sequence selection and to improve functional site resolution. This is shown first, in 110 proteins, for which the overlap between top‐ranked residues and actual functional sites rose by 8% in significance. Then, on a structural proteomic scale, optimized ET led to better 3D structure‐function motifs (3D templates) and, in turn, to enzyme function prediction by the Evolutionary Trace Annotation (ETA) method with better sensitivity of (40% to 53%) and positive predictive value (93% to 94%). This suggests that the similarity of evolutionary importance among neighboring residues in the sequence and in the structure is a universal feature of protein evolution. In practice, this yields a tool for optimizing sequence selections for comparative analysis and, via ET, for better predictions of functional site and function. This should prove useful for the efficient mutational redesign of protein function and for pharmaceutical targeting.  相似文献   

2.
3.
Given the massive increase in the number of new sequences and structures, a critical problem is how to integrate these raw data into meaningful biological information. One approach, the Evolutionary Trace, or ET, uses phylogenetic information to rank the residues in a protein sequence by evolutionary importance and then maps those ranked at the top onto a representative structure. If these residues form structural clusters, they can identify functional surfaces such as those involved in molecular recognition. Now that a number of examples have shown that ET can identify binding sites and focus mutational studies on their relevant functional determinants, we ask whether the method can be improved so as to be applicable on a large scale. To address this question, we introduce a new treatment of gaps resulting from insertions and deletions, which streamlines the selection of sequences used as input. We also introduce objective statistics to assess the significance of the total number of clusters and of the size of the largest one. As a result of the novel treatment of gaps, ET performance improves measurably. We find evolutionarily privileged clusters that are significant at the 5% level in 45 out of 46 (98%) proteins drawn from a variety of structural classes and biological functions. In 37 of the 38 proteins for which a protein-ligand complex is available, the dominant cluster contacts the ligand. We conclude that spatial clustering of evolutionarily important residues is a general phenomenon, consistent with the cooperative nature of residues that determine structure and function. In practice, these results suggest that ET can be applied on a large scale to identify functional sites in a significant fraction of the structures in the protein databank (PDB). This approach to combining raw sequences and structure to obtain detailed insights into the molecular basis of function should prove valuable in the context of the Structural Genomics Initiative.  相似文献   

4.
The growing body of experimental and computational data describing how proteins interact with each other has emphasized the multiplicity of protein interactions and the complexity underlying protein surface usage and deformability. In this work, we propose new concepts and methods toward deciphering such complexity. We introduce the notion of interacting region to account for the multiple usage of a protein's surface residues by several partners and for the variability of protein interfaces coming from molecular flexibility. We predict interacting patches by crossing evolutionary, physicochemical and geometrical properties of the protein surface with information coming from complete cross-docking (CC-D) simulations. We show that our predictions match well interacting regions and that the different sources of information are complementary. We further propose an indicator of whether a protein has a few or many partners. Our prediction strategies are implemented in the dynJET2 algorithm and assessed on a new dataset of 262 protein on which we performed CC-D. The code and the data are available at: http://www.lcqb.upmc.fr/dynJET2/ .  相似文献   

5.
Automatic methods for predicting functionally important residues   总被引:9,自引:0,他引:9  
Sequence analysis is often the first guide for the prediction of residues in a protein family that may have functional significance. A few methods have been proposed which use the division of protein families into subfamilies in the search for those positions that could have some functional significance for the whole family, but at the same time which exhibit the specificity of each subfamily ("Tree-determinant residues"). However, there are still many unsolved questions like the best division of a protein family into subfamilies, or the accurate detection of sequence variation patterns characteristic of different subfamilies. Here we present a systematic study in a significant number of protein families, testing the statistical meaning of the Tree-determinant residues predicted by three different methods that represent the range of available approaches. The first method takes as a starting point a phylogenetic representation of a protein family and, following the principle of Relative Entropy from Information Theory, automatically searches for the optimal division of the family into subfamilies. The second method looks for positions whose mutational behavior is reminiscent of the mutational behavior of the full-length proteins, by directly comparing the corresponding distance matrices. The third method is an automation of the analysis of distribution of sequences and amino acid positions in the corresponding multidimensional spaces using a vector-based principal component analysis. These three methods have been tested on two non-redundant lists of protein families: one composed by proteins that bind a variety of ligand groups, and the other composed by proteins with annotated functionally relevant sites. In most cases, the residues predicted by the three methods show a clear tendency to be close to bound ligands of biological relevance and to those amino acids described as participants in key aspects of protein function. These three automatic methods provide a wide range of possibilities for biologists to analyze their families of interest, in a similar way to the one presented here for the family of proteins related with ras-p21.  相似文献   

6.
预测蛋白质间相互作用的生物信息学方法   总被引:8,自引:0,他引:8  
后基因组时代的研究模式,已从原来的序列-结构-功能转向基因表达-系统动力学-生理功能。建立蛋白质间相互作用的完全网络,即蛋白质相互作用组(interactome),将有助于从系统角度加深对细胞结构和功能的认识,并为新药靶点的发现和药物设计提供理论基础。一系列系统分析蛋白质相互作用的实验方法已经建立,近年来,出现了多种预测蛋白质相互作用的生物信息学方法,这些方法不仅是对传统实验方法的有价值的补充,而且能够扩展实验方法的预测范围;同时,在开发这些方法的过程中建立了一些重要的分子进化和分子生物学慨念。本文综述了9种生物信息学方法的原理、方法评估、存在的问题.并分析了这个领域的发展前景。  相似文献   

7.
8.
9.
As a result of rapid advances in genome sequencing, the pace of discovery of new protein sequences has surpassed that of structure and function determination by orders of magnitude. This is also true for metal-binding proteins, that is, proteins that bind one or more metal atoms necessary for their biological function. While metal binding site geometry and composition have been extensively studied, no large scale investigation of metal-coordinating residue conservation has been pursued so far. In pursuing this analysis, we were able to corroborate anecdotal evidence that certain residues are preferred to others for binding to certain metals. The conservation of most metal-coordinating residues is correlated with residue preference in a statistically significant manner. Additionally, we also established a statistically significant difference in conservation between metal-coordinating and noncoordinating residues. These results could be useful for providing better insight to functional importance of metal-coordinating residues, possibly aiding metal binding site prediction and design, metal-protein complex structure prediction, drug discovery, as well as model fitting to electron-density maps produced by X-ray crystallography.  相似文献   

10.
11.
The studies of novel inhibitors of DNA topoisomerase I (Topo I) have already become very promising in cancer chemotherapy. Identifying the new drug-binding residues is playing an important role in the design and optimization of Topo I inhibitors. The designed compounds may have novel scaffolds, thus will be helpful to overcome the toxicities of current camptothecin (CPT) drugs and may provide a solution to cross resistance with these drugs. Multiple sequence alignments were performed on eukaryotic DNA topoisomerase I superfamily and thus the evolutionary tree was constructed. The Evolutionary Trace method was applied to identify functionally important residues of human Topo I. It has been demonstrated that class-specific hydrophobic residues Ala351, Met428, Pro431 are located around the 7,9-position of CPT, indicating suitable substitution of hydrophobic group on CPT will increase antitumor activity. The conservative residue Lys436 in the superfamily is of particular interest and new CPT derivatives designed based on this residue may greatly increase water solubility of such drugs. It has also been demonstrated that the residues Asn352 and Arg364 were conservative in the superfamily, whose mutation will render CPT resistance. As our molecular docking studies demonstrated they did not make any direct interaction with CPT, they are important drug-binding site residues for future design of novel non-camptothecin lead compounds. This work provided a strong basis for the design and synthesis of novel highly potent CPT derivatives and virtual screening for novel lead compounds.  相似文献   

12.
A common difficulty in post genomics biology is that large-scale techniques of data collection often strip away information on the biological context of these data. The result is a massive number of disconnected observations on sequence, structure, and function from which underlying patterns and biological meaning are obscured. One solution is to build computational filters that pick out sufficiently few facts, relevant to a query, that their relationship is immediately apparent and experimentally testable. Typically, these filters rely on mathematics and statistics, and on first principles from physics and chemistry. We show here that evolution itself can be used to filter sequence and structure data in order to identify evolutionarily important amino acids. A general property of these residues is that they form clusters in native protein structures and point to regions where mutations have the greatest biological impact. The result is an accurate method of functional site annotation that is scalable for structural proteomics.  相似文献   

13.
基因组功能预测的进化印记方法   总被引:6,自引:1,他引:6  
改善基因组功能预测方案是目前功能基因组学的迫切问题,生物进化历程会在分子序列上留下相应进化印记-直系同源簇的特异模体,在这一生物学事实的基础上,提出了一个新的基因缚功能预测方法,首先利用进化分析方法构建直系同源簇,再找到各直系同源簇的功能模体,这样可以形成特异的功能模体库,未知基因的功能预测可望通过搜索该功能模体库而得以高效,准确地完成,对5个家族的检验初步证实该方案是可行的。  相似文献   

14.
During protein evolutionary processes, protein fam-ily members undergo extensive random mutations and a long period of natural selections, and thus induce the functional evolution and the emergence of subfamily. The evolutionary variation events were recorded in the sequences of protein family members. Therefore, identification of functionally important residues can be achieved by studying residue conservation in protein sequence families. Generally, the residues conserved across the family of…  相似文献   

15.
We describe a general, modular method for developing protocols to identify the amino acid residues that most likely define the division of a protein superfamily into two subsets. As one possibility, we use PROBE to gather superfamily members and perform an ungapped alignment. We then use a modified BLOSUM62 substitution matrix to determine the discriminating power of each column of aligned residues. The overall method is particularly useful for predicting amino acids responsible for substrate or binding specificity when no structures are available. We apply our method to three pairs of protein classes in three different superfamilies, and present our results, some of which have been experimentally verified. This approach may accelerate the elucidation of enzymic substrate specificity, which is critical for both mechanistic insights into biocatalysis and ultimate application.  相似文献   

16.
Molecular docking is a popular way to screen for novel drug compounds. The method involves aligning small molecules to a protein structure and estimating their binding affinity. To do this rapidly for tens of thousands of molecules requires an effective representation of the binding region of the target protein. This paper presents an algorithm for representing a protein's binding site in a way that is specifically suited to molecular docking applications. Initially the protein's surface is coated with a collection of molecular fragments that could potentially interact with the protein. Each fragment, or probe, serves as a potential alignment point for atoms in a ligand, and is scored to represent that probe's affinity for the protein. Probes are then clustered by accumulating their affinities, where high affinity clusters are identified as being the "stickiest" portions of the protein surface. The stickiest cluster is used as a computational binding "pocket" for docking. This method of site identification was tested on a number of ligand-protein complexes; in each case the pocket constructed by the algorithm coincided with the known ligand binding site. Successful docking experiments demonstrated the effectiveness of the probe representation.  相似文献   

17.
The maltose transporter of Escherichia coli is a member of the ATP‐binding cassette (ABC) transporter superfamily. The crystal structures of maltose transporter MalK have been determined for distinct conformations in the presence and absence of the ligand ATP, and other interacting proteins. Using the distinct MalK structures, normal mode analysis was performed to understand the dynamics behavior of the system. A network of dynamically important residues was obtained from the normal mode analysis and the analysis of point mutation on the normal modes. Our results suggest that the intradomain rotation occurs earlier than the interdomain rotation during the maltose‐binding protein (MBP)‐driven conformational changes of MalK. We inquire if protein motion and functional‐driven evolutionary conservation are related. The sequence conservation of MalK was analyzed to derive a network of evolutionarily important residues. There are highly significant correlations between protein sequence and protein dynamics in many regions on the maltose transporter MalK, suggesting a link between the protein evolution and dynamics. The significant overlaps between the network of dynamically important residues and the network of evolutionarily important residues form a network of dynamically conserved residues. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

18.
The degree to which an amino acid site is free to vary is strongly dependent on its structural and functional importance. An amino acid that plays an essential role is unlikely to change over evolutionary time. Hence, the evolutionary rate at an amino acid site is indicative of how conserved this site is and, in turn, allows evaluation of its importance in maintaining the structure/function of the protein. When using probabilistic methods for site-specific rate inference, few alternatives are possible. In this study we use simulations to compare the maximum-likelihood and Bayesian paradigms. We study the dependence of inference accuracy on such parameters as number of sequences, branch lengths, the shape of the rate distribution, and sequence length. We also study the possibility of simultaneously estimating branch lengths and site-specific rates. Our results show that a Bayesian approach is superior to maximum-likelihood under a wide range of conditions, indicating that the prior that is incorporated into the Bayesian computation significantly improves performance. We show that when branch lengths are unknown, it is better first to estimate branch lengths and then to estimate site-specific rates. This procedure was found to be superior to estimating both the branch lengths and site-specific rates simultaneously. Finally, we illustrate the difference between maximum-likelihood and Bayesian methods when analyzing site-conservation for the apoptosis regulator protein Bcl-x(L).  相似文献   

19.
Laccase belongs to the family of blue multi-copper oxidases and are capable of oxidizing a wide range of aromatic compounds. Laccases have industrial applications in paper pulping or bleaching and hydrocarbon bioremediation as a biocatalyst. We describe the design of a laccase with broader substrate spectrum in bioremediation. The application of evolutionary trace (ET) analysis of laccase at the ligand binding site for optimal design of the enzyme is described. In this attempt, class specific sites from ET analysis were mapped onto known crystal structure of laccase. The analysis revealed 162PHE as a critical residue in structure function relationship studies.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号