首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites.  相似文献   

2.

Background  

RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy.  相似文献   

3.
4.
Due to Ca2+‐dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet‐lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet‐lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large‐margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM‐binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome‐wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif‐based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub‐sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels .  相似文献   

5.
MOTIVATION: Knowledge-based potentials are valuable tools for protein structure modeling and evaluation of the quality of the structure prediction obtained by a variety of methods. Potentials of such type could be significantly enhanced by a proper exploitation of the evolutionary information encoded in related protein sequences. The new potentials could be valuable components of threading algorithms, ab-initio protein structure prediction, comparative modeling and structure modeling based on fragmentary experimental data. RESULTS: A new potential for scoring local protein geometry is designed and evaluated. The approach is based on the similarity of short protein fragments measured by an alignment of their sequence profiles. Sequence specificity of the resulting energy function has been compared with the specificity of simpler potentials using gapless threading and the ability to predict specific geometry of protein fragments. Significant improvement in threading sensitivity and in the ability to generate sequence-specific protein-like conformations has been achieved.  相似文献   

6.
Detecting cis-regulatory binding sites for cooperatively binding proteins   总被引:1,自引:0,他引:1  
Several methods are available to predict cis-regulatory modules in DNA based on position weight matrices. However, the performance of these methods generally depends on a number of additional parameters that cannot be derived from sequences and are difficult to estimate because they have no physical meaning. As the best way to detect cis-regulatory modules is the way in which the proteins recognize them, we developed a new scoring method that utilizes the underlying physical binding model. This method requires no additional parameter to account for multiple binding sites; and the only necessary parameters to model homotypic cooperative interactions are the distances between adjacent protein binding sites in basepairs, and the corresponding cooperative binding constants. The heterotypic cooperative binding model requires one more parameter per cooperatively binding protein, which is the concentration multiplied by the partition function of this protein. In a case study on the bacterial ferric uptake regulator, we show that our scoring method for homotypic cooperatively binding proteins significantly outperforms other PWM-based methods where biophysical cooperativity is not taken into account.  相似文献   

7.
Lattice models of proteins were used to examine the role of local propensities in stabilizing the native state of a protein, using techniques drawn from spin-glass theory to characterize the free-energy landscapes. In the strong evolutionary limit, optimal conditions for folding are achieved when the contributions from local interactions to the stability of the native state is small. Further increasing the local interactions rapidly decreases the foldability. © 1995 Wiley-Liss, Inc.  相似文献   

8.
An analysis of the characteristic properties of sugar binding sites was performed on a set of 19 sugar binding proteins. For each site six parameters were evaluated: solvation potential, residue propensity, hydrophobicity, planarity, protrusion and relative accessible surface area. Three of the parameters were found to distinguish the observed sugar binding sites from the other surface patches. These parameters were then used to calculate the probability for a surface patch to be a carbohydrate binding site. The prediction was optimized on a set of 19 non-homologous carbohydrate binding structures and a test prediction was carried out on a set of 40 protein-carbohydrate complexes. The overall accuracy of prediction achieved was 65%. Results were in general better for carbohydrate-binding enzymes than for the lectins, with a rate of success of 87%.  相似文献   

9.
Harris R  Olson AJ  Goodsell DS 《Proteins》2008,70(4):1506-1517
We present a method, termed AutoLigand, for the prediction of ligand-binding sites in proteins of known structure. The method searches the space surrounding the protein and finds the contiguous envelope with the specified volume of atoms, which has the largest possible interaction energy with the protein. It uses a full atomic representation, with atom types for carbon, hydrogen, oxygen, nitrogen and sulfur (and others, if desired), and is designed to minimize the need for artificial geometry. Testing on a set of 187 diverse protein-ligand complexes has shown that the method is successful in predicting the location and approximate volume of the binding site in 73% of cases. Additional testing was performed on a set of 96 protein-ligand complexes with crystallographic structures of apo and holo forms, and AutoLigand was able to predict the binding site in 80% of the apo structures.  相似文献   

10.
Chung JL  Wang W  Bourne PE 《Proteins》2006,62(3):630-640
A rapid increase in the number of experimentally derived three-dimensional structures provides an opportunity to better understand and subsequently predict protein-protein interactions. In this study, structurally conserved residues were derived from multiple structure alignments of the individual components of known complexes and the assigned conservation score was weighted based on the crystallographic B factor to account for the structural flexibility that will result in a poor alignment. Sequence profile and accessible surface area information was then combined with the conservation score to predict protein-protein binding sites using a Support Vector Machine (SVM). The incorporation of the conservation score significantly improved the performance of the SVM. About 52% of the binding sites were precisely predicted (greater than 70% of the residues in the site were identified); 77% of the binding sites were correctly predicted (greater than 50% of the residues in the site were identified), and 21% of the binding sites were partially covered by the predicted residues (some residues were identified). The results support the hypothesis that in many cases protein interfaces require some residues to provide rigidity to minimize the entropic cost upon complex formation.  相似文献   

11.
The rapidly increasing volume of sequence and structure information available for proteins poses the daunting task of determining their functional importance. Computational methods can prove to be very useful in understanding and characterizing the biochemical and evolutionary information contained in this wealth of data, particularly at functionally important sites. Therefore, we perform a detailed survey of compositional and evolutionary constraints at the molecular and biological function level for a large set of known functionally important sites extracted from a wide range of protein families. We compare the degree of conservation across different functional categories and provide detailed statistical insight to decipher the varying evolutionary constraints at functionally important sites. The compositional and evolutionary information at functionally important sites has been compiled into a library of functional templates. We developed a module that predicts functionally important columns (FIC) of an alignment based on the detection of a significant "template match score" to a library template. Our template match score measures an alignment column's similarity to a library template and combines a term explicitly representing a column's residue composition with various evolutionary conservation scores (information content and position-specific scoring matrix-derived statistics). Our benchmarking studies show good sensitivity/specificity for the prediction of functional sites and high accuracy in attributing correct molecular function type to the predicted sites. This prediction method is based on information derived from homologous sequences and no structural information is required. Therefore, this method could be extremely useful for large-scale functional annotation.  相似文献   

12.
Summary Conservation of polypeptide fold and mode of ligand binding is frequently found within proteins of related function. Examples illustrating this phenomenon are taken from NAD linked enzymes, nucleotide binding proteins, polysaccharide binding proteins, heme binding proteins and enzymes with essential Fe-S complexes or zinc atoms.  相似文献   

13.
Highly specific prediction of phosphorylation sites in proteins   总被引:1,自引:0,他引:1  
SUMMARY: The prediction of significant short functional protein sequences has inherent problems. In predicting phosphorylation sites, problems came from the shortness of phosphorylation sites, the difficulties in maintaining many different predefined models of binding sites, and the difficulties of obtaining highly sensitive predictions and of obtaining predictions with a constant sensitivity and specificity. The algorithm presented in this paper overcomes these problems. The proposed algorithm PHOSITE is based on the case-based sequence analysis. This enables the prediction of phosphorylation sites with constant specificity and sensitivity. Furthermore, this method leads not only to the prediction of phosphorylation sites in general but also predicts the most probable type of kinase involved. AVAILABILITY: The tool PHOSITE implementing the presented method can be evaluated under the website http://www.phosite.com.  相似文献   

14.
Summary The functions of a number of amino acid residues in proteins have been studied by chemical modification techniques and much useful information has been obtained. Methods using dicarbonyl compounds for the modification of arginine residues are the most recent to have been developed. Since their introduction about 10 years ago, they have led to the identification of a large number of enzymes and other proteins that contain arginine residues critical to biological function. These reagents are discussed in terms of their chemical reactivity and mechanisms of action and in relation to the unique chemical properties of the guanidinium group. Butanedione, phenylglyoxal and cyclohexanedione are the most commonly employed arginyl reagents, and their relative advantages are examined. A survey of the functional role of arginine residues in enzymes and other proteins is presented in which nearly 100 examples are cited. The prediction that arginine residues would be found to serve a general role as anionic binding sites in protein has obviously been validated. The genetic and physiological implications of the selection of arginine for this important function are discussed.This work was supported by Grant-in-Aid GM-15003 from the National Institutes of Health.  相似文献   

15.

Background  

Many integral membrane proteins, like their non-membrane counterparts, form either transient or permanent multi-subunit complexes in order to carry out their biochemical function. Computational methods that provide structural details of these interactions are needed since, despite their importance, relatively few structures of membrane protein complexes are available.  相似文献   

16.
An algorithm for recognition of prokaryotic ribosomal binding sites is suggested. The parameter library contains weight matrices for mapping of gene starts in various bacterial genomes. Comparison of the ribosome binding starts in different taxonomic groups demonstrates that the signals in Gram-positive bacteria are stronger than in Gram-negative bacteria, and in particular, Enterobacteria. The recognition matrices are available by e-mail misha@imb.imb.ac.ru.  相似文献   

17.
18.
Metals play a variety of roles in biological processes, and hence their presence in a protein structure can yield vital functional information. Because the residues that coordinate a metal often undergo conformational changes upon binding, detection of binding sites based on simple geometric criteria in proteins without bound metal is difficult. However, aspects of the physicochemical environment around a metal binding site are often conserved even when this structural rearrangement occurs. We have developed a Bayesian classifier using known zinc binding sites as positive training examples and nonmetal binding regions that nonetheless contain residues frequently observed in zinc sites as negative training examples. In order to allow variation in the exact positions of atoms, we average a variety of biochemical and biophysical properties in six concentric spherical shells around the site of interest. At a specificity of 99.8%, this method achieves 75.5% sensitivity in unbound proteins at a positive predictive value of 73.6%. We also test its accuracy on predicted protein structures obtained by homology modeling using templates with 30%-50% sequence identity to the target sequences. At a specificity of 99.8%, we correctly identify at least one zinc binding site in 65.5% of modeled proteins. Thus, in many cases, our model is accurate enough to identify metal binding sites in proteins of unknown structure for which no high sequence identity homologs of known structure exist. Both the source code and a Web interface are available to the public at http://feature.stanford.edu/metals.  相似文献   

19.
Classical molecular interaction potentials, in conjunction with other theoretical techniques, are used to analyze the dependence of the binding sites of representative proteins on the bound ligand. It is found that the ligand bound introduces in general small structural perturbations at the binding site of the protein. However, such small structural changes can lead to important alterations in the recognition pattern of the protein. The impact of these findings in docking procedures is discussed.  相似文献   

20.
  1. Download : Download high-res image (162KB)
  2. Download : Download full-size image
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号