首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prediction of RNA binding sites in a protein using SVM and PSSM profile   总被引:1,自引:0,他引:1  
Kumar M  Gromiha MM  Raghava GP 《Proteins》2008,71(1):189-194
  相似文献   

2.
Prediction of RNA binding sites in proteins from amino acid sequence   总被引:3,自引:0,他引:3  
RNA-protein interactions are vitally important in a wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses. We have developed a computational tool for predicting which amino acids of an RNA binding protein participate in RNA-protein interactions, using only the protein sequence as input. RNABindR was developed using machine learning on a validated nonredundant data set of interfaces from known RNA-protein complexes in the Protein Data Bank. It generates a classifier that captures primary sequence signals sufficient for predicting which amino acids in a given protein are located in the RNA-protein interface. In leave-one-out cross-validation experiments, RNABindR identifies interface residues with >85% overall accuracy. It can be calibrated by the user to obtain either high specificity or high sensitivity for interface residues. RNABindR, implementing a Naive Bayes classifier, performs as well as a more complex neural network classifier (to our knowledge, the only previously published sequence-based method for RNA binding site prediction) and offers the advantages of speed, simplicity and interpretability of results. RNABindR predictions on the human telomerase protein hTERT are in good agreement with experimental data. The availability of computational tools for predicting which residues in an RNA binding protein are likely to contact RNA should facilitate design of experiments to directly test RNA binding function and contribute to our understanding of the diversity, mechanisms, and regulation of RNA-protein complexes in biological systems. (RNABindR is available as a Web tool from http://bindr.gdcb.iastate.edu.).  相似文献   

3.
Zinc is one the most abundant catalytic cofactor and also an important structural component of a large number of metallo-proteins. Hence prediction of zinc metal binding sites in proteins can be a significant step in annotation of molecular function of a large number of proteins. Majority of existing methods for zinc-binding site predictions are based on a data-set of proteins, which has been compiled nearly a decade ago. Hence there is a need to develop zinc-binding site prediction system using the current updated data to include recently added proteins. Herein, we propose a support vector machine-based method, named as ZincBinder, for prediction of zinc metal-binding site in a protein using sequence profile information. The predictor was trained using fivefold cross validation approach and achieved 85.37% sensitivity with 86.20% specificity during training. Benchmarking on an independent non-redundant data-set, which was not used during training, showed better performance of ZincBinder vis-à-vis existing methods. Executable versions, source code, sample datasets, and usage instructions are available at http://proteininformatics.org/mkumar/znbinder/  相似文献   

4.
Computational methods designed to predict and visualize ligand protein binding interactions were used to characterize volatile anesthetic (VA) binding sites and unoccupied pockets within the known structures of VAs bound to serum albumin, luciferase, and apoferritin. We found that both the number of protein atoms and methyl hydrogen, which are within approximately 8 A of a potential ligand binding site, are significantly greater in protein pockets where VAs bind. This computational approach was applied to structures of calmodulin (CaM), which have not been determined in complex with a VA. It predicted that VAs bind to [Ca(2+)](4)-CaM, but not to apo-CaM, which we confirmed with isothermal titration calorimetry. The VA binding sites predicted for the structures of [Ca(2+)](4)-CaM are located in hydrophobic pockets that form when the Ca(2+) binding sites in CaM are saturated. The binding of VAs to these hydrophobic pockets is supported by evidence that halothane predominantly makes contact with aliphatic resonances in [Ca(2+)](4)-CaM (nuclear Overhauser effect) and increases the Ca(2+) affinity of CaM (fluorescence spectroscopy). Our computational analysis and experiments indicate that binding of VA to proteins is consistent with the hydrophobic effect and the Meyer-Overton rule.  相似文献   

5.

Background  

The nucleus, a highly organized organelle, plays important role in cellular homeostasis. The nuclear proteins are crucial for chromosomal maintenance/segregation, gene expression, RNA processing/export, and many other processes. Several methods have been developed for predicting the nuclear proteins in the past. The aim of the present study is to develop a new method for predicting nuclear proteins with higher accuracy.  相似文献   

6.
7.
Mannose is an abundant cell surface monosaccharide and has an important role in many biochemical processes. It binds to a greatdiversity of receptor proteins. In this study we have employed Random Forest for prediction of mannose binding sites. Mannosebindingsite is taken to be a sphere around the centroid of the ligand and the sphere is subdivided into different layers and atomwise and residue wise features were extracted for each layer. The method achieves 95.59 % of accuracy using Random Forest with10 fold cross validation. Prediction of mannose binding site analysis will be quite useful in drug design.  相似文献   

8.
Heparin is a glycosaminoglycan known to bind bone morphogenetic proteins (BMPs) and the growth and differentiation factors (GDFs) and has strong and variable effects on BMP osteogenic activity. In this paper we report our predictions of the likely heparin binding sites for BMP-2 and 14. The N-terminal sequences upstream of TGF-β-type cysteine-knot domains in BMP-2, 7 and 14 contain the basic residues arginine and lysine, which are key components of the heparin/HS-binding sites, with these residues being highly non-conserved. Importantly, evolutionary conserved surfaces on the beta sheets are required for interactions with receptors and antagonists. Furthermore, BMP-2 has electropositive surfaces on two sides compared to BMP-7 and BMP-14. Molecular docking simulations suggest the presence of high and low affinity binding sites in dimeric BMP-2. Histidines were found to play a role in the interactions of BMP-2 with heparin; however, a pK(a) analysis suggests that histidines are likely not protonated. This is indicative that interactions of BMP-2 with heparin do not require acidic pH. Taken together, non-conserved amino acid residues in the N-terminus and residues protruding from the beta sheet (not overlapping with the receptor binding sites and the dimeric interface) and not C-terminal are found to be important for heparin-BMP interactions.  相似文献   

9.
10.
Kaur H  Raghava GP 《Proteins》2004,55(1):83-90
In this paper a systematic attempt has been made to develop a better method for predicting alpha-turns in proteins. Most of the commonly used approaches in the field of protein structure prediction have been tried in this study, which includes statistical approach "Sequence Coupled Model" and machine learning approaches; i) artificial neural network (ANN); ii) Weka (Waikato Environment for Knowledge Analysis) Classifiers and iii) Parallel Exemplar Based Learning (PEBLS). We have also used multiple sequence alignment obtained from PSIBLAST and secondary structure information predicted by PSIPRED. The training and testing of all methods has been performed on a data set of 193 non-homologous protein X-ray structures using five-fold cross-validation. It has been observed that ANN with multiple sequence alignment and predicted secondary structure information outperforms other methods. Based on our observations we have developed an ANN-based method for predicting alpha-turns in proteins. The main components of the method are two feed-forward back-propagation networks with a single hidden layer. The first sequence-structure network is trained with the multiple sequence alignment in the form of PSI-BLAST-generated position specific scoring matrices. The initial predictions obtained from the first network and PSIPRED predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. The final network yields an overall prediction accuracy of 78.0% and MCC of 0.16. A web server AlphaPred (http://www.imtech.res.in/raghava/alphapred/) has been developed based on this approach.  相似文献   

11.
Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.  相似文献   

12.

Background  

DNA recognition by proteins is one of the most important processes in living systems. Therefore, understanding the recognition process in general, and identifying mutual recognition sites in proteins and DNA in particular, carries great significance. The sequence and structural dependence of DNA-binding sites in proteins has led to the development of successful machine learning methods for their prediction. However, all existing machine learning methods predict DNA-binding sites, irrespective of their target sequence and hence, none of them is helpful in identifying specific protein-DNA contacts. In this work, we formulate the problem of predicting specific DNA-binding sites in terms of contacts between the residue environments of proteins and the identity of a mononucleotide or a dinucleotide step in DNA. The aim of this work is to take a protein sequence or structural features as inputs and predict for each amino acid residue if it binds to DNA at locations identified by one of the four possible mononucleotides or one of the 10 unique dinucleotide steps. Contact predictions are made at various levels of resolution viz. in terms of side chain, backbone and major or minor groove atoms of DNA.  相似文献   

13.

Background  

Membrane proteins are estimated to represent about 25% of open reading frames in fully sequenced genomes. However, the experimental study of proteins remains difficult. Considerable efforts have thus been made to develop prediction methods. Most of these were conceived to detect transmembrane helices in polytopic proteins. Alternatively, a membrane protein can be monotopic and anchored via an amphipathic helix inserted in a parallel way to the membrane interface, so-called in-plane membrane (IPM) anchors. This type of membrane anchor is still poorly understood and no suitable prediction method is currently available.  相似文献   

14.
15.
A detailed knowledge of a protein's functional site is an absolute prerequisite for understanding its mode of action at the molecular level. However, the rapid pace at which sequence and structural information is being accumulated for proteins greatly exceeds our ability to determine their biochemical roles experimentally. As a result, computational methods are required which allow for the efficient processing of the evolutionary information contained in this wealth of data, in particular that related to the nature and location of functionally important sites and residues. The method presented here, referred to as conserved functional group (CFG) analysis, relies on a simplified representation of the chemical groups found in amino acid side-chains to identify functional sites from a single protein structure and a number of its sequence homologues. We show that CFG analysis can fully or partially predict the location of functional sites in approximately 96% of the 470 cases tested and that, unlike other methods available, it is able to tolerate wide variations in sequence identity. In addition, we discuss its potential in a structural genomics context, where automation, scalability and efficiency are critical, and an increasing number of protein structures are determined with no prior knowledge of function. This is exemplified by our analysis of the hypothetical protein Ydde_Ecoli, whose structure was recently solved by members of the North East Structural Genomics consortium. Although the proposed active site for this protein needs to be validated experimentally, this example illustrates the scope of CFG analysis as a general tool for the identification of residues likely to play an important role in a protein's biochemical function. Thus, our method offers a convenient solution to rapidly and automatically process the vast amounts of data that are beginning to emerge from structural genomics projects.  相似文献   

16.
17.
18.

Background  

The majority of peptide bonds in proteins are found to occur in thetransconformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt thecisform. Prolinecis/transisomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of prolinecis/transisomerization in proteins would have many important applications towards the understanding of protein structure and function.  相似文献   

19.
20.
Lu CH  Lin YF  Lin JJ  Yu CS 《PloS one》2012,7(6):e39252
The structure of a protein determines its function and its interactions with other factors. Regions of proteins that interact with ligands, substrates, and/or other proteins, tend to be conserved both in sequence and structure, and the residues involved are usually in close spatial proximity. More than 70,000 protein structures are currently found in the Protein Data Bank, and approximately one-third contain metal ions essential for function. Identifying and characterizing metal ion-binding sites experimentally is time-consuming and costly. Many computational methods have been developed to identify metal ion-binding sites, and most use only sequence information. For the work reported herein, we developed a method that uses sequence and structural information to predict the residues in metal ion-binding sites. Six types of metal ion-binding templates- those involving Ca(2+), Cu(2+), Fe(3+), Mg(2+), Mn(2+), and Zn(2+)-were constructed using the residues within 3.5 ? of the center of the metal ion. Using the fragment transformation method, we then compared known metal ion-binding sites with the templates to assess the accuracy of our method. Our method achieved an overall 94.6 % accuracy with a true positive rate of 60.5 % at a 5 % false positive rate and therefore constitutes a significant improvement in metal-binding site prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号