首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Lu CH  Lin YF  Lin JJ  Yu CS 《PloS one》2012,7(6):e39252
The structure of a protein determines its function and its interactions with other factors. Regions of proteins that interact with ligands, substrates, and/or other proteins, tend to be conserved both in sequence and structure, and the residues involved are usually in close spatial proximity. More than 70,000 protein structures are currently found in the Protein Data Bank, and approximately one-third contain metal ions essential for function. Identifying and characterizing metal ion-binding sites experimentally is time-consuming and costly. Many computational methods have been developed to identify metal ion-binding sites, and most use only sequence information. For the work reported herein, we developed a method that uses sequence and structural information to predict the residues in metal ion-binding sites. Six types of metal ion-binding templates- those involving Ca(2+), Cu(2+), Fe(3+), Mg(2+), Mn(2+), and Zn(2+)-were constructed using the residues within 3.5 ? of the center of the metal ion. Using the fragment transformation method, we then compared known metal ion-binding sites with the templates to assess the accuracy of our method. Our method achieved an overall 94.6 % accuracy with a true positive rate of 60.5 % at a 5 % false positive rate and therefore constitutes a significant improvement in metal-binding site prediction.  相似文献   

2.
Identification of functionally important sites (FIS) in proteins is a critical problem and can have profound importance where protein structural information is limited. Machine learning techniques have been very useful in successful classification of many important biological problems. In this paper, we adopt the sparse kernel least squares classifiers (SKLSC) approach for classification and/or prediction of FIS using protein sequence derived features. The SKLSC algorithm was applied to 5435 FIS that have been extracted from 312 reliable alignments for a wide range of protein families. We obtained 68.28% sensitivity and 68.66% specificity for training dataset and 65.34% sensitivity and 66.88% specificity for testing dataset. Further, large scale benchmarking study using alignments of 101 protein families containing 1899 FIS showed that our method achieved an average ∼70% sensitivity in predicting different types of FIS, such as active sites, metal, ligand or protein binding sites. Our findings also indicate that active sites and metal binding sites are comparably easier to predict compared to the ligand and protein binding sites. Despite moderate success, our results suggest the usefulness and potential of SKLSC approach in prediction of FIS using only protein sequence derived information.  相似文献   

3.
DNA mismatch repair (MMR) is responsible for correcting replication errors. MutLα, one of the main players in MMR, has been recently shown to harbor an endonuclease/metal-binding activity, which is important for its function in vivo. This endonuclease activity has been confined to the C-terminal domain of the hPMS2 subunit of the MutLα heterodimer. In this work, we identify a striking sequence-structure similarity of hPMS2 to the metal-binding/dimerization domain of the iron-dependent repressor protein family and present a structural model of the metal-binding domain of MutLα. According to our model, this domain of MutLα comprises at least three highly conserved sequence motifs, which are also present in most MutL homologs from bacteria that do not rely on the endonuclease activity of MutH for strand discrimination. Furthermore, based on our structural model, we predict that MutLα is a zinc ion binding protein and confirm this prediction by way of biochemical analysis of zinc ion binding using the full-length and C-terminal domain of MutLα. Finally, we demonstrate that the conserved residues of the metal ion binding domain are crucial for MMR activity of MutLα in vitro.  相似文献   

4.
Kinch LN  Grishin NV 《Proteins》2002,48(1):75-84
Nitrogen regulatory (PII) proteins are signal transduction molecules involved in controlling nitrogen metabolism in prokaryots. PII proteins integrate the signals of intracellular nitrogen and carbon status into the control of enzymes involved in nitrogen assimilation. Using elaborate sequence similarity detection schemes, we show that five clusters of orthologs (COGs) and several small divergent protein groups belong to the PII superfamily and predict their structure to be a (betaalphabeta)(2) ferredoxin-like fold. Proteins from the newly emerged PII superfamily are present in all major phylogenetic lineages. The PII homologs are quite diverse, with below random (as low as 1%) pairwise sequence identities between some members of distant groups. Despite this sequence diversity, evidence suggests that the different subfamilies retain the PII trimeric structure important for ligand-binding site formation and maintain a conservation of conservations at residue positions important for PII function. Because most of the orthologous groups within the PII superfamily are composed entirely of hypothetical proteins, our remote homology-based structure prediction provides the only information about them. Analogous to structural genomics efforts, such prediction gives clues to the biological roles of these proteins and allows us to hypothesize about locations of functional sites on model structures or rationalize about available experimental information. For instance, conserved residues in one of the families map in close proximity to each other on PII structure, allowing for a possible metal-binding site in the proteins coded by the locus known to affect sensitivity to divalent metal ions. Presented analysis pushes the limits of sequence similarity searches and exemplifies one of the extreme cases of reliable sequence-based structure prediction. In conjunction with structural genomics efforts to shed light on protein function, our strategies make it possible to detect homology between highly diverse sequences and are aimed at understanding the most remote evolutionary connections in the protein world.  相似文献   

5.
Zinc is one the most abundant catalytic cofactor and also an important structural component of a large number of metallo-proteins. Hence prediction of zinc metal binding sites in proteins can be a significant step in annotation of molecular function of a large number of proteins. Majority of existing methods for zinc-binding site predictions are based on a data-set of proteins, which has been compiled nearly a decade ago. Hence there is a need to develop zinc-binding site prediction system using the current updated data to include recently added proteins. Herein, we propose a support vector machine-based method, named as ZincBinder, for prediction of zinc metal-binding site in a protein using sequence profile information. The predictor was trained using fivefold cross validation approach and achieved 85.37% sensitivity with 86.20% specificity during training. Benchmarking on an independent non-redundant data-set, which was not used during training, showed better performance of ZincBinder vis-à-vis existing methods. Executable versions, source code, sample datasets, and usage instructions are available at http://proteininformatics.org/mkumar/znbinder/  相似文献   

6.
He H  McAllister G  Smith TF 《Proteins》2002,48(4):654-663
We have constructed, in a completely automated fashion, a new structure template library for threading that represents 358 distinct SCOP folds where each model is mathematically represented as a Hidden Markov model (HMM). Because the large number of models in the library can potentially dilute the prediction measure, a new triage method for fold prediction is employed. In the first step of the triage method, the most probable structural class is predicted using a set of manually constructed, high-level, generalized structural HMMs that represent seven general protein structural classes: all-alpha, all-beta, alpha/beta, alpha+beta, irregular small metal-binding, transmembrane beta-barrel, and transmembrane alpha-helical. In the second step, only those fold models belonging to the determined structural class are selected for the final fold prediction. This triage method gave more predictions as well as more correct predictions compared with a simple prediction method that lacks the initial classification step. Two different schemes of assigning Bayesian model priors are presented and discussed.  相似文献   

7.
The geometry of metal coordination by proteins is well understood, but the evolution of metal binding sites has been less studied. Here we present a study on a small number of well-documented structural calcium and zinc binding sites, concerning how the geometry diverges between relatives, how often nonrelatives converge towards the same structure, and how often these metal binding sites are lost in the course of evolution. Both calcium and zinc binding site structure is observed to be conserved; structural differences between those atoms directly involved in metal binding in related proteins are typically less than 0.5 A root mean square deviation, even in distant relatives. Structural templates representing these conserved calcium and zinc binding sites were used to search the Protein Data Bank for cases where unrelated proteins have converged upon the same residue selection and geometry for metal binding. This allowed us to identify six "archetypal" metal binding site structures: two archetypal zinc binding sites, both of which had independently evolved on a large number of occasions, and four diverse archetypal calcium binding sites, where each had evolved independently on only a handful of occasions. We found that it was common for distant relatives of metal-binding proteins to lack metal-binding capacity. This occurred for 13 of the 18 metal binding sites we studied, even though in some of these cases the original metal had been classified as "essential for protein folding." For most of the calcium binding sites studied (seven out of eleven cases), the lack of metal binding in relatives was due to point mutation of the metal-binding residues, whilst for zinc binding sites, lack of metal binding in relatives always involved more extensive changes, with loss of secondary structural elements or loops around the binding site.  相似文献   

8.
Goyal K  Mande SC 《Proteins》2008,70(4):1206-1218
High throughput structural genomics efforts have been making the structures of proteins available even before their function has been fully characterized. Therefore, methods that exploit the structural knowledge to provide evidence about the functions of proteins would be useful. Such methods would be needed to complement the sequence-based function annotation approaches. The current study describes generation of 3D-structural motifs for metal-binding sites from the known metalloproteins. It then scans all the available protein structures in the PDB database for putative metal-binding sites. Our analysis predicted more than 1000 novel metal-binding sites in proteins using three-residue templates, and more than 150 novel metal-binding sites using four-residue templates. Prediction of metal-binding site in a yeast protein YDR533c led to the hypothesis that it might function as metal-dependent amidopeptidase. The structural motifs identified by our method present novel metal-binding sites that reveal newer mechanisms for a few well-known proteins.  相似文献   

9.

Background  

For many metalloproteins, sequence motifs characteristic of metal-binding sites have not been found or are so short that they would not be expected to be metal-specific. Striking examples of such metalloproteins are those containing Mg2+, one of the most versatile metal cofactors in cellular biochemistry. Even when Mg2+-proteins share insufficient sequence homology to identify Mg2+-specific sequence motifs, they may still share similarity in the Mg2+-binding site structure. However, no structural motifs characteristic of Mg2+-binding sites have been reported. Thus, our aims are (i) to develop a general method for discovering structural patterns/motifs characteristic of ligand-binding sites, given the 3D protein structures, and (ii) to apply it to Mg2+-proteins sharing <30% sequence identity. Our motif discovery method employs structural alphabet encoding to convert 3D structures to the corresponding 1D structural letter sequences, where the Mg2+-structural motifs are identified as recurring structural patterns.  相似文献   

10.
Over one-third of protein structures contain metal ions, which are the necessary elements in life systems. Traditionally, structural biologists were used to investigate properties of metalloproteins (proteins which bind with metal ions) by physical means and interpreting the function formation and reaction mechanism of enzyme by their structures and observations from experiments in vitro. Most of proteins have primary structures (amino acid sequence information) only; however, the 3-dimension structures are not always available. In this paper, a direct analysis method is proposed to predict the protein metal-binding amino acid residues from its sequence information only by neural networks with sliding window-based feature extraction and biological feature encoding techniques. In four major bulk elements (Calcium, Potassium, Magnesium, and Sodium), the metal-binding residues are identified by the proposed method with higher than 90% sensitivity and very good accuracy under 5-fold cross validation. With such promising results, it can be extended and used as a powerful methodology for metal-binding characterization from rapidly increasing protein sequences in the future.  相似文献   

11.
12.
Phospho-Ser/Thr protein phosphatases (PPs) are dinuclear metalloenzymes classed into two large families, PPP and PPM, on the basis of sequence similarity and metal ion dependence. The archetype of the PPM family is the α isoform of human PP2C (PP2Cα), which folds into an α/β domain similar to those of PPP enzymes. The recent structural studies of three bacterial PPM phosphatases, Mycobacterium tuberculosis MtPstP, Mycobacterium smegmatis MspP, and Streptococcus agalactiae STP, confirmed the conservation of the overall fold and dinuclear metal center in the family, but surprisingly revealed the presence of a third conserved metal-binding site in the active site. To gain insight into the roles of the three-metal center in bacterial enzymes, we report structural and metal-binding studies of MtPstP and MspP. The structure of MtPstP in a new trigonal crystal form revealed a fully active enzyme with the canonical dinuclear metal center but without the third metal ion bound to the catalytic site. The absence of metal correlates with a partially unstructured flap segment, indicating that the third manganese ion contributes to reposition the flap, but is dispensable for catalysis. Studies of metal binding to MspP using isothermal titration calorimetry revealed that the three Mn2+-binding sites display distinct affinities, with dissociation constants in the nano- and micromolar range for the two catalytic metal ions and a significantly lower affinity for the third metal-binding site. In agreement, the structure of inactive MspP at acidic pH was determined at atomic resolution and shown to lack the third metal ion in the active site. Structural comparisons of all bacterial phosphatases revealed positional variations in the third metal-binding site that are correlated with the presence of bound substrate and the conformation of the flap segment, supporting a role of this metal ion in assisting enzyme-substrate interactions.  相似文献   

13.
A novel metal-binding site has been identified in the hammerhead ribozyme by 31P NMR. The metal-binding site is associated with the A13 phosphate in the catalytic core of the hammerhead ribozyme and is distinct from any previously identified metal-binding sites. 31P NMR spectroscopy was used to measure the metal-binding affinity for this site and leads to an apparent dissociation constant of 250-570 microM at 25 degrees C for binding of a single Mg2+ ion. The NMR data also show evidence of a structural change at this site upon metal binding and these results are compared with previous data on metal-induced structural changes in the core of the hammerhead ribozyme. These NMR data were combined with the X-ray structure of the hammerhead ribozyme (Pley HW, Flaherty KM, McKay DB. 1994. Nature 372:68-74) to model RNA ligands involved in binding the metal at this A13 site. In this model, the A13 metal-binding site is structurally similar to the previously identified A(g) metal-binding site and illustrates the symmetrical nature of the tandem G x A base pairs in domain 2 of the hammerhead ribozyme. These results demonstrate that 31P NMR represents an important method for both identification and characterization of metal-binding sites in nucleic acids.  相似文献   

14.
Protein functional sites control most biological processes and are important targets for drug design and protein engineering. To characterize them, the evolutionary trace (ET) ranks the relative importance of residues according to their evolutionary variations. Generally, top‐ranked residues cluster spatially to define evolutionary hotspots that predict functional sites in structures. Here, various functions that measure the physical continuity of ET ranks among neighboring residues in the structure, or in the sequence, are shown to inform sequence selection and to improve functional site resolution. This is shown first, in 110 proteins, for which the overlap between top‐ranked residues and actual functional sites rose by 8% in significance. Then, on a structural proteomic scale, optimized ET led to better 3D structure‐function motifs (3D templates) and, in turn, to enzyme function prediction by the Evolutionary Trace Annotation (ETA) method with better sensitivity of (40% to 53%) and positive predictive value (93% to 94%). This suggests that the similarity of evolutionary importance among neighboring residues in the sequence and in the structure is a universal feature of protein evolution. In practice, this yields a tool for optimizing sequence selections for comparative analysis and, via ET, for better predictions of functional site and function. This should prove useful for the efficient mutational redesign of protein function and for pharmaceutical targeting.  相似文献   

15.
Babor M  Gerzon S  Raveh B  Sobolev V  Edelman M 《Proteins》2008,70(1):208-217
Metal ions are crucial for protein function. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. Current tools for predicting metal-protein interactions are based on proteins crystallized with their metal ions present (holo forms). However, a majority of resolved structures are free of metal ions (apo forms). Moreover, metal binding is a dynamic process, often involving conformational rearrangement of the binding pocket. Thus, effective predictions need to be based on the structure of the apo state. Here, we report an approach that identifies transition metal-binding sites in apo forms with a resulting selectivity >95%. Applying the approach to apo forms in the Protein Data Bank and structural genomics initiative identifies a large number of previously unknown, putative metal-binding sites, and their amino acid residues, in some cases providing a first clue to the function of the protein.  相似文献   

16.
DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Cα deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein.  相似文献   

17.
BACKGROUND: Metallochaperone proteins function in the trafficking and delivery of essential, yet potentially toxic, metal ions to distinct locations and particular proteins in eukaryotic cells. The Atx1 protein shuttles copper to the transport ATPase Ccc2 in yeast cells. Molecular mechanisms for copper delivery by Atx1 and similar human chaperones have been proposed, but detailed structural characterization is necessary to elucidate how Atx1 binds metal ions and how it might interact with Ccc2 to facilitate metal ion transfer. RESULTS: The 1.02 A resolution X-ray structure of the Hg(II) form of Atx1 (HgAtx1) reveals the overall secondary structure, the location of the metal-binding site, the detailed coordination geometry for Hg(II), and specific amino acid residues that may be important in interactions with Ccc2. Metal ion transfer experiments establish that HgAtx1 is a functional model for the Cu(I) form of Atx1 (CuAtx1). The metal-binding loop is flexible, changing conformation to form a disulfide bond in the oxidized apo form, the structure of which has been solved to 1.20 A resolution. CONCLUSIONS: The Atx1 structure represents the first structure of a metallochaperone protein, and is one of the largest unknown structures solved by direct methods. The structural features of the metal-binding site support the proposed Atx1 mechanism in which facile metal ion transfer occurs between metal-binding sites of the diffusible copper-donor and membrane-tethered copper-acceptor proteins. The Atx1 structural motif represents a prototypical metal ion trafficking unit that is likely to be employed in a variety of organisms for different metal ions.  相似文献   

18.
19.
Brylinski M  Skolnick J 《Proteins》2011,79(3):735-751
The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure‐based approaches showing considerable promise. In this article, we present FINDSITE‐metal, a new threading‐based method designed specifically to detect metal‐binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE‐metal. Combining structure/evolutionary information with machine learning results in highly accurate metal‐binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal‐binding sites are detected with the best predicted binding site at rank 1 and within the top two ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium, and magnesium ions, the binding metal can be predicted with high, typically 70% to 90%, accuracy. FINDSITE‐metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome‐wide application of FINDSITE‐metal that quantifies the metal‐binding complement of the human proteome. FINDSITE‐metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite‐metal/ . Proteins 2011. © 2010 Wiley‐Liss, Inc.  相似文献   

20.
Mismatch string kernels for discriminative protein classification   总被引:1,自引:0,他引:1  
MOTIVATION: Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine learning approaches provide good performance, but simplicity and computational efficiency of training and prediction are also important concerns. RESULTS: We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the problem of protein classification and remote homology detection. These kernels measure sequence similarity based on shared occurrences of fixed-length patterns in the data, allowing for mutations between patterns. Thus, the kernels provide a biologically well-motivated way to compare protein sequences without relying on family-based generative models such as hidden Markov models. We compute the kernels efficiently using a mismatch tree data structure, allowing us to calculate the contributions of all patterns occurring in the data in one pass while traversing the tree. When used with an SVM, the kernels enable fast prediction on test sequences. We report experiments on two benchmark SCOP datasets, where we show that the mismatch kernel used with an SVM classifier performs competitively with state-of-the-art methods for homology detection, particularly when very few training examples are available. Examination of the highest-weighted patterns learned by the SVM classifier recovers biologically important motifs in protein families and superfamilies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号