首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Structural genomics is on a quest for the structure and function of a significant fraction of gene products. Current efforts are focusing on structure determination of single-domain proteins, which can readily be targeted by X-ray crystallography, NMR spectroscopy and computational homology modeling. However, comprehensive association of gene products with functions also requires systematic determination of more complex protein structures and other biomolecules participating in cellular processes such as nucleic acids, and characterization of biomolecular interactions and dynamics relevant to function. Such NMR investigations are becoming more feasible, not only due to recent advances in NMR methodology, but also because structural genomics is providing valuable structural information and new experimental and computational tools. The measurement of residual dipolar couplings in partially oriented systems and other new NMR methods will play an important role in this synergistic relationship between NMR and structural genomics. Both an expansion in the domain of NMR application, and important contributions to future structural genomics efforts can be anticipated.  相似文献   

2.
Assigning function to structures is an important aspect of structural genomics projects, since they frequently provide structures for uncharacterized proteins. Similarities uncovered by structure alignment can suggest a similar function, even in the absence of sequence similarity. For proteins adopting novel folds or those with many functions, this strategy can fail, but functional clues can still come from comparison of local functional sites involving a few key residues. Here we assess the general applicability of functional site comparison through the study of 157 proteins solved by structural genomics initiatives. For 17, the method bolsters confidence in predictions made based on overall fold similarity. For another 12 with new folds, it suggests functions, including a putative phosphotyrosine binding site in the Archaeal protein Mth1187 and an active site for a ribose isomerase. The approach is applied weekly to all new structures, providing a resource for those interested in using structure to infer function.  相似文献   

3.
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.  相似文献   

4.
The myosin superfamily is diverse in its structure, kinetic mechanisms and cellular function. The enzymatic activities of most myosins are regulated by some means such as Ca2+ ion binding, phosphorylation or binding of other proteins. In the present review, we discuss the structural basis for the regulation of mammalian myosin 5a and Drosophila myosin 7a. We show that, although both myosins have a folded inactive state in which domains in the myosin tail interact with the motor domain, the details of the regulation of these two myosins differ greatly.  相似文献   

5.
A new approach to the functional classification of protein 3D structures is described with application to some examples from structural genomics. This approach is based on functional site prediction with THEMATICS and POOL. THEMATICS employs calculated electrostatic potentials of the query structure. POOL is a machine learning method that utilizes THEMATICS features and has been shown to predict accurate, precise, highly localized interaction sites. Extension to the functional classification of structural genomics proteins is now described. Predicted functionally important residues are structurally aligned with those of proteins with previously characterized biochemical functions. A 3D structure match at the predicted local functional site then serves as a more reliable predictor of biochemical function than an overall structure match. Annotation is confirmed for a structural genomics protein with the ribulose phosphate binding barrel (RPBB) fold. A putative glucoamylase from Bacteroides fragilis (PDB ID 3eu8) is shown to be in fact probably not a glucoamylase. Finally a structural genomics protein from Streptomyces coelicolor annotated as an enoyl-CoA hydratase (PDB ID 3g64) is shown to be misannotated. Its predicted active site does not match the well-characterized enoyl-CoA hydratases of similar structure but rather bears closer resemblance to those of a dehalogenase with similar fold.  相似文献   

6.
Inferring protein functions from structures is a challenging task, as a large number of orphan protein structures from structural genomics project are now solved without their biochemical functions characterized. For proteins binding to similar substrates or ligands and carrying out similar functions, their binding surfaces are under similar physicochemical constraints, and hence the sets of allowed and forbidden residue substitutions are similar. However, it is difficult to isolate such selection pressure due to protein function from selection pressure due to protein folding, and evolutionary relationship reflected by global sequence and structure similarities between proteins is often unreliable for inferring protein function. We have developed a method, called pevoSOAR (pocket-based evolutionary search of amino acid residues), for predicting protein functions by solving the problem of uncovering amino acids residue substitution pattern due to protein function and separating it from amino acids substitution pattern due to protein folding. We incorporate evolutionary information specific to an individual binding region and match local surfaces on a large scale with millions of precomputed protein surfaces to identify those with similar functions. Our pevoSOAR method also generates a probablistic model called the computed binding a profile that characterizes protein-binding activities that may involve multiple substrates or ligands. We show that our method can be used to predict enzyme functions with accuracy. Our method can also assess enzyme binding specificity and promiscuity. In an objective large-scale test of 100 enzyme families with thousands of structures, our predictions are found to be sensitive and specific: At the stringent specificity level of 99.98%, we can correctly predict enzyme functions for 80.55% of the proteins. The overall area under the receiver operating characteristic curve measuring the performance of our prediction is 0.955, close to the perfect value of 1.00. The best Matthews coefficient is 86.6%. Our method also works well in predicting the biochemical functions of orphan proteins from structural genomics projects.  相似文献   

7.
In all sequenced genomes, a large fraction of predicted genes encodes proteins of unknown biochemical function and up to 15% of the genes with "known" function are mis-annotated. Several global approaches are routinely employed to predict function, including sophisticated sequence analysis, gene expression, protein interaction, and protein structure. In the first coupling of genomics and enzymology, Phizicky and colleagues undertook a screen for specific enzymes using large pools of partially purified proteins and specific enzymatic assays. Here we present an overview of the further developments of this approach, which involve the use of general enzymatic assays to screen individually purified proteins for enzymatic activity. The assays have relaxed substrate specificity and are designed to identify the subclass or sub-subclasses of enzymes (phosphatase, phosphodiesterase/nuclease, protease, esterase, dehydrogenase, and oxidase) to which the unknown protein belongs. Further biochemical characterization of proteins can be facilitated by the application of secondary screens with natural substrates (substrate profiling). We demonstrate here the feasibility and merits of this approach for hydrolases and oxidoreductases, two very broad and important classes of enzymes. Application of general enzymatic screens and substrate profiling can greatly speed up the identification of biochemical function of unknown proteins and the experimental verification of functional predictions produced by other functional genomics approaches.  相似文献   

8.
Function prediction frequently relies on comparing genes or gene products to search for relevant similarities. Because the number of protein structures with unknown function is mushrooming, however, we asked here whether such comparisons could be improved by focusing narrowly on the key functional features of protein structures, as defined by the Evolutionary Trace (ET). Therefore a series of algorithms was built to (a) extract local motifs (3D templates) from protein structures based on ET ranking of residue importance; (b) to assess their geometric and evolutionary similarity to other structures; and (c) to transfer enzyme annotation whenever a plurality was reached across matches. Whereas a prototype had only been 80% accurate and was not scalable, here a speedy new matching algorithm enabled large-scale searches for reciprocal matches and thus raised annotation specificity to 100% in both positive and negative controls of 49 enzymes and 50 non-enzymes, respectively-in one case even identifying an annotation error-while maintaining sensitivity ( approximately 60%). Critically, this Evolutionary Trace Annotation (ETA) pipeline requires no prior knowledge of functional mechanisms. It could thus be applied in a large-scale retrospective study of 1218 structural genomics enzymes and reached 92% accuracy. Likewise, it was applied to all 2935 unannotated structural genomics proteins and predicted enzymatic functions in 320 cases: 258 on first pass and 62 more on second pass. Controls and initial analyses suggest that these predictions are reliable. Thus the large-scale evolutionary integration of sequence-structure-function data, here through reciprocal identification of local, functionally important structural features, may contribute significantly to de-orphaning the structural proteome.  相似文献   

9.
Previous studies have demonstrated that human salivary alpha-amylase specifically binds to the oral bacterium Streptococcus gordonii. This interaction is inhibited by substrates such as starch and maltotriose suggesting that bacterial binding may involve the enzymatic site of amylase. Experiments were performed to determine if amylase bound to the bacterial surface possessed enzymatic activity. It was found that over one-half of the bound amylase was enzymatically active. In addition, bacterial-bound amylase hydrolyzed starch to glucose which was then metabolized to lactic acid by the bacteria. In further studies, the role of amylase's histidine residues in streptococcal binding and enzymatic function was assessed after their selective modification with diethyl pyrocarbonate. DEP-modified amylase showed a marked reduction in both enzymatic and streptococcal binding activities. These effects were diminished when DEP modification occurred in the presence of maltotriose. DEP-modified amylase had a significantly altered secondary structure when compared with native enzyme or amylase modified in the presence of maltotriose. Collectively, these results suggest that human salivary alpha-amylase may possess multiple sites for bacterial binding and enzymatic activity which share structural similarities.  相似文献   

10.
Exploring the structure and function paradigm   总被引:3,自引:3,他引:0  
Advances in protein structure determination, led by the structural genomics initiatives have increased the proportion of novel folds deposited in the Protein Data Bank. However, these structures are often not accompanied by functional annotations with experimental confirmation. In this review, we reassess the meaning of structural novelty and examine its relevance to the complexity of the structure-function paradigm. Recent advances in the prediction of protein function from structure are discussed, as well as new sequence-based methods for partitioning large, diverse superfamilies into biologically meaningful clusters. Obtaining structural data for these functionally coherent groups of proteins will allow us to better understand the relationship between structure and function.  相似文献   

11.
The genome sequencing projects and knowledge of the entire protein repertoires of many organisms have prompted new procedures and techniques for the large-scale determination of protein structure, function and interactions. Recently, new work has been carried out on the determination of the function and evolutionary relationships of proteins by experimental structural genomics, and the discovery of protein-protein interactions by computational structural genomics.  相似文献   

12.
Structural biology and structural genomics are expected to produce many three-dimensional protein structures in the near future. Each new structure raises questions about its function and evolution. Correct functional and evolutionary classification of a new structure is difficult for distantly related proteins and error-prone using simple statistical scores based on sequence or structure similarity. Here we present an accurate numerical method for the identification of evolutionary relationships (homology). The method is based on the principle that natural selection maintains structural and functional continuity within a diverging protein family. The problem of different rates of structural divergence between different families is solved by first using structural similarities to produce a global map of folds in protein space and then further subdividing fold neighborhoods into superfamilies based on functional similarities. In a validation test against a classification by human experts (SCOP), 77% of homologous pairs were identified with 92% reliability. The method is fully automated, allowing fast, self-consistent and complete classification of large numbers of protein structures. In particular, the discrimination between analogy and homology of close structural neighbors will lead to functional predictions while avoiding overprediction.  相似文献   

13.
Shin DH  Proudfoot M  Lim HJ  Choi IK  Yokota H  Yakunin AF  Kim R  Kim SH 《Proteins》2008,70(3):1000-1009
We have determined the crystal structure of DR1281 from Deinococcus radiodurans. DR1281 is a protein of unknown function with over 170 homologs found in prokaryotes and eukaryotes. To elucidate the molecular function of DR1281, its crystal structure at 2.3 A resolution was determined and a series of biochemical screens for catalytic activity was performed. The crystal structure shows that DR1281 has two domains, a small alpha domain and a putative catalytic domain formed by a four-layered structure of two beta-sheets flanked by five alpha-helices on both sides. The small alpha domain interacts with other molecules in the asymmetric unit and contributes to the formation of oligomers. The structural comparison of the putative catalytic domain with known structures suggested its biochemical function to be a phosphatase, phosphodiesterase, nuclease, or nucleotidase. Structural analyses with its homologues also indicated that there is a dinuclear center at the interface of two domains formed by Asp8, Glu37, Asn38, Asn65, His148, His173, and His175. An absolute requirement of metal ions for activity has been proved by enzymatic assay with various divalent metal ions. A panel of general enzymatic assays of DR1281 revealed metal-dependent catalytic activity toward model substrates for phosphatases (p-nitrophenyl phosphate) and phosphodiesterases (bis-p-nitrophenyl phosphate). Subsequent secondary enzymatic screens with natural substrates demonstrated significant phosphatase activity toward phosphoenolpyruvate and phosphodiesterase activity toward 2',3'-cAMP. Thus, our structural and enzymatic studies have identified the biochemical function of DR1281 as a novel phosphatase/phosphodiesterase and disclosed key conserved residues involved in metal binding and catalytic activity.  相似文献   

14.
Zhao H  Yang Y  Zhou Y 《Nucleic acids research》2011,39(8):3017-3025
Mechanistic understanding of many key cellular processes often involves identification of RNA binding proteins (RBPs) and RNA binding sites in two separate steps. Here, they are predicted simultaneously by structural alignment to known protein-RNA complex structures followed by binding assessment with a DFIRE-based statistical energy function. This method achieves 98% accuracy and 91% precision for predicting RBPs and 93% accuracy and 78% precision for predicting RNA-binding amino-acid residues for a large benchmark of 212 RNA binding and 6761 non-RNA binding domains (leave-one-out cross-validation). Additional tests revealed that the method makes no false positive prediction from 311 DNA binding domains but correctly detects six domains binding with both DNA and RNA. In addition, it correctly identified 31 of 75 unbound RNA-binding domains with 92% accuracy and 65% precision for predicted binding residues and achieved 86% success rate in its application to SCOP RNA binding domain superfamily (Structural Classification Of Proteins). It further predicts 25 targets as RBPs in 2076 structural genomics targets: 20 of 25 predicted ones (80%) are putatively RNA binding. The superior performance over existing methods indicates the importance of dividing structures into domains, using a Z-score to measure relative structural similarity, and a statistical energy function to measure protein-RNA binding affinity.  相似文献   

15.
Canaves JM 《Proteins》2004,56(1):19-27
Recently, the structures of two proteins belonging to the archease family, TM1083 from Thermotoga maritima and MTH1598 from Methanobacterium thermoautotrophicum, have been solved independently by two Protein Structure Initiative structural genomics pilot centers using X-ray crystallography and NMR, respectively. The archease protein family is a good example of one of the paradoxes of structural genomics: Approximately one third of protein structures produced by structural genomics centers have no known function and are still annotated as "hypothetical proteins" in the Protein Data Bank. In the case of archeases, despite the existence of two protein structures and abundant sequence information, there is still no function assigned to this protein family. Here, our group predicts, based on structural similarity, sequence conservation, and gene context analyses, that members of this protein family might function as chaperones or modulators of proteins involved in DNA/RNA processing. The conservation of genomic context for this protein family is constant from Archaea and Bacteria to humans, and suggests that unannotated open reading frames contiguous to them could be novel RNA/DNA binding proteins.  相似文献   

16.
Structural genomics (SG) has significantly increased the number of novel protein structures of targets with medical relevance. In the protein kinase area, SG has contributed >50% of all novel kinases structures during the past three years and determined more than 30 novel catalytic domain structures. Many of the released structures are inhibitor complexes and a number of them have identified new inhibitor binding modes and scaffolds. In addition, generated reagents, assays, and inhibitor screening data provide a diversity of chemogenomic data that can be utilized for early drug development. Here we discuss the currently available structural data for the kinase family considering novel structures as well as inhibitor complexes. Our analysis revealed that the structural coverage of many kinases families is still rather poor, and inhibitor complexes with diverse inhibitors are only available for a few kinases. However, we anticipate that with the current rate of structure determination and high throughput technologies developed by SG programs these gaps will be closed soon. In addition, the generated reagents will put SG initiatives in a unique position providing data beyond protein structure determination by identifying chemical probes, determining their binding modes and target specificity.  相似文献   

17.
Human MMP-26 (matrix metalloproteinase-26) (also known as endometase or matrilysin-2) is a putative biomarker for human carcinomas of breast, prostate and other cancers of epithelial origin. Calcium modulates protein structure and function and may act as a molecular signal or switch in cells. The relationship between MMPs and calcium has barely been studied and is absent for MMP-26. We have investigated the calcium-binding sites and the role of calcium in MMP-26. MMP-26 has one high-affinity and one low-affinity calcium binding site. High-affinity calcium binding was restored at physiologically low calcium conditions with a calcium-dissociation constant of 63 nM without inducing secondary and tertiary structural changes. High-affinity calcium binding protects MMP-26 against thermal denaturation. Mutants of this site (D165A or E191A) lose enzymatic activity. Low-affinity calcium binding was restored at relatively high calcium concentrations and showed a K(d2) (low-affinity calcium-dissociation constant) value of 120 microM, which was accompanied with the recovery of enzymatic activity reversibly and tertiary structural changes, but without secondary structural rearrangements. Mutations at the low-affinity calcium-binding site (C3 site), K189E or D114A, induced enhanced affinity for the Ca2+ ion or an irreversible loss of enzymatic activity triggered by low-affinity calcium binding respectively. Mutation at non-calcium-binding site (V184D at C2 site) showed that C2 is not a true calcium-binding site. Observations from homology-modelled mutant structures correlated with these experimental results. A human breast cancer cell line, MDA-MB-231, transfected with wild-type MMP-26 cDNA showed a calcium-dependent invasive potential when compared with controls that were transfected with an inactive form of MMP-26 (E209A). Calcium-independent high invasiveness was observed in the K189E mutant MDA-MB-231 cell line.  相似文献   

18.
Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares > or =60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a beta-barrel/sandwich formed by eight antiparallel beta-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site.  相似文献   

19.
Advances in structural genomics and protein structure prediction require the design of automatic, fast, objective, and well benchmarked methods capable of comparing and assessing the similarity of low-resolution three-dimensional structures, via experimental or theoretical approaches. Here, a new method for sequence-independent structural alignment is presented that allows comparison of an experimental protein structure with an arbitrary low-resolution protein tertiary model. The heuristic algorithm is given and then used to show that it can describe random structural alignments of proteins with different folds with good accuracy by an extreme value distribution. From this observation, a structural similarity score between two proteins or two different conformations of the same protein is derived from the likelihood of obtaining a given structural alignment by chance. The performance of the derived score is then compared with well established, consensus manual-based scores and data sets. We found that the new approach correlates better than other tools with the gold standard provided by a human evaluator. Timings indicate that the algorithm is fast enough for routine use with large databases of protein models. Overall, our results indicate that the new program (MAMMOTH) will be a good tool for protein structure comparisons in structural genomics applications. MAMMOTH is available from our web site at http://physbio.mssm.edu/~ortizg/.  相似文献   

20.
In the search for immunoprotective antigens of the intraerythrocytic Babesia canis rossi parasite, a new cDNA was cloned and sequenced. Protein sequence database searches suggested that the 41-kDa protein belongs to the phosphofructokinase B type family (PFK-B). However, because of the low level sequence identity (< 20%) of the protein both with adenosine and sugar kinases from this family, its structural and functional features were further investigated using molecular modelling and enzymatic assays. The sequence/structure comparison of the protein with the crystal structure of a member of the PFK-B family, Escherichia coli ribokinase (EcRK), suggested that it might also form a stable and active dimer and revealed conservation of the ATP-binding site. However, residues specifically involved in the ribose-binding sites in the EcRK sequence (S and N) were substituted in its sequence (by H and M, respectively), and were suspected of binding adenosine compounds rather than sugar ones. Enzymatic assays using a purified glutathione S-transferase fusion protein revealed that this protein exhibits rapid catalysis of the phosphorylation of adenosine with an apparent Km value of 70 nM, whereas it was inactive on ribose or other carbohydrates. As enzymatic assays confirmed the results of the structure/function analysis indicating a preferential specificity towards adenosine compounds, this new protein of the PFK-B family corresponds to an adenosine kinase from B. canis rossi. It was named BcrAK.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号