首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.  相似文献   

2.
The CsaA protein was first characterized in Bacillus subtilis as a molecular chaperone with export-related activities. Here we report the 2.0 Angstrom-resolution crystal structure of the Thermus thermophilus CsaA protein, designated ttCsaA. Atomic structure and experiments in solution revealed a homodimer as the functional unit. The structure of the ttCsaA monomer is reminiscent of the well known oligonucleotide-binding fold, with the addition of extensions at the N- and C-termini that form an extensive dimer interface. The two identical, large, hydrophobic cavities on the protein surface are likely to constitute the substrate binding sites. The CsaA proteins share essential sequence similarity with the tRNA-binding protein Trbp111. Structure-based sequence analysis suggests a close structural resemblance between these proteins, which may extend to the architecture of the binding sites at the atomic level. These results raise the intriguing possibility that CsaA proteins possess a second, tRNA-binding activity in addition to their export-related function.  相似文献   

3.
Comparisons of protein sequence via cyclic training of Hidden Markov Models (HMMs) in conjunction with alignments of three-dimensional structure, using the Combinatorial Extension (CE) algorithm, reveal two putative EF-hand metal binding domains in acetylcholinesterase. Based on sequence similarity, putative EF-hands are also predicted for the neuroligin family of cell surface proteins. These predictions are supported by experimental evidence. In the acetylcholinesterase crystal structure from Torpedo californica, the first putative EF-hand region binds the Zn2+ found in the heavy metal replacement structure. Further, the interaction of neuroligin 1 with its cognate receptor neurexin depends on Ca2+. Thus, members of the alpha,beta hydrolase fold family of proteins contain potential Ca2+ binding sites, which in some family members may be critical for heterologous cell associations.  相似文献   

4.
Genomics has posed the challenge of determination of protein function from sequence and/or 3-D structure. Functional assignment from sequence relationships can be misleading, and structural similarity does not necessarily imply functional similarity. Proteins in the DJ-1 family, many of which are of unknown function, are examples of proteins with both sequence and fold similarity that span multiple functional classes. THEMATICS (theoretical microscopic titration curves), an electrostatics-based computational approach to functional site prediction, is used to sort proteins in the DJ-1 family into different functional classes. Active site residues are predicted for the eight distinct DJ-1 proteins with available 3-D structures. Placement of the predicted residues onto a structural alignment for six of these proteins reveals three distinct types of active sites. Each type overlaps only partially with the others, with only one residue in common across all six sets of predicted residues. Human DJ-1 and YajL from Escherichia coli have very similar predicted active sites and belong to the same probable functional group. Protease I, a known cysteine protease from Pyrococcus horikoshii, and PfpI/YhbO from E. coli, a hypothetical protein of unknown function, belong to a separate class. THEMATICS predicts a set of residues that is typical of a cysteine protease for Protease I; the prediction for PfpI/YhbO bears some similarity. YDR533Cp from Saccharomyces cerevisiae, of unknown function, and the known chaperone Hsp31 from E. coli constitute a third group with nearly identical predicted active sites. While the first four proteins have predicted active sites at dimer interfaces, YDR533Cp and Hsp31 both have predicted sites contained within each subunit. Although YDR533Cp and Hsp31 form different dimers with different orientations between the subunits, the predicted active sites are superimposable within the monomer structures. Thus, the three predicted functional classes form four different types of quaternary structures. The computational prediction of the functional sites for protein structures of unknown function provides valuable clues for functional classification.  相似文献   

5.
BACKGROUND: THP12 is an abundant and extraordinarily hydrophilic hemolymph protein from the mealworm Tenebrio molitor and belongs to a group of small insect proteins with four highly conserved cysteine residues. Despite their sequence homology to odorant-binding proteins and pheromone-binding proteins, the function of these proteins is unclear. RESULTS: The first three-dimensional structure of THP12 has been determined by multidimensional NMR spectroscopy. The protein has a nonbundle helical structure consisting of six alpha helices. The arrangement of the alpha helices has a 'baseball glove' shape. In addition to the hydrophobic core, electrostatic interactions make contributions to the overall stability of the protein. NMR binding studies demonstrated the binding of small hydrophobic ligands to the single hydrophobic groove in THP12. Comparing the structure of THP12 with the predicted secondary structure of homologs reveals a common fold for this new class of insect proteins. A search with the program DALI revealed extensive similarity between the three-dimensional structure of THP12 and the N-terminal domain (residues 1-95) of recoverin, a member of the family of calcium-binding EF-hand proteins. CONCLUSIONS: Although the biological function of this new class of proteins is as yet undetermined, a general role as alpha-helical carrier proteins for small hydrophobic ligands, such as fatty acids or pheromones, is proposed on the basis of NMR-shift perturbation spectroscopy.  相似文献   

6.
The nature of the interaction of insect cuticular proteins and chitin is unknown even though about half of the cuticular proteins sequenced thus far share a consensus region that has been predicted to be the site of chitin binding. We previously predicted the preponderance of beta-pleated sheet in the consensus region and proposed its responsibility for the formation of helicoidal cuticle (Iconomidou et al., Insect Biochem. Mol. Biol. 29 (1999) 285). Consequently, we have also verified experimentally the abundance of antiparallel beta-pleated sheet in the structure of cuticle proteins (Iconomidou et al., Insect Biochem. Mol. Biol. 31 (2001) 877). In this work, based on sequence and secondary structure similarity of cuticle proteins, and especially that of the consensus motif, to that of bovine plasma retinol binding protein (RBP), we propose by homology modelling an antiparallel beta-sheet half-barrel structure as the basic folding motif of cuticle proteins. This folding motif may provide the template for elucidating cuticle protein-chitin interactions in detail and reveal the precise geometrical formation of cuticle's helicoidal architecture. This predicted motif is another example where nature utilizes an almost flat protein surface covered by aromatic side chains to interact with the polysaccharide chains of chitin.  相似文献   

7.
MOTIVATION: A method for recognizing the three-dimensional fold from the protein amino acid sequence based on a combination of hidden Markov models (HMMs) and secondary structure prediction was recently developed for proteins in the Mainly-Alpha structural class. Here, this methodology is extended to Mainly-Beta and Alpha-Beta class proteins. Compared to other fold recognition methods based on HMMs, this approach is novel in that only secondary structure information is used. Each HMM is trained from known secondary structure sequences of proteins having a similar fold. Secondary structure prediction is performed for the amino acid sequence of a query protein. The predicted fold of a query protein is the fold described by the model fitting the predicted sequence the best. RESULTS: After model cross-validation, the success rate on 44 test proteins covering the three structural classes was found to be 59%. On seven fold predictions performed prior to the publication of experimental structure, the success rate was 71%. In conclusion, this approach manages to capture important information about the fold of a protein embedded in the length and arrangement of the predicted helices, strands and coils along the polypeptide chain. When a more extensive library of HMMs representing the universe of known structural families is available (work in progress), the program will allow rapid screening of genomic databases and sequence annotation when fold similarity is not detectable from the amino acid sequence. AVAILABILITY: FORESST web server at http://absalpha.dcrt.nih.gov:8008/ for the library of HMMs of structural families used in this paper. FORESST web server at http://www.tigr.org/ for a more extensive library of HMMs (work in progress). CONTACT: valedf@tigr.org; munson@helix.nih.gov; garnier@helix.nih.gov  相似文献   

8.
9.
A cDNA for a type II antifreeze protein was isolated from liver of smelt (Osmerus mordax). The predicted protein sequence is homologous to that from sea raven (Hemitripterus americanus) and both show homology to a family of calcium-dependent lectins. Smelt and sea raven belong to taxonomic orders believed to have diverged prior to Cenozoic glaciation. Thus, type II antifreeze proteins appear to have evolved independently in these fish species from pre-existing calcium-dependent lectins. Sequence alignment of the antifreezes and the lectins suggest that these proteins adopt a similar fold, that the sea raven antifreeze has lost its Ca2+ binding sites, and the smelt antifreeze has retained one site. Experiments show that smelt antifreeze protein activity is responsive to Ca2+ but that of sea raven antifreeze protein is not. These results suggest that the type II fish antifreeze proteins and calcium-dependent lectins share a common ancestry, related folding structures, and functional similarity.  相似文献   

10.
Brakoulias A  Jackson RM 《Proteins》2004,56(2):250-260
A method is described for the rapid comparison of protein binding sites using geometric matching to detect similar three-dimensional structure. The geometric matching detects common atomic features through identification of the maximum common sub-graph or clique. These features are not necessarily evident from sequence or from global structural similarity giving additional insight into molecular recognition not evident from current sequence or structural classification schemes. Here we use the method to produce an all-against-all comparison of phosphate binding sites in a number of different nucleotide phosphate-binding proteins. The similarity search is combined with clustering of similar sites to allow a preliminary structural classification. Clustering by site similarity produces a classification of binding sites for the 476 representative local environments producing ten main clusters representing half of the representative environments. The similarities make sense in terms of both structural and functional classification schemes. The ten main clusters represent a very limited number of unique structural binding motifs for phosphate. These are the structural P-loop, di-nucleotide binding motif [FAD/NAD(P)-binding and Rossman-like fold] and FAD-binding motif. Similar classification schemes for nucleotide binding proteins have also been arrived at independently by others using different methods.  相似文献   

11.
In mammals, a family of four lipid binding proteins has been previously defined that includes two lipopolysaccharide binding proteins and two lipid transfer proteins. The first member of this family to have its three-dimensional structure determined is bactericidal/permeability-increasing protein (BPI). Using both the sequence and structure of BPI, along with recently developed sequence-sequence and sequence-structure similarity search methods, we have identified 13 distant members of the family in a diverse set of eukaryotes, including rat, chicken, Caenorhabditis elegans, and Biomphalaria galbrata. Although the sequence similarity between these 13 new members and any of the 4 original members of the BPI family is well below the "twilight zone," their high sequence-structure compatibility with BPI indicates they are likely to share its fold. These findings broaden the BPI family to include a member found in retina and brain, and suggest that a primitive member may have contained only one of the two similar domains of BPI.  相似文献   

12.
The primary sequence of the receptor for L-arabinose or Ara-binding protein (ABP) composed of 306 residues is very different from the D-glucose/D-galactose-binding protein (GGBP) which consists of 309 residues. Nevertheless, superimpositioning of the well-refined high resolution structures of ABP in complex with D-galactose and the GGBP in complex with D-glucose shows very similar structures; 220 of the residues (or about 70%) have a root mean square deviation of 2.0 A. From the superpositioning, nine pairs of continuous segments (consisting of 8-51 residues), mainly alpha-helices and beta-strands that form the core of the two lobes of the bilobate proteins were found to exhibit strong sequence homology. The equivalenced structures and aligned sequences show that many of the polar, as well as aromatic residues, in the sugar-binding sites located in the cleft between the two lobes are highly conserved. Surprisingly, however, the exact mode of binding of the D-galactose in ABP is totally different from that of the D-glucose in GGBP. Using the structurally aligned sequences of the ABP and GGBP as a template, we have matched the sequence of the ribose-binding protein (RBP) which consists of 271 residues with the ABP/GGBP pair. Although the nine aligned segments of all three proteins show little sequence identity, they have significant homology. Four additional segments of RBP were matched only with GGBP, leading to the alignment of about 90% of the RBP sequence with the GGBP sequence. Many of the conserved residues in the binding sites of ABP and GGBP matched with similar residues in RBP. Additional observations indicate that the GGBP/RBP pair is more closely related than the ABP/RBP or ABP/GGBP pair. All three binding proteins, which may have diverged from a common ancestor, serve as primary receptors for bacterial high affinity active transport systems. Moreover, GGBP and RBP, but not ABP, also act as receptors for chemotaxis. An exposed site located in one domain, which includes Gly74, for interacting with the trg transmembrane signal transducer that is involved in triggering chemotaxis has been located in the structure of GGBP (Vyas, N.K., Vyas, M.N., and Quiocho, F.A. (1988) Science 242, 1290-1295). Whereas the site is absent in the structure of ABP, it is strongly predicted to be present in RBP which shares the same trg transducer with GGBP. The knowledge-based alignment of RBP further revealed two possible additional peripheral chemotactic sites that show high structural and sequence similarity between GGBP and RBP only. At least one of these sites, together with the one proven to exist in the other domain, could be used by the signal transducer with which both binding proteins interact in a way which the substrate-loaded "closed cleft" structure could be discriminated from the unliganded "open cleft" form by the transducer.  相似文献   

13.
Proteins that share even low sequence homologies are known to adopt similar folds. The beta-propeller structural motif is one such example. Identifying sequences that adopt a beta-propeller fold is useful to annotate protein structure and function. Often, tandem sequence repeats provide the necessary signal for identifying beta-propellers in proteins. In our recent analysis to identify cell surface proteins in archaeal and bacterial genomes, we identified some proteins that contain novel tandem repeats "LVIVD", "RIVW" and "LGxL". In this work, based on protein fold predictions and three-dimensional comparative modeling methods, we predicted that these repeat types fold as beta-propeller. Further, the evolutionary trace analysis of all proteins constituting amino acid sequence repeats in beta-propellers suggest that the novel repeats have diverged from a common ancestor.  相似文献   

14.
TA0095 is a 96-residue hypothetical protein from Thermoplasma acidophilum that exhibits no sequence similarity to any protein of known structure. Also, TA0095 is a member of the COG4004 orthologous group of unknown function found in Archaea bacteria. We determined its three-dimensional structure by NMR methods. The structure displays an alpha/beta two-layer sandwich architecture formed by three alpha-helices and five beta-strands following the order beta1-alpha1-beta2-beta3-beta4-beta5-alpha2-alpha3. Searches for structural homologs indicate that the TA0095 structure belongs to the TBP-like fold, constituting a novel superfamily characterized by an additional C-terminal helix. The TA0095 structure provides a fold common to the COG4004 proteins that will obviously belong to this new superfamily. Most hydrophobic residues conserved in the COG4004 proteins are buried in the structure determined herein, thus underlying their importance for structure stability. Considering that the TA0095 surface shows a large positively charged patch with a high degree of residue conservation within the COG4004 domain, the biological function of TA0095 and the rest of COG4004 proteins might occur through binding a negatively charged molecule. Like other TBP-like fold proteins, the COG4004 proteins might be DNA-binding proteins. The fact that TA0095 is shown to interact with large DNA fragments is in favor of this hypothesis, although nonspecific DNA binding cannot be ruled out.  相似文献   

15.
We present a protein fold recognition method, MANIFOLD, which uses the similarity between target and template proteins in predicted secondary structure, sequence and enzyme code to predict the fold of the target protein. We developed a non-linear ranking scheme in order to combine the scores of the three different similarity measures used. For a difficult test set of proteins with very little sequence similarity, the program predicts the fold class correctly in 34% of cases. This is an over twofold increase in accuracy compared with sequence-based methods such as PSI-BLAST or GenTHREADER, which score 13-14% correct first hits for the same test set. The functional similarity term increases the prediction accuracy by up to 3% compared with using the combination of secondary structure similarity and PSI-BLAST alone. We argue that using functional and secondary structure information can increase the fold recognition beyond sequence similarity.  相似文献   

16.
Agam (Anopheles gambiae) relies on its olfactory system to target human prey, leading eventually to the injection of Plasmodium falciparum, the malaria vector. OBPs (odorant-binding proteins) are the first line of proteins involved in odorant recognition. They interact with olfactory receptors and thus constitute an interesting target for insect control. In the present study, we undertook a large-scale analysis of proteins belonging to the olfactory system of Agam with the aim of preventing insect bites by designing strong olfactory repellents. We determined the three-dimensional structures of several Agam OBPs, either alone or in complex with model compounds. In the present paper, we report the first three-dimensional structure of a member of the C-plus class of OBPs, AgamOBP47, which has a longer sequence than classical OBPs and contains six disulfide bridges. AgamOBP47 possesses a core of six α-helices and three disulfide bridges, similar to the classical OBP fold. Two extra loops and the N- and C-terminal extra segments contain two additional α-helices and are held in conformation by three disulfide bridges. They are located either side of the classical OBP core domain. The binding site of OBP47 is located between the core and the additional domains. Two crevices are observed on opposite sides of OBP47, which are joined together by a shallow channel of sufficient size to accommodate a model of the best-tested ligand. The binding sites of C-plus class OBPs therefore exhibit different characteristics, as compared with classical OBPs, which should lead to markedly diverse functional implications.  相似文献   

17.
Of the membrane proteins of known structure, we found that a remarkable 67% of the water soluble domains are structurally similar to water soluble proteins of known structure. Moreover, 41% of known water soluble protein structures share a domain with an already known membrane protein structure. We also found that functional residues are frequently conserved between extramembrane domains of membrane and soluble proteins that share structural similarity. These results suggest membrane and soluble proteins readily exchange domains and their attendant functionalities. The exchanges between membrane and soluble proteins are particularly frequent in eukaryotes, indicating that this is an important mechanism for increasing functional complexity. The high level of structural overlap between the two classes of proteins provides an opportunity to employ the extensive information on soluble proteins to illuminate membrane protein structure and function, for which much less is known. To this end, we employed structure guided sequence alignment to elucidate the functions of membrane proteins in the human genome. Our results bridge the gap of fold space between membrane and water soluble proteins and provide a resource for the prediction of membrane protein function. A database of predicted structural and functional relationships for proteins in the human genome is provided at sbi.postech.ac.kr/emdmp.  相似文献   

18.
Rigden DJ  Carneiro M 《Proteins》1999,37(4):697-708
The study of the plant oncogene rolA has been hampered by a lack of structural information. Here we show that, despite a lack of significant sequence similarity to proteins of known structure, the rolA sequence adopts a known fold; that of the papillomavirus E2 DNA-binding domain. This fold is reliably identified by modern threading programs, which consider predicted secondary structure, but not by others. Although the rolA sequence is only around 16% identical to those of the available template structures, a structural model could be built that performed well against protein structure verification programs. The adopted strategy involved alignment corrections, justified by multiple model building and evaluation, with particular attention paid to the hydrophobic core residues. We find that rolA protein is predicted to resemble the template proteins in two key aspects; existence as a dimer and ability to bind DNA. rolA protein has recently been shown experimentally to possess DNA binding ability. This model predicts Lys 24 and Arg 27 to be involved in sequence-specific interactions and eight other residues to hydrogen-bond phosphate groups of the DNA.  相似文献   

19.
20.
Structure-based prediction of DNA target sites by regulatory proteins   总被引:15,自引:0,他引:15  
Kono H  Sarai A 《Proteins》1999,35(1):114-131
Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号