首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 42 毫秒
1.
SUMMARY: We provide the scientific community with a web server which gives access to SuMo, a bioinformatic system for finding similarities in arbitrary 3D structures or substructures of proteins. SuMo is based on a unique representation of macromolecules using selected triplets of chemical groups having their own geometry and symmetry, regardless of the restrictive notions of main chain and lateral chains of amino acids. The heuristic for extracting similar sites was used to drive two major large-scale approaches. First, searching for ligand binding sites onto a query structure has been made possible by comparing the structure against each of the ligand binding sites found in the Protein Data Bank (PDB). Second, the reciprocal process, i.e. searching for a given 3D site of interest among the structures of the PDB is also possible and helps detect cross-reacting targets in drug design projects. AVAILABILITY: The web server is freely accessible to academia through http://sumo-pbil.ibcp.fr and full support is available from MEDIT (http://www.medit.fr). CONTACT: mjambon@burnham.org.  相似文献   

2.
Ligand–protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED‐SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED‐SMA is an automated and fast method to classify binding sites. It is based on MED‐SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora‐A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED‐SMA approach is demonstrated as it gathers binding sites of proteins with similar structure‐activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target‐based drug design and prediction of cross‐reactivity and therefore potential toxic side effects.  相似文献   

3.
We have recently developed a fast approach to comparisons of 3-dimensional structures. Our method is unique, treating protein structures as collections of unconnected points (atoms) in space. It is completely independent of the amino acid sequence order. It is unconstrained by insertions, deletions, and chain directionality. It matches single, isolated amino acids between 2 different structures strictly by their spatial positioning regardless of their relative sequential position in the amino acid chain. It automatically detects a recurring 3D motif in protein molecules. No predefinition of the motif is required. The motif can be either in the interior of the proteins or on their surfaces. In this work, we describe an enhancement over our previously developed technique, which considerably reduces the complexity of the algorithm. This results in an extremely fast technique. A typical pairwise comparison of 2 protein molecules requires less than 3 s on a workstation. We have scanned the structural database with dozens of probes, successfully detecting structures that are similar to the probe. To illustrate the power of this method, we compare the structure of a trypsin-like serine protease against the structural database. Besides detecting homologous trypsin-like proteases, we automatically obtain 3D, sequence order-independent, active-site similarities with subtilisin-like and sulfhydryl proteases. These similarities equivalence isolated residues, not conserving the linear order of the amino acids in the chains. The active-site similarities are well known and have been detected by manually inspecting the structures in a time-consuming, laborious procedure. This is the first time such equivalences are obtained automatically from the comparison of full structures. The far-reaching advantages and the implications of our novel algorithm to studies of protein folding, to evolution, and to searches for pharmacophoric patterns are discussed.  相似文献   

4.
Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.  相似文献   

5.
We have developed a method of searching for similar spatial arrangements of atoms around a given chemical moiety in proteins that bind a common ligand. The first step in this method is to consider a set of atoms that closely surround a given chemical moiety. Then, to compare the spatial arrangements of such surrounding atoms in different proteins, they are translated and rotated so that the chemical moieties are superposed on each other. Spatial arrangements of surrounding atoms in a pair of proteins are judged to be similar, when there are many corresponding atoms occupying similar spatial positions. Because the method focuses on the arrangements of surrounding atoms, it can detect structural similarities of binding sites in proteins that are dissimilar in their amino acid sequences or in their chain folds. We have applied this method to identify modes of nucleotide base recognition by proteins. An all-against-all comparison of the arrangements of atoms surrounding adenine moieties revealed an unexpected structural similarity between protein kinases, cAMP-dependent protein kinase (cAPK), and casein kinase-1 (CK1), and D-Ala:D-Ala ligase (DD-ligase) at their adenine-binding sites, despite a lack of similarity in their chain folds. The similar local structure consists of a four-residue segment and three sequentially separated residues. In particular the four-residue segments of these enzymes were found to have nearly identical conformations in their backbone parts, which are involved in the recognition of adenine. This common local structure was also found in substrate-free three-dimensional structures of other proteins that are similar to DD-ligase in the chain fold and of other protein kinases. As the proteins with different folds were found to share a common local structure, these proteins seem to constitute a remarkable example of convergent evolution for the same recognition mechanism. Received: 9 December 1996 / Accepted: 7 February 1997  相似文献   

6.
7.
The R3H domain is a conserved sequence motif, identified in over 100 proteins, that is thought to be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA. In this work the 3D structure of the R3H domain from human Smubp-2 was determined by NMR spectroscopy. It is the first 3D structure determination of an R3H domain. The fold presents a small motif, consisting of a three-stranded antiparallel beta-sheet and two alpha-helices, which is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. The similarities are non-trivial, as the amino acid identities are below 10%. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site. A strong correlation between conservation of hydrophobic amino acids and side-chain solvent protection indicates that the structure of the Smubp-2 R3H domain is representative of R3H domains in general.  相似文献   

8.
Protein phosphorylation is widely used in biological regulatory processes. The study of spatial features related to phosphorylation sites is necessary to increase the efficacy of recognition of phosphorylation patterns in protein sequences. Using the data on phosphosites found in amino acid sequences, we mapped these sites onto 3D structures and studied the structural variability of the same sites in different PDB entries related to the same proteins. Solvent accessibility was calculated for the residues known to be phosphorylated. A significant change in accessibility was shown for many sites, but several ones were determined as buried in all the structures considered. Most phosphosites were found in coil regions. However, a significant portion was located in the structurally stable ordered regions. Comparison of structures with the same sites in modified and unmodified states showed that the region surrounding a site could be significantly shifted due to phosphorylation. Comparison between non‐modified structures (as well as between the modified ones) suggested that phosphorylation stabilizes one of the possible conformations. The local structure around the site could be changed due to phosphorylation, but often the initial conformation of the site surrounding is not altered within bounds of a rather large substructure. In this case, we can observe an extensive displacement within a protein domain. Phosphorylation without structural alteration seems to provide the interface for domain‐domain or protein‐protein interactions. Accounting for structural features is important for revealing more specific patterns of phosphorylation. It is also necessary for explaining structural changes as a basis for regulatory processes.  相似文献   

9.
MOTIVATION: With the increasing availability of protein structures, the generation of biologically meaningful 3D patterns from the simultaneous alignment of several protein structures is an exciting prospect: active sites could be better understood, protein functions and protein 3D structures could be predicted more accurately. Although patterns can already be generated at the fold and topological levels, no system produces high-resolution 3D patterns including atom and cavity positions. To address this challenge, our research focuses on generating patterns from proteins with rigid prosthetic groups. Since these groups are key elements of protein active sites, the generated 3D patterns are expected to be biologically meaningful. RESULTS: In this paper, we present a new approach which allows the generation of 3D patterns from proteins with rigid prosthetic groups. Using 237 protein chains representing proteins containing porphyrin rings, our method was validated by comparing 3D templates generated from homologues with the 3D structure of the proteins they model. Atom positions were predicted reliably: 93% of them had an accuracy of 1.00 A or less. Moreover, similar results were obtained regarding chemical group and cavity positions. Results also suggested our system could contribute to the validation of 3D protein models. Finally, a 3D template was generated for the active site of human cytochrome P450 CYP17, the 3D structure of which is unknown. Its analysis showed that it is biologically meaningful: our method detected the main patterns of the cytochrome P450 superfamily and the motifs linked to catalytic reactions. The 3D template also suggested the position of a residue, which could be involved in a hydrogen bond with CYP17 substrates and the shape and location of a cavity. Comparisons with independently generated 3D models comforted these hypotheses. AVAILABILITY: Alignment software (Nestor3D) is available at http://www.kingston.ac.uk/~ku33185/Nestor3D.html  相似文献   

10.
Lectins, a group of proteins that bind to cell surface carbohydrates and play important roles in innate immunity, are widely used experimentally to distinguish cell types and to induce cell proliferation. Eel serum lectins have been useful as anti-H hemagglutinins and also in lectin histochemistry as fucose-binding lectins (fucolectins), but their structures have not been determined. Here we report the primary structures and the sites of synthesis of eel fucolectins. Eel serum fucolectins were separated by two-dimensional gel electrophoresis and sequenced. cDNA cloning, based on the amino acid sequence information, and Northern blot analysis indicated that 1) the fucose-binding lectins are secretory proteins and have unique structures among the lectins, exhibiting only weak similarities to frog pentraxin, horseshoe crab tachylectin-4, and fly fw protein; 2) there are at least seven closely related members; and 3) their messages are abundantly expressed in the liver and in significant levels in the gill and intestine. The lectin-producing hepatic cells were identified by immunostaining; in the gill, exocrine mucous cells were stained, suggesting that serum fucolectins derive from the liver. Using primary culture of eel hepatocytes, the message levels were shown to be increased by lipopolysaccharide, suggesting a role for fucolectins in host defense. SDS-polyacrylamide gel electrophoresis analysis showed that eel fucolectins have a SDS-resistant tetrameric structure consisting of two disulfide-linked dimers.  相似文献   

11.
Rigid-body docking approaches are not sufficient to predict the structure of a protein complex from the unbound (native) structures of the two proteins. Accounting for side chain flexibility is an important step towards fully flexible protein docking. This work describes an approach that allows conformational flexibility for the side chains while keeping the protein backbone rigid. Starting from candidates created by a rigid-docking algorithm, we demangle the side chains of the docking site, thus creating reasonable approximations of the true complex structure. These structures are ranked with respect to the binding free energy. We present two new techniques for side chain demangling. Both approaches are based on a discrete representation of the side chain conformational space by the use of a rotamer library. This leads to a combinatorial optimization problem. For the solution of this problem, we propose a fast heuristic approach and an exact, albeit slower, method that uses branch-and-cut techniques. As a test set, we use the unbound structures of three proteases and the corresponding protein inhibitors. For each of the examples, the highest-ranking conformation produced was a good approximation of the true complex structure.  相似文献   

12.
MOTIVATION: Existing algorithms for automated protein structure alignment generate contradictory results and are difficult to interpret. An algorithm which can provide a context for interpreting the alignment and uses a simple method to characterize protein structure similarity is needed. RESULTS: We describe a heuristic for limiting the search space for structure alignment comparisons between two proteins, and an algorithm for finding minimal root-mean-squared-distance (RMSD) alignments as a function of the number of matching residue pairs within this limited search space. Our alignment algorithm uses coordinates of alpha-carbon atoms to represent each amino acid residue and requires a total computation time of O(m(3) n(2)), where m and n denote the lengths of the protein sequences. This makes our method fast enough for comparisons of moderate-size proteins (fewer than approximately 800 residues) on current workstation-class computers and therefore addresses the need for a systematic analysis of multiple plausible shape similarities between two proteins using a widely accepted comparison metric.  相似文献   

13.
The amino acid sequences of various single- and two-chain lectins from the Leguminosae exhibit striking homologies which indicates that these proteins have been conserved during evolution. Their predicted secondary structures appear to be very similar to that of Con A and, in addition, amino acids involved in the three major functional features of the Con A protomer (hydrophobic cavity, bivalent cation binding sites and carbohydrate binding site) are well conserved in other single- and two-chain lectins. It is assumed that Vicieae and Leguminosae lectins are, like Con A, three-domain proteins whose amino acid sequences have been slightly modified during evolution, thus appearing as good phylogenetic markers of speciation.  相似文献   

14.
An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein for the set. The algorithm is a heuristic in that it computes an approximation to the optimal multiple structure alignment that minimizes the sum of the pairwise distances between the protein structures. The algorithm chooses an input protein as the initial consensus and computes a correspondence between the protein structures (which are represented as sets of unit vectors) using an approach analogous to the center-star method for multiple sequence alignment. From this correspondence, a set of rotation matrices (optimal for the given correspondence) is derived to align the structures and derive the new consensus. The process is iterated until the sum of pairwise distances converges. The computation of the optimal rotations is itself an iterative process that both makes use of the current consensus and generates simultaneously a new one. This approach is based on an interesting result that allows the sum of all pairwise distances to be represented compactly as distances to the consensus. Experimental results on several protein families are presented, showing that the algorithm converges quite rapidly.  相似文献   

15.
Collectins are animal calcium dependent lectins that target the carbohydrate structures on invading pathogens, resulting in the agglutination and enhanced clearance of the microorganism. These proteins form trimers that may assemble into larger oligomers. Each polypeptide chain consists of four regions: a relatively short N-terminal region, a collagen like region, an alpha-helical coiled-coil, and the lectin domain. Only primary structure data are available for the N-terminal region, while the most important features of the collagen-like region can be derived from its homology with collagen. The structures of the alpha-helical coiled-coil and the lectin domain are known from crystallographic studies of mannan binding protein (MBP) and lung surfactant protein D (SP-D). Carbohydrate binding has been structurally characterized in several complexes between MBP and carbohydrate; all indicate that the major interaction between carbohydrate and collectin is the binding of two adjacent carbohydrate hydroxyl group to a collectin calcium ion. In addition, these hydroxyl groups hydrogen bond to some of the calcium amino acid ligands. While each collectin trimer contains three such carbohydrate binding sites, deviation from the overall threefold symmetry has been demonstrated for SP-D, which may influence its binding properties. The protein surface between the three binding sites is positively charged in both MBP and SP-D.  相似文献   

16.
We present a new method for protein structure comparison that combines indexing and dynamic programming (DP). The method is based on simple geometric features of triplets of secondary structures of proteins. These features provide indexes to a hash table that allows fast retrieval of similarity information for a query protein. After the query protein is matched with all proteins in the hash table producing a list of putative similarities, the dynamic programming algorithm is used to align the query protein with each protein of this list. Since the pairwise comparison with DP is applied only to a small subset of proteins and, furthermore, DP re-uses information that is already computed and stored in the hash table, the approach is very fast even when searching the entire PDB. We have done extensive experimentation showing that our approach achieves results of quality comparable to that of other existing approaches but is generally faster.  相似文献   

17.
An examination of the binding sites of four carbohydrate binding proteins (Escherichia coli lactose repressor, E. coli arabinose-binding protein, yeast hexokinase A and Concanavalin A) revealed certain similarities of amino acid sequences and residues forming hydrogen bonds and hydrophobic interactions with the bound carbohydrate. These were: (i) Asx-Asx, hydrogen bonding to the pyranose ring oxygen and anomeric-OH group; (ii) Arg-X-X-X-(Ser/Thr), or the reverse sequence, with the Arg hydrogen bonding to the pyranose ring oxygen; (iii) Lys-(Ser/Thr)-X-X-Asp, or the reverse sequence and with interchange of the Lys-(Ser/Thr) positions, with hydrogen bonding of either or both the Lys and Asp residues to the -OH groups at carbons 2, 3, 4 or 6; (iv) a diaromatic sequence with possible hydrophobic interactions to the faces of the pyranose ring structure. An algorithm was devised to search the amino acid sequences of a large number of proteins, those known to bind carbohydrates as well as those without known carbohydrate-binding activities, for the four amino acid sequence criteria. The algorithm incorporated a weighted distance value (WDV) to assess the approximate distance between any two criteria, with the WDV being based on the predicted secondary structure of the protein amino acid sequence. When the algorithm using criteria 1 and 2 plus the WDV was applied to the sequences of 125 proteins, the method indicated the presence of the potential carbohydrate-binding site motif for 42% of proteins with known carbohydrate binding, only 8% of proteins were predicted as false positives, and the accuracy of the method was calculated to be 61.6%.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

18.
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.  相似文献   

19.
Designing new protein folds requires a method for simultaneously optimizing the conformation of the backbone and the side-chains. One approach to this problem is the use of a parameterized backbone, which allows the systematic exploration of families of structures. We report the crystal structure of RH3, a right-handed, three-helix coiled coil that was designed using a parameterized backbone and detailed modeling of core packing. This crystal structure was determined using another rationally designed feature, a metal-binding site that permitted experimental phasing of the X-ray data. RH3 adopted the intended fold, which has not been observed previously in biological proteins. Unanticipated structural asymmetry in the trimer was a principal source of variation within the RH3 structure. The sequence of RH3 differs from that of a previously characterized right-handed tetramer, RH4, at only one position in each 11 amino acid sequence repeat. This close similarity indicates that the design method is sensitive to the core packing interactions that specify the protein structure. Comparison of the structures of RH3 and RH4 indicates that both steric overlap and cavity formation provide strong driving forces for oligomer specificity.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号