首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present an efficient method for flexible comparison of protein structures, allowing swiveling motions. In all currently available methodologies developed and applied to the comparisons of protein structures, the molecules are considered to be rigid objects. The method described here extends and generalizes current approaches to searches for structural similarity between molecules by viewing proteins as objects consisting of rigid parts connected by rotary joints. During the matching, the rigid subparts are allowed to be rotated with respect to each other around swiveling points in one of the molecules. This technique straightforwardly detects structural motifs having hinge(s) between their domains. Whereas other existing methods detect hinge-bent motifs by initially finding the matching rigid parts and subsequently merging these together, our method automatically detects recurring substructures, allowing full 3 dimensional rotations about their swiveling points. Yet the method is extremely fast, avoiding the time-consuming full conformational space search. Comparison of two protein structures, without a predefinition of the motif, takes only seconds to one minute on a workstation per hinge. Hence, the molecule can be scanned for many potential hinge sites, allowing practically all C(alpha) atoms to be tried as swiveling points. This algorithm provides a highly efficient, fully automated tool. Its complexity is only O(n2), where n is the number of C(alpha) atoms in the compared molecules. As in our previous methodologies, the matching is independent of the order of the amino acids in the polypeptide chain. Here we illustrate the performance of this highly powerful tool on a large number of proteins exhibiting hinge-bending domain movements. Despite the motions, known hinge-bent domains/motifs which have been assembled and classified, are correctly identified. Additional matches are detected as well. This approach has been motivated by a technique for model based recognition of articulated objects originating in computer vision and robotics.  相似文献   

2.
In this work, we present an algorithm developed to handle biomolecular structural recognition problems, as part of an interdisciplinary research endeavor of the Computer Vision and Molecular Biology fields. A key problem in rational drug design and in biomolecular structural recognition is the generation of binding modes between two molecules, also known as molecular docking. Geometrical fitness is a necessary condition for molecular interaction. Hence, docking a ligand (e.g., a drug molecule or a protein molecule), to a protein receptor (e.g., enzyme), involves recognition of molecular surfaces. Conformational transitions by "hinge-bending" involves rotational movements of relatively rigid parts with respect to each other. The generation of docked binding modes between two associating molecules depends on their three dimensional structures (3-D) and their conformational flexibility. In comparison to the particular case of rigid-body docking, the computational difficulty grows considerably when taking into account the additional degrees of freedom intrinsic to the flexible molecular docking problem. Previous docking techniques have enabled hinge movements only within small ligands. Partial flexibility in the receptor molecule is enabled by a few techniques. Hinge-bending motions of protein receptors domains are not addressed by these methods, although these types of transitions are significant, e.g., in enzymes activity. Our approach allows hinge induced motions to exist in either the receptor or the ligand molecules of diverse sizes. We allow domains/subdomains/group of atoms movements in either of the associating molecules. We achieve this by adapting a technique developed in Computer Vision and Robotics for the efficient recognition of partially occluded articulated objects. These types of objects consist of rigid parts which are connected by rotary joints (hinges). Our method is based on an extension and generalization of the Hough transform and the Geometric Hashing paradigms for rigid object recognition. We show experimental results obtained by the successful application of the algorithm to cases of bound and unbound molecular complexes, yielding fast matching times. While the "correct" molecular conformations of the known complexes are obtained with small RMS distances, additional, predictive good-fitting binding modes are generated as well. We conclude by discussing the algorithm's implications and extensions, as well as its application to investigations of protein structures in Molecular Biology and recognition problems in Computer Vision.  相似文献   

3.
The root mean square deviation (RMSD) and the least RMSD are two widely used similarity measures in structural bioinformatics. Yet, they stem from global comparisons, possibly obliterating locally conserved motifs. We correct these limitations with the so-called combined RMSD, which mixes independent lRMSD measures, each computed with its own rigid motion. The combined RMSD is relevant in two main scenarios, namely to compare (quaternary) structures based on motifs defined from the sequence (domains and SSE) and to compare structures based on structural motifs yielded by local structural alignment methods. We illustrate the benefits of combined RMSD over the usual RMSD on three problems, namely (a) the assignment of quaternary structures for hemoglobin (scenario #1), (b) the calculation of structural phylogenies (case study: class II fusion proteins; scenario #1), and (c) the analysis of conformational changes based on combined RMSD of rigid structural motifs (case study: one class II fusion protein; scenario #2). Based on these illustrations, we argue that the combined RMSD is a tool of choice to perform positive and negative discrimination of degree of freedom, with applications to the design of move sets and collective coordinates. Executables to compute combined RMSD are available within the Structural Bioinformatics Library ( http://sbl.inria.fr ).  相似文献   

4.
Fischer D 《Proteins》2003,51(3):434-441
To gain a better understanding of the biological role of proteins encoded in genome sequences, knowledge of their three-dimensional (3D) structure and function is required. The computational assignment of folds is becoming an increasingly important complement to experimental structure determination. In particular, fold-recognition methods aim to predict approximate 3D models for proteins bearing no sequence similarity to any protein of known structure. However, fully automated structure-prediction methods can currently produce reliable models for only a fraction of these sequences. Using a number of semiautomated procedures, human expert predictors are often able to produce more and better predictions than automated methods. We describe a novel, fully automatic, fold-recognition meta-predictor, named 3D-SHOTGUN, which incorporates some of the strategies human predictors have successfully applied. This new method is reminiscent of the so-called cooperative algorithms of Computer Vision. The input to 3D-SHOTGUN are the top models predicted by a number of independent fold-recognition servers. The meta-predictor consists of three steps: (i) assembly of hybrid models, (ii) confidence assignment, and (iii) selection. We have applied 3D-SHOTGUN to an unbiased test set of 77 newly released protein structures sharing no sequence similarity to proteins previously released. Forty-six correct rank-1 predictions were obtained, 30 of which had scores higher than that of the first incorrect prediction-a significant improvement over the performance of all individual servers. Furthermore, the predicted hybrid models were, on average, more similar to their corresponding native structures than those produced by the individual servers. This opens the possibility of generating more accurate, full-atom homology models for proteins with no sequence similarity to proteins of known structure. These improvements represent a step forward toward the wider applicability of fully automated structure-prediction methods at genome scales.  相似文献   

5.
Most proteins comprise several domains and/or participate in functional complexes. Owing to ongoing structural genomic projects, it is likely that it will soon be possible to predict, with reasonable accuracy, the conserved regions of most structural domains. Under these circumstances, it will be important to have methods, based on simple-to-acquire experimental data, that allow to build and refine structures of multi-domain proteins or of protein complexes from homology models of the individual domains/proteins. It has been recently shown that small angle X-ray scattering (SAXS) and NMR residual dipolar coupling (RDC) data can be combined to determine the architecture of such objects when the X-ray structures of the domains are known and can be considered as rigid objects. We developed a simple genetic algorithm to achieve the same goal, but by using homology models of the domains considered as deformable objects. We applied it to two model systems, an S1KH bi-domain of the NusA protein and the γS-crystallin protein. Despite its simplicity our algorithm is able to generate good solutions when driven by SAXS and RDC data.  相似文献   

6.
The labeling of proteins with stable isotopes enhances the NMR method for the determination of 3D protein structures in solution. Stereo-array isotope labeling (SAIL) provides an optimal stereospecific and regiospecific pattern of stable isotopes that yields sharpened lines, spectral simplification without loss of information, and the ability to collect rapidly and evaluate fully automatically the structural restraints required to solve a high-quality solution structure for proteins up to twice as large as those that can be analyzed using conventional methods. Here, we describe a protocol for the preparation of SAIL proteins by cell-free methods, including the preparation of S30 extract and their automated structure analysis using the FLYA algorithm and the program CYANA. Once efficient cell-free expression of the unlabeled or uniformly labeled target protein has been achieved, the NMR sample preparation of a SAIL protein can be accomplished in 3 d. A fully automated FLYA structure calculation can be completed in 1 d on a powerful computer system.  相似文献   

7.
Many of the targets of structural genomics will be proteins with little or no structural similarity to those currently in the database. Therefore, novel function prediction methods that do not rely on sequence or fold similarity to other known proteins are needed. We present an automated approach to predict nucleic-acid-binding (NA-binding) proteins, specifically DNA-binding proteins. The method is based on characterizing the structural and sequence properties of large, positively charged electrostatic patches on DNA-binding protein surfaces, which typically coincide with the DNA-binding-sites. Using an ensemble of features extracted from these electrostatic patches, we predict DNA-binding proteins with high accuracy. We show that our method does not rely on sequence or structure homology and is capable of predicting proteins of novel-binding motifs and protein structures solved in an unbound state. Our method can also distinguish NA-binding proteins from other proteins that have similar, large positive electrostatic patches on their surfaces, but that do not bind nucleic acids.  相似文献   

8.
Goyal K  Mande SC 《Proteins》2008,70(4):1206-1218
High throughput structural genomics efforts have been making the structures of proteins available even before their function has been fully characterized. Therefore, methods that exploit the structural knowledge to provide evidence about the functions of proteins would be useful. Such methods would be needed to complement the sequence-based function annotation approaches. The current study describes generation of 3D-structural motifs for metal-binding sites from the known metalloproteins. It then scans all the available protein structures in the PDB database for putative metal-binding sites. Our analysis predicted more than 1000 novel metal-binding sites in proteins using three-residue templates, and more than 150 novel metal-binding sites using four-residue templates. Prediction of metal-binding site in a yeast protein YDR533c led to the hypothesis that it might function as metal-dependent amidopeptidase. The structural motifs identified by our method present novel metal-binding sites that reveal newer mechanisms for a few well-known proteins.  相似文献   

9.
We describe the application of a method geared toward structural and surface comparison of proteins. The method is based on the Geometric Hashing Paradigm adapted from Computer Vision. It allows for comparison of any two sets of 3-D coordinates, such as protein backbones, protein core or protein surface motifs, and small molecules such as drugs. Here we apply our method to 4 types of comparisons between pairs of molecules: (1) comparison of the backbones of two protein domains; (2) search for a predefined 3-D Cα motif within the full backbone of a domain; and in particular, (3) comparison of the surfaces of two receptor proteins; and (4) comparison of the surface of a receptor to the surface of a ligand. These aspects complement each other and can contribute toward a better understandingof protein structure and biomolecular recognition. Searches for 3-D surface motifs can be carried out on either receptors or on ligands. The latter may result in the detection of pharmacophoric patterns. If the surfaces of the binding sites of either the receptors or of the ligands are relatively similar, surface superpositioning may aid significantly in the docking problem. Currently, only distance invariants are used in the matching, although additional geometric surface invariants are considered. The speed of our Geometric Hashing algorithm is encouraging, with a typical surface comparison taking only seconds or minutes of CPU time on a SUN 4 SPARC workstation. The direct application of this method to the docking problem is also discussed. We demonstrate the success of this methodin its application to two members of the globin family and to two dehydrogenases. © 1993 Wiley-Liss, Inc.  相似文献   

10.
ABSTRACT: BACKGROUND: Searching for structural motifs across known protein structures can be useful for identifying unrelated proteins with similar function and characterising secondary structures such as beta-sheets. This is infeasible using conventional sequence alignment because linear protein sequences do not contain spatial information. beta-residue motifs are beta-sheet substructures that can be represented as graphs and queried using existing graph indexing methods, however, these approaches are designed for general graphs that do not incorporate the inherent structural constraints of beta-sheets and require computationally-expensive filtering and verification procedures. 3D substructure search methods, on the other hand, allow beta-residue motifs to be queried in a three-dimensional context but at significant computational costs. RESULTS: We developed a new method for querying beta-residue motifs, called BetaSearch, which leverages the natural planar constraints of beta-sheets by indexing them as 2D matrices, thus avoiding much of the computational complexities involved with structural and graph querying. BetaSearch demonstrates faster filtering, verification, and overall query time than existing graph indexing approaches whilst producing comparable index sizes. Compared to 3D substructure search methods, BetaSearch achieves 33 and 240 times speedups over index-based and pairwise alignment-based approaches, respectively. Furthermore, we have presented case-studies to demonstrate its capability of motif matching in sequentially dissimilar proteins and described a method for using BetaSearch to predict beta-strand pairing. CONCLUSIONS: We have demonstrated that BetaSearch is a fast method for querying substructure motifs. The improvements in speed over existing approaches make it useful for efficiently performing high-volume exploratory querying of possible protein substructural motifs or conformations. BetaSearch was used to identify a nearly identical beta-residue motif between an entirely synthetic (Top7) and a naturally-occurring protein (Charcot-Leyden crystal protein), as well as identifying structural similarities between biotin-binding domains of avidin, streptavidin and the lipocalin gamma subunit of human C8. AVAILABILITY: The web-interface, source code, and datasets for BetaSearch can be accessed from http://www.csse.unimelb.edu.au/~hohkhkh1/betasearch.  相似文献   

11.
Toward consistent assignment of structural domains in proteins   总被引:3,自引:0,他引:3  
The assignment of protein domains from three-dimensional structure is critically important in understanding protein evolution and function, yet little quality assurance has been performed. Here, the differences in the assignment of structural domains are evaluated using six common assignment methods. Three human expert methods (AUTHORS (authors' annotation), CATH and SCOP) and three fully automated methods (DALI, DomainParser and PDP) are investigated by analysis of individual methods against the author's assignment as well as analysis based on the consensus among groups of methods (only expert, only automatic, combined). The results demonstrate that caution is recommended in using current domain assignments, and indicates where additional work is needed. Specifically, the major factors responsible for conflicting domain assignments between methods, both experts and automatic, are: (1) the definition of very small domains; (2) splitting secondary structures between domains; (3) the size and number of discontinuous domains; (4) closely packed or convoluted domain-domain interfaces; (5) structures with large and complex architectures; and (6) the level of significance placed upon structural, functional and evolutionary concepts in considering structural domain definitions. A web-based resource that focuses on the results of benchmarking and the analysis of domain assignments is available at  相似文献   

12.
There are several different families of repeat proteins. In each, a distinct structural motif is repeated in tandem to generate an elongated structure. The nonglobular, extended structures that result are particularly well suited to present a large surface area and to function as interaction domains. Many repeat proteins have been demonstrated experimentally to fold and function as independent domains. In tetratricopeptide (TPR) repeats, the repeat unit is a helix-turn-helix motif. The majority of TPR motifs occur as three to over 12 tandem repeats in different proteins. The majority of TPR structures in the Protein Data Bank are of isolated domains. Here we present the high-resolution structure of NlpI, the first structure of a complete TPR-containing protein. We show that in this instance the TPR motifs do not fold and function as an independent domain, but are fully integrated into the three-dimensional structure of a globular protein. The NlpI structure is also the first TPR structure from a prokaryote. It is of particular interest because it is a membrane-associated protein, and mutations in it alter septation and virulence.  相似文献   

13.
A few highly charged natural peptide sequences were recently suggested to form stable alpha-helical structures in water. In this article we show that these sequences represent a novel structural motif called "charged single alpha-helix" (CSAH). To obtain reliable candidate CSAH motifs, we developed two conceptually different computational methods capable of scanning large databases: SCAN4CSAH is based on sequence features characteristic for salt bridge stabilized single alpha-helices, whereas FT_CHARGE applies Fourier transformation to charges along sequences. Using the consensus of the two approaches, a remarkable number of proteins were found to contain putative CSAH domains. Recombinant fragments (50-60 residues) corresponding to selected hits obtained by both methods (myosin 6, Golgi resident protein GCP60, and M4K4 protein kinase) were produced and shown by circular dichroism spectroscopy to adopt largely alpha-helical structure in water. CSAH segments differ substantially both from coiled-coil and intrinsically disordered proteins, despite the fact that current prediction methods recognize them as either or both. Analysis of the proteins containing CSAH motif revealed possible functional roles of the corresponding segments. The suggested main functional features include the formation of relatively rigid spacer/connector segments between functional domains as in caldesmon, extension of the lever arm in myosin motors and mediation of transient interactions by promoting dimerization in a range of proteins.  相似文献   

14.
Different types of structures closed into cycles are widespread at all the levels of structural organization of proteins. β-Hairpins, triple-stranded β-sheets, and βαβ-units represent simple structural motifs closed into cycles by systems of hydrogen bonds. Secondary closing of these simple motifs into larger cycles by means of different superhelices, split β-hairpins, or SS-bridges results in formation of complex structural motifs such as abcd-units, φ-motifs, five- and seven-segment α/β-motifs, etc. At the level of tertiary structure many proteins and domains fold into structures closed into cylinders. Apparently, closing the motifs and domains into cycles and cylinders results in formation of more cooperative and stable structures as compared with open ones, and this may be the reason for high frequencies of occurrence of the motifs in proteins.  相似文献   

15.
Knowledge of three dimensional structure is essential to understand the function of a protein. Although the overall fold is made from the whole details of its sequence, a small group of residues, often called as structural motifs, play a crucial role in determining the protein fold and its stability. Identification of such structural motifs requires sufficient number of sequence and structural homologs to define conservation and evolutionary information. Unfortunately, there are many structures in the protein structure databases have no homologous structures or sequences. In this work, we report an SVM method, SMpred, to identify structural motifs from single protein structure without using sequence and structural homologs. SMpred method was trained and tested using 132 proteins domains containing 581 motifs. SMpred method achieved 78.79% accuracy with 79.06% sensitivity and 78.53% specificity. The performance of SMpred was evaluated with MegaMotifBase using 188 proteins containing 1161 motifs. Out of 1161 motifs, SMpred correctly identified 1503 structural motifs reported in MegaMotifBase. Further, we showed that SMpred is useful approach for the length deviant superfamilies and single member superfamilies. This result suggests the usefulness of our approach for facilitating the identification of structural motifs in protein structure in the absence of sequence and structural homologs. The dataset and executable for the SMpred algorithm is available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/SMpred.htm.  相似文献   

16.
We present a comprehensive evaluation of a new structure mining method called PB-ALIGN. It is based on the encoding of protein structure as 1D sequence of a combination of 16 short structural motifs or protein blocks (PBs). PBs are short motifs capable of representing most of the local structural features of a protein backbone. Using derived PB substitution matrix and simple dynamic programming algorithm, PB sequences are aligned the same way amino acid sequences to yield structure alignment. PBs are short motifs capable of representing most of the local structural features of a protein backbone. Alignment of these local features as sequence of symbols enables fast detection of structural similarities between two proteins. Ability of the method to characterize and align regions beyond regular secondary structures, for example, N and C caps of helix and loops connecting regular structures, puts it a step ahead of existing methods, which strongly rely on secondary structure elements. PB-ALIGN achieved efficiency of 85% in extracting true fold from a large database of 7259 SCOP domains and was successful in 82% cases to identify true super-family members. On comparison to 13 existing structure comparison/mining methods, PB-ALIGN emerged as the best on general ability test dataset and was at par with methods like YAKUSA and CE on nontrivial test dataset. Furthermore, the proposed method performed well when compared to flexible structure alignment method like FATCAT and outperforms in processing speed (less than 45 s per database scan). This work also establishes a reliable cut-off value for the demarcation of similar folds. It finally shows that global alignment scores of unrelated structures using PBs follow an extreme value distribution. PB-ALIGN is freely available on web server called Protein Block Expert (PBE) at http://bioinformatics.univ-reunion.fr/PBE/.  相似文献   

17.
A method for simultaneous alignment of multiple protein structures   总被引:1,自引:0,他引:1  
Shatsky M  Nussinov R  Wolfson HJ 《Proteins》2004,56(1):143-156
Here, we present MultiProt, a fully automated highly efficient technique to detect multiple structural alignments of protein structures. MultiProt finds the common geometrical cores between input molecules. To date, most methods for multiple alignment start from the pairwise alignment solutions. This may lead to a small overall alignment. In contrast, our method derives multiple alignments from simultaneous superpositions of input molecules. Further, our method does not require that all input molecules participate in the alignment. Actually, it efficiently detects high scoring partial multiple alignments for all possible number of molecules in the input. To demonstrate the power of MultiProt, we provide a number of case studies. First, we demonstrate known multiple alignments of protein structures to illustrate the performance of MultiProt. Next, we present various biological applications. These include: (1) a partial alignment of hinge-bent domains; (2) identification of functional groups of G-proteins; (3) analysis of binding sites; and (4) protein-protein interface alignment. Some applications preserve the sequence order of the residues in the alignment, whereas others are order-independent. It is their residue sequence order-independence that allows application of MultiProt to derive multiple alignments of binding sites and of protein-protein interfaces, making MultiProt an extremely useful structural tool.  相似文献   

18.
The spectrin family of proteins represents a discrete group of cytoskeletal proteins comprising principally alpha-actinin, spectrin, dystrophin, and homologues and isoforms. They all share three main structural and functional motifs, namely, the spectrin repeat, EF-hands, and a CH domain-containing actin-binding domain. These proteins are variously involved in organisation of the actin cytoskeleton, membrane cytoskeleton architecture, cell adhesion, and contractile apparatus. The highly modular nature of these molecules has been a hindrance to the determination of their complete structures due to the inherent flexibility imparted on the proteins, but has also been an asset, inasmuch as the individual modules were of a size amenable to structural analysis by both crystallographic and NMR approaches. Representative structures of all the major domains shared by spectrin family proteins have now been solved at atomic resolution, including in some cases multiple domains from several family members. High-resolution structures, coupled with lower resolution methods to determine the overall molecular shape of these proteins, allow us for the first time to build complete atomic structures of the spectrin family of proteins.  相似文献   

19.
Eunsung Park  Julian Lee 《Proteins》2015,83(6):1054-1067
Many proteins undergo large‐scale motions where relatively rigid domains move against each other. The identification of rigid domains, as well as the hinge residues important for their relative movements, is important for various applications including flexible docking simulations. In this work, we develop a method for protein rigid domain identification based on an exhaustive enumeration of maximal rigid domains, the rigid domains not fully contained within other domains. The computation is performed by mapping the problem to that of finding maximal cliques in a graph. A minimal set of rigid domains are then selected, which cover most of the protein with minimal overlap. In contrast to the results of existing methods that partition a protein into non‐overlapping domains using approximate algorithms, the rigid domains obtained from exact enumeration naturally contain overlapping regions, which correspond to the hinges of the inter‐domain bending motion. The performance of the algorithm is demonstrated on several proteins. Proteins 2015; 83:1054–1067. © 2015 Wiley Periodicals, Inc.  相似文献   

20.
An  J.  Wako  H.  Sarai  A. 《Molecular Biology》2001,35(6):905-910
An amino acid sequence pattern conserved among a family of proteins is called motif. It is usually related to the specific function of the family. On the other hand, functions of proteins are realized through their 3D structures. Specific local structures, called structural motifs, are considered as related to their functions. However, searching for common structural motifs in different proteins is much more difficult than for common sequence motifs. We are attempting in this study to convert the information about the structural motifs into a set of one-dimensional digital strings, i.e., a set of codes, to compare them more easily by computer and to investigate their relationship to functions more quantitatively. By applying the Delaunay tessellation to a 3D structure of a protein, we can assign each local structure to a unique code that is defined so as to reflect its structural feature. Since a structural motif is defined as a set of the local structures in this paper, the structural motif is represented by a set of the codes. In order to examine the ability of the set of the codes to distinguish differences among the sets of local structures with a given PROSITE pattern that contain both true and false positives, we clustered them by introducing a similarity measure among the set of the codes. The obtained clustering shows a good agreement with other results by direct structural comparison methods such as a superposition method. The structural motifs in homologous proteins are also properly clustered according to their sources. These results suggest that the structural motifs can be well characterized by these sets of the codes, and that the method can be utilized in comparing structural motifs and relating them with function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号