首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A method for simultaneous alignment of multiple protein structures   总被引:1,自引:0,他引:1  
Shatsky M  Nussinov R  Wolfson HJ 《Proteins》2004,56(1):143-156
Here, we present MultiProt, a fully automated highly efficient technique to detect multiple structural alignments of protein structures. MultiProt finds the common geometrical cores between input molecules. To date, most methods for multiple alignment start from the pairwise alignment solutions. This may lead to a small overall alignment. In contrast, our method derives multiple alignments from simultaneous superpositions of input molecules. Further, our method does not require that all input molecules participate in the alignment. Actually, it efficiently detects high scoring partial multiple alignments for all possible number of molecules in the input. To demonstrate the power of MultiProt, we provide a number of case studies. First, we demonstrate known multiple alignments of protein structures to illustrate the performance of MultiProt. Next, we present various biological applications. These include: (1) a partial alignment of hinge-bent domains; (2) identification of functional groups of G-proteins; (3) analysis of binding sites; and (4) protein-protein interface alignment. Some applications preserve the sequence order of the residues in the alignment, whereas others are order-independent. It is their residue sequence order-independence that allows application of MultiProt to derive multiple alignments of binding sites and of protein-protein interfaces, making MultiProt an extremely useful structural tool.  相似文献   

2.
While a number of approaches have been geared toward multiple sequence alignments, to date there have been very few approaches to multiple structure alignment and detection of a recurring substructural motif. Among these, none performs both multiple structure comparison and motif detection simultaneously. Further, none considers all structures at the same time, rather than initiating from pairwise molecular comparisons. We present such a multiple structural alignment algorithm. Given an ensemble of protein structures, the algorithm automatically finds the largest common substructure (core) of C(alpha) atoms that appears in all the molecules in the ensemble. The detection of the core and the structural alignment are done simultaneously. Additional structural alignments also are obtained and are ranked by the sizes of the substructural motifs, which are present in the entire ensemble. The method is based on the geometric hashing paradigm. As in our previous structural comparison algorithms, it compares the structures in an amino acid sequence order-independent way, and hence the resulting alignment is unaffected by insertions, deletions and protein chain directionality. As such, it can be applied to protein surfaces, protein-protein interfaces and protein cores to find the optimally, and suboptimally spatially recurring substructural motifs. There is no predefinition of the motif. We describe the algorithm, demonstrating its efficiency. In particular, we present a range of results for several protein ensembles, with different folds and belonging to the same, or to different, families. Since the algorithm treats molecules as collections of points in three-dimensional space, it can also be applied to other molecules, such as RNA, or drugs.  相似文献   

3.
We report the largest and most comprehensive comparison of protein structural alignment methods. Specifically, we evaluate six publicly available structure alignment programs: SSAP, STRUCTAL, DALI, LSQMAN, CE and SSM by aligning all 8,581,970 protein structure pairs in a test set of 2930 protein domains specially selected from CATH v.2.4 to ensure sequence diversity. We consider an alignment good if it matches many residues, and the two substructures are geometrically similar. Even with this definition, evaluating structural alignment methods is not straightforward. At first, we compared the rates of true and false positives using receiver operating characteristic (ROC) curves with the CATH classification taken as a gold standard. This proved unsatisfactory in that the quality of the alignments is not taken into account: sometimes a method that finds less good alignments scores better than a method that finds better alignments. We correct this intrinsic limitation by using four different geometric match measures (SI, MI, SAS, and GSAS) to evaluate the quality of each structural alignment. With this improved analysis we show that there is a wide variation in the performance of different methods; the main reason for this is that it can be difficult to find a good structural alignment between two proteins even when such an alignment exists. We find that STRUCTAL and SSM perform best, followed by LSQMAN and CE. Our focus on the intrinsic quality of each alignment allows us to propose a new method, called "Best-of-All" that combines the best results of all methods. Many commonly used methods miss 10-50% of the good Best-of-All alignments. By putting existing structural alignments into proper perspective, our study allows better comparison of protein structures. By highlighting limitations of existing methods, it will spur the further development of better structural alignment methods. This will have significant biological implications now that structural comparison has come to play a central role in the analysis of experimental work on protein structure, protein function and protein evolution.  相似文献   

4.
Rai BK  Fiser A 《Proteins》2006,63(3):644-661
A major bottleneck in comparative protein structure modeling is the quality of input alignment between the target sequence and the template structure. A number of alignment methods are available, but none of these techniques produce consistently good solutions for all cases. Alignments produced by alternative methods may be superior in certain segments but inferior in others when compared to each other; therefore, an accurate solution often requires an optimal combination of them. To address this problem, we have developed a new approach, Multiple Mapping Method (MMM). The algorithm first identifies the alternatively aligned regions from a set of input alignments. These alternatively aligned segments are scored using a composite scoring function, which determines their fitness within the structural environment of the template. The best scoring regions from a set of alternative segments are combined with the core part of the alignments to produce the final MMM alignment. The algorithm was tested on a dataset of 1400 protein pairs using 11 combinations of two to four alignment methods. In all cases MMM showed statistically significant improvement by reducing alignment errors in the range of 3 to 17%. MMM also compared favorably over two alignment meta-servers. The algorithm is computationally efficient; therefore, it is a suitable tool for genome scale modeling studies.  相似文献   

5.
Zhu J  Weng Z 《Proteins》2005,58(3):618-627
We present a novel algorithm named FAST for aligning protein three-dimensional structures. FAST uses a directionality-based scoring scheme to compare the intra-molecular residue-residue relationships in two structures. It employs an elimination heuristic to promote sparseness in the residue-pair graph and facilitate the detection of the global optimum. In order to test the overall accuracy of FAST, we determined its sensitivity and specificity with the SCOP classification (version 1.61) as the gold standard. FAST achieved higher sensitivities than several existing methods (DaliLite, CE, and K2) at all specificity levels. We also tested FAST against 1033 manually curated alignments in the HOMSTRAD database. The overall agreement was 96%. Close inspection of examples from broad structural classes indicated the high quality of FAST alignments. Moreover, FAST is an order of magnitude faster than other algorithms that attempt to establish residue-residue correspondence. Typical pairwise alignments take FAST less than a second with a Pentium III 1.2GHz CPU. FAST software and a web server are available at http://biowulf.bu.edu/FAST/.  相似文献   

6.
Liu X  Zhao YP  Zheng WM 《Proteins》2008,71(2):728-736
CLEMAPS is a tool for multiple alignment of protein structures. It distinguishes itself from other existing algorithms for multiple structure alignment by the use of conformational letters, which are discretized states of 3D segmental structural states. A letter corresponds to a cluster of combinations of three angles formed by C(alpha) pseudobonds of four contiguous residues. A substitution matrix called CLESUM is available to measure the similarity between any two such letters. The input 3D structures are first converted to sequences of conformational letters. Each string of a fixed length is then taken as the center seed to search other sequences for neighbors of the seed, which are strings similar to the seed. A seed and its neighbors form a center-star, which corresponds to a fragment set of local structural similarity shared by many proteins. The detection of center-stars using CLESUM is extremely efficient. Local similarity is a necessary, but insufficient, condition for structural alignment. Once center-stars are found, the spatial consistency between any two stars are examined to find consistent star duads using atomic coordinates. Consistent duads are later joined to create a core for multiple alignment, which is further polished to produce the final alignment. The utility of CLEMAPS is tested on various protein structure ensembles.  相似文献   

7.
8.
Protein structure alignment methods are used for the detection of evolutionary and functionally related positions in proteins. A wide array of different methods are available, but the choice of the best method is often not apparent to the user. Several studies have assessed the alignment accuracy and consistency of structure alignment methods, but none of these explicitly considered membrane proteins, which are important targets for drug development and have distinct structural features. Here, we compared 13 widely used pairwise structural alignment methods on a test set of homologous membrane protein structures (called HOMEP3). Each pair of structures was aligned and the corresponding sequence alignment was used to construct homology models. The model accuracy compared to the known structures was assessed using scoring functions not incorporated in the tested structural alignment methods. The analysis shows that fragment‐based approaches such as FR‐TM‐align are the most useful for aligning structures of membrane proteins. Moreover, fragment‐based approaches are more suitable for comparison of protein structures that have undergone large conformational changes. Nevertheless, no method was clearly superior to all other methods. Additionally, all methods lack a measure to rate the reliability of a position within a structure alignment. To solve both of these problems, we propose a consensus‐type approach, combining alignments from four different methods, namely FR‐TM‐align, DaliLite, MATT, and FATCAT. Agreement between the methods is used to assign confidence values to each position of the alignment. Overall, we conclude that there remains scope for the improvement of structural alignment methods for membrane proteins. Proteins 2015; 83:1720–1732. © 2015 Wiley Periodicals, Inc.  相似文献   

9.
Analysis of protein structures based on backbone structural patterns known as structural alphabets have been shown to be very useful. Among them, a set of 16 pentapeptide structural motifs known as protein blocks (PBs) has been identified and upon which backbone model of most protein structures can be built. PBs allows simplification of 3D space onto 1D space in the form of sequence of PBs. Here, for the first time, substitution probabilities of PBs in a large number of aligned homologous protein structures have been studied and are expressed as a simplified 16 x 16 substitution matrix. The matrix was validated by benchmarking how well it can align sequences of PBs rather like amino acid alignment to identify structurally equivalent regions in closely or distantly related proteins using dynamic programming approach. The alignment results obtained are very comparable to well established structure comparison methods like DALI and STAMP. Other interesting applications of the matrix have been investigated. We first show that, in variable regions between two superimposed homologous proteins, one can distinguish between local conformational differences and rigid-body displacement of a conserved motif by comparing the PBs and their substitution scores. Second, we demonstrate, with the example of aspartic proteinases, that PBs can be efficiently used to detect the lobe/domain flexibility in the multidomain proteins. Lastly, using protein kinase as an example, we identify regions of conformational variations and rigid body movements in the enzyme as it is changed to the active state from an inactive state.  相似文献   

10.
Mooney SD  Liang MH  DeConde R  Altman RB 《Proteins》2005,61(4):741-747
A primary challenge for structural genomics is the automated functional characterization of protein structures. We have developed a sequence-independent method called S-BLEST (Structure-Based Local Environment Search Tool) for the annotation of previously uncharacterized protein structures. S-BLEST encodes the local environment of an amino acid as a vector of structural property values. It has been applied to all amino acids in a nonredundant database of protein structures to generate a searchable structural resource. Given a query amino acid from an experimentally determined or modeled structure, S-BLEST quickly identifies similar amino acid environments using a K-nearest neighbor search. In addition, the method gives an estimation of the statistical significance of each result. We validated S-BLEST on X-ray crystal structures from the ASTRAL 40 nonredundant dataset. We then applied it to 86 crystallographically determined proteins in the protein data bank (PDB) with unknown function and with no significant sequence neighbors in the PDB. S-BLEST was able to associate 20 proteins with at least one local structural neighbor and identify the amino acid environments that are most similar between those neighbors.  相似文献   

11.
Structural comparison of the prokaryotic ribosomal proteins L7/L12 and L30   总被引:7,自引:0,他引:7  
The structures of two prokaryotic ribosomal proteins, the carboxyterminal half of L7/L12 from Escherichia coli (L12CTF) and L30 from Bacilus stearothermophilus display a remarkably similar fold in which alpha-helices pack onto one side of an antiparallel, three-stranded, beta-pleated sheet. A detailed comparison of the structures by least-squares methods reveals that more than two-thirds of the alpha carbons can be superimposed with a root mean square distance of 2.33 A. The principal difference is an extra alpha-helix in L12CTF. The sequences of the proteins display a distinct conservation in regions which are crucial to the common fold, in particular the hydrophobic core. It is proposed that the similarity is a result of divergent evolution.  相似文献   

12.
Knowledge-based model building of proteins: concepts and examples.   总被引:8,自引:6,他引:2       下载免费PDF全文
We describe how to build protein models from structural templates. Methods to identify structural similarities between proteins in cases of significant, moderate to low, or virtually absent sequence similarity are discussed. The detection and evaluation of structural relationships is emphasized as a central aspect of protein modeling, distinct from the more technical aspects of model building. Computational techniques to generate and complement comparative protein models are also reviewed. Two examples, P-selectin and gp39, are presented to illustrate the derivation of protein model structures and their use in experimental studies.  相似文献   

13.
A novel method has been developed for acquiring the correct alignment of a query sequence against remotely homologous proteins by extracting structural information from profiles of multiple structure alignment. A systematic search algorithm combined with a group of score functions based on sequence information and structural information has been introduced in this procedure. A limited number of top solutions (15,000) with high scores were selected as candidates for further examination. On a test-set comprising 301 proteins from 75 protein families with sequence identity less than 30%, the proportion of proteins with completely correct alignment as first candidate was improved to 39.8% by our method, whereas the typical performance of existing sequence-based alignment methods was only between 16.1% and 22.7%. Furthermore, multiple candidates for possible alignment were provided in our approach, which dramatically increased the possibility of finding correct alignment, such that completely correct alignments were found amongst the top-ranked 1000 candidates in 88.3% of the proteins. With the assistance of a sequence database, completely correct alignment solutions were achieved amongst the top 1000 candidates in 94.3% of the proteins. From such a limited number of candidates, it would become possible to identify more correct alignment using a more time-consuming but more powerful method with more detailed structural information, such as side-chain packing and energy minimization, etc. The results indicate that the novel alignment strategy could be helpful for extending the application of highly reliable methods for fold identification and homology modeling to a huge number of homologous proteins of low sequence similarity. Details of the methods, together with the results and implications for future development are presented.  相似文献   

14.
Rahat O  Yitzhaky A  Schreiber G 《Proteins》2008,71(2):621-630
Protein-protein interactions networks has come to be a buzzword associated with nets containing edges that represent a pair of interacting proteins (e.g. hormone-receptor, enzyme-inhibitor, antigen-antibody, and a subset of multichain biological machines). Yet, each such interaction composes its own unique network, in which vertices represent amino acid residues, and edges represent atomic contacts. Recent studies have shown that analyses of the data encapsulated in these detailed networks may impact predictions of structure-function correlation. Here, we study homologous families of protein-protein interfaces, which share the same fold but vary in sequence. In this context, we address what properties of the network are shared among relatives with different sequences (and hence different atomic interactions) and which are not. Herein, we develop the general mathematical framework needed to compare the modularity of homologous networks. We then apply this analysis to the structural data of a few interface families, including hemoglobin alpha-beta, growth hormone-receptor, and Serine protease-inhibitor. Our results suggest that interface modularity is an evolutionarily conserved property. Hence, protein-protein interfaces can be clustered down to a few modules, with the boundaries being evolutionarily conserved along homologous complexes. This suggests that protein engineering of protein-protein binding sites may be simplified by varying each module, but retaining the overall modularity of the interface.  相似文献   

15.
G Vriend  C Sander 《Proteins》1991,11(1):52-58
We present a fully automatic algorithm for three-dimensional alignment of protein structures and for the detection of common substructures and structural repeats. Given two proteins, the algorithm first identifies all pairs of structurally similar fragments and subsequently clusters into larger units pairs of fragments that are compatible in three dimensions. The detection of similar substructures is independent of insertion/deletion penalties and can be chosen to be independent of the topology of loop connections and to allow for reversal of chain direction. Using distance geometry filters and other approximations, the algorithm, implemented in the WHAT IF program, is so fast that structural comparison of a single protein with the entire database of known protein structures can be performed routinely on a workstation. The method reproduces known non-trivial superpositions such as plastocyanin on azurin. In addition, we report surprising structural similarity between ubiquitin and a (2Fe-2S) ferredoxin.  相似文献   

16.
We have characterized the thermodynamic stability of the SH3 domain from the Saccharomyces cerevisiae Abp1p protein and found it to be relatively low compared to most other SH3 domains, with a Tm of 60 degrees C and a deltaGu of 3.08 kcal/mol. Analysis of a large alignment of SH3 domains led to the identification of atypical residues at eight positions in the wild-type Abp1p SH3 domain sequence that were subsequently replaced by the residue seen most frequently at that position in the alignment. Three of the eight mutants constructed in this way displayed increases in Tm ranging from 8 to 15 degrees C with concomitant increases in deltaGu of up to 1.4 kcal/mol. The effects of these substitutions on folding thermodynamics and kinetics were entirely additive, and a mutant containing all three was dramatically stabilized with a Tm greater than 90 degrees C and a deltaGu more than double that of the wild-type domain. The folding rate of this hyperstable mutant was 10-fold faster than wild-type, while its unfolding rate was fivefold slower. All of the stabilized mutants were still able to bind a target peptide with wild-type affinity. We have analyzed the stabilizing amino acid substitutions isolated in this study and several other similar sequence alignment based studies. In approximately 25% of cases, increased stability can be explained by enhanced propensity of the substituted residue for the local backbone conformation at the mutagenized site.  相似文献   

17.
Qi Y  Grishin NV 《Proteins》2005,58(2):376-388
Protein structure classification is necessary to comprehend the rapidly growing structural data for better understanding of protein evolution and sequence-structure-function relationships. Thioredoxins are important proteins that ubiquitously regulate cellular redox status and various other crucial functions. We define the thioredoxin-like fold using the structure consensus of thioredoxin homologs and consider all circular permutations of the fold. The search for thioredoxin-like fold proteins in the PDB database identified 723 protein domains. These domains are grouped into eleven evolutionary families based on combined sequence, structural, and functional evidence. Analysis of the protein-ligand structure complexes reveals two major active site locations for the thioredoxin-like proteins. Comparison to existing structure classifications reveals that our thioredoxin-like fold group is broader and more inclusive, unifying proteins from five SCOP folds, five CATH topologies and seven DALI domain dictionary globular folding topologies. Considering these structurally similar domains together sheds new light on the relationships between sequence, structure, function and evolution of thioredoxins.  相似文献   

18.
19.
20.
We apply a simple method for aligning protein sequences on the basis of a 3D structure, on a large scale, to the proteins in the scop classification of fold families. This allows us to assess, understand, and improve our automatic method against an objective, manually derived standard, a type of comprehensive evaluation that has not yet been possible for other structural alignment algorithms. Our basic approach directly matches the backbones of two structures, using repeated cycles of dynamic programming and least-squares fitting to determine an alignment minimizing coordinate difference. Because of simplicity, our method can be readily modified to take into account additional features of protein structure such as the orientation of side chains or the location-dependent cost of opening a gap. Our basic method, augmented by such modifications, can find reasonable alignments for all but 1.5% of the known structural similarities in scop, i.e., all but 32 of the 2,107 superfamily pairs. We discuss the specific protein structural features that make these 32 pairs so difficult to align and show how our procedure effectively partitions the relationships in scop into different categories, depending on what aspects of protein structure are involved (e.g., depending on whether or not consideration of side-chain orientation is necessary for proper alignment). We also show how our pairwise alignment procedure can be extended to generate a multiple alignment for a group of related structures. We have compared these alignments in detail with corresponding manual ones culled from the literature. We find good agreement (to within 95% for the core regions), and detailed comparison highlights how particular protein structural features (such as certain strands) are problematical to align, giving somewhat ambiguous results. With these improvements and systematic tests, our procedure should be useful for the development of scop and the future classification of protein folds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号