首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The comparison between two protein structures is important for understanding a molecular function. In particular, the comparison of protein surfaces to measure their similarity provides another challenge useful for studying molecular evolution, docking, and drug design. This paper presents an algorithm, called the BetaSuperposer, which evaluates the similarity between the surfaces of two structures using the beta-shape which is a geometric structure derived from the Voronoi diagram of molecule. The algorithm performs iterations of mix-and-match between the beta-shapes of two structures for the optimal superposition from which a similarity measure is computed, where each mix-and-match step attempts to solve an NP-hard problem. The devised heuristic algorithm based on the assignment problem formulation quickly produces a good superposition and an assessment of similarity. The BetaSuperposer was fully implemented and benchmarked against popular programs, the Dali and the Click, using the SCOP models. The BetaSuperposer is freely available to the public from the Voronoi Diagram Research Center (http://voronoi.hanyang.ac.kr).  相似文献   

2.
How to compare the structures of an ensemble of protein conformations is a fundamental problem in structural biology. As has been previously observed, the widely used RMSD measure due to Kabsch, in which a rigid‐body superposition minimizing the least‐squares positional deviations is performed, has its drawbacks when comparing and visualizing a set of flexible protein structures. Here, we develop a method, fleximatch, of protein structure comparison that takes flexibility into account. Based on a distance matrix measure of flexibility, a weighted superposition of distance matrices rather than of atomic coordinates is performed. Subsequently, this allows a consistent determination of (a) a superposition of structures for visualization, (b) a partitioning of the protein structure into rigid molecular components (core atoms), and (c) an atomic mobility measure. The method is suitable for highlighting both particularly flexible and rigid parts of a protein from structures derived from NMR, X‐ray diffraction or molecular simulation. Proteins 2015; 83:820–826. © 2015 Wiley Periodicals, Inc.  相似文献   

3.
R B Russell  G J Barton 《Proteins》1992,14(2):309-323
An algorithm is presented for the accurate and rapid generation of multiple protein sequence alignments from tertiary structure comparisons. A preliminary multiple sequence alignment is performed using sequence information, which then determines an initial superposition of the structures. A structure comparison algorithm is applied to all pairs of proteins in the superimposed set and a similarity tree calculated. Multiple sequence alignments are then generated by following the tree from the branches to the root. At each branchpoint of the tree, a structure-based sequence alignment and coordinate transformations are output, with the multiple alignment of all structures output at the root. The algorithm encoded in STAMP (STructural Alignment of Multiple Proteins) is shown to give alignments in good agreement with published structural accounts within the dehydrogenase fold domains, globins, and serine proteinases. In order to reduce the need for visual verification, two similarity indices are introduced to determine the quality of each generated structural alignment. Sc quantifies the global structural similarity between pairs or groups of proteins, whereas Pij' provides a normalized measure of the confidence in the alignment of each residue. STAMP alignments have the quality of each alignment characterized by Sc and Pij' values and thus provide a reproducible resource for studies of residue conservation within structural motifs.  相似文献   

4.
Shah SB  Sahinidis NV 《PloS one》2012,7(5):e37493
Protein structure alignment is the problem of determining an assignment between the amino-acid residues of two given proteins in a way that maximizes a measure of similarity between the two superimposed protein structures. By identifying geometric similarities, structure alignment algorithms provide critical insights into protein functional similarities. Existing structure alignment tools adopt a two-stage approach to structure alignment by decoupling and iterating between the assignment evaluation and structure superposition problems. We introduce a novel approach, SAS-Pro, which addresses the assignment evaluation and structure superposition simultaneously by formulating the alignment problem as a single bilevel optimization problem. The new formulation does not require the sequentiality constraints, thus generalizing the scope of the alignment methodology to include non-sequential protein alignments. We employ derivative-free optimization methodologies for searching for the global optimum of the highly nonlinear and non-differentiable RMSD function encountered in the proposed model. Alignments obtained with SAS-Pro have better RMSD values and larger lengths than those obtained from other alignment tools. For non-sequential alignment problems, SAS-Pro leads to alignments with high degree of similarity with known reference alignments. The source code of SAS-Pro is available for download at http://eudoxus.cheme.cmu.edu/saspro/SAS-Pro.html.  相似文献   

5.
Geometric objects are often represented approximately in terms of a finite set of points in three-dimensional euclidean space. In this paper, we extend this representation to what we call labeled point clouds. A labeled point cloud is a finite set of points, where each point is not only associated with a position in three-dimensional space, but also with a discrete class label that represents a specific property. This type of model is especially suitable for modeling biomolecules such as proteins and protein binding sites, where a label may represent an atom type or a physico-chemical property. Proceeding from this representation, we address the question of how to compare two labeled points clouds in terms of their similarity. Using fuzzy modeling techniques, we develop a suitable similarity measure as well as an efficient evolutionary algorithm to compute it. Moreover, we consider the problem of establishing an alignment of the structures in the sense of a one-to-one correspondence between their basic constituents. From a biological point of view, alignments of this kind are of great interest, since mutually corresponding molecular constituents offer important information about evolution and heredity, and can also serve as a means to explain a degree of similarity. In this paper, we therefore develop a method for computing pairwise or multiple alignments of labeled point clouds. To this end, we proceed from an optimal superposition of the corresponding point clouds and construct an alignment which is as much as possible in agreement with the neighborhood structure established by this superposition. We apply our methods to the structural analysis of protein binding sites.  相似文献   

6.
蛋白质三维结构叠加面临的主要问题是,参与叠加的目标蛋白质的氨基酸残基存在某些缺失,但是多结构叠加方法却大多数需要完整的氨基酸序列,而目前通用的方法是直接删去缺失的氨基酸序列,导致叠加结果不准确。由于同源蛋白质间结构的相似性,因此,一个蛋白质结构中缺失的某个区域,可能存在于另一个同源蛋白质结构中。基于此,本文提出一种新的、简单、有效的缺失数据下的蛋白质结构叠加方法(ITEMDM)。该方法采用缺失数据的迭代思想计算蛋白质的结构叠加,采用优化的最小二乘算法结合矩阵SVD分解方法,求旋转矩阵和平移向量。用该方法成功叠加了细胞色素C家族的蛋白质和标准Fischer’s 数据库的蛋白质(67对蛋白质),并且与其他方法进行了比较。数值实验表明,本算法有如下优点:①与THESEUS算法相比较,运行时间快,迭代次数少;②与PSSM算法相比较,结果准确,运算时间少。结果表明,该方法可以更好地叠加缺失数据的蛋白质三维结构。  相似文献   

7.
Large-scale genome sequencing and structural genomics projects generate numerous sequences and structures for 'hypothetical' proteins without functional characterizations. Detection of homology to experimentally characterized proteins can provide functional clues, but the accuracy of homology-based predictions is limited by the paucity of tools for quantitative comparison of diverging residues responsible for the functional divergence. SURF'S UP! is a web server for analysis of functional relationships in protein families, as inferred from protein surface maps comparison according to the algorithm. It assigns a numerical score to the similarity between patterns of physicochemical features(charge, hydrophobicity) on compared protein surfaces. It allows recognizing clusters of proteins that have similar surfaces, hence presumably similar functions. The server takes as an input a set of protein coordinates and returns files with "spherical coordinates" of proteins in a PDB format and their graphical presentation, a matrix with values of mutual similarities between the surfaces, and the unrooted tree that represents the clustering of similar surfaces, calculated by the neighbor-joining method. SURF'S UP! facilitates the comparative analysis of physicochemical features of the surface, which are the key determinants of the protein function. By concentrating on coarse surface features, SURF'S UP! can work with models obtained from comparative modelling. Although it is designed to analyse the conservation among homologs, it can also be used to compare surfaces of non-homologous proteins with different three-dimensional folds, as long as a functionally meaningful structural superposition is supplied by the user. Another valuable characteristic of our method is the lack of initial assumptions about the functional features to be compared. SURF'S UP! is freely available for academic researchers at http://asia.genesilico.pl/surfs_up/.  相似文献   

8.
MOTIVATION: Geometric representations of proteins and ligands, including atom volumes, atom-atom contacts and solvent accessible surfaces, can be used to characterize interactions between and within proteins, ligands and solvent. Voronoi algorithms permit quantification of these properties by dividing structures into cells with a one-to-one correspondence with constituent atoms. As there is no generally accepted measure of atom-atom contacts, a continuous analytical representation of inter-atomic contacts will be useful. Improved geometric algorithms will also be helpful in increasing the speed and accuracy of iterative modeling algorithms. RESULTS: We present computational methods based on the Voronoi procedure that provide rapid and exact solutions to solvent accessible surfaces, volumes, and atom contacts within macromolecules. Furthermore, we define a measure of atom-atom contact that is consistent with the calculation of solvent accessible surfaces, allowing the integration of solvent accessibility and inter-atomic contacts into a continuous measure. The speed and accuracy of the algorithm is compared to existing methods for calculating solvent accessible surfaces and volumes. The presented algorithm has a reduced execution time and greater accuracy compared to numerical and approximate analytical surface calculation algorithms, and a reduced execution time and similar accuracy to existing Voronoi procedures for calculating atomic surfaces and volumes.  相似文献   

9.
Zhao N  Pang B  Shyu CR  Korkin D 《PloS one》2011,6(5):e19554
Interactions between proteins play a key role in many cellular processes. Studying protein-protein interactions that share similar interaction interfaces may shed light on their evolution and could be helpful in elucidating the mechanisms behind stability and dynamics of the protein complexes. When two complexes share structurally similar subunits, the similarity of the interaction interfaces can be found through a structural superposition of the subunits. However, an accurate detection of similarity between the protein complexes containing subunits of unrelated structure remains an open problem. Here, we present an alignment-free machine learning approach to measure interface similarity. The approach relies on the feature-based representation of protein interfaces and does not depend on the superposition of the interacting subunit pairs. Specifically, we develop an SVM classifier of similar and dissimilar interfaces and derive a feature-based interface similarity measure. Next, the similarity measure is applied to a set of 2,806×2,806 binary complex pairs to build a hierarchical classification of protein-protein interactions. Finally, we explore case studies of similar interfaces from each level of the hierarchy, considering cases when the subunits forming interactions are either homologous or structurally unrelated. The analysis has suggested that the positions of charged residues in the homologous interfaces are not necessarily conserved and may exhibit more complex conservation patterns.  相似文献   

10.
The problem of finding an optimal structural alignment for a pair of superimposed proteins is often amenable to the Smith-Waterman dynamic programming algorithm, which runs in time proportional to the product of lengths of the sequences being aligned. While the quadratic running time is acceptable for computing a single alignment of two fixed protein structures, the time complexity becomes a bottleneck when running the Smith-Waterman routine multiple times in order to find a globally optimal superposition and alignment of the input proteins. We present a subquadratic running time algorithm capable of computing an alignment that optimizes one of the most widely used measures of protein structure similarity, defined as the number of pairs of residues in two proteins that can be superimposed under a predefined distance cutoff. The algorithm presented in this article can be used to significantly improve the speed-accuracy tradeoff in a number of popular protein structure alignment methods.  相似文献   

11.
Protein structural alignment for detection of maximally conserved regions   总被引:3,自引:0,他引:3  
An algorithm for comparison of homologous protein structures and for study of conformational changes in proteins, has been developed. The method is based on identification of pieces of the two molecules that have similar shapes, as determined by the local conformation of the polypeptide chain. Pieces that superpose within a specified tolerance are assembled into domains based on similar transformations for superposition. The result is sets of pieces that represent conserved structural elements and conserved spatial relationships between structural elements within the proteins being compared. A similarity criterion based on maximum distance rather than on root mean square deviation reduces bias by outliers. The utility of the method is demonstrated by using examples from the protein kinase family.  相似文献   

12.

Background

The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed.

Results

To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition.

Conclusion

The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from https://engineering.purdue.edu/PRECISE/LMSfit.  相似文献   

13.
Evaluation and improvements in the automatic alignment of protein sequences   总被引:6,自引:0,他引:6  
The accuracy of protein sequence alignment obtained by applying a commonly used global sequence comparison algorithm is assessed. Alignments based on the superposition of the three-dimensional structures are used as a standard for testing the automatic, sequence-based methods. Alignments obtained from the global comparison of five pairs of homologous protein sequences studied gave 54% agreement overall for residues in secondary structures. The inclusion of information about the secondary structure of one of the proteins in order to limit the number of gaps inserted in regions of secondary structure, improved this figure to 68%. A similarity score of greater than six standard deviation units suggests that an alignment which is greater than 75% correct within secondary structural regions can be obtained automatically for the pair of sequences.  相似文献   

14.
Inferring protein interactions from phylogenetic distance matrices   总被引:2,自引:0,他引:2  
Finding the interacting pairs of proteins between two different protein families whose members are known to interact is an important problem in molecular biology. We developed and tested an algorithm that finds optimal matches between two families of proteins by comparing their distance matrices. A distance matrix provides a measure of the sequence similarity of proteins within a family. Since the protein sets of interest may have dozens of proteins each, the use of an efficient approximate solution is necessary. Therefore the approach we have developed consists of a Metropolis Monte Carlo optimization algorithm which explores the search space of possible matches between two distance matrices. We demonstrate that by using this algorithm we are able to accurately match chemokines and chemokine-receptors as well as the tgfbeta family of ligands and their receptors.  相似文献   

15.
MOTIVATION: Protein structure comparison is a fundamental problem in structural biology and bioinformatics. Two-dimensional maps of distances between residues in the structure contain sufficient information to restore the 3D representation, while maps of contacts reveal characteristic patterns of interactions between secondary and super-secondary structures and are very attractive for visual analysis. The overlap of 2D maps of two structures can be easily calculated, providing a sensitive measure of protein structure similarity. PROTMAP2D is a software tool for calculation of contact and distance maps based on user-defined criteria, quantitative comparison of pairs or series of contact maps (e.g. alternative models of the same protein, model versus native structure, different trajectories from molecular dynamics simulations, etc.) and visualization of the results. AVAILABILITY: PROTMAP2D for Windows / Linux / MacOSX is freely available for academic users from http://genesilico.pl/protmap2d.htm  相似文献   

16.
Protein structure analysis is a very important research topic in the molecular biology of the post-genomic era. The root mean square deviation (RMSD) is the most frequently used measure for comparing two protein three-dimensional (3-D) structures. In this paper, we deal with two fundamental problems related to the RMSD. We first deal with a problem called the "range RMSD query" problem. Given an aligned pair of structures, the problem is to compute the RMSD between two aligned substructures of them without gaps. This problem has many applications in protein structure analysis. We propose a linear-time preprocessing algorithm that enables constant-time RMSD computation. Next, we consider a problem called the "substructure RMSD query" problem, which is a generalization of the above range RMSD query problem. It is a problem to compute the RMSD between any substructures of two unaligned structures without gaps. Based on the algorithm for the range RMSD problem, we propose an O(nm) preprocessing algorithm that enables constant-time RMSD computation, where n and m are the lengths of the given structures. Moreover, we propose O(nm log r/r)-time and O(nm/r)-space preprocessing algorithm that enables O(r) query, where r is an arbitrary integer such that 1 < or = r < or = min(n, m). We also show that our strategy also works for another measure called the unit-vector root mean square deviation (URMSD), which is a variant of the RMSD.  相似文献   

17.
The identification of protein biochemical functions based on their three-dimensional structures is strongly required in the post-genome-sequencing era. We have developed a new method to identify and predict protein biochemical functions using the similarity information of molecular surface geometries and electrostatic potentials on the surfaces. Our prediction system consists of a similarity search method based on a clique search algorithm and the molecular surface database eF-site (electrostatic surface of functional-site in proteins). Using this system, functional sites similar to those of phosphoenoylpyruvate carboxy kinase were detected in several mononucleotide-binding proteins, which have different folds. We also applied our method to a hypothetical protein, MJ0226 from Methanococcus jannaschii, and detected the mononucleotide binding site from the similarity to other proteins having different folds.  相似文献   

18.
19.
Fast, efficient, and reliable algorithms for pairwise alignment of protein structures are in ever-increasing demand for analyzing the rapidly growing data on protein structures. CLePAPS is a tool developed for this purpose. It distinguishes itself from other existing algorithms by the use of conformational letters, which are discretized states of 3D segmental structural states. A letter corresponds to a cluster of combinations of the three angles formed by Calpha pseudobonds of four contiguous residues. A substitution matrix called CLESUM is available to measure the similarity between any two such letters. CLePAPS regards an aligned fragment pair (AFP) as an ungapped string pair with a high sum of pairwise CLESUM scores. Using CLESUM scores as the similarity measure, CLePAPS searches for AFPs by simple string comparison. The transformation which best superimposes a highly similar AFP can be used to superimpose the structure pairs under comparison. A highly scored AFP which is consistent with several other AFPs determines an initial alignment. CLePAPS then joins consistent AFPs guided by their similarity scores to extend the alignment by several "zoom-in" iteration steps. A follow-up refinement produces the final alignment. CLePAPS does not implement dynamic programming. The utility of CLePAPS is tested on various protein structure pairs.  相似文献   

20.
A measure of protein structure similarity is calculated from the matching of pairs of secondary structure elements between two proteins. The interaction of each pair was estimated from their axial line segments and combined with other geometric features to produce an optimal discrimination between intrafamily and interfamily relationships. The matching used a fast bipartite graph-matching algorithm that avoids the computational complexity of searching for the full subgraph isomorphism between the two sets of interactions. The main algorithm used was the "stable marriage" algorithm, which works on the ranked "preferences" of one interaction for another. The method takes 1/10 of a second for a typical comparison making it suitable as a fast pre-filter for slower, more exhaustive approaches. An application to protein structure classification is described.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号