首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Damm KL  Carlson HA 《Biophysical journal》2006,90(12):4558-4573
Many proteins contain flexible structures such as loops and hinged domains. A simple root mean square deviation (RMSD) alignment of two different conformations of the same protein can be skewed by the difference between the mobile regions. To overcome this problem, we have developed a novel method to overlay two protein conformations by their atomic coordinates using a Gaussian-weighted RMSD (wRMSD) fit. The algorithm is based on the Kabsch least-squares method and determines an optimal transformation between two molecules by calculating the minimal weighted deviation between the two coordinate sets. Unlike other techniques that choose subsets of residues to overlay, all atoms are included in the wRMSD overlay. Atoms that barely move between the two conformations will have a greater weighting than those that have a large displacement. Our superposition tool has produced successful alignments when applied to proteins for which two conformations are known. The transformation calculation is heavily weighted by the coordinates of the static region of the two conformations, highlighting the range of flexibility in the overlaid structures. Lastly, we show how wRMSD fits can be used to evaluate predicted protein structures. Comparing a predicted fold to its experimentally determined target structure is another case of comparing two protein conformations of the same sequence, and the degree of alignment directly reflects the quality of the prediction.  相似文献   

2.
Protein structures are routinely compared by their root-mean-square deviation (RMSD) in atomic coordinates after optimal rigid body superposition. What is not so clear is the significance of different RMSD values, particularly above the customary arbitrary cutoff for obvious similarity of 2–3 Å. Our earlier work argued for an intrinsic cutoff for protein similarity that varied with the number of residues in the polypeptide chains being compared. Here we introduce a new measure, ρ, of structural similarity based on RMSD that is independent of the sizes of the molecules involved, or of any other special properties of molecules. When ρ is less than 0.4–0.5, protein structures are visually recognized to be obviously similar, but the mathematically pleasing intrinsic cutoff of ρ>1.0 corresponds to overall similarity in folding motif at a level not usually recognized until smoothing of the polypeptide chain path makes it striking. When the structures are scaled to unit radius of gyration and equal principle moments of inertia, the comparisons are even more universal, since they are no longer obscured by differences in overall size and ellipticity. With increasing chain length, the distribution of ρ for pairs of random structures is skewed to higher values, but the value for the best 1% of the comparisons rises only slowly with the number of residues. This level is close to an intrinsic cutoff between similar and dissimilar comparisons, namely the maximal scaled ρ possible for the two structures to be more similar to each other than one is to the other's mirror image. The intrinsic cutoff is independent of the number of residues or points being compared. For proteins having fewer than 100 residues, the 1% ρ falls below the intrinsic cutoff, so that for very small proteins, geometrically significant similarity can often occur by chance. We believe these ideas will be helpful in judging success in NMR structure determination and protein folding modeling. © 1995 Wiley-Liss, Inc.  相似文献   

3.
Shih ES  Hwang MJ 《Proteins》2004,56(3):519-527
Comparison of two protein structures often results in not only a global alignment but also a number of distinct local alignments; the latter, referred to as alternative alignments, are however usually ignored in existing protein structure comparison analyses. Here, we used a novel method of protein structure comparison to extensively identify and characterize the alternative alignments obtained for structure pairs of a fold classification database. We showed that all alternative alignments can be classified into one of just a few types, and with which illustrated the potential of using alternative alignments to identify recurring protein substructures, including the internal structural repeats of a protein. Furthermore, we showed that among the alternative alignments obtained, permuted alignments, which included both circular and scrambled permutations, are as prevalent as topological alignments. These results demonstrated that the so far largely unattended alternative alignments of protein structures have implications and applications for research of protein classification and evolution.  相似文献   

4.
By using an unsupervised cluster analyzer, we have identified a local structural alphabet composed of 16 folding patterns of five consecutive C(alpha) ("protein blocks"). The dependence that exists between successive blocks is explicitly taken into account. A Bayesian approach based on the relation protein block-amino acid propensity is used for prediction and leads to a success rate close to 35%. Sharing sequence windows associated with certain blocks into "sequence families" improves the prediction accuracy by 6%. This prediction accuracy exceeds 75% when keeping the first four predicted protein blocks at each site of the protein. In addition, two different strategies are proposed: the first one defines the number of protein blocks in each site needed for respecting a user-fixed prediction accuracy, and alternatively, the second one defines the different protein sites to be predicted with a user-fixed number of blocks and a chosen accuracy. This last strategy applied to the ubiquitin conjugating enzyme (alpha/beta protein) shows that 91% of the sites may be predicted with a prediction accuracy larger than 77% considering only three blocks per site. The prediction strategies proposed improve our knowledge about sequence-structure dependence and should be very useful in ab initio protein modelling.  相似文献   

5.
The quantitative criteria characterizing the regularity of Calpha-backbones in the protein structures are presented. A technique is based on the Fourier remapping of the Cartesian coordinates for the Calpha-chain. The Fourier spectra identify the hidden periodicities and symmetries in protein structures, while the integral regularity is assessed via the spectral structural entropies. The formal unification of digitizing and the similarities in statistics for the random counterparts allow study of the direct correlations between the distribution of physico-chemical characteristics along the amino acid sequence and the spatial conformation of the polypeptide chain. The significant correlations are found for both hydrophobicity and side-chain volumes, though, as expected, the effects for hydrophobicity turn out essentially stronger. A scheme is illustrated by the set of 120 protein structures comprising the representatives from the main superfamilies and superfolds.  相似文献   

6.
The Fourier methods are applied to the pairwise comparison of Calpha-backbones in protein structures. The technique allows to assess both the general similarity and the main origins of resemblance (coincident periodicities, similarity of fragments, or large-scale semblance of folding). The analogous methods can be extended to the study of correlations between the structural characteristics for the Calpha-backbone of one protein and the distribution of physico-chemical parameters along the primary amino acid sequence for the other. Finally, we discuss the problem of clusterization of pairwise data into tree-like hierarchical system.  相似文献   

7.
A systematic method has been developed for comparing the backbone conformations of proteins (Remington & Matthews, 1978). Two proteins are compared by successively optimizing the agreement between all possible segments of a chosen length from one protein, and all possible segments of the same length from the other protein. The method reveals any similarities between the two proteins, and provides an estimate of the statistical significance of any given structure agreement that is obtained.The method has been tested in a number of cases, including comparisons of the dehydrogenases and of the pancreatic and bacterial serine proteases. These examples were chosen to test the ability of the comparison method to detect structural similarities in the presence of large insertions and deletions. The results suggest that the detection of the “nucleotide binding fold” in the dehydrogenases is at the limit of the capability of the comparison technique in its original form, although it may be possible to generalize the method to allow for insertions and deletions in proteins.The results of many protein comparisons, made with different probe lengths, are summarized. For medium and long probe lengths, the average value of the structural agreement does not depend very much on the type of protein being compared. The average value of the structure agreement increases with the square root of the probe length, but for probe lengths above about 40 residues, the standard deviation is independent of probe length. From these observations it is possible to construct a generalized probability diagram to evaluate the significance of any structure agreement that might be obtained in comparing two proteins.  相似文献   

8.

Background  

The structure of proteins may change as a result of the inherent flexibility of some protein regions. We develop and explore probabilistic machine learning methods for predicting a continuum secondary structure, i.e. assigning probabilities to the conformational states of a residue. We train our methods using data derived from high-quality NMR models.  相似文献   

9.
A new method is presented for evaluating the quality of protein structures obtained by NMR. This method exploits the dependence between measurable chemical properties of a protein, namely pK a values of acidic residues, and protein structure. The accurate and fast empirical computational method employed by the PROPKA program () allows the user to test the ability of a given structure to reproduce known pK a values, which in turn can be used as a criterion for the selection of more accurate structures. We demonstrate the feasibility of this novel idea for a series of proteins for which both␣NMR and X-ray structures, as well as pK a values of all ionizable residues, have been determined. For the 17 NMR ensembles used in this study, this criterion is shown effective in the elimination of a large number of NMR structure ensemble members.  相似文献   

10.
This paper describes two computer programs designed to assist in the comparison of protein structures. LOPAL (LOoP ALignment) applies a dynamic programming algorithm to the comparison of regions of protein three-dimensional (3D) structure and gives a similarity score and suggested sequence alignment with that score. SCAMP (Structure Comparison and Alignment of Multiple Proteins) is an interactive graphics program for the Evans and Sutherland PS300 graphics terminal that allows the simultaneous display, manipulation and pairwise least-squares fitting of up to nine independent structures. Together, LOPAL and SCAMP provide an integrated system for characterizing structural similarities in proteins with the aim of improving the accuracy of predicted protein structures. An application of these programs to loop regions in the immunoglobulin constant domains is illustrated.  相似文献   

11.
The comparison between two protein structures is important for understanding a molecular function. In particular, the comparison of protein surfaces to measure their similarity provides another challenge useful for studying molecular evolution, docking, and drug design. This paper presents an algorithm, called the BetaSuperposer, which evaluates the similarity between the surfaces of two structures using the beta-shape which is a geometric structure derived from the Voronoi diagram of molecule. The algorithm performs iterations of mix-and-match between the beta-shapes of two structures for the optimal superposition from which a similarity measure is computed, where each mix-and-match step attempts to solve an NP-hard problem. The devised heuristic algorithm based on the assignment problem formulation quickly produces a good superposition and an assessment of similarity. The BetaSuperposer was fully implemented and benchmarked against popular programs, the Dali and the Click, using the SCOP models. The BetaSuperposer is freely available to the public from the Voronoi Diagram Research Center (http://voronoi.hanyang.ac.kr).  相似文献   

12.
13.
Comparing and classifying the three-dimensional (3D) structures of proteins is of crucial importance to molecular biology, from helping to determine the function of a protein to determining its evolutionary relationships. Traditionally, 3D structures are classified into groups of families that closely resemble the grouping according to their primary sequence. However, significant structural similarities exist at multiple levels between proteins that belong to these different structural families. In this study, we propose a new algorithm, CLICK, to capture such similarities. The method optimally superimposes a pair of protein structures independent of topology. Amino acid residues are represented by the Cartesian coordinates of a representative point (usually the C(α) atom), side chain solvent accessibility, and secondary structure. Structural comparison is effected by matching cliques of points. CLICK was extensively benchmarked for alignment accuracy on four different sets: (i) 9537 pair-wise alignments between two structures with the same topology; (ii) 64 alignments from set (i) that were considered to constitute difficult alignment cases; (iii) 199 pair-wise alignments between proteins with similar structure but different topology; and (iv) 1275 pair-wise alignments of RNA structures. The accuracy of CLICK alignments was measured by the average structure overlap score and compared with other alignment methods, including HOMSTRAD, MUSTANG, Geometric Hashing, SALIGN, DALI, GANGSTA(+), FATCAT, ARTS and SARA. On average, CLICK produces pair-wise alignments that are either comparable or statistically significantly more accurate than all of these other methods. We have used CLICK to uncover relationships between (previously) unrelated proteins. These new biological insights include: (i) detecting hinge regions in proteins where domain or sub-domains show flexibility; (ii) discovering similar small molecule binding sites from proteins of different folds and (iii) discovering topological variants of known structural/sequence motifs. Our method can generally be applied to compare any pair of molecular structures represented in Cartesian coordinates as exemplified by the RNA structure superimposition benchmark.  相似文献   

14.
The comparison between two protein structures is important for understanding a molecular function. In particular, the comparison of protein surfaces to measure their similarity provides another challenge useful for studying molecular evolution, docking, and drug design. This paper presents an algorithm, called the BetaSuperposer, which evaluates the similarity between the surfaces of two structures using the beta-shape which is a geometric structure derived from the Voronoi diagram of molecule. The algorithm performs iterations of mix-and-match between the beta-shapes of two structures for the optimal superposition from which a similarity measure is computed, where each mix-and-match step attempts to solve an NP-hard problem. The devised heuristic algorithm based on the assignment problem formulation quickly produces a good superposition and an assessment of similarity. The BetaSuperposer was fully implemented and benchmarked against popular programs, the Dali and the Click, using the SCOP models. The BetaSuperposer is freely available to the public from the Voronoi Diagram Research Center ( http://voronoi.hanyang.ac.kr ).  相似文献   

15.
Comparative sequence analyses, including such fundamental bioinformatics techniques as similarity searching, sequence alignment and phylogenetic inference, have become a mainstay for researchers studying type 1 Human Immunodeficiency Virus (HIV-1) genome structure and evolution. Implicit in comparative analyses is an underlying model of evolution, and the chosen model can significantly affect the results. In general, evolutionary models describe the probabilities of replacing one amino acid character with another over a period of time. Most widely used evolutionary models for protein sequences have been derived from curated alignments of hundreds of proteins, usually based on mammalian genomes. It is unclear to what extent these empirical models are generalizable to a very different organism, such as HIV-1-the most extensively sequenced organism in existence. We developed a maximum likelihood model fitting procedure to a collection of HIV-1 alignments sampled from different viral genes, and inferred two empirical substitution models, suitable for describing between-and within-host evolution. Our procedure pools the information from multiple sequence alignments, and provided software implementation can be run efficiently in parallel on a computer cluster. We describe how the inferred substitution models can be used to generate scoring matrices suitable for alignment and similarity searches. Our models had a consistently superior fit relative to the best existing models and to parameter-rich data-driven models when benchmarked on independent HIV-1 alignments, demonstrating evolutionary biases in amino-acid substitution that are unique to HIV, and that are not captured by the existing models. The scoring matrices derived from the models showed a marked difference from common amino-acid scoring matrices. The use of an appropriate evolutionary model recovered a known viral transmission history, whereas a poorly chosen model introduced phylogenetic error. We argue that our model derivation procedure is immediately applicable to other organisms with extensive sequence data available, such as Hepatitis C and Influenza A viruses.  相似文献   

16.
A probabilistic measure for alignment-free sequence comparison   总被引:3,自引:0,他引:3  
MOTIVATION: Alignment-free sequence comparison methods are still in the early stages of development compared to those of alignment-based sequence analysis. In this paper, we introduce a probabilistic measure of similarity between two biological sequences without alignment. The method is based on the concept of comparing the similarity/dissimilarity between two constructed Markov models. RESULTS: The method was tested against six DNA sequences, which are the thrA, thrB and thrC genes of the threonine operons from Escherichia coli K-12 and from Shigella flexneri; and one random sequence having the same base composition as thrA from E.coli. These results were compared with those obtained from CLUSTAL W algorithm (alignment-based) and the chaos game representation (alignment-free). The method was further tested against a more complex set of 40 DNA sequences and compared with other existing sequence similarity measures (alignment-free). AVAILABILITY: All datasets and computer codes written in MATLAB are available upon request from the first author.  相似文献   

17.
In this article, we develop a quantitative comparison method for two arbitrary protein structures. This method uses a root‐mean‐square deviation characterization and employs a series expansion of the protein's shape function in terms of the Wigner‐D functions to define a new criterion, which is called a “similarity value.” We further demonstrate that the expansion coefficients for the shape function obtained with the help of the Wigner‐D functions correspond to structure factors. Our method addresses the common problem of comparing two proteins with different numbers of atoms. We illustrate it with a worked example. Proteins 2014; 82:2756–2769. © 2014 Wiley Periodicals, Inc.  相似文献   

18.
19.
Summary A fast dynamic programming algorithm for the spatial superposition of protein structure without prior knowledge of an initial alignment has been developed. The program was applied to serine proteases, hemoglobins, cytochromes C, small copper-binding proteins, and lysozymes. In most cases the existing structural homology could be detected in a completely unbiased way. The results of the method presented are in general agreement with other studies. Applying our method, the different alignment results obtained by other authors for serine proteases and cytochromes C can be classified in terms of different alignment parameters such as gap penalties or cut-off length. Limitations of the method are discussed.  相似文献   

20.
A 1173-base pair cDNA encoding bovine cellular retinaldehyde-binding protein (CRALBP) was cloned from a bovine retinal cDNA expression library using as probes both anti-CRALBP polyclonal and monoclonal antibodies. The amino acid sequence deduced from the cDNA corresponds exactly to that determined by direct analysis of NH2-terminally acetylated bovine CRALBP (Crabb, J. W., Johnson, C. M., Carr, S. A., Armes, L. G., and Saari, J. C. (1988) J. Biol. Chem. 263, 18678-18687). Nick-translated bovine CRALBP cDNA probes were then used to clone from a human retinal cDNA library a 1317-base pair cDNA encoding human CRALBP. Bovine and human CRALBP are 92% identical in amino acid sequence and not related to any other known protein sequence. Both the bovine and human proteins contain 316 residues and have calculated molecular weights of 36,378 and 36,347, respectively, exclusive of the NH2-terminal blocking groups. The CRALBP cDNA clones should prove valuable as tools for studying the physiological role of the protein in vision and visual disorders.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号