首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

A new simple quantitative representation of three-dimensional structure of globular proteins is proposed which is useful for comparison of distantly related problems, computer sorting of large sets of conformations, and search of structurally similar domains in protein data base. The folding course of the polypeptide backbone is approximated by a set of successive vectors corresponding to the elements of regular secondary structure (e.g. α-helices, strands of β- sheets) and non-regular segments. The parameters specifying the spatial organization of segments in this vector model are internal coordinates, namely, lengths of the vectors, planar and dihedral angles. Quantitative representation proposed allows to circumvent the problem of insertions/deletions and to avoid the stage of best superposition during protein comparison An application was made to the comparison of three-dimensional structures of scorpion toxins Centruroides sculpturatus Ewing v-3, Buthus eupeus M9 and I5A, which have different chain lengths and low sequence similarity.  相似文献   

2.
R M Sweet 《Biopolymers》1986,25(8):1565-1577
Short segments of polypeptide, from a protein for which the primary sequence but not the three-dimensional structure is known, are compared to a library of known structures. The basis of comparison is the probability with which residues in the unknown segment might substitute through evolution for residues in segments of known structure. In test cases, segments from known structures that are similar in sequence to those from a protein treated as unknown are often found to be similar in three-dimensional structure to one another and to the true structure of the “unknown” segment. This provides a basis for prediction of the local configuration (secondary structure) of polypeptides.  相似文献   

3.
Cytochrome b is an integral membrane protein, which forms the core of the ubiquinol-cytochrome c oxidoreductase (cytochrome bc1) complex. A computer-aided three-dimensional modeling procedure was carried out in four steps. First, the candidate hydrophobic helices were searched for throughout the protein primary sequence by a computer procedure based upon the method of Eisenberg; second, a secondary helical structure was imposed to the transmembrane peptides; third, the helical segments at a lipid-water interface were oriented, and finally the possible interactions between helices with similar properties were investigated. This procedure enabled the identification of nine hydrophobic segments, of which eight are membrane-spanning helices while one has amphipathic properties. Three hydrophilic receptor-binding domains were also identified. Based upon their hydrophobicity profiles, the transmembrane helices could be associated in pairs inside the lipid bilayer. In our folding model proposed for cytochrome b, all mutation sites are not only located on the same side of the membrane but are also in close proximity in the three-dimensional structure. Inhibitor resistance mutational sites which were recently characterized (di Rago, J.-P., and Colson, A.-M. (1988) J. Biol. Chem. 263, 12564-12570) have been located on this model. Moreover, the receptor-binding domains and the mutation sites are close neighbors in the three-dimensional spatial representation.  相似文献   

4.
5.
Simple and concise representations of protein-folding patterns provide powerful abstractions for visualizations, comparisons, classifications, searching and aligning structural data. Structures are often abstracted by replacing standard secondary structural features-that is, helices and strands of sheet-by vectors or linear segments. Relying solely on standard secondary structure may result in a significant loss of structural information. Further, traditional methods of simplification crucially depend on the consistency and accuracy of external methods to assign secondary structures to protein coordinate data. Although many methods exist automatically to identify secondary structure, the impreciseness of definitions, along with errors and inconsistencies in experimental structure data, drastically limit their applicability to generate reliable simplified representations, especially for structural comparison. This article introduces a mathematically rigorous algorithm to delineate protein structure using the elegant statistical and inductive inference framework of minimum message length (MML). Our method generates consistent and statistically robust piecewise linear explanations of protein coordinate data, resulting in a powerful and concise representation of the structure. The delineation is completely independent of the approaches of using hydrogen-bonding patterns or inspecting local substructural geometry that the current methods use. Indeed, as is common with applications of the MML criterion, this method is free of parameters and thresholds, in striking contrast to the existing programs which are often beset by them. The analysis of results over a large number of proteins suggests that the method produces consistent delineation of structures that encompasses, among others, the segments corresponding to standard secondary structure. AVAILABILITY: http://www.csse.monash.edu.au/~karun/pmml.  相似文献   

6.
A systematic method has been developed for comparing the backbone conformations of proteins (Remington & Matthews, 1978). Two proteins are compared by successively optimizing the agreement between all possible segments of a chosen length from one protein, and all possible segments of the same length from the other protein. The method reveals any similarities between the two proteins, and provides an estimate of the statistical significance of any given structure agreement that is obtained.The method has been tested in a number of cases, including comparisons of the dehydrogenases and of the pancreatic and bacterial serine proteases. These examples were chosen to test the ability of the comparison method to detect structural similarities in the presence of large insertions and deletions. The results suggest that the detection of the “nucleotide binding fold” in the dehydrogenases is at the limit of the capability of the comparison technique in its original form, although it may be possible to generalize the method to allow for insertions and deletions in proteins.The results of many protein comparisons, made with different probe lengths, are summarized. For medium and long probe lengths, the average value of the structural agreement does not depend very much on the type of protein being compared. The average value of the structure agreement increases with the square root of the probe length, but for probe lengths above about 40 residues, the standard deviation is independent of probe length. From these observations it is possible to construct a generalized probability diagram to evaluate the significance of any structure agreement that might be obtained in comparing two proteins.  相似文献   

7.
Two new methods for the visualization of structural similarity in proteins with known three-dimensional structures are presented. They are based on the degree of equivalence of α-carbon pairs in two proteins. The quantitative measure for residue equivalence is the comparison score generated using the sequence and structure alignment method of Taylor and Orengo, which is based on the comparison of interatomic distances (and other properties that can be defined on a residue basis).The first method uses information on corresponding α-carbon positions to display vectors joining these structurally equivalent residues. These vectors can be defined as target constraints, and their minimization “bends” the two proteins toward a common average structure. In the average structure the corresponding residues virtually superpose, while insertions and deletions become clearly visible.The second method uses the comparison scores to perform a weighted least-squares fit of the two structures. It is further used to color code the two structures according to the score value, i.e., their similarity, on a continuous scale from red to blue. Examples of the methods for the comparison of flavodoxin, chemotaxis Y protein and L-arabinose-binding protein are given.  相似文献   

8.
9.
The technique of model-building a protein of known sequence but unknown tertiary structure from the structures of homologous proteins is probably so far the most reliable means of mapping from primary to tertiary structure. A key step towards the realization of the aim is to develop ways of aligning three-dimensional structures of homologus proteins, thereby deriving the rules useful for protein modelling. We have developed a generalized differential-geometric representation of protein local conformation for use in a protein comparison program which aligns protein sequences on the basis of their sequence and conformational knowledge. Because the differetial-geometric distance measure between local conformations is independent of the coordinate frame and remains chirality information, the comparison program is easily implemented, relatively rational and reasonably fast. The utility of this program for aligning closely and distantly related homologous proteins is demonstrated by multiple alignment of globins, serine proteinases and aspartic proteinase domains. Particularly, the method has reached the rational alignment between the mammalian and microbial serine proteinases as compared with many published alignment programs.  相似文献   

10.
In this paper, we present an approach based on Burrows–Wheeler transform to compare the protein sequences. The strings representing amino acid sequences do not reflect the chemical physical properties better, and it is very hard to extract any key features by reading these long character strings directly. The use of the Burrows–Wheeler similarity distribution needs a suitable representation which can reflect some interesting properties of the proteins. For the comparison of the primary protein sequences we convert the protein sequences into digital codes by the Ponnuswamy hydrophobicity index, and for the comparison of the structure of the proteins we adjust the topology of protein structure strings, which are simple but useful representation of the secondary structure of proteins to match the Burrows–Wheeler similarity distribution. At last, some experiments show that the approach proposed in this paper is a powerful and useful tool for the comparison of proteins.  相似文献   

11.
A fast search algorithm to reveal similar polypeptide backbone structural motifs in proteins is proposed. It is based on the vector representation of a polypeptide chain fold in which the elements of regular secondary structures are approximated by linear segments (Abagyan and Maiorov, J. Biomol. Struct. Dyn. 5, 1267-1279 (1988)). The algorithm permits insertions and deletions in the polypeptide chain fragments to be compared. The fast search algorithm implemented in FASEAR program is used for collecting beta alpha beta supersecondary structure units in a number of alpha/beta proteins of Brookhaven Data Bank. Variation of geometrical parameters specifying backbone chain fold is estimated. It appears that the conformation of the majority of the fragments, although almost all of them are right-handed, is quite different from that of standard beta alpha beta units. Apart from searching for specific type of secondary structure motif, the algorithm allows automatically to identify new recurrent folding patterns in proteins. It may be of particular interest for the development of tertiary template approach for prediction of protein three-dimensional structure as well for constructing artificial polypeptides with goal-oriented conformation.  相似文献   

12.
We consider the problem of identifying common three-dimensional substructures between proteins. Our method is based on comparing the shape of the alpha-carbon backbone structures of the proteins in order to find three-dimensional (3D) rigid motions that bring portions of the geometric structures into correspondence. We propose a geometric representation of protein backbone chains that is compact yet allows for similarity measures that are robust against noise and outliers. This representation encodes the structure of the backbone as a sequence of unit vectors, defined by each adjacent pair of alpha-carbons. We then define a measure of the similarity of two protein structures based on the root mean squared (RMS) distance between corresponding orientation vectors of the two proteins. Our measure has several advantages over measures that are commonly used for comparing protein shapes, such as the minimum RMS distance between the 3D positions of corresponding atoms in two proteins. A key advantage is that this new measure behaves well for identifying common substructures, in contrast with position-based measures where the nonmatching portions of the structure dominate the measure. At the same time, it avoids the quadratic space and computational difficulties associated with methods based on distance matrices and contact maps. We show applications of our approach to detecting common contiguous substructures in pairs of proteins, as well as the more difficult problem of identifying common protein domains (i.e., larger substructures that are not necessarily contiguous along the protein chain).  相似文献   

13.
Abstract

A fast search algorithm to reveal similar polypeptide backbone structural motifs in proteins is proposed. It is based on the vector representation of a polypeptide chain fold in which the elements of regular secondary structures are approximated by linear segments (Abagyan and Maiorov, J. Biomol. Struct. Dyn. 5, 1267–1279 (1988)). The algorithm permits insertions and deletions in the polypeptide chain fragments to be compared. The fast search algorithm implemented in FASEAR program is used for collecting βαβ supersecondary structure units in a number of α/β proteins of Brookhaven Data Bank. Variation of geometrical parameters specifying backbone chain fold is estimated. It appears that the conformation of the majority of the fragments, although almost all of them are right-handed, is quite different from that of standard βαβ units. Apart from searching for specific type of secondary structure motif, the algorithm allows automatically to identify new recurrent folding patterns in proteins. It may be of particular interest for the development of tertiary template approach for prediction of protein three-dimensional structure as well for constructing artificial polypeptides with goal-oriented conformation.  相似文献   

14.
The assumption that homologous segments in different proteins may share a similar conformation is applied to the prediction of secondary structures in proteins. Sequences homologous to a target protein are searched, without allowing any gap, and compared against a number of reference proteins of known three-dimensional structure, and then a conformational state (alpha, beta or coil) for each residue of the protein is predicted by looking at the secondary structure of corresponding homologous segments. This prediction is done in a statistical rather than 'deterministic' way, by assigning the most probable conformation state among homologous data to each residue site of a target protein. A test application for 22 sample proteins yields 60% correctness on the average, a better value in comparison with two other existing methods. Joint prediction combining three methods into one is shown to increase the reliability up to 70%, when only the regions identically predicted with the three methods are taken into account. Application of the present method to 10 proteins of unknown structure is demonstrated.  相似文献   

15.
In order to compare different genome sequences, an alignment-free method has proposed. First, we presented a new graphical representation of DNA sequences without degeneracy, which is conducive to intuitive comparison of sequences. Then, a new numerical characterization based on the representation was introduced to quantitatively depict the intrinsic nature of genome sequences, and considered as a 10-dimensional vector in the mathematical space. Alignment-free comparison of sequences was performed by computing the distances between vectors of the corresponding numerical characterizations, which define the evolutionary relationship. Two data sets of DNA sequences were constructed to assess the performance on sequence comparison. The results illustrate well validity of the method. The new numerical characterization provides a powerful tool for genome comparison.  相似文献   

16.
We investigated protein sequence/structure correlation by constructing a space of protein sequences, based on methods developed previously for constructing a space of protein structures. The space is constructed by using a representation of the amino acids as vectors of 10 property factors that encode almost all of their physical properties. Each sequence is represented by a distribution of overlapping sequence fragments. A distance between any two sequences can be calculated. By attaching a weight to each factor, intersequence distances can be varied. We optimize the correlation between corresponding distances in the sequence and structure spaces. The optimal correlation between the sequence and structure spaces is significantly better than that which results from correlating randomly generated sequences, having the overall composition of the data base, with the structure space. However, sets of randomly generated sequences, each of which approximates the composition of the real sequence it replaces, produce correlations with the structure space that are as good as that observed for the actual protein sequences. A connection is proposed with previous studies of the protein folding code. It is shown that the most important property factors for the correlation of the sequence and structure spaces are related to helix/bend preference, side chain bulk, and beta-structure preference.  相似文献   

17.
Material remains of ancestor nucleotides and proteins are largely unavailable, thus sequence comparison among homologous genes in present-day organisms forms the core of current knowledge of molecular evolution. Variation in protein three-dimensional structure is a basis for functional diversity. To study the evolution of three-dimensional structures in related proteins would significantly improve our understanding of protein evolution and function. A protein may contain ancestor conformations that have been allosterically suppressed by evolutionarily additive structures. Using monoclonal antibody probes to detect such conformation in proteins after removing the suppressor structure, our study demonstrates three-dimensional structure evidence for the evolutionary relationship between troponin I and troponin T, two subunits of the troponin complex in the Ca2+-regulatory system of striated muscle, and among their muscle type-specific isoforms. The experimental data show the feasibility of detecting evolutionarily suppressed history-telling structural states in proteins by removing conformational modulator segments added during evolution. In addition to identifying structural modifications that were critical to the emergence of diverged proteins, investigating this novel mode of evolution will help us to understand the origin and functional potential of protein structures.  相似文献   

18.
The three-dimensional structure of the native unliganded form of the Leu/Ile/Val-binding protein (Mr = 36,700), an essential component of the high-affinity active transport system for the branched aliphatic amino acids in Escherichia coli, has been determined and further refined to a crystallographic R-factor of 0.17 at 2.4 A resolution. The entire structure consists of 2710 non-hydrogen atoms from the complete sequence of 344 residues and 121 ordered water molecules. Bond lengths and angle distances in the refined model have root-mean-square deviations from ideal values of 0.05 A and 0.10 A, respectively. The overall shape of the protein is a prolate ellipsoid with dimensions of 35 A x 40 A x 70 A. The protein consists of two distinct globular domains linked by three short peptide segments which, though widely separated in the sequence, are proximal in the tertiary structure and form the base of the deep cleft between the two domains. Although each domain is built from polypeptide segments located in both the amino (N) and the carboxy (C) terminal halves, both domains exhibit very similar supersecondary structures, consisting of a central beta-sheet of seven strands flanked on either side by two or three helices. The two domains are far apart from each other, leaving the cleft wide open by about 18 A. The cleft has a depth of about 15 A and a base of about 14 A x 16 A. Refining independently the structure of native Leu/Ile/Val-binding protein crystals soaked in a solution containing L-leucine at 2.8 A resolution (R-factor = 0.15), we have been able to locate and characterize an initial, major portion of the substrate-binding site of the Leu/Ile/Val-binding protein. The binding of the L-leucine substrate does not alter the native crystal structure, and the L-leucine is lodged in a crevice on the wall of the N-domain, which is in the inter-domain cleft. The L-leucine is held in place primarily by hydrogen-bonding of its alpha-ammonium and alpha-carboxylate groups with main-chain peptide units and hydroxyl side-chain groups; there are no salt-linkages. The charges on the leucine zwitterion are stabilized by hydrogen-bond dipoles. The side-chain of the L-leucine substrate lies in a depression lined with non-polar residues, including Leu77, which confers specificity to the site by stacking with the side-chain of the leucine substrate.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

19.

Background  

Recently a new class of methods for fast protein structure comparison has emerged. We call the methods in this class projection methods as they rely on a mapping of protein structure into a high-dimensional vector space. Once the mapping is done, the structure comparison is reduced to distance computation between corresponding vectors. As structural similarity is approximated by distance between projections, the success of any projection method depends on how well its mapping function is able to capture the salient features of protein structure. There is no agreement on what constitutes a good projection technique and the three currently known projection methods utilize very different approaches to the mapping construction, both in terms of what structural elements are included and how this information is integrated to produce a vector representation.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号