首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
The technique of model-building a protein of known sequence but unknown tertiary structure from the structures of homologous proteins is probably so far the most reliable means of mapping from primary to tertiary structure. A key step towards the realization of the aim is to develop ways of aligning three-dimensional structures of homologus proteins, thereby deriving the rules useful for protein modelling. We have developed a generalized differential-geometric representation of protein local conformation for use in a protein comparison program which aligns protein sequences on the basis of their sequence and conformational knowledge. Because the differetial-geometric distance measure between local conformations is independent of the coordinate frame and remains chirality information, the comparison program is easily implemented, relatively rational and reasonably fast. The utility of this program for aligning closely and distantly related homologous proteins is demonstrated by multiple alignment of globins, serine proteinases and aspartic proteinase domains. Particularly, the method has reached the rational alignment between the mammalian and microbial serine proteinases as compared with many published alignment programs.  相似文献   

2.
An approach is described for modelling the three-dimensional structure of a protein from the tertiary structures of several homologous proteins that have been determined by X-ray analysis. A method is developed for the simultaneous superposition of several protein molecules and for the calculation of an 'average structure' or 'framework'. Investigation of the convergence properties of this method, in the case of both weighted and unweighted least squares, demonstrates that both give a unique answer and the latter is robust for an homologous family of proteins. Multi-dimensional scaling is used to subgroup of the proteins with respect to structural homology. The framework calculated on the basis of the family of homologous proteins, or of an appropriate subgroup, is used to align fragments of the known protein structures of high sequence homology with the unknown. This alignment provides a basis for model building the tertiary structure. Different techniques for using the framework to model the mainchain of various globins and an immunoglobulin domain in the structurally conserved regions are investigated.  相似文献   

3.
A large-scale purification scheme was developed for lipopolysaccharide-free protein P, the phosphate-starvation-inducible outer-membrane porin from Pseudomonas aeruginosa. This highly purified protein P was used to successfully form hexagonal crystals in the presence of n-octyl-beta-glucopyranoside. Amino-acid analysis indicated that protein P had a similar composition to other bacterial outer membrane proteins, containing a high percentage (50%) of hydrophilic residues. The amino-terminal sequence of this protein, although not homologous to either outer membrane protein, PhoE or OmpF, of Escherichia coli, was found to have an analogous protein-folding pattern. Protein P in the native trimer form was capable of maintaining a stable functional trimer after proteinase cleavage. This suggested the existence of a strongly associated tertiary and quaternary structure. Circular dichroism studies confirmed these results in that a large proportion of the protein structure was determined to be beta-sheet and resistant to acid pH and heating in 0.1% sodium dodecyl sulphate.  相似文献   

4.
An object-oriented database system has been developed which is being used to store protein structure data. The database can be queried using the logic programming language Prolog or the query language Daplex. Queries retrieve information by navigating through a network of objects which represent the primary, secondary and tertiary structures of proteins. Routines written in both Prolog and Daplex can integrate complex calculations with the retrieval of data from the database, and can also be stored in the database for sharing among users. Thus object-oriented databases are better suited to prototyping applications and answering complex queries about protein structure than relational databases. This system has been used to find loops of varying length and anchor positions when modelling homologous protein structures.  相似文献   

5.
Predictions of tertiary structures of proteins from their amino acid sequences are facilitated greatly when the structures of homologous proteins are known. On this basis, structural features of Escherichia coli ornithine transcarbamoylase (OTCase) were investigated by site-directed mutagenesis experiments based on the known tertiary structure of the catalytic (c) chain of E. coli aspartate transcarbamoylase (ATCase). In ATCase, each c chain is composed of two globular domains connected by two interdomain helices, one of which is near the C-terminus and is critical for the in vivo folding of the chains and their assembly into trimers. Each active site is located at the interface between two chains and requires the participation of residues from each of the adjacent chains. OTCase, a trimeric enzyme, has been proposed to be similar in structure to the ATCase trimer on the basis of sequence identity (32%), the nature of the reaction catalyzed by the enzyme, and secondary structure predictions. As shown here, analysis of OTCase and ATCase sequences revealed extensive evolutionary conservation in portions corresponding to the ATCase active site and the C-terminal helix. Truncations and substitutions within the predicted C-terminal helix of OTCase had effects on activity and thermal stability strikingly similar to those caused by analogous alterations in ATCase. Similarly, substitutions at either of two conserved residues, Ser 55 and Lys 86, in the proposed active site of OTCase had deleterious effects parallel to those caused by the analogous ATCase substitutions. Hybrid trimers comprised of chains from both these relatively inactive OTCase mutants exhibited dramatically increased activity, as predicted for shared active sites located at the chain interfaces. These results strongly support the hypothesis that the tertiary and quaternary structures of the two enzymes are similar.  相似文献   

6.
A cysteine proteinase that possibly participates in the degradation of phaseolin, the main storage protein of kidney bean ( Phaseolus vulgaris L. cv. Moldavian) was isolated from germinating kidney bean seeds and partially characterized. According to its properties it may be classified as a member of a group of homologous cysteine proteinases A, also present in germinating seeds of a number of other plants. The proteinase of this group hydrolyze storage proteins to short peptides. Similarly, the kidney bean proteinase hydrolyzes vicilin, the reserve protein of vetch ( Vicia sativa ). However, its action on phaseolin is limited to the cleavage of subunits into two approximately equal parts and to the splitting off a small number of short peptides. An explanation of phaseolin resistance to the action of this proteinase is proposed on the basis of the differences of its structure from that of other homologous 7S proteins.  相似文献   

7.
Insight into the functions and interactions of proteins may be gained by correlating a variety of types of experimental data (including kinetics, spectroscopy, biophysical measurements, among others) with three-dimensional structural models displayed and manipulated using interactive computer graphics. Although tertiary structures have been determined for a large number of proteins, one limiting factor in structure-function studies is the lack of availability of the structural coordinates of specific proteins for which other types of detailed experimental data are known. However, as the data base of known structures grows, it becomes more and more likely that the structure of a closely related protein will be available. Here we present a method for predicting structures by ( 1 ) careful alteration of a known structure of a homologous, functionally analogous protein followed by (2) energy minimization to optimize the predicted structure. This method provides a rapid and effective solution to the initial problem of obtaining a working structure for modeling studies.  相似文献   

8.
In previous papers, a method of protein tertiary structure recognition was described based on the construction of an associative memory Hamiltonian, which encoded the amino acid sequence and the C alpha co-ordinates of a set of database proteins. Using molecular dynamics with simulated annealing, the ability of the Hamiltonian to successfully recall the structure of a protein in the memory database was successfully demonstrated, as long as the total number of database proteins did not exceed a characteristic value, called the capacity of the Hamiltonian, equal to 0.5N to 0.7N, where N is the number of amino acid residues in the protein to be recalled. In this paper, we describe the development of additional methods to increase the capacity of the Hamiltonian, including use of a more complete representation of the protein backbone and the incorporation of contextual information into the Hamiltonian through the use of secondary structure prediction. In addition, we further extend the ability of associative memory models to predict the tertiary structures of proteins not present in the protein data set, by making the Hamiltonian invariant with respect to biological symmetries that represent site mutations and insertions and deletions. The ability of the Hamiltonian to generalize from homologous proteins to an unknown protein in the presence of other unrelated proteins in the data set is demonstrated.  相似文献   

9.
We have recently reported the first complete amino acid sequence of an iron-containing superoxide dismutase. The iron enzyme is thought to be closely homologous to the manganese-containing superoxide dismutases. The availability of complete amino acid sequence information for four manganese superoxide dismutases and the crystal structures for two iron and two manganese superoxide dismutases prompted us to investigate the degree of homology between the two proteins at various levels. We report that it is not possible to clearly distinguish the two proteins on the basis of their secondary or tertiary structures. It would appear that a small number of single site substitutions are responsible for conferring distinguishing properties between the two proteins. Substitution of glycine 77 and glutamine 154 by a glutamine and an alanine respectively in Photobacterium leiognathi iron superoxide dismutase may distinguish the kinetic and other particular properties of this protein from the manganese protein (and other iron superoxide dismutases). Furthermore the primary structure of both the iron and manganese proteins does not appear to have any homology with any other known amino acid sequence.  相似文献   

10.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix, beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69, respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30% of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

11.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix,beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69,respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30%of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

12.
Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality.  相似文献   

13.
A data collection which merges protein structural and sequence information is described. Structural superpositions amongst proteins with similar main-chain fold were performed or collected from the literature. Sequences taken from the protein primary structure databases were associated with the multiple structural alignments providing they were at least 50% homologous in residue identity to one of the structural sequences and at least 50% of the structural sequence residues were alignable. Such restrictions allow reasonable confidence that the primary sequences share the conformation of the tertiary structural templates, except in the less conserved loop regions. Multiple structural superpositions were collected for 38 familial groups containing a total of 209 tertiary structures; 45 structures had no superposable mates and were used individually. Other information is also provided as main-chain and side-chain conformational angles, secondary structural assignments and the like. Wedding the primary and tertiary structural data resulted in an 8-fold increase of data bank sequence entries over those associated with the known three-dimensional architectures alone.  相似文献   

14.
Panchenko AR  Madej T 《Proteins》2004,57(3):539-547
Two proteins are considered to have a similar fold if sufficiently many of their secondary structure elements are positioned similarly in space and are connected in the same order. Such a common structural scaffold may arise due to either divergent or convergent evolution. The intervening unaligned regions ("loops") between the superimposable helices and strands can exhibit a wide range of similarity and may offer clues to the structural evolution of folds. One might argue that more closely related proteins differ less in their nonconserved loop regions than distantly related proteins and, at the same time, the degree of variability in the loop regions in structurally similar but unrelated proteins is higher than in homologs. Here we introduce a new measure for structural (dis)similarity in loop regions that is based on the concept of the Hausdorff metric. This measure is used to gauge protein relatedness and is tested on a benchmark of homologous and analogous protein structures. It has been shown that the new measure can distinguish homologous from analogous proteins with the same or higher accuracy than the conventional measures that are based on comparing proteins in structurally aligned regions. We argue that this result can be attributed to the higher sensitivity of the Hausdorff (dis)similarity measure in detecting particularly evident dissimilarities in structures and draw some conclusions about evolutionary relatedness of proteins in the most populated protein folds.  相似文献   

15.
Several studies based on the known three-dimensional (3-D) structures of proteins show that two homologous proteins with insignificant sequence similarity could adopt a common fold and may perform same or similar biochemical functions. Hence, it is appropriate to use similarities in 3-D structure of proteins rather than the amino acid sequence similarities in modelling evolution of distantly related proteins. Here we present an assessment of using 3-D structures in modelling evolution of homologous proteins. Using a dataset of 108 protein domain families of known structures with at least 10 members per family we present a comparison of extent of structural and sequence dissimilarities among pairs of proteins which are inputs into the construction of phylogenetic trees. We find that correlation between the structure-based dissimilarity measures and the sequence-based dissimilarity measures is usually good if the sequence similarity among the homologues is about 30% or more. For protein families with low sequence similarity among the members, the correlation coefficient between the sequence-based and the structure-based dissimilarities are poor. In these cases the structure-based dendrogram clusters proteins with most similar biochemical functional properties better than the sequence-similarity based dendrogram. In multi-domain protein families and disulphide-rich protein families the correlation coefficient for the match of sequence-based and structure-based dissimilarity (SDM) measures can be poor though the sequence identity could be higher than 30%. Hence it is suggested that protein evolution is best modelled using 3-D structures if the sequence similarities (SSM) of the homologues are very low.  相似文献   

16.
Intron boundaries were extracted from genomic data and mapped onto single-domain human and murine protein structures taken from the Protein Data Bank. A first analysis of this set of proteins shows that intron boundaries prefer to be in non-regular secondary structure elements, while avoiding alpha-helices and beta-strands. This fact alone suggests an evolutionary model in which introns are constrained by protein structure, particularly by tertiary structure contacts. In addition, in silico recombination experiments of a subset of these proteins together with their homologues, including those in different species, show that introns have a tendency to occur away from artificial crossover hot spots. Altogether, these findings support a model in which genes can preferentially harbour introns in less constrained regions of the protein fold they code for. In the light of these findings, we discuss some implications for protein modelling and design.  相似文献   

17.
The secondary and tertiary structures of interferon were predicted from four homologous amino acid sequences. Three methods of secondary structure prediction gave differing results that were interpreted to suggest that there might be four α-helices that are important in the tertiary fold. The validity of this interpretation was assessed by the application of the methods to predict the secondary structures of two proteins known to consist of four α-helices. A possible tertiary model for interferon is then proposed in which the four α-helices pack into a right-handed bundle similar to that observed in several known protein structures. This model was shown to be stereochemically feasible by an α-helix docking algorithm. One of the resultant structures is shown to be compatible with the known disulphide linkages in interferon. Certain residues that are conserved between the different sequences lie near each other in our model and these residues might form a functional site. In the absence of a crystal structure for interferon, a predicted tertiary model will help further structural and functional studies.  相似文献   

18.
Alexander PA  Rozak DA  Orban J  Bryan PN 《Biochemistry》2005,44(43):14045-14054
To better understand how amino acid sequences specify unique tertiary folds, we have used random mutagenesis and phage display selection to evolve proteins with a high degree of sequence identity but different tertiary structures (homologous heteromorphs). The starting proteins in this evolutionary process were the IgG binding domains of streptococcal protein G (G(B)) and staphylococcal protein A (A(B)). These nonhomologous domains are similar in size and function but have different folds. G(B) has an alpha/beta fold, and A(B) is a three-helix bundle (3-alpha). IgG binding function is used to select for mutant proteins which retain the correct tertiary structure as the level of sequence identity is increased. A detailed thermodynamic analysis of the folding reactions and binding reactions for a pair of homologous heteromorphs (59% identical) is presented. High-resolution NMR structures of the pair are presented by He et al. [(2005) Biochemistry 44, 14055-14061]. Because the homologous but heteromorphic proteins are identical at most positions in their sequence, their essential folding signals must reside in the positions of nonidentity. Further, the thermodynamic linkage between folding and binding is used to assess the propensity of one sequence to adopt two unique folds.  相似文献   

19.
Hydrophobic cluster analysis (HCA) [15] is a very efficient method to analyse and compare protein sequences. Despite its effectiveness, this method is not widely used because it relies in part on the experience and training of the user. In this article, detailed guidelines as to the use of HCA are presented and include discussions on: the definition of the hydrophobic clusters and their relationships with secondary and tertiary structures; the length of the clusters; the amino acid classification used for HCA; the HCA plot programs; and the working strategies. Various procedures for the analysis of a single sequence are presented: structural segmentation, structural domains and secondary structure evaluation. Like most sequence analysis methods, HCA is more efficient when several homologous sequences are compared. Procedures for the detection and alignment of distantly related proteins by HCA are described through several published examples along with 2 previously unreported cases: the beta-glucosidase from Ruminococcus albus is clearly related to the beta-glucosidases from Clostridum thermocellum and Hansenula anomala although they display a reverse organization of their constitutive domains; the alignment of the sequence of human GTPase activating protein with that of the Crk oncogene is presented. Finally, the pertinence of HCA in the identification of important residues for structure/function as well as in the preparation of homology modelling is discussed.  相似文献   

20.
Protein folding involves the formation of secondary structural elements from the primary sequence and their association with tertiary assemblies. The relation of this primary sequence to a specific folded protein structure remains a central question in structural biology. An increasing body of evidence suggests that variations in homologous sequence ranging from point mutations to substantial insertions or deletions can yield stable proteins with markedly different folds. Here we report the structural characterization of domain IV (D4) and ΔD4 (polypeptides with 222 and 160 amino acids, respectively) that differ by virtue of an N-terminal deletion of 62 amino acids (28% of the overall D4 sequence). The high-resolution crystal structures of the monomeric D4 and the dimeric ΔD4 reveal substantially different folds despite an overall conservation of secondary structure. These structures show that the formation of tertiary structures, even in extended polypeptide sequences, can be highly context dependent, and they serve as a model for structural plasticity in protein isoforms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号