首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
18th Sir Hans Krebs lecture. Knowledge-based protein modelling and design   总被引:12,自引:0,他引:12  
A systematic technique for protein modelling that is applicable to the design of drugs, peptide vaccines and novel proteins is described. Our approach is knowledge-based, depending on the structures of homologous or analogous proteins and more generally on a relational data base of protein three-dimensional structures. The procedure simultaneously aligns the known tertiary structures, selects fragments from the structurally conserved regions on the basis of sequence homology, aligns these with the 'average structure' or 'framework', builds on the loops selected from homologous proteins or a wider database, substitutes sidechains and energy minimises the resultant model. Applications to modelling an homologous structure, tissue plasminogen activator on the basis of another serine proteinase, and to modelling an analogous protein, HIV viral proteinase on the basis of aspartic proteinases, are described. The converse problem of ab initio design is also addressed: this involves the selection of an amino acid sequence to give a particular tertiary structure, in this case a symmetrical domain of two Greek-key motifs.  相似文献   

2.
Tao Y  Julian RR 《Biochemistry》2012,51(8):1796-1802
A simple mass spectrometry-based method capable of examining protein structure called SNAPP (selective noncovalent adduct protein probing) is used to evaluate the structural consequences of point mutations in naturally occurring sequence variants from different species. SNAPP monitors changes in the attachment of noncovalent adducts to proteins as a function of structural state. Mutations that lead to perturbations to the electrostatic surface structure of a protein affect noncovalent attachment and are easily observed with SNAPP. Mutations that do not alter the tertiary structure or electrostatic surface structure yield similar results by SNAPP. For example, bovine, porcine, and human insulin all have very similar backbone structures and no basic or acidic residue mutations, and the SNAPP distributions for all three proteins are very similar. In contrast, four variants of cytochrome c (cytc) have varying degrees of sequence homology, which are reflected in the observed SNAPP distributions. Bovine and pigeon cytc have several basic or acidic residue substitutions relative to horse cytc, but the SNAPP distributions for all three proteins are similar. This suggests that these mutations do not significantly influence the protein surface structure. On the other hand, yeast cytc has the least sequence homology and exhibits a unique, though related, SNAPP distribution. Even greater differences are observed for lysozyme. Hen and human lysozyme have identical tertiary structures but significant variations in the locations of numerous basic and acidic residues. The SNAPP distributions are quite distinct for the two forms of lysozyme, suggesting significant differences in the surface structures. In summary, SNAPP experiments are relatively easy to perform, require minimal sample consumption, and provide a facile route for comparison of protein surface structure between highly homologous proteins.  相似文献   

3.
We have recently reported the first complete amino acid sequence of an iron-containing superoxide dismutase. The iron enzyme is thought to be closely homologous to the manganese-containing superoxide dismutases. The availability of complete amino acid sequence information for four manganese superoxide dismutases and the crystal structures for two iron and two manganese superoxide dismutases prompted us to investigate the degree of homology between the two proteins at various levels. We report that it is not possible to clearly distinguish the two proteins on the basis of their secondary or tertiary structures. It would appear that a small number of single site substitutions are responsible for conferring distinguishing properties between the two proteins. Substitution of glycine 77 and glutamine 154 by a glutamine and an alanine respectively in Photobacterium leiognathi iron superoxide dismutase may distinguish the kinetic and other particular properties of this protein from the manganese protein (and other iron superoxide dismutases). Furthermore the primary structure of both the iron and manganese proteins does not appear to have any homology with any other known amino acid sequence.  相似文献   

4.
The total number of protein-protein complex structures currently available in the Protein Data Bank (PDB) is six times smaller than the total number of tertiary structures in the PDB, which limits the power of homology-based approaches to complex structure modeling. We present a threading-recombination approach, COTH, to boost the protein complex structure library by combining tertiary structure templates with complex alignments. The query sequences are first aligned to complex templates using a modified dynamic programming algorithm, guided by ab initio binding-site predictions. The monomer alignments are then shifted to the multimeric template framework by structural alignments. COTH was tested on 500 nonhomologous dimeric proteins, which can successfully detect correct templates for 50% of the cases after homologous templates are excluded, which significantly outperforms conventional homology modeling algorithms. It also shows a higher accuracy in interface modeling than rigid-body docking of unbound structures from ZDOCK although with lower coverage. These data demonstrate new avenues to model complex structures from nonhomologous templates.  相似文献   

5.
C Sander  R Schneider 《Proteins》1991,9(1):56-68
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.  相似文献   

6.
We present an automated method incorporated into a software package, FOLDER, to fold a protein sequence on a given three-dimensional (3D) template. Starting with the sequence alignment of a family of homologous proteins, tertiary structures are modeled using the known 3D structure of one member of the family as a template. Homologous interatomic distances from the template are used as constraints. For nonhomologous regions in the model protein, the lower and the upper bounds for the interatomic distances are imposed by steric constraints and the globular dimensions of the template, respectively. Distance geometry is used to embed an ensemble of structures consistent with these distance bounds. Structures are selected from this ensemble based on minimal distance error criteria, after a penalty function optimization step. These structures are then refined using energy optimization methods. The method is tested by simulating the alpha-chain of horse hemoglobin using the alpha-chain of human hemoglobin as the template and by comparing the generated models with the crystal structure of the alpha-chain of horse hemoglobin. We also test the packing efficiency of this method by reconstructing the atomic positions of the interior side chains beyond C beta atoms of a protein domain from a known 3D structure. In both test cases, models retain the template constraints and any additionally imposed constraints while the packing of the interior residues is optimized with no short contacts or bond deformations. To demonstrate the use of this method in simulating structures of proteins with nonhomologous disulfides, we construct a model of murine interleukin (IL)-4 using the NMR structure of human IL-4 as the template. The resulting geometry of the nonhomologous disulfide in the model structure for murine IL-4 is consistent with standard disulfide geometry.  相似文献   

7.
The technique of model-building a protein of known sequence but unknown tertiary structure from the structures of homologous proteins is probably so far the most reliable means of mapping from primary to tertiary structure. A key step towards the realization of the aim is to develop ways of aligning three-dimensional structures of homologus proteins, thereby deriving the rules useful for protein modelling. We have developed a generalized differential-geometric representation of protein local conformation for use in a protein comparison program which aligns protein sequences on the basis of their sequence and conformational knowledge. Because the differetial-geometric distance measure between local conformations is independent of the coordinate frame and remains chirality information, the comparison program is easily implemented, relatively rational and reasonably fast. The utility of this program for aligning closely and distantly related homologous proteins is demonstrated by multiple alignment of globins, serine proteinases and aspartic proteinase domains. Particularly, the method has reached the rational alignment between the mammalian and microbial serine proteinases as compared with many published alignment programs.  相似文献   

8.
Homology modeling is a powerful technique that greatly increases the value of experimental structure determination by using the structural information of one protein to predict the structures of homologous proteins. We have previously described a method of homology modeling by satisfaction of spatial restraints (Li et al., Protein Sci 1997;6:956-970). The Homology Modeling Automatically (HOMA) web site, , is a new tool, using this method to predict 3D structure of a target protein based on the sequence alignment of the target protein to a template protein and the structure coordinates of the template. The user is presented with the resulting models, together with an extensive structure validation report providing critical assessments of the quality of the resulting homology models. The homology modeling method employed by HOMA was assessed and validated using twenty-four groups of homologous proteins. Using HOMA, homology models were generated for 510 proteins, including 264 proteins modeled with correct folds and 246 modeled with incorrect folds. Accuracies of these models were assessed by superimposition on the corresponding experimentally determined structures. A subset of these results was compared with parallel studies of modeling accuracy using several other automated homology modeling approaches. Overall, HOMA provides prediction accuracies similar to other state-of-the-art homology modeling methods. We also provide an evaluation of several structure quality validation tools in assessing the accuracy of homology models generated with HOMA. This study demonstrates that Verify3D (Luthy et al., Nature 1992;356:83-85) and ProsaII (Sippl, Proteins 1993;17:355-362) are most sensitive in distinguishing between homology models with correct or incorrect folds. For homology models that have the correct fold, the steric conformational energy (including primarily the Van der Waals energy), MolProbity clashscore (Word et al., Protein Sci 2000;9:2251-2259), and the PROCHECK G-factors (Laskowski et al., J Biomol NMR 1996;8:477-486) provide sensitive and consistent methods for assessing accuracy and can distinguish between homology models of higher and lower accuracy. As demonstrated in the accompanying paper (Bhattacharya et al., accompanying paper), combinations of these scores for models generated with HOMA provide a basis for distinguishing low from high accuracy models.  相似文献   

9.
The primary and secondary structure of human plasma apolipoprotein A-I and apolipoprotein E-3 have been analyzed to further our understanding of the secondary and tertiary conformation of these proteins and the structure and function of plasma lipoprotein particles. The methods used to analyze the primary sequence of these proteins used computer programs: (a) to identify repeated patterns within these proteins on the basis of conservative substitutions and similarities within the physicochemical properties of each residue; (b) for local averaging, hydrophobic moment, and Fourier analysis of the physicochemical properties; and (c) for secondary structure prediction of each protein carried out using homology, statistical, and information theory based methods. Circular dichroism was used to study purified lipid-protein complexes of each protein and quantitate the secondary structure in a lipid environment. The data from these analyses were integrated into a single secondary structure prediction to derive a model of each protein. The sequence homology within apolipoproteins A-I, E-3, and A-IV is used to derive a consensus sequence for two 11 amino acid repeating sequences in this family of proteins.  相似文献   

10.
The pattern of residue substitution in divergently evolving families of globular proteins is highly variable. At each position in a fold there are constraints on the identities of amino acids from both the three-dimensional structure and the function of the protein. To characterize and quantify the structural constraints, we have made a comparative analysis of families of homologous globular proteins. Residues are classified according to amino acid type, secondary structure, accessibility of the sidechain, and existence of hydrogen bonds from sidechain to other sidechains or peptide carbonyl or amide functions. There are distinct patterns of substitution especially where residues are both solvent inaccessible and hydrogen bonded through their sidechains. The patterns of residue substitution can be used to construct templates or to identify 'key' residues if one or more structures are known. Conversely, analysis of conversation and substitution across a large family of aligned sequences in terms of substitution profiles can allow prediction of tertiary environment or indicate a functional role. Similar analyses can be used to test the validity of putative structures if several homologous sequences are available.  相似文献   

11.
A similarity between average distance maps (Kikuchiet al., 1988a)—that is, predicted contact maps of two tertiary structurally homologous proteins—is examined. Comparisons of shapes of average distance maps (we refer to this as ADM) are made by superpositions of ADMs for two homologous proteins. Also, we compare shapes of actual contact maps for the pair of proteins. We search a optimal superposition mode of each pair of maps showing that two proteins are most similar. It is concluded that two ADMs are also similar when actual tertiary structures between two proteins show similarity. A criterion for similarity of maps is also proposed. The possibility of application of this method to detect weak homology between protein structures is discussed.  相似文献   

12.
The secondary and tertiary structures of interferon were predicted from four homologous amino acid sequences. Three methods of secondary structure prediction gave differing results that were interpreted to suggest that there might be four α-helices that are important in the tertiary fold. The validity of this interpretation was assessed by the application of the methods to predict the secondary structures of two proteins known to consist of four α-helices. A possible tertiary model for interferon is then proposed in which the four α-helices pack into a right-handed bundle similar to that observed in several known protein structures. This model was shown to be stereochemically feasible by an α-helix docking algorithm. One of the resultant structures is shown to be compatible with the known disulphide linkages in interferon. Certain residues that are conserved between the different sequences lie near each other in our model and these residues might form a functional site. In the absence of a crystal structure for interferon, a predicted tertiary model will help further structural and functional studies.  相似文献   

13.
This paper describes a novel computer graphics tool for predicting protein structures. The method is based on structural profiles; which are plots of hydrophobicity, parameters used for secondary structure prediction, or other residue-specific traits against sequence number. Similar structural profiles can indicate similar tertiary structures, in the absence of sequence homology. The profiles of reference proteins, with known structure, can be used for prediction. In the method presented here, structural profiles are compared by interactive computer graphics, using the program Multiplot. As a test, a structural profile comparison of several proteins known to have similar 3D structures is presented. Comparison of structural profiles detects similar folding of the two domains of rhodanese, which was not easily detected by sequence homology.  相似文献   

14.
15.
Fold assignments for proteins from the Escherichia coli genome are carried out using BASIC, a profile-profile alignment algorithm, recently tested on fold recognition benchmarks and on the Mycoplasma genitalium genome and PSI BLAST, the newest generation of the de facto standard in homology search algorithms. The fold assignments are followed by automated modeling and the resulting three-dimensional models are analyzed for possible function prediction. Close to 30% of the proteins encoded in the E. coli genome can be recognized as homologous to a protein family with known structure. Most of these homologies (23% of the entire genome) can be recognized both by PSI BLAST and BASIC algorithms, but the latter recognizes an additional 260 homologies. Previous estimates suggested that only 10-15% of E. coli proteins can be characterized this way. This dramatic increase in the number of recognized homologies between E. coli proteins and structurally characterized protein families is partly due to the rapid increase of the database of known protein structures, but mostly it is due to the significant improvement in prediction algorithms. Knowing protein structure adds a new dimension to our understanding of its function and the predictions presented here can be used to predict function for uncharacterized proteins. Several examples, analyzed in more detail in this paper, include the DPS protein protecting DNA from oxidative damage (predicted to be homologous to ferritin with iron ion acting as a reducing agent) and the ahpC/tsa family of proteins, which provides resistance to various oxidating agents (predicted to be homologous to glutathione peroxidase).  相似文献   

16.
The microbial rhodopsins (MR) are homologous to putative chaperone and retinal-binding proteins of fungi. These proteins comprise a coherent family that we have termed the MR family. We have used modeling techniques to predict the structure of one of the putative yeast chaperone proteins, YRO2, based on homology with bacteriorhodopsins (BR). Availability of the structure allowed depiction of conserved residues that are likely to be of functional significance. The results lead us to predict an extracellular protein folding function and a transmembrane proton transport pathway. We suggest that protein folding is energized by a novel mechanism involving the proton motive force. We further show that MR family proteins are distantly related to a family of fungal, animal and plant proteins that include the human lysosomal cystine transporter (LCT) of man (cystinosin), mutations in which cause cystinosis. Sequence and phylogenetic analyses of both the MR family and the LCT family are reported. Proteins in both families are of the same approximate size, exhibit seven putative transmembrane alpha-helical spanners (TMSs) and show limited sequence similarity. We show that the LCT family arose by an internal gene duplication event and that TMSs 1-3 are homologous to TMSs 5-7. Although the same could not be demonstrated statistically for MR family members, homology with the LCT family suggests (but does not prove) a common evolutionary pathway. Thus, TMSs 1-3 and 5-7 in both LCT and MR family members may share a common origin, accounting for their shared structural features.  相似文献   

17.
The hyperthermophilic archaeon Archaeoglobus fulgidus contains an L-Ala dehydrogenase (AlaDH, EC 1.4.1.1) that is not homologous to known bacterial dehydrogenases and appears to represent a previously unrecognized archaeal group of NAD-dependent dehydrogenases. The gene (Genbank; TIGR AF1665) was annotated initially as an ornithine cyclodeaminase (OCD) on the basis of strong homology with the mu crystallin/OCD protein family. We report the structure of the NAD-bound AF1665 AlaDH (AF-AlaDH) at 2.3 A in a C2 crystal form with the 70 kDa dimer in the asymmetric unit, as the first structural representative of this family. Consistent with its lack of homology to bacterial AlaDH proteins, which are mostly hexameric, the archaeal dimer has a novel structure. Although both types of AlaDH enzyme include a Rossmann-type NAD-binding domain, the arrangement of strands in the C-terminal half of this domain is novel, and the other (catalytic) domain in the archaeal protein has a new fold. The active site presents a cluster of conserved Arg and Lys side-chains over the pro-R face of the cofactor. In addition, the best ordered of the 338 water molecules in the structure is positioned well for mechanistic interaction. The overall structure and active site are compared with other dehydrogenases, including the AlaDH from Phormidium lapideum. Implications for the catalytic mechanism and for the structures of homologs are considered. The archaeal AlaDH represents an ancient and previously undescribed subclass of Rossmann-fold proteins that includes bacterial ornithine and lysine cyclodeaminases, marsupial lens proteins and, in man, a thyroid hormone-binding protein that exhibits 30% sequence identity with AF1665.  相似文献   

18.
Hydrophobic cluster analysis (HCA) [15] is a very efficient method to analyse and compare protein sequences. Despite its effectiveness, this method is not widely used because it relies in part on the experience and training of the user. In this article, detailed guidelines as to the use of HCA are presented and include discussions on: the definition of the hydrophobic clusters and their relationships with secondary and tertiary structures; the length of the clusters; the amino acid classification used for HCA; the HCA plot programs; and the working strategies. Various procedures for the analysis of a single sequence are presented: structural segmentation, structural domains and secondary structure evaluation. Like most sequence analysis methods, HCA is more efficient when several homologous sequences are compared. Procedures for the detection and alignment of distantly related proteins by HCA are described through several published examples along with 2 previously unreported cases: the beta-glucosidase from Ruminococcus albus is clearly related to the beta-glucosidases from Clostridum thermocellum and Hansenula anomala although they display a reverse organization of their constitutive domains; the alignment of the sequence of human GTPase activating protein with that of the Crk oncogene is presented. Finally, the pertinence of HCA in the identification of important residues for structure/function as well as in the preparation of homology modelling is discussed.  相似文献   

19.
详细了解蛋白质的三级结构信息有助于理解其生物学功能.随着植物基因组研究的进展,已发现了50多个植物类金属硫蛋白(Metallothionein-Like, MT-L)基因.但至今只有少数几个MT-L蛋白得到了纯化,而其结构尚无报道,因此有必要建立分析这类蛋白结构特征的方法.本研究根据已知的哺乳动物MT的结构数据,分析得出了CXC、CXXC模式和金属-硫络合簇结构原子间的距离限制条件,并用距离几何算法计算得出预测蛋白可能的构象;然后通过统计分析筛选出目标函数值显著较小、构象能低的结构作为这些蛋白半胱氨酸富含区的预测结构,由此建成了适合于植物类金属硫蛋白半胱氨酸富含区的结构预测方法.从应用该方法正确地预测出了已知结构的蓝蟹MT的结构来看,该方法是可行的.并用该方法预测了油菜MT-L蛋白的半胱氨酸富含区的结构.  相似文献   

20.
The complete primary structures of two variant specific glycoproteins (VSGs) of the nannomonad Trypanosoma (N.) congolense are presented. These coat proteins subserve the function of antigenic variation. The secondary structure potentials of both VSGs have been calculated. The amino acid sequences and secondary structure potentials of these VSGs have been compared with the primary structures and secondary structure potentials of several Trypanosoma brucei complex VSGs. In homologous regions, the T. brucei complex VSGs show a pattern of sharply contrasting secondary structure potentials. It has been suggested previously that this pattern gives rise to different folding structures in different members of this polygene protein family. Thus, different short regions of the polypeptide sequence are exposed as antigenic "caps" on the solvent-exposed surface of intact trypanosomes. A sharply contrasting secondary structure potential pattern is also found in regions of the two T. congolense VSGs. However, there is little homology of primary structure between each of the two T. congolense VSGs and any member of the T. brucei complex VSG polygene family whose primary structure has been determined.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号