首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
PALI is a database of structure-based sequence alignments and phylogenetic relationships derived on the basis of three-dimensional structures of homologous proteins. This database enables grouping of pairs of homologous protein structures on the basis of their sequence identity calculated from the structure-based alignment and PALI also enables association of a new sequence to a family and automatic generation of a dendrogram combining the query sequence and homologous protein structures.  相似文献   

2.
We describe a method to identify protein domain boundaries from sequence information alone based on the assumption that hydrophobic residues cluster together in space. SnapDRAGON is a suite of programs developed to predict domain boundaries based on the consistency observed in a set of alternative ab initio three-dimensional (3D) models generated for a given protein multiple sequence alignment. This is achieved by running a distance geometry-based folding technique in conjunction with a 3D-domain assignment algorithm. The overall accuracy of our method in predicting the number of domains for a non-redundant data set of 414 multiple alignments, representing 185 single and 231 multiple-domain proteins, is 72.4 %. Using domain linker regions observed in the tertiary structures associated with each query alignment as the standard of truth, inter-domain boundary positions are delineated with an accuracy of 63.9 % for proteins comprising continuous domains only, and 35.4 % for proteins with discontinuous domains. Overall, domain boundaries are delineated with an accuracy of 51.8 %. The prediction accuracy values are independent of the pair-wise sequence similarities within each of the alignments. These results demonstrate the capability of our method to delineate domains in protein sequences associated with a wide variety of structural domain organisation.  相似文献   

3.
The key reaction of protein synthesis, peptidyl transfer, is catalysed in all living organisms by the ribosome - an advanced and highly efficient molecular machine. During the last decade extensive X-ray crystallographic and NMR studies of the three-dimensional structure of ribosomal proteins, ribosomal RNA components and their complexes with ribosomal proteins, and of several translation factors in different functional states have taken us to a new level of understanding of the mechanism of function of the protein synthesis machinery. Among the new remarkable features revealed by structural studies, is the mimicry of the tRNA molecule by elongation factor G, ribosomal recycling factor and the eukaryotic release factor 1. Several other translation factors, for which three-dimensional structures are not yet known, are also expected to show some form of tRNA mimicry. The efforts of several crystallographic and biochemical groups have resulted in the determination by X-ray crystallography of the structures of the 30S and 50S subunits at moderate resolution, and of the structure of the 70S subunit both by X-ray crystallography and cryo-electron microscopy (EM). In addition, low resolution cryo-EM models of the ribosome with different translation factors and tRNA have been obtained. The new ribosomal models allowed for the first time a clear identification of the functional centres of the ribosome and of the binding sites for tRNA and ribosomal proteins with known three-dimensional structure. The new structural data have opened a way for the design of new experiments aimed at deeper understanding at an atomic level of the dynamics of the system.  相似文献   

4.
We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics.  相似文献   

5.
Domains are the evolutionary units that comprise proteins, and most proteins are built from more than one domain. Domains can be shuffled by recombination to create proteins with new arrangements of domains. Using structural domain assignments, we examined the combinations of domains in the proteins of 131 completely sequenced organisms. We found two-domain and three-domain combinations that recur in different protein contexts with different partner domains. The domains within these combinations have a particular functional and spatial relationship. These units are larger than individual domains and we term them "supra-domains". Amongst the supra-domains, we identified some 1400 (1203 two-domain and 166 three-domain) combinations that are statistically significantly over-represented relative to the occurrence and versatility of the individual component domains. Over one-third of all structurally assigned multi-domain proteins contain these over-represented supra-domains. This means that investigation of the structural and functional relationships of the domains forming these popular combinations would be particularly useful for an understanding of multi-domain protein function and evolution as well as for genome annotation. These and other supra-domains were analysed for their versatility, duplication, their distribution across the three kingdoms of life and their functional classes. By examining the three-dimensional structures of several examples of supra-domains in different biological processes, we identify two basic types of spatial relationships between the component domains: the combined function of the two domains is such that either the geometry of the two domains is crucial and there is a tight constraint on the interface, or the precise orientation of the domains is less important and they are spatially separate. Frequently, the role of the supra-domain becomes clear only once the three-dimensional structure is known. Since this is the case for only a quarter of the supra-domains, we provide a list of the most important unknown supra-domains as potential targets for structural genomics projects.  相似文献   

6.
Characterizing the three-dimensional structure of macromolecules is central to understanding their function. Traditionally, structures of proteins and their complexes have been determined using experimental techniques such as X-ray crystallography, NMR, or cryo-electron microscopy—applied individually or in an integrative manner. Meanwhile, however, computational methods for protein structure prediction have been improving their accuracy, gradually, then suddenly, with the breakthrough advance by AlphaFold2, whose models of monomeric proteins are often as accurate as experimental structures. This breakthrough foreshadows a new era of computational methods that can build accurate models for most monomeric proteins. Here, we envision how such accurate modeling methods can combine with experimental structural biology techniques, enhancing integrative structural biology. We highlight the challenges that arise when considering multiple structural conformations, protein complexes, and polymorphic assemblies. These challenges will motivate further developments, both in modeling programs and in methods to solve experimental structures, towards better and quicker investigation of structure–function relationships.  相似文献   

7.
Three-dimensional structures of only a handful of membrane proteins have been solved, in contrast to the thousands of structures of water-soluble proteins. Difficulties in crystallization have inhibited the determination of the three-dimensional structure of membrane proteins by x-ray crystallography and have spotlighted the critical need for alternative approaches to membrane protein structure. A new approach to the three-dimensional structure of membrane proteins has been developed and tested on the integral membrane protein, bacteriorhodopsin, the crystal structure of which had previously been determined. An overlapping series of 13 peptides, spanning the entire sequence of bacteriorhodopsin, was synthesized, and the structures of these peptides were determined by NMR in dimethylsulfoxide solution. These structures were assembled into a three-dimensional construct by superimposing the overlapping sequences at the ends of each peptide. Onto this construct were written all the distance and angle constraints obtained from the individual solution structures along with a limited number of experimental inter-helical distance constraints, and the construct was subjected to simulated annealing. A three-dimensional structure, determined exclusively by the experimental constraints, emerged that was similar to the crystal structure of this protein. This result suggests an alternative approach to the acquisition of structural information for membrane proteins consisting of helical bundles.  相似文献   

8.
Lipoate scavenging from the human host is essential for malaria parasite survival. Scavenged lipoate is covalently attached to three parasite proteins: the H‐protein and the E2 subunits of branched chain amino acid dehydrogenase (BCDH) and α‐ketoglutarate dehydrogenase (KDH). We show mitochondrial localization for the E2 subunits of BCDH and KDH, similar to previously localized H‐protein, demonstrating that all three lipoylated proteins reside in the parasite mitochondrion. The lipoate ligase 1, LipL1, has been shown to reside in the mitochondrion and it catalyses the lipoylation of the H‐protein; however, we show that LipL1 alone cannot lipoylate BCDH or KDH. A second mitochondrial protein with homology to lipoate ligases, LipL2, does not show ligase activity and is not capable of lipoylating any of the mitochondrial substrates. Instead, BCDH and KDH are lipoylated through a novel mechanism requiring both LipL1 and LipL2. This mechanism is sensitive to redox conditions where BCDH and KDH are exclusively lipoylated under strong reducing conditions in contrast to the H‐protein which is preferentially lipoylated under less reducing conditions. Thus, malaria parasites contain two different routes of mitochondrial lipoylation, an arrangement that has not been described for any other organism.  相似文献   

9.
The identification of protein biochemical functions based on their three-dimensional structures is strongly required in the post-genome-sequencing era. We have developed a new method to identify and predict protein biochemical functions using the similarity information of molecular surface geometries and electrostatic potentials on the surfaces. Our prediction system consists of a similarity search method based on a clique search algorithm and the molecular surface database eF-site (electrostatic surface of functional-site in proteins). Using this system, functional sites similar to those of phosphoenoylpyruvate carboxy kinase were detected in several mononucleotide-binding proteins, which have different folds. We also applied our method to a hypothetical protein, MJ0226 from Methanococcus jannaschii, and detected the mononucleotide binding site from the similarity to other proteins having different folds.  相似文献   

10.
Paramagnetic metal ions generate pseudocontact shifts (PCSs) in nuclear magnetic resonance spectra that are manifested as easily measurable changes in chemical shifts. Metals can be incorporated into proteins through metal binding tags, and PCS data constitute powerful long-range restraints on the positions of nuclear spins relative to the coordinate system of the magnetic susceptibility anisotropy tensor (Δχ-tensor) of the metal ion. We show that three-dimensional structures of proteins can reliably be determined using PCS data from a single metal binding site combined with backbone chemical shifts. The program PCS-ROSETTA automatically determines the Δχ-tensor and metal position from the PCS data during the structure calculations, without any prior knowledge of the protein structure. The program can determine structures accurately for proteins of up to 150 residues, offering a powerful new approach to protein structure determination that relies exclusively on readily measurable backbone chemical shifts and easily discriminates between correctly and incorrectly folded conformations.  相似文献   

11.
Protein biotinylation and lipoylation are post-translational modifications, in which biotin or lipoic acid is covalently attached to specific proteins containing biotin/lipoyl attachment domains. All the currently reported natural proteins containing biotin/lipoyl attachment domains are multidomain proteins and can only be modified by either biotin or lipoic acid in vivo. We have identified a single domain protein with 73 amino acid residues from Bacillus subtilis strain 168, and it can be both biotinylated and lipoylated in Escherichia coli. The protein is therefore named as biotin/lipoyl attachment protein (BLAP). This is the first report that a natural single domain protein exists as both a biotin and lipoic acid receptor. The solution structure of apo-BLAP showed that it adopts a typical fold of biotin/lipoyl attachment domain. The structure of biotinylated BLAP revealed that the biotin moiety is covalently attached to the side chain of Lys(35), and the bicyclic ring of biotin is folded back and immobilized on the protein surface. The biotin moiety immobilization is mainly due to an interaction between the biotin ureido ring and the indole ring of Trp(12). NMR study also indicated that the lipoyl group of the lipoylated BLAP is also immobilized on the protein surface in a similar fashion as the biotin moiety in the biotinylated protein.  相似文献   

12.
Overview of structural genomics: from structure to function   总被引:7,自引:0,他引:7  
The unprecedented increase in the number of new protein sequences arising from genomics and proteomics highlights directly the need for methods to rapidly and reliably determine the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds, thereby providing three-dimensional portraits for all proteins in a living organism and to infer molecular functions of the proteins. The goal of obtaining protein structures on a genomic scale has motivated the development of high-throughput technologies for macromolecular structure determination, which have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional and evolution relationships that were hidden at the sequence level.  相似文献   

13.
Sistla RK  K V B  Vishveshwara S 《Proteins》2005,59(3):616-626
We present a novel method for the identification of structural domains and domain interface residues in proteins by graph spectral method. This method converts the three-dimensional structure of the protein into a graph by using atomic coordinates from the PDB file. Domain definitions are obtained by constructing either a protein backbone graph or a protein side-chain graph. The graph is constructed based on the interactions between amino acid residues in the three-dimensional structure of the proteins. The spectral parameters of such a graph contain information regarding the domains and subdomains in the protein structure. This is based on the fact that the interactions among amino acids are higher within a domain than across domains. This is evident in the spectra of the protein backbone and the side-chain graphs, thus differentiating the structural domains from one another. Further, residues that occur at the interface of two domains can also be easily identified from the spectra. This method is simple, elegant, and robust. Moreover, a single numeric computation yields both the domain definitions and the interface residues.  相似文献   

14.
WW domains are small globular protein interaction modules found in a wide spectrum of proteins. They recognize their target proteins by binding specifically to short linear peptide motifs that are often proline-rich. To infer the determinants of the ligand binding propensities of WW domains, we analyzed 42 WW domains. We built models of the 3D structures of the WW domains and their peptide complexes by comparative modeling supplemented with experimental data from peptide library screens. The models provide new insights into the orientation and position of the peptide in structures of WW domain-peptide complexes that have not yet been determined experimentally. From a protein interaction property similarity analysis (PIPSA) of the WW domain structures, we show that electrostatic potential is a distinguishing feature of WW domains and we propose a structure-based classification of WW domains that expands the existent ligand-based classification scheme. Application of the comparative molecular field analysis (CoMFA), GRID/GOLPE and comparative binding energy (COMBINE) analysis methods permitted the derivation of quantitative structure-activity relationships (QSARs) that aid in identifying the specificity-determining residues within WW domains and their ligand-recognition motifs. Using these QSARs, a new group-specific sequence feature of WW domains that target arginine-containing peptides was identified. Finally, the QSAR models were applied to the design of a peptide to bind with greater affinity than the known binding peptide sequences of the yRSP5-1 WW domain. The prediction was verified experimentally, providing validation of the QSAR models and demonstrating the possibility of rationally improving peptide affinity for WW domains. The QSAR models may also be applied to the prediction of the specificity of WW domains with uncharacterized ligand-binding properties.  相似文献   

15.
An approach is described for modelling the three-dimensional structure of a protein from the tertiary structures of several homologous proteins that have been determined by X-ray analysis. A method is developed for the simultaneous superposition of several protein molecules and for the calculation of an 'average structure' or 'framework'. Investigation of the convergence properties of this method, in the case of both weighted and unweighted least squares, demonstrates that both give a unique answer and the latter is robust for an homologous family of proteins. Multi-dimensional scaling is used to subgroup of the proteins with respect to structural homology. The framework calculated on the basis of the family of homologous proteins, or of an appropriate subgroup, is used to align fragments of the known protein structures of high sequence homology with the unknown. This alignment provides a basis for model building the tertiary structure. Different techniques for using the framework to model the mainchain of various globins and an immunoglobulin domain in the structurally conserved regions are investigated.  相似文献   

16.
Structural bioinformatics of membrane proteins is still in its infancy, and the picture of their fold space is only beginning to emerge. Because only a handful of three-dimensional structures are available, sequence comparison and structure prediction remain the main tools for investigating sequence-structure relationships in membrane protein families. Here we present a comprehensive analysis of the structural families corresponding to α-helical membrane proteins with at least three transmembrane helices. The new version of our CAMPS database (CAMPS 2.0) covers nearly 1300 eukaryotic, prokaryotic, and viral genomes. Using an advanced classification procedure, which is based on high-order hidden Markov models and considers both sequence similarity as well as the number of transmembrane helices and loop lengths, we identified 1353 structurally homogeneous clusters roughly corresponding to membrane protein folds. Only 53 clusters are associated with experimentally determined three-dimensional structures, and for these clusters CAMPS is in reasonable agreement with structure-based classification approaches such as SCOP and CATH. We therefore estimate that ~1300 structures would need to be determined to provide a sufficient structural coverage of polytopic membrane proteins. CAMPS 2.0 is available at http://webclu.bio.wzw.tum.de/CAMPS2.0/.  相似文献   

17.
Referee: Dr. Ruth Nussinov, Saic Frederick, Bldg. 469. 469, Room 151, Frederick, MD 21702-1201

Hyperthermophilic organisms optimally grow close to the boiling point of water. As a consequence, their macromolecules must be much more thermostable than those from mesophilic species. Here, proteins from hyperthermophiles and mesophiles are compared with respect to their thermodynamic and kinetic stabilities. The known differences in amino acid sequences and three-dimensional structures between intrinsically thermostable and thermolabile proteins will be summarized, and the crucial role of electrostatic interactions for protein stability at high temperatures will be highlighted. Successful attempts to increase the thermostability of proteins, which were either based on rational design or on directed evolution, are presented. The relationship between high thermo-stability of enzymes from hyperthermophiles and their low catalytic activity at room temperature is discussed. Not all proteins from hyperthermophiles are thermostable enough to retain their structures and functions at the high physiological temperatures. It will be shown how this shortcoming can be surpassed by extrinsic factors such as large molecular chaperones and small compatible solutes. Finally, the potential of thermostable enzymes for biotechnology is discussed.  相似文献   

18.
Cells use the post‐translational modification ADP‐ribosylation to control a host of biological activities. In some pathogenic bacteria, an operon‐encoded mono‐ADP‐ribosylation cycle mediates response to host‐induced oxidative stress. In this system, reversible mono ADP‐ribosylation of a lipoylated target protein represses oxidative stress response. An NAD+‐dependent sirtuin catalyzes the single ADP‐ribose (ADPr) addition, while a linked macrodomain‐containing protein removes the ADPr. Here we report the crystal structure of the sitruin‐linked macrodomain protein from Staphylococcus aureus, SauMacro (also known as SAV0325) to 1.75‐Å resolution. The monomeric SauMacro bears a previously unidentified Zn2+‐binding site that putatively aids in substrate recognition and catalysis. An amino‐terminal three‐helix bundle motif unique to this class of macrodomain proteins provides a structural scaffold for the Zn2+ site. Structural features of the enzyme further indicate a cleft proximal to the Zn2+ binding site appears well suited for ADPr binding, while a deep hydrophobic channel in the protein core is suitable for binding the lipoate of the lipoylated protein target.  相似文献   

19.
The gap between the number of protein sequences and protein structures is increasing rapidly, exacerbated by the completion of numerous genome projects now flooding into public databases. To fill this gap, comparative protein modelling is widely considered the most accurate technique for predicting the three-dimensional shape of proteins. High-throughput, automatic protein modelling should considerably increase our access to protein structures other than those determined by experimental techniques such as X-ray crystallography and NMR (nuclear magnetic resonance) spectroscopy. The uses for these complete three-dimensional models are growing rapidly, ranging from guiding site-directed mutagenesis experiments to protein-protein interaction predictions. In recognition of this, a number of very useful comparative modelling servers have begun to emerge on the Web. Molecular biologists now have a powerful web-based toolkit to construct models, assess their accuracy, and use them to explain and predict experiments. There is, however, still much to do by those engaged in algorithmic development if comparative modelling is to compete on an equal footing with experimental protein structure determination techniques.  相似文献   

20.
Fold assignments for proteins from the Escherichia coli genome are carried out using BASIC, a profile-profile alignment algorithm, recently tested on fold recognition benchmarks and on the Mycoplasma genitalium genome and PSI BLAST, the newest generation of the de facto standard in homology search algorithms. The fold assignments are followed by automated modeling and the resulting three-dimensional models are analyzed for possible function prediction. Close to 30% of the proteins encoded in the E. coli genome can be recognized as homologous to a protein family with known structure. Most of these homologies (23% of the entire genome) can be recognized both by PSI BLAST and BASIC algorithms, but the latter recognizes an additional 260 homologies. Previous estimates suggested that only 10-15% of E. coli proteins can be characterized this way. This dramatic increase in the number of recognized homologies between E. coli proteins and structurally characterized protein families is partly due to the rapid increase of the database of known protein structures, but mostly it is due to the significant improvement in prediction algorithms. Knowing protein structure adds a new dimension to our understanding of its function and the predictions presented here can be used to predict function for uncharacterized proteins. Several examples, analyzed in more detail in this paper, include the DPS protein protecting DNA from oxidative damage (predicted to be homologous to ferritin with iron ion acting as a reducing agent) and the ahpC/tsa family of proteins, which provides resistance to various oxidating agents (predicted to be homologous to glutathione peroxidase).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号