首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Proteins form arguably the most significant link between genotype and phenotype. Understanding the relationship between protein sequence and structure, and applying this knowledge to predict function, is difficult. One way to investigate these relationships is by considering the space of protein folds and how one might move from fold to fold through similarity, or potential evolutionary relationships. The many individual characterisations of fold space presented in the literature can tell us a lot about how well the current Protein Data Bank represents protein fold space, how convergence and divergence may affect protein evolution, how proteins affect the whole of which they are part, and how proteins themselves function. A synthesis of these different approaches and viewpoints seems the most likely way to further our knowledge of protein structure evolution and thus, facilitate improved protein structure design and prediction.  相似文献   

2.
The genomes of over 60 organisms from all three kingdoms of life are now entirely sequenced. In many respects, the inventory of proteins used in different kingdoms appears surprisingly similar. However, eukaryotes differ from other kingdoms in that they use many long proteins, and have more proteins with coiled-coil helices and with regions abundant in regular secondary structure. Particular structural domains are used in many pathways. Nevertheless, one domain tends to occur only once in one particular pathway. Many proteins do not have close homologues in different species (orphans) and there could even be folds that are specific to one species. This view implies that protein fold space is discrete. An alternative model suggests that structure space is continuous and that modern proteins evolved by aggregating fragments of ancient proteins. Either way, after having harvested proteomes by applying standard tools, the challenge now seems to be to develop better methods for comparative proteomics.  相似文献   

3.
Lu HM  Liang J 《Proteins》2008,70(2):442-449
To study protein nascent chain folding during biosynthesis, we investigate the folding behavior of models of hydrophobic and polar (HP) chains at growing length using both two-dimensional square lattice model and an optimized three-dimensional 4-state discrete off-lattice model. After enumerating all possible sequences and conformations of HP heteropolymers up to length N = 18 and N = 15 in two and three-dimensional space, respectively, we examine changes in adopted structure, stability, and tolerance to single point mutation as the nascent chain grows. In both models, we find that stable model proteins have fewer folded nascent chains during growth, and often will only fold after reaching full length. For the few occasions where partial chains of stable proteins fold, these partial conformations on average are very similar to the corresponding parts of the final conformations at full length. Conversely, we find that sequences with fewer stable nascent chains and sequences with native-like folded nascent chains are more stable. In addition, these stable sequences in general can have many more point mutations and still fold into the same conformation as the wild type sequence. Our results suggest that stable proteins are less likely to be trapped in metastable conformations during biosynthesis, and are more resistant to point-mutations. Our results also imply that less stable proteins will require the assistance of chaperone and other factors during nascent chain folding. Taken together with other reported studies, it seems that cotranslational folding may not be a general mechanism of in vivo protein folding for small proteins, and in vitro folding studies are still relevant for understanding how proteins fold biologically.  相似文献   

4.
Proteins have many functions and predicting these is still one of the major challenges in theoretical biophysics and bioinformatics. Foremost amongst these functions is the need to fold correctly thereby allowing the other genetically dictated tasks that the protein has to carry out to proceed efficiently. In this work, some earlier algorithms for predicting protein domain folds are revisited and they are compared with more recently developed methods. In dealing with intractable problems such as fold prediction, when different algorithms show convergence onto the same result there is every reason to take all algorithms into account such that a consensus result can be arrived at. In this work it is shown that the application of different algorithms in protein structure prediction leads to results that do not converge as such but rather they collude in a striking and useful way that has never been considered before.  相似文献   

5.
Hafumi Nishi  Motonori Ota 《Proteins》2010,78(6):1563-1574
Despite similarities in their sequence and structure, there are a number of homologous proteins that adopt various oligomeric states. Comparisons of these homologous protein pairs, in terms of residue substitutions at the protein–protein interfaces, have provided fundamental characteristics that describe how proteins interact with each other. We have prepared a dataset composed of pairs of related proteins with different homo‐oligomeric states. Using the protein complexes, the interface residues were identified, and using structural alignments, the shadow‐interface residues have been defined as the surface residues that align with the interface residues. Subsequently, we investigated residue substitutions between the interfaces and the shadow interfaces. Based on the degree of the contributions to the interactions, the aligned sites of the interfaces and shadow interfaces were divided into primary and secondary sites; the primary sites are the focus of this work. The primary sites were further classified into two groups (i.e. exposed and buried) based on the degree to which the residue is buried within the shadow interfaces. Using these classifications, two simple mechanisms that mediate the oligomeric states were identified. In the primary‐exposed sites, the residues on the shadow interfaces are replaced by more hydrophobic or aromatic residues, which are physicochemically favored at protein–protein interfaces. In the primary‐buried sites, the residues on the shadow interfaces are replaced by larger residues that protrude into other proteins. These simple rules are satisfied in 23 out of 25 Structural Classification of Proteins (SCOP) families with a different‐oligomeric‐state pair, and thus represent a basic strategy for modulating protein associations and dissociations. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

6.
Nucleotides are among the most extensively exploited chemical moieties in nature and, as a part of a handful of different protein ligands, nucleotides play key roles in energy transduction, enzymatic catalysis and regulation of protein function. We have previously reported that in many proteins with different folds and functions a distinctive adenine-binding motif is involved in the recognition of the Watson-Crick edge of adenine. Here, we show that many proteins do have clear structural motifs that recognize adenosine (and some other nucleotides and nucleotide analogs) not only through the Watson-Crick edge, but also through the sugar and Hoogsteen edges. Each of the three edges of adenosine has a donor-acceptor-donor (DAD) pattern that is often recognized by proteins via a complementary acceptor-donor-acceptor (ADA) motif, whereby three distinct hydrogen bonds are formed: two conventional N-H...O and N-H...N hydrogen bonds, and one weak C-H...O hydrogen bond. The local conformation of the adenine-binding loop is betabetabeta or betabetaalpha and reflects the mode of nucleotide binding. Additionally, we report 21 proteins from five different folds that simultaneously recognize both the sugar edge and the Watson-Crick edge of adenine. In these proteins a unique beta-loop-beta supersecondary structure grasps an adenine-containing ligand between two identical adenine-binding motifs as part of the betaalphabeta-loop-beta fold.  相似文献   

7.
8.
双绕蛋白质的分类与识别   总被引:1,自引:0,他引:1  
蛋白质折叠识别是蛋白质结构研究的重要内容。双绕是α/β蛋白质中结构典型的常见折叠类型。选取22个家族中序列一致性小于25%的79个典型双绕蛋白质作为训练集,以RMSD为指标进行系统聚类,并对各类建立基于结构比对的概形隐马尔科夫模型(profile-HMM)。将Astral1.65中序列一致性小于95%的9 505个样本作为检验集,整体识别敏感性为93.9%,特异性为82.1%,MCC值为0.876。结果表明:对于成员较多,无法建立统一模型的折叠类型,分类建模可以实现较高准确率的识别。  相似文献   

9.
Animal toxins are small proteins built on the basis of a few disulfide bonded frameworks. Because of their high variability in sequence and biologic function, these proteins are now used as templates for protein engineering. Here we report the extensive characterization of the structure and dynamics of two toxin folds, the "three-finger" fold and the short alpha/beta scorpion fold found in snake and scorpion venoms, respectively. These two folds have a very different architecture; the short alpha/beta scorpion fold is highly compact, whereas the "three-finger" fold is a beta structure presenting large flexible loops. First, the crystal structure of the snake toxin alpha was solved at 1.8-A resolution. Then, long molecular dynamics simulations (10 ns) in water boxes of the snake toxin alpha and the scorpion charybdotoxin were performed, starting either from the crystal or the solution structure. For both proteins, the crystal structure is stabilized by more hydrogen bonds than the solution structure, and the trajectory starting from the X-ray structure is more stable than the trajectory started from the NMR structure. The trajectories started from the X-ray structure are in agreement with the experimental NMR and X-ray data about the protein dynamics. Both proteins exhibit fast motions with an amplitude correlated to their secondary structure. In contrast, slower motions are essentially only observed in toxin alpha. The regions submitted to rare motions during the simulations are those that exhibit millisecond time-scale motions. Lastly, the structural variations within each fold family are described. The localization and the amplitude of these variations suggest that the regions presenting large-scale motions should be those tolerant to large insertions or deletions.  相似文献   

10.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

11.
We report herein the NMR structure of Tm0979, a structural proteomics target from Thermotoga maritima. The Tm0979 fold consists of four beta/alpha units, which form a central parallel beta-sheet with strand order 1234. The first three helices pack toward one face of the sheet and the fourth helix packs against the other face. The protein forms a dimer by adjacent parallel packing of the fourth helices sandwiched between the two beta-sheets. This fold is very interesting from several points of view. First, it represents the first structure determination for the DsrH family of conserved hypothetical proteins, which are involved in oxidation of intracellular sulfur but have no defined molecular function. Based on structure and sequence analysis, possible functions are discussed. Second, the fold of Tm0979 most closely resembles YchN-like folds; however the proteins that adopt these folds differ in secondary structural elements and quaternary structure. Comparison of these proteins provides insight into possible mechanisms of evolution of quaternary structure through a simple mechanism of hydrophobicity-changing mutations of one or two residues. Third, the Tm0979 fold is found to be similar to flavodoxin-like folds and beta/alpha barrel proteins, and may provide a link between these very abundant folds and putative ancestral half-barrel proteins.  相似文献   

12.
Kifer I  Nussinov R  Wolfson HJ 《Proteins》2011,79(6):1759-1773
The pathways by which proteins fold into their specific native structure are still an unsolved mystery. Currently, many methods for protein structure prediction are available, and most of them tackle the problem by relying on the vast amounts of data collected from known protein structures. These methods are often not concerned with the route the protein follows to reach its final fold. This work is based on the premise that proteins fold in a hierarchical manner. We present FOBIA, an automated method for predicting a protein structure. FOBIA consists of two main stages: the first finds matches between parts of the target sequence and independently folding structural units using profile-profile comparison. The second assembles these units into a 3D structure by searching and ranking their possible orientations toward each other using a docking-based approach. We have previously reported an application of an initial version of this strategy to homology based targets. Since then we have considerably enhanced our method's abilities to allow it to address the more difficult template-based target category. This allows us to now apply FOBIA to the template-based targets of CASP8 and to show that it is both very efficient and promising. Our method can provide an alternative for template-based structure prediction, and in particular, the docking-basedranking technique presented here can be incorporated into any profile-profile comparison based method.  相似文献   

13.
The Sm and Sm-like proteins are conserved in all three domains of life and have emerged as important players in many different RNA-processing reactions. Their proposed role is to mediate RNA-RNA and/or RNA-protein interactions. In marked contrast to eukaryotes, bacteria appear to contain only one distinct Sm-like protein belonging to the Hfq family of proteins. Similarly, there are generally only one or two subtypes of Sm-related proteins in archaea, but at least one archaeon, Methanococcus jannaschii, encodes a protein that is related to Hfq. This archaeon does not contain any gene encoding a conventional archaeal Sm-type protein, suggesting that Hfq proteins and archaeal Sm-homologs can complement each other functionally. Here, we report the functional characterization of M. jannaschii Hfq and its crystal structure at 2.5 A resolution. The protein forms a hexameric ring. The monomer fold, as well as the overall structure of the complex is similar to that found for the bacterial Hfq proteins. However, clear differences are seen in the charge distribution on the distal face of the ring, which is unusually negative in M. jannaschii Hfq. Moreover, owing to a very short N-terminal alpha-helix, the overall diameter of the archaeal Hfq hexamer is significantly smaller than its bacterial counterparts. Functional analysis reveals that Escherichia coli and M. jannaschii Hfqs display very similar biochemical and biological properties. It thus appears that the archaeal and bacterial Hfq proteins are largely functionally interchangeable.  相似文献   

14.
A significant proportion of bacteria express two or more chaperonin genes. Chaperonins are a group of molecular chaperones, defined by sequence similarity, required for the folding of some cellular proteins. Chaperonin monomers have a mass of c . 60 kDa, and are typically found as large protein complexes containing 14 subunits arranged in two rings. The mechanism of action of the Escherichia coli GroEL protein has been studied in great detail. It acts by binding to unfolded proteins and enabling them to fold in a protected environment where they do not interact with any other proteins. GroEL can assist the folding of many proteins of different sizes, sequences, and structures, and homologues from many different bacteria can functionally replace GroEL in E. coli . What then are the functions of multiple chaperonins? Do they provide a mechanism for cells to increase their general chaperoning ability, or have they become specialized to take on specific novel cellular roles? Here I will review the genetic, biochemical, and phylogenetic evidence that has a bearing on this question, and show that there is good evidence for at least some specificity of function in multiple chaperonin genes.  相似文献   

15.
For many years it has been accepted that the sequence of a protein can specify its three-dimensional structure. However, there has been limited progress in explaining how the sequence dictates its fold and no attempt to do this computationally without the use of specific structural data has ever succeeded for any protein larger than 100 residues. We describe a method that can predict complex folds up to almost 200 residues using only basic principles that do not include any elements of sequence homology. The method does not simulate the folding chain but generates many thousands of models based on an idealized representation of structure. Each rough model is scored and the best are refined. On a set of five proteins, the correct fold score well and when tested on a set of larger proteins, the correct fold was ranked highest for some proteins more than 150 residues, with others being close topological variants. All other methods that approach this level of success rely on the use of templates or fragments of known structures. Our method is unique in using a database of ideal models based on general packing rules that, in spirit, is closer to an ab initio approach.  相似文献   

16.
YibK is a 160 residue homodimeric protein belonging to the SPOUT class of methyltransferases. Proteins in this group all display a unique topological feature; the backbone polypeptide chain folds to form a deep trefoil knot. Such knotted structures were completely unpredicted, it being thought impossible for a protein to fold efficiently in this way. However, they are becoming more common and there are now a growing number of examples in the Protein Data Bank. These intriguing knotted structures represent a new and significant challenge in the field of protein folding. Here, we present an initial characterisation of the folding of YibK, one of the smallest knotted proteins to be identified. This is the first detailed folding study on a knotted protein to be reported. We have established conditions under which the protein can be denatured reversibly in vitro using urea, thereby showing that molecular chaperones are not required for the efficient folding of this protein. A series of equilibrium unfolding experiments were performed over a 400-fold range of protein concentration. Both secondary and tertiary structural probes show a single, protein concentration-dependent unfolding transition, and data are most consistent with a three-state equilibrium denaturation model involving a monomeric intermediate. Thermodynamic parameters obtained from the fit of the data to this model indicate that the intermediate is a stable species with appreciable secondary and tertiary structure; whether the topological knot remains in the intermediate state is still to be shown. Together, these results demonstrate that, despite its complex knotted structure, YibK is able to fold efficiently and behaves remarkably similarly to other dimeric proteins under equilibrium conditions.  相似文献   

17.
Abeln S  Deane CM 《Proteins》2005,60(4):690-700
We review fold usage on completed genomes to explore protein structure evolution. The patterns of presence or absence of folds on genomes gives us insights into the relationships between folds, the age of different folds and how we have arrived at the set of folds we see today. We examine the relationships between different measures which describe protein fold usage, such as the number of copies of a fold per genome, the number of families per fold, and the number of genomes a fold occurs on. We obtained these measures of fold usage by searching for the structural domains on 157 completed genome sequences from all three kingdoms of life. In our comparisons of these measures we found that bacteria have relatively more distinct folds on their genomes than archaea. Eukaryotes were found to have many more copies of a fold on their genomes. If we separate out the different fold classes, the alpha/beta class has relatively fewer distinct folds on large genomes, more copies of a fold on bacteria and more folds occurring in all three kingdoms simultaneously. These results possibly indicate that most alpha/beta folds originated earlier than other folds. The expected power law distribution is observed for copies of a fold per genome and we found a similar distribution for the number of families per fold. However, a more complicated distribution appears for fold occurrence across genomes, which strongly depends on fold class and kingdom. We also show that there is not a clear relationship between the three measures of fold usage. A fold which occurs on many genomes does not necessarily have many copies on each genome. Similarly, folds with many copies do not necessarily have many families or vice versa.  相似文献   

18.
The dominant view in protein science is that a three-dimensional (3-D) structure is a prerequisite for protein function. In contrast to this dominant view, there are many counterexample proteins that fail to fold into a 3-D structure, or that have local regions that fail to fold, and yet carry out function. Protein without fixed 3-D structure is called intrinsically disordered. Motivated by anecdotal accounts of higher rates of sequence evolution in disordered protein than in ordered protein we are exploring the molecular evolution of disordered proteins. To test whether disordered protein evolves more rapidly than ordered protein, pairwise genetic distances were compared between the ordered and the disordered regions of 26 protein families having at least one member with a structurally characterized region of disorder of 30 or more consecutive residues. For five families, there were no significant differences in pairwise genetic distances between ordered and disordered sequences. The disordered region evolved significantly more rapidly than the ordered region for 19 of the 26 families. The functions of these disordered regions are diverse, including binding sites for protein, DNA, or RNA and also including flexible linkers. The functions of some of these regions are unknown. The disordered regions evolved significantly more slowly than the ordered regions for the two remaining families. The functions of these more slowly evolving disordered regions include sites for DNA binding. More work is needed to understand the underlying causes of the variability in the evolutionary rates of intrinsically ordered and disordered protein.  相似文献   

19.
Although most proteins conform to the classical one‐structure/one‐function paradigm, an increasing number of proteins with dual structures and functions have been discovered. In response to cellular stimuli, such proteins undergo structural changes sufficiently dramatic to remodel even their secondary structures and domain organization. This “fold‐switching” capability fosters protein multi‐functionality, enabling cells to establish tight control over various biochemical processes. Accurate predictions of fold‐switching proteins could both suggest underlying mechanisms for uncharacterized biological processes and reveal potential drug targets. Recently, we developed a prediction method for fold‐switching proteins using structure‐based thermodynamic calculations and discrepancies between predicted and experimentally determined protein secondary structure (Porter and Looger, Proc Natl Acad Sci U S A 2018; 115:5968–5973). Here we seek to leverage the negative information found in these secondary structure prediction discrepancies. To do this, we quantified secondary structure prediction accuracies of 192 known fold‐switching regions (FSRs) within solved protein structures found in the Protein Data Bank (PDB). We find that the secondary structure prediction accuracies for these FSRs vary widely. Inaccurate secondary structure predictions are strongly associated with fold‐switching proteins compared to equally long segments of non‐fold‐switching proteins selected at random. These inaccurate predictions are enriched in helix‐to‐strand and strand‐to‐coil discrepancies. Finally, we find that most proteins with inaccurate secondary structure predictions are underrepresented in the PDB compared with their alternatively folded cognates, suggesting that unequal representation of fold‐switching conformers within the PDB could be an important cause of inaccurate secondary structure predictions. These results demonstrate that inconsistent secondary structure predictions can serve as a useful preliminary marker of fold switching.  相似文献   

20.
The Z-score of a protein is defined as the energy separation between the native fold and the average of an ensemble of misfolds in the units of the standard deviation of the ensemble. The Z-score is often used as a way of testing the knowledge-based potentials for their ability to recognize the native fold from other alternatives. However, it is not known what range of values the Z-scores should have if one had a correct potential. Here, we offer an estimate of Z-scores extracted from calorimetric measurements of proteins. The energies obtained from these experimental data are compared with those from computer simulations of a lattice model protein. It is suggested that the Z-scores calculated from different knowledge-based potentials are generally too small in comparison with the experimental values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号