首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Proteins that share even low sequence homologies are known to adopt similar folds. The beta-propeller structural motif is one such example. Identifying sequences that adopt a beta-propeller fold is useful to annotate protein structure and function. Often, tandem sequence repeats provide the necessary signal for identifying beta-propellers in proteins. In our recent analysis to identify cell surface proteins in archaeal and bacterial genomes, we identified some proteins that contain novel tandem repeats "LVIVD", "RIVW" and "LGxL". In this work, based on protein fold predictions and three-dimensional comparative modeling methods, we predicted that these repeat types fold as beta-propeller. Further, the evolutionary trace analysis of all proteins constituting amino acid sequence repeats in beta-propellers suggest that the novel repeats have diverged from a common ancestor.  相似文献   

2.
What are the selective pressures on protein sequences during evolution? Amino acid residues may be highly conserved for functional or structural (stability) reasons. Theoretical studies have proposed that residues involved in the folding nucleus may also be highly conserved. To test this we are using an experimental "fold approach" to the study of protein folding. This compares the folding and stability of a number of proteins that share the same fold, but have no common amino acid sequence or biological activity. The fold selected for this study is the immunoglobulin-like beta-sandwich fold, which is a fold that has no specifically conserved function. Four model proteins are used from two distinct superfamilies that share the immunoglobulin-like fold, the fibronectin type III and immunoglobulin superfamilies. Here, the fold approach and protein engineering are used to question the role of a highly conserved tyrosine in the "tyrosine corner" motif that is found ubiquitously and exclusively in Greek key proteins. In the four model beta-sandwich proteins characterised here, the tyrosine is the only residue that is absolutely conserved at equivalent sites. By mutating this position to phenylalanine, we show that the tyrosine hydroxyl is not required to nucleate folding in the immunoglobulin superfamily, whereas it is involved to some extent in early structure formation in the fibronectin type III superfamily. The tyrosine corner is important for stability, mutation to phenylalanine costs between 1.5 and 3 kcal mol(-1). We propose that the high level of conservation of the tyrosine is related to the structural restraints of the loop connecting the beta-sheets, representing an evolutionary "cul-de-sac".  相似文献   

3.
Internal symmetry is commonly observed in the majority of fundamental protein folds. Meanwhile, sufficient evidence suggests that nascent polypeptide chains of proteins have the potential to start the co-translational folding process and this process allows mRNA to contain additional information on protein structure. In this paper, we study the relationship between gene sequences and protein structures from the viewpoint of symmetry to explore how gene sequences code for structural symmetry in proteins. We found that, for a set of two-fold symmetric proteins from left-handed beta-helix fold, intragenic symmetry always exists in their corresponding gene sequences. Meanwhile, codon usage bias and local mRNA structure might be involved in modulating translation speed for the formation of structural symmetry: a major decrease of local codon usage bias in the middle of the codon sequence can be identified as a common feature; and major or consecutive decreases in local mRNA folding energy near the boundaries of the symmetric substructures can also be observed. The results suggest that gene duplication and fusion may be an evolutionarily conserved process for this protein fold. In addition, the usage of rare codons and the formation of higher order of secondary structure near the boundaries of symmetric substructures might have coevolved as conserved mechanisms to slow down translation elongation and to facilitate effective folding of symmetric substructures. These findings provide valuable insights into our understanding of the mechanisms of translation and its evolution, as well as the design of proteins via symmetric modules.  相似文献   

4.
Recognition of protein fold from amino acid sequence is a challenging task. The structure and stability of proteins from different fold are mainly dictated by inter-residue interactions. In our earlier work, we have successfully used the medium- and long-range contacts for predicting the protein folding rates, discriminating globular and membrane proteins and for distinguishing protein structural classes. In this work, we analyze the role of inter-residue interactions in commonly occurring folds of globular proteins in order to understand their folding mechanisms. In the medium-range contacts, the globin fold and four-helical bundle proteins have more contacts than that of DNA-RNA fold although they all belong to all-alpha class. In long-range contacts, only the ribonuclease fold prefers 4-10 range and the other folding types prefer the range 21-30 in alpha/beta class proteins. Further, the preferred residues and residue pairs influenced by these different folds are discussed. The information about the preference of medium- and long-range contacts exhibited by the 20 amino acid residues can be effectively used to predict the folding type of each protein.  相似文献   

5.
Recent protein design experiments have demonstrated that proteins can migrate between folds through the accumulation of substitution mutations without visiting disordered or nonfunctional points in sequence space. To explore the biophysical mechanism underlying such transitions we use a three-letter continuous protein model with seven atoms per amino acid to provide realistic sequence-structure and sequence-function mappings through explicit simulation of the folding and interaction of model sequences. We start from two 16-amino-acid sequences folding into an α-helix and a β-hairpin, respectively, each of which has a preferred binding partner with 35 amino acids. We identify a mutational pathway between the two folds, which features a sharp fold switch. By contrast, we find that the transition in function is smooth. Moreover, the switch in preferred binding partner does not coincide with the fold switch. Discovery of new folds in evolution might therefore be facilitated by following fitness slopes in sequence space underpinned by binding-induced conformational switching.  相似文献   

6.
BACKGROUND: Are folding pathways conserved in protein families? To test this explicitly and ask to what extent structure specifies folding pathways requires comparison of proteins with a common fold. Our strategy is to choose members of a highly diverse protein family with no conservation of function and little or no sequence identity, but with structures that are essentially the same. The immunoglobulin-like fold is one of the most common structural families, and is subdivided into superfamilies with no detectable evolutionary or functional relationship. RESULTS: We compared the folding of a number of immunoglobulin-like proteins that have a common structural core and found a strong correlation between folding rate and stability. The results suggest that the folding pathways of these immunoglobulin-like proteins share common features. CONCLUSIONS: This study is the first to compare the folding of structurally related proteins that are members of different superfamilies. The most likely explanation for the results is that interactions that are important in defining the structure of immunoglobulin-like proteins are also used to guide folding.  相似文献   

7.
Using a recently developed program (SCOPmap) designed to automatically assign new protein structures to existing evolutionary-based classification schemes, we identify a evolutionarily conserved domain (EDD) common to three different folds: mannose transporter EIIA domain (EIIA-man), dihydroxyacetone kinase (Dak), and DegV. Several lines of evidence support unification of these three folds into a single superfamily: statistically significant sequence similarity detected by PSI-BLAST; "closed structural grouping" using DALI Z-scores (each protein inside a group finds all other group members with scores higher than those to proteins outside the group) that includes only these proteins sharing a unique alpha-helical hairpin at the C-terminus and excludes all other proteins with similar topology; similar domain fusions connect Dak and DegV, and genomic neighborhood organizations connect Dak and EIIA-man. Finally, both Dak and EIIA-man perform similar phosphotransfer reactions, suggesting a phosphotransferase activity for the DegV-like family of proteins, whose function other than lipid binding revealed in the crystal structure remains unknown.  相似文献   

8.
Recent protein design experiments have demonstrated that proteins can migrate between folds through the accumulation of substitution mutations without visiting disordered or nonfunctional points in sequence space. To explore the biophysical mechanism underlying such transitions we use a three-letter continuous protein model with seven atoms per amino acid to provide realistic sequence-structure and sequence-function mappings through explicit simulation of the folding and interaction of model sequences. We start from two 16-amino-acid sequences folding into an α-helix and a β-hairpin, respectively, each of which has a preferred binding partner with 35 amino acids. We identify a mutational pathway between the two folds, which features a sharp fold switch. By contrast, we find that the transition in function is smooth. Moreover, the switch in preferred binding partner does not coincide with the fold switch. Discovery of new folds in evolution might therefore be facilitated by following fitness slopes in sequence space underpinned by binding-induced conformational switching.  相似文献   

9.
Hegyi H  Lin J  Greenbaum D  Gerstein M 《Proteins》2002,47(2):126-141
We conducted a structural genomics analysis of the folds and structural superfamilies in the first 20 completely sequenced genomes by focusing on the patterns of fold usage and trying to identify structural characteristics of typical and atypical folds. We assigned folds to sequences using PSI-blast, run with a systematic protocol to reduce the amount of computational overhead. On average, folds could be assigned to about a fourth of the ORFs in the genomes and about a fifth of the amino acids in the proteomes. More than 80% of all the folds in the SCOP structural classification were identified in one of the 20 organisms, with worm and E. coli having the largest number of distinct folds. Folds are particularly effective at comprehensively measuring levels of gene duplication, because they group together even very remote homologues. Using folds, we find the average level of duplication varies depending on the complexity of the organism, ranging from 2.4 in M. genitalium to 32 for the worm, values significantly higher than those observed based purely on sequence similarity. We rank the common folds in the 20 organisms, finding that the top three are the P-loop NTP hydrolase, the ferrodoxin fold, and the TIM-barrel, and discuss in detail the many factors that affect and bias these rankings. We also identify atypical folds that are "unique" to one of the organisms in our study and compare the characteristics of these folds with the most common ones. We find that common folds tend be more multifunctional and associated with more regular, "symmetrical" structures than the unique ones. In addition, many of the unique folds are associated with proteins involved in cell defense (e.g., toxins). We analyze specific patterns of fold occurrence in the genomes by associating some of them with instances of horizontal transfer and others with gene loss. In particular, we find three possible examples of transfer between archaea and bacteria and six between eukarya and bacteria. We make available our detailed results at http://genecensus.org/20.  相似文献   

10.
The information required to generate a protein structure is contained in its amino acid sequence, but how three-dimensional information is mapped onto a linear sequence is still incompletely understood. Multiple structure alignments of similar protein structures have been used to investigate conserved sequence features but contradictory results have been obtained, due, in large part, to the absence of subjective criteria to be used in the construction of sequence profiles and in the quantitative comparison of alignment results. Here, we report a new procedure for multiple structure alignment and use it to construct structure-based sequence profiles for similar proteins. The definition of "similar" is based on the structural alignment procedure and on the protein structural distance (PSD) described in paper I of this series, which offers an objective measure for protein structure relationships. Our approach is tested in two well-studied groups of proteins; serine proteases and Ig-like proteins. It is demonstrated that the quality of a sequence profile generated by a multiple structure alignment is quite sensitive to the PSD used as a threshold for the inclusion of proteins in the alignment. Specifically, if the proteins included in the aligned set are too distant in structure from one another, there will be a dilution of information and patterns that are relevant to a subset of the proteins are likely to be lost.In order to understand better how the same three-dimensional information can be encoded in seemingly unrelated sequences, structure-based sequence profiles are constructed for subsets of proteins belonging to nine superfolds. We identify patterns of relatively conserved residues in each subset of proteins. It is demonstrated that the most conserved residues are generally located in the regions where tertiary interactions occur and that are relatively conserved in structure. Nevertheless, the conservation patterns are relatively weak in all cases studied, indicating that structure-determining factors that do not require a particular sequential arrangement of amino acids, such as secondary structure propensities and hydrophobic interactions, are important in encoding protein fold information. In general, we find that similar structures can fold without having a set of highly conserved residue clusters or a well-conserved sequence profile; indeed, in some cases there is no apparent conservation pattern common to structures with the same fold. Thus, when a group of proteins exhibits a common and well-defined sequence pattern, it is more likely that these sequences have a close evolutionary relationship rather than the similarities having arisen from the structural requirements of a given fold.  相似文献   

11.
The functional significance of evolutionarily conserved motifs/patterns of short regions in proteins is well documented. Although a large number of sequences are conserved, only a small fraction of these are invariant across several organisms. Here, we have examined the structural features of the functionally important peptide sequences, which have been found invariant across diverse bacterial genera. Ramachandran angles (phi,psi) have been used to analyze the conformation, folding patterns and geometrical location (buried/exposed) of these invariant peptides in different crystal structures harboring these sequences. The analysis indicates that the peptides preferred a single conformation in different protein structures, with the exception of only a few longer peptides that exhibited some conformational variability. In addition, it is noticed that the variability of conformation occurs mainly due to flipping of peptide units about the virtual C(alpha)...C(alpha) bond. However, for a given invariant peptide, the folding patterns are found to be similar in almost all the cases. Over and above, such peptides are found to be buried in the protein core. Thus, we can safely conclude that these invariant peptides are structurally important for the proteins, since they acquire unique structures across different proteins and can act as structural determinants (SD) of the proteins. The location of these SD peptides on the protein chain indicated that most of them are clustered towards the N-terminal and middle region of the protein with the C-terminal region exhibiting low preference. Another feature that emerges out of this study is that some of these SD peptides can also play the roles of "fold boundaries" or "hinge nucleus" in the protein structure. The study indicates that these SD peptides may act as chain-reversal signatures, guiding the proteins to adopt appropriate folds. In some cases the invariant signature peptides may also act as folding nuclei (FN) of the proteins.  相似文献   

12.
One still cannot predict the 3D fold of a protein from its amino acid sequence, mainly because of errors in the energy estimates underlying the prediction. However, a recently developed theory [1] shows that having a set of homologs (i.e., the chains with equal, in despite of numerous mutations, 3D folds) one can average the potential of each interaction over the homologs and thus predict the common 3D fold of protein family even when a correct fold prediction for an individual sequence is impossible because the energies are known only approximately. This theoretical conclusion has been verified by simulation of the energy spectra of simplified models of protein chains [2], and the further investigation of these simplified models shows that their true "native" fold can be found by folding of the chain where each interaction potential is averaged over the homologs. In conclusion, the applicability of the "homolog-averaging" approach is tested by recognition of real protein 3D structures. Both the gapless threading of sequences onto the known protein folds [3] and the more practically important gapped threading (which allows to consider not only the known 3D structures, but the more or less similar to them folds as well) shows a significant increase in selectivity of the native chain fold recognition.  相似文献   

13.
We report herein the NMR structure of Tm0979, a structural proteomics target from Thermotoga maritima. The Tm0979 fold consists of four beta/alpha units, which form a central parallel beta-sheet with strand order 1234. The first three helices pack toward one face of the sheet and the fourth helix packs against the other face. The protein forms a dimer by adjacent parallel packing of the fourth helices sandwiched between the two beta-sheets. This fold is very interesting from several points of view. First, it represents the first structure determination for the DsrH family of conserved hypothetical proteins, which are involved in oxidation of intracellular sulfur but have no defined molecular function. Based on structure and sequence analysis, possible functions are discussed. Second, the fold of Tm0979 most closely resembles YchN-like folds; however the proteins that adopt these folds differ in secondary structural elements and quaternary structure. Comparison of these proteins provides insight into possible mechanisms of evolution of quaternary structure through a simple mechanism of hydrophobicity-changing mutations of one or two residues. Third, the Tm0979 fold is found to be similar to flavodoxin-like folds and beta/alpha barrel proteins, and may provide a link between these very abundant folds and putative ancestral half-barrel proteins.  相似文献   

14.
Motivation. Protein design aims to identify sequences compatible with a given protein fold but incompatible to any alternative folds. To select the correct sequences and to guide the search process, a design scoring function is critically important. Such a scoring function should be able to characterize the global fitness landscape of many proteins simultaneously. RESULTS: To find optimal design scoring functions, we introduce two geometric views and propose a formulation using a mixture of non-linear Gaussian kernel functions. We aim to solve a simplified protein sequence design problem. Our goal is to distinguish each native sequence for a major portion of representative protein structures from a large number of alternative decoy sequences, each a fragment from proteins of different folds. Our scoring function discriminates perfectly a set of 440 native proteins from 14 million sequence decoys. We show that no linear scoring function can succeed in this task. In a blind test of unrelated proteins, our scoring function misclassfies only 13 native proteins out of 194. This compares favorably with about three-four times more misclassifications when optimal linear functions reported in the literature are used. We also discuss how to develop protein folding scoring function.  相似文献   

15.
Most homologous pairs of proteins have no significant sequence similarity to each other and are not identified by direct sequence comparison or profile-based strategies. However, multiple sequence alignments of low similarity homologues typically reveal a limited number of positions that are well conserved despite diversity of function. It may be inferred that conservation at most of these positions is the result of the importance of the contribution of these amino acids to the folding and stability of the protein. As such, these amino acids and their relative positions may define a structural signature. We demonstrate that extraction of this fold template provides the basis for the sequence database to be searched for patterns consistent with the fold, enabling identification of homologs that are not recognized by global sequence analysis. The fold template method was developed to address the need for a tool that could comprehensively search the midnight and twilight zones of protein sequence similarity without reliance on global statistical significance. Manual implementations of the fold template method were performed on three folds--immunoglobulin, c-lectin and TIM barrel. Following proof of concept of the template method, an automated version of the approach was developed. This automated fold template method was used to develop fold templates for 10 of the more populated folds in the SCOP database. The fold template method developed three-dimensional structural motifs or signatures that were able to return a diverse collection of proteins, while maintaining a low false positive rate. Although the results of the manual fold template method were more comprehensive than the automated fold template method, the diversity of the results from the automated fold template method surpassed those of current methods that rely on statistical significance to infer evolutionary relationships among divergent proteins.  相似文献   

16.

Background

The members of cupin superfamily exhibit large variations in their sequences, functions, organization of domains, quaternary associations and the nature of bound metal ion, despite having a conserved β-barrel structural scaffold. Here, an attempt has been made to understand structure-function relationships among the members of this diverse superfamily and identify the principles governing functional diversity. The cupin superfamily also contains proteins for which the structures are available through world-wide structural genomics initiatives but characterized as “hypothetical”. We have explored the feasibility of obtaining clues to functions of such proteins by means of comparative analysis with cupins of known structure and function.

Methodology/Principal Findings

A 3-D structure-based phylogenetic approach was undertaken. Interestingly, a dendrogram generated solely on the basis of structural dissimilarity measure at the level of domain folds was found to cluster functionally similar members. This clustering also reflects an independent evolution of the two domains in bicupins. Close examination of structural superposition of members across various functional clusters reveals structural variations in regions that not only form the active site pocket but are also involved in interaction with another domain in the same polypeptide or in the oligomer.

Conclusions/Significance

Structure-based phylogeny of cupins can influence identification of functions of proteins of yet unknown function with cupin fold. This approach can be extended to other proteins with a common fold that show high evolutionary divergence. This approach is expected to have an influence on the function annotation in structural genomics initiatives.  相似文献   

17.
BACKGROUND: Do proteins that have the same structure fold by the same pathway even when they are unrelated in sequence? To address this question, we are comparing the folding of a number of different immunoglobulin-like proteins. Here, we present a detailed protein engineering phi value analysis of the folding pathway of TI I27, an immunoglobulin domain from human cardiac titin. RESULTS: TI I27 folds rapidly via a kinetic intermediate that is destabilized by most mutations. The transition state for folding is remarkably native-like in terms of solvent accessibility. We use phi value analysis to map this transition state and show that it is highly structured; only a few residues close to the N-terminal region of the protein remain completely unfolded. Interestingly, most mutations cause the transition state to become less native-like. This anti-Hammond behavior can be used as a novel means of obtaining additional structural information about the transition state. CONCLUSIONS: The residues that are involved in nucleating the folding of TI I27 are structurally equivalent to the residues that form the folding nucleus in an evolutionary unrelated fibronectin type III protein. These residues form part of the common structural core of Ig-like domains. The data support the hypothesis that interactions essential for defining the structure of these beta sandwich proteins are also important in nucleation of folding.  相似文献   

18.
One of the key questions in protein folding is whether polypeptide chains require unique nucleation sites to fold to the native state. In order to identify possible essential polypeptide segments for folding, we have performed a complete circular permutation analysis of a protein in which the natural termini are in close proximity. As a model system, we used the disulfide oxidoreductase DsbA from Escherichia coli, a monomeric protein of 189 amino acid residues. To introduce new termini at all possible positions in its polypeptide chain, we generated a library of randomly circularly permuted dsbA genes and screened for active circularly permuted variants in vivo. A total of 51 different active variants were identified. The new termini were distributed over about 70 % of the polypeptide chain, with the majority of them occurring within regular secondary structures. New termini were not found in approximately 30 % of the DsbA sequence which essentially correspond to four alpha-helices of DsbA. Introduction of new termini into these "forbidden segments" by directed mutagenesis yielded proteins with altered overall folds and strongly reduced catalytic activities. In contrast, all active variants analysed so far show structural and catalytic properties comparable with those of DsbA wild-type. We suggest that random circular permutation allows identification of contiguous structural elements in a protein that are essential for folding and stability.  相似文献   

19.
Subbian E  Yabuta Y  Shinde U 《Biochemistry》2004,43(45):14348-14360
Subtilisin E (SbtE) is a member of the ubiquitous superfamily of serine proteases called subtilases and serves as a model for understanding propeptide-mediated protein folding mechanisms. Unlike most proteins that adopt thermodynamically stable conformations, the native state of SbtE is trapped into a kinetically stable conformation. While kinetic stability offers distinct functional advantages to the native state, the constraints that dictate the selection between kinetic and thermodynamic folding and stability remain unknown. Using highly conserved subtilases, we demonstrate that adaptive evolution of sequence dictates selection of folding pathways. Intracellular and extracellular serine proteases (ISPs and ESPs, respectively) constitute two subfamilies within the family of subtilases that have highly conserved sequences, structures, and catalytic activities. Our studies on the folding pathways of subtilisin E (SbtE), an ESP, and its homologue intracellular serine protease 1 (ISP1), an ISP, show that although topology, contact order, and hydrophobicity that drive protein folding reactions are conserved, ISP1 and SbtE fold through significantly different pathways and kinetics. While SbtE absolutely requires the propeptide to fold into a kinetically trapped conformer, ISP1 folds to a thermodynamically stable state more than 1 million times faster and independent of a propeptide. Furthermore, kinetics establish that ISP1 and SbtE fold through different intermediate states. An evolutionary analysis of folding constraints in subtilases suggests that observed differences in folding pathways may be mediated through positive selection of specific residues that map mostly onto the protein surface. Together, our results demonstrate that closely related subtilases can fold through distinct pathways and mechanisms, and suggest that fine sequence details can dictate the choice between kinetic and thermodynamic folding and stability.  相似文献   

20.
Statistical analysis of protein folding rates has been done for 84 proteins with available experimental data. A surprising result is that the proteins with multi-state kinetics from the size range of 50–100 amino acid residues (a.a.) fold as fast as proteins with two-state kinetics from the same size range. At the same time, the proteins with two-state kinetics from the size range 101–151 a.a. fold faster than those from the size range 50–100 a.a. Moreover, it turns out unexpectedly that usually in the group of structural homologs from the size range 50–100 a.a., proteins with multi-state kinetics fold faster than those with two-state kinetics. The protein folding for six proteins with a ferredoxin-like fold and with a similar size has been modeled using Monte Carlo simulations and dynamic programming. Good correlation between experimental folding rates, some structural parameters, and the number of Monte Carlo steps has been obtained. It is shown that a protein with multi-state kinetics actually folds three times faster than its structural homologs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号