首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The evolutionary adaptations of thermophilic water‐soluble proteins required for maintaining stability at high temperature have been extensively investigated. Little is known about the adaptations in membrane proteins, however. Here, we compare many properties of mesophilic and thermophilic membrane protein structures, including side‐chain burial, packing, hydrogen bonding, transmembrane kinks, loop lengths, hydrophobicity, and other sequence features. Most of these properties are quite similar between mesophiles and thermophiles although we observe a slight increase in side‐chain burial and possibly a slight decrease in the frequency of transmembrane kinks in thermophilic membrane protein structures. The most striking difference is the increased hydrophobicity of thermophilic transmembrane helices, possibly reflecting more stringent hydrophobicity requirements for membrane partitioning at high temperature. In agreement with prior work examining transmembrane sequences, we find that thermophiles have an increase in small residues (Gly, Ala, Ser, and Val) and a strong suppression of Cys. We also find a relative dearth of most strongly polar residues (Asp, Asn, Glu, Gln, and Arg). These results suggest that in thermophiles, there is significant evolutionary pressure to offload destabilizing polar amino acids, to decrease the entropy cost of side chain burial, and to eliminate thermally sensitive amino acids.  相似文献   

2.
This paper is concerned with a branch of computational biology related to protein prediction and analysis of secondary structure of proteins. Although traditional methods use a simple amino acid composition to predict the secondary structure content, hydrophobicity has been recently found to improve the results in this and several related prediction tasks. To this end, we propose and analyze advantages of two new hydrophobicity index-based scales that incorporate information about long-range interactions along the protein sequence and contrast them with currently used raw hydrophobic index values. We also compare three leading hydrophobicity indices, i.e., Eisenberg's, Fauchere-Pliska's, and Cid's, using the proposed scales. The analysis is performed using fuzzy cognitive maps that quantify the strength of relation between the hydrophobicity scales/indices and the protein content values. A set of empirical tests that involve generation of fuzzy cognitive map models for a set of 200 low homology proteins have been performed. The results show that the secondary structure content along the protein sequence is characterized by about 2.5 times stronger relation with the two proposed hydrophobicity scales when compared with the currently used raw index values. The new scales exhibit stronger relation irrespective of the applied hydrobhobicity indices. Analysis of different scales shows superiority of the Eisenberg's hydrophobicity index, when used with the new scales. In contrast, the Fauchere-Pliska's index is found to perform better when compared with the two other indices when using raw hydrophobic index values that disregard the long-range interactions.  相似文献   

3.
A critical step in the folding pathway of globular proteins is the formation of a tightly packed hydrophobic core. Several mutational studies have addressed the question of whether tight packing interactions are present during the rate-limiting step of folding. In some of these investigations, substituted side chains have been assumed to form native-like interactions in the transition state when the folding rates of mutant proteins correlate with their native-state stabilities. Alternatively, it has been argued that side chains participate in nonspecific hydrophobic collapse when the folding rates of mutant proteins correlate with side-chain hydrophobicity. In a reanalysis of published data, we have found that folding rates often correlate similarly well, or poorly, with both native-state stability and side-chain hydrophobicity, and it is therefore not possible to select an appropriate transition state model based on these one-parameter correlations. We show that this ambiguity can be resolved using a two-parameter model in which side chain burial and the formation of all other native-like interactions can occur asynchronously. Notably, the model agrees well with experimental data, even for positions where the one-parameter correlations are poor. We find that many side chains experience a previously unrecognized type of transition state environment in which specific, native-like interactions are formed, but hydrophobic burial dominates. Implications of these results to the design and analysis of protein folding studies are discussed.  相似文献   

4.
We present a coarse-grained approach for modeling the thermodynamic stability of single-domain globular proteins in concentrated aqueous solutions. Our treatment derives effective protein-protein interactions from basic structural and energetic characteristics of the native and denatured states. These characteristics, along with the intrinsic (i.e., infinite dilution) thermodynamics of folding, are calculated from elementary sequence information using a heteropolymer collapse theory. We integrate this information into Reactive Canonical Monte Carlo simulations to investigate the connections between protein sequence hydrophobicity, protein-protein interactions, protein concentration, and the thermodynamic stability of the native state. The model predicts that sequence hydrophobicity can affect how protein concentration impacts native-state stability in solution. In particular, low hydrophobicity proteins are primarily stabilized by increases in protein concentration, whereas high hydrophobicity proteins exhibit richer nonmonotonic behavior. These trends appear qualitatively consistent with the available experimental data. Although factors such as pH, salt concentration, and protein charge are also important for protein stability, our analysis suggests that some of the nontrivial experimental trends may be driven by a competition between destabilizing hydrophobic protein-protein attractions and entropic crowding effects.  相似文献   

5.
The ability to consistently distinguish real protein structures from computationally generated model decoys is not yet a solved problem. One route to distinguish real protein structures from decoys is to delineate the important physical features that specify a real protein. For example, it has long been appreciated that the hydrophobic cores of proteins contribute significantly to their stability. We used two sources to obtain datasets of decoys to compare with real protein structures: submissions to the biennial Critical Assessment of protein Structure Prediction competition, in which researchers attempt to predict the structure of a protein only knowing its amino acid sequence, and also decoys generated by 3DRobot, which have user‐specified global root‐mean‐squared deviations from experimentally determined structures. Our analysis revealed that both sets of decoys possess cores that do not recapitulate the key features that define real protein cores. In particular, the model structures appear more densely packed (because of energetically unfavorable atomic overlaps), contain too few residues in the core, and have improper distributions of hydrophobic residues throughout the structure. Based on these observations, we developed a feed‐forward neural network, which incorporates key physical features of protein cores, to predict how well a computational model recapitulates the real protein structure without knowledge of the structure of the target sequence. By identifying the important features of protein structure, our method is able to rank decoy structures with similar accuracy to that obtained by state‐of‐the‐art methods that incorporate many additional features. The small number of physical features makes our model interpretable, emphasizing the importance of protein packing and hydrophobicity in protein structure prediction.  相似文献   

6.
We use highly efficient transition-matrix Monte Carlo simulations to determine equilibrium unfolding curves and fluid phase boundaries for solutions of coarse-grained globular proteins. The model we analyze derives the intrinsic stability of the native state and protein-protein interactions from basic information about protein sequence using heteropolymer collapse theory. It predicts that solutions of low hydrophobicity proteins generally exhibit a single liquid phase near their midpoint temperatures for unfolding, while solutions of proteins with high sequence hydrophobicity display the type of temperature-inverted, liquid-liquid transition associated with aggregation processes of proteins and other amphiphilic molecules. The phase transition occurring in solutions of the most hydrophobic protein we study extends below the unfolding curve, creating an immiscibility gap between a dilute, mostly native phase and a concentrated, mostly denatured phase. The results are qualitatively consistent with the solution behavior of hemoglobin (HbA) and its sickle variant (HbS), and they suggest that a liquid-liquid transition resulting in significant protein denaturation should generally be expected on the phase diagram of high-hydrophobicity protein solutions. The concentration fluctuations associated with this transition could be a driving force for the nonnative aggregation that can occur below the midpoint temperature.  相似文献   

7.
The stability of thermophilic proteins has been viewed from different perspectives and there is yet no unified principle to understand this stability. It would be valuable to reveal the most important interactions for designing thermostable proteins for such applications as industrial protein engineering. In this work, we have systematically analyzed the importance of various interactions by computing different parameters such as surrounding hydrophobicity, inter‐residue interactions, ion‐pairs and hydrogen bonds. The importance of each interaction has been determined by its predicted relative contribution in thermophiles versus the same contribution in mesophilic homologues based on a dataset of 373 protein families. We predict that hydrophobic environment is the major factor for the stability of thermophilic proteins and found that 80% of thermophilic proteins analyzed showed higher hydrophobicity than their mesophilic counterparts. Ion pairs, hydrogen bonds, and interaction energy are also important and favored in 68%, 50%, and 62% of thermophilic proteins, respectively. Interestingly, thermophilic proteins with decreased hydrophobic environments display a greater number of hydrogen bonds and/or ion pairs. The systematic elimination of mesophilic proteins based on surrounding hydrophobicity, interaction energy, and ion pairs/hydrogen bonds, led to correctly identifying 95% of the thermophilic proteins in our analyses. Our analysis was also applied to another, more refined set of 102 thermophilic–mesophilic pairs, which again identified hydrophobicity as a dominant property in 71% of the thermophilic proteins. Further, the notion of surrounding hydrophobicity, which characterizes the hydrophobic behavior of residues in a protein environment, has been applied to the three‐dimensional structures of elongation factor‐Tu proteins and we found that the thermophilic proteins are enriched with a hydrophobic environment. The results obtained in this work highlight the importance of hydrophobicity as the dominating characteristic in the stability of thermophilic proteins, and we anticipate this will be useful in our attempts to engineering thermostable proteins. © Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

8.
9.
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence‐structure‐dynamics‐function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence‐conserved residues and build phylogenetic tree. Three‐dimensional structure alignment was also applied to obtain structure‐conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics.  相似文献   

10.
Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.  相似文献   

11.
Recent ab initio folding simulations for a limited number of small proteins have corroborated a previous suggestion that atomic burial information obtainable from sequence could be sufficient for tertiary structure determination when combined to sequence‐independent geometrical constraints. Here, we use simulations parameterized by native burials to investigate the required amount of information in a diverse set of globular proteins comprising different structural classes and a wide size range. Burial information is provided by a potential term pushing each atom towards one among a small number L of equiprobable concentric layers. An upper bound for the required information is provided by the minimal number of layers Lmin still compatible with correct folding behavior. We obtain Lmin between 3 and 5 for seven small to medium proteins with 50 ≤ Nr ≤ 110 residues while for a larger protein with Nr = 141 we find that L ≥ 6 is required to maintain native stability. We additionally estimate the usable redundancy for a given LLmin from the burial entropy associated to the largest folding‐compatible fraction of “superfluous” atoms, for which the burial term can be turned off or target layers can be chosen randomly. The estimated redundancy for small proteins with L = 4 is close to 0.8. Our results are consistent with the above‐average quality of burial predictions used in previous simulations and indicate that the fraction of approachable proteins could increase significantly with even a mild, plausible, improvement on sequence‐dependent burial prediction or on sequence‐independent constraints that augment the detectable redundancy during simulations. Proteins 2016; 84:515–531. © 2016 Wiley Periodicals, Inc.  相似文献   

12.
Utilizing concepts of protein building blocks, we propose a de novo computational algorithm that is similar to combinatorial shuffling experiments. Our goal is to engineer new naturally occurring folds with low homology to existing proteins. A selected protein is first partitioned into its building blocks based on their compactness, degree of isolation from the rest of the structure, and hydrophobicity. Next, the protein building blocks are substituted by fragments taken from other proteins with overall low sequence identity, but with a similar hydrophobic/hydrophilic pattern and a high structural similarity. These criteria ensure that the designed protein has a similar fold, low sequence identity, and a good hydrophobic core compared with its native counterpart. Here, we have selected two proteins for engineering, protein G B1 domain and ubiquitin. The two engineered proteins share approximately 20% and approximately 25% amino acid sequence identities with their native counterparts, respectively. The stabilities of the engineered proteins are tested by explicit water molecular dynamics simulations. The algorithm implements a strategy of designing a protein using relatively stable fragments, with a high population time. Here, we have selected the fragments by searching for local minima along the polypeptide chain using the protein building block model. Such an approach provides a new method for engineering new proteins with similar folds and low homology.  相似文献   

13.
14.
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions.  相似文献   

15.
Domains are the main structural and functional units of larger proteins. They tend to be contiguous in primary structure and can fold and function independently. It has been observed that 10–20% of all encoded proteins contain duplicated domains and the average pairwise sequence identity between them is usually low. In the present study, we have analyzed the structural similarity between domain repeats of proteins with known structures available in the Protein Data Bank using structure-based inter-residue interaction measures such as the number of long-range contacts, surrounding hydrophobicity, and pairwise interaction energy. We used RADAR program for detecting the repeats in a protein sequence which were further validated using Pfam domain assignments. The sequence identity between the repeats in domains ranges from 20 to 40% and their secondary structural elements are well conserved. The number of long-range contacts, surrounding hydrophobicity calculations and pairwise interaction energy of the domain repeats clearly reveal the conservation of 3-D structure environment in the repeats of domains. The proportions of mainchain–mainchain hydrogen bonds and hydrophobic interactions are also highly conserved between the repeats. The present study has suggested that the computation of these structure-based parameters will give better clues about the tertiary environment of the repeats in domains. The folding rates of individual domains in the repeats predicted using the long-range order parameter indicate that the predicted folding rates correlate well with most of the experimentally observed folding rates for the analyzed independently folded domains.  相似文献   

16.
Homology modelling of the human eIF-5A protein has been performed by using a multiple predictions strategy. As the sequence identity between the target and the template proteins is nearly 30%, which is lower than the commonly used threshold to apply with confidence the homology modelling method, we developed a specific predictive scheme by combining different sequence analyses and predictions, as well as model validation by comparison to structural experimental information. The target sequence has been used to find homologues within sequence databases and a multiple alignment has been created. Secondary structure for each single protein has been predicted and compared on the basis of the multiple sequence alignment, in order to evaluate and adjust carefully any gap. Therefore, comparative modelling has been applied to create the model of the protein on the basis of the optimized sequence alignment. The quality of the model has been checked by computational methods and the structural features have been compared to experimental information, giving us a good validation of the reliability of the model and its correspondence to the protein structure in solution. Last, the model was deposited in the Protein Data Bank to be accessible for studies on the structure-function relationships of the human eIF-5A.  相似文献   

17.
The three-dimensional structures of leucine-rich repeat (LRR)-containing proteins from five different families were previously predicted based on the crystal structure of the ribonuclease inhibitor, using an approach that combined homology-based modeling, structure-based sequence alignment of LRRs, and several rational assumptions. The structural models have been produced based on very limited sequence similarity, which, in general, cannot yield trustworthy predictions. Recently, the protein structures from three of these five families have been determined. In this report we estimate the quality of the modeling approach by comparing the models with the experimentally determined structures. The comparison suggests that the general architecture, curvature, "interior/exterior" orientations of side chains, and backbone conformation of the LRR structures can be predicted correctly. On the other hand, the analysis revealed that, in some cases, it is difficult to predict correctly the twist of the overall super-helical structure. Taking into consideration the conclusions from these comparisons, we identified a new family of bacterial LRR proteins and present its structural model. The reliability of the LRR protein modeling suggests that it would be informative to apply similar modeling approaches to other classes of solenoid proteins.  相似文献   

18.
If it is assumed that the primary sequence determines the three-dimensional folded structure of a protein, then the regular folding patterns, such as alpha-helix, beta-sheet, and other ordered patterns in the three-dimensional structure must correspond to the periodic distribution of the physical properties of the amino acids along the primary sequence. An AutoRegressive Moving Average (ARMA) model method of spectral analysis is applied to analyze protein sequences represented by the hydrophobicity of their amino acids. The results for several membrane proteins of known structures indicate that the periodic distribution of hydrophobicity of the primary sequence is closely related to the regular folding patterns in a protein's three-dimensional structure. We also applied the method to the transmembrane regions of acetylcholine receptor alpha subunit and Shaker potassium channel for which no atomic resolution structure is available. This work is an extension of our analysis of globular proteins by a similar method.  相似文献   

19.
Proper protein localization is essential for critical cellular processes, including vesicle‐mediated transport and protein translocation. Tail‐anchored (TA) proteins are integrated into organellar membranes via the C‐terminus, orienting the N‐terminus towards the cytosol. Localization of TA proteins occurs posttranslationally and is governed by the C‐terminus, which contains the integral transmembrane domain (TMD) and targeting sequence. Targeting of TA proteins is dependent on the hydrophobicity of the TMD as well as the length and composition of flanking amino acid sequences. We previously identified an unusual homologue of elongator protein, Elp3, in the apicomplexan parasite Toxoplasma gondii as a TA protein targeting the outer mitochondrial membrane. We sought to gain further insight into TA proteins and their targeting mechanisms using this early‐branching eukaryote as a model. Our bioinformatics analysis uncovered 59 predicted TA proteins in Toxoplasma, 9 of which were selected for follow‐up analyses based on representative features. We identified novel TA proteins that traffic to specific organelles in Toxoplasma, including the parasite endoplasmic reticulum, mitochondrion, and Golgi apparatus. Domain swap experiments elucidated that targeting of TA proteins to these specific organelles was strongly influenced by the TMD sequence, including charge of the flanking C‐terminal sequence.   相似文献   

20.
Hydropathy plots or window averages over local stretches of the sequence of residue hydrophobicity have revealed patterns related to various protein tertiary structural features. This has enabled identification of regions of the sequence that are at the surface or within the interior of globular soluble proteins, regions located within the lipid bilayer of transmembrane proteins, portions of the sequence that characterize repeating motifs, as well as motifs that usefully characterize different protein structural families. This, therefore, provides one example of the generally expressed maxim that "sequence determines structure". On the other hand, a number of previous investigations have shown the rapidly varying values of residue hydrophobicity along the sequence to be distributed approximately randomly. So one might question just how much of the sequence actually determines structure. It is, therefore, of interest to extract that part of this rapidly varying distribution of residue hydrophobicity that is responsible for the longer wavelength variations that correlate with protein tertiary structural features and to determine their prevalence within the entire distribution. This is accomplished by a finite Fourier analysis of the sequence of residue hydrophobicity and of a new measure of residue distance from the protein interior. Calculations are performed on a number of globins, immunoglobulins, cuprodoxins, and papain-like structures. The spectral power of the Fourier amplitudes of the frequencies extracted, whose inverse transforms underlie the windowed values of residue hydrophobicity is shown to be a small fraction of the total power of the hydrophobicity distribution and thereby consistent with a distribution that might appear to be predominantly random. The wide range of sequence identity between proteins having the same fold, all exhibiting similar small fractions of power amplitude that correlate with the longer wavelength inside-to-outside excursions of the amino acid residues, supports the general contention that close sequence identity is an expression of a close evolutionary relationship rather than an expression of structural similarity. Practical implications of the present analysis for protein structure prediction and engineering are also described.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号