首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
β‐sheets often have one face packed against the core of the protein and the other facing solvent. Mutational studies have indicated that the solvent‐facing residues can contribute significantly to protein stability, and that the preferred amino acid at each sequence position is dependent on the precise structure of the protein backbone and the identity of the neighboring amino acids. This suggests that the most advantageous methods for designing β‐sheet surfaces will be approaches that take into account the multiple energetic factors at play including side chain rotamer preferences, van der Waals forces, electrostatics, and desolvation effects. Here, we show that the protein design software Rosetta, which models these energetic factors, can be used to dramatically increase protein stability by optimizing interactions on the surfaces of small β‐sheet proteins. Two design variants of the β‐sandwich protein from tenascin were made with 7 and 14 mutations respectively on its β‐sheet surfaces. These changes raised the thermal midpoint for unfolding from 45°C to 64°C and 74°C. Additionally, we tested an empirical approach based on increasing the number of potential salt bridges on the surfaces of the β‐sheets. This was not a robust strategy for increasing stability, as three of the four variants tested were unfolded.  相似文献   

2.
Many existing derivations of knowledge-based statistical pair potentials invoke the quasichemical approximation to estimate the expected side-chain contact frequency if there were no amino acid pair-specific interactions. At first glance, the quasichemical approximation that treats the residues in a protein as being disconnected and expresses the side-chain contact probability as being proportional to the product of the mole fractions of the pair of residues would appear to be rather severe. To investigate the validity of this approximation, we introduce two new reference states in which no specific pair interactions between amino acids are allowed, but in which the connectivity of the protein chain is retained. The first estimates the expected number of side-chain contracts by treating the protein as a Gaussian random coil polymer. The second, more realistic reference state includes the effects of chain connectivity, secondary structure, and chain compactness by estimating the expected side-chain contrast probability by placing the sequence of interest in each member of a library of structures of comparable compactness to the native conformation. The side-chain contact maps are not allowed to readjust to the sequence of interest, i.e., the side chains cannot repack. This situation would hold rigorously if all amino acids were the same size. Both reference states effectively permit the factorization of the side-chain contact probability into sequence-dependent and structure-dependent terms. Then, because the sequence distribution of amino acids in proteins is random, the quasichemical approximation to each of these reference states is shown to be excellent. Thus, the range of validity of the quasichemical approximation is determined by the magnitude of the side-chain repacking term, which is, at present, unknown. Finally, the performance of these two sets of pair interaction potentials as well as side-chain contact fraction-based interaction scales is assessed by inverse folding tests both without and with allowing for gaps.  相似文献   

3.

Background

Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model.

Results

We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.  相似文献   

4.
The standard genetic code is known to be robust to translation errors and point mutations. We studied how small modifications of the standard code affect its robustness. The robustness was assessed in terms of a proper stability function, the negative variations of which correspond to a more robust code. The fraction of more robust codes obtained under small modifications appeared to be unexpectedly high, about 0.1-0.4 depending on the choice of stability function and code modifications, yet significantly lower than the corresponding fraction in the random codes (about a half). In this sense the standard code ought to be considered distinctly non-random in accordance with previous observations. The distribution of the negative variations of stability function revealed very abrupt drop beyond one standard deviation, much sharper than for Gaussian distribution or for the random codes with the same number of codons in the sets coding for amino acids or stop-codons. This behavior holds for both the standard code as a whole and its binary NRN-NYN, NWN-NSN, and NMN-NKN blocks. Previously, it has been proved that such binary block structure is necessary for the robustness of a code and is inherent to the standard genetic code. The modifications of the standard code corresponding to more robust coding may be related to the different variants of the code. These effects may also contribute to the rates of replacements of amino acids. The observed features demonstrate the joint impact of random factors and natural selection during evolution of the genetic code.  相似文献   

5.
An algorithm was derived to relate the amino acid sequence of a collagen triple helix to its thermal stability. This calculation is based on the triple helical stabilization propensities of individual residues and their intermolecular and intramolecular interactions, as quantitated by melting temperature values of host-guest peptides. Experimental melting temperature values of a number of triple helical peptides of varying length and sequence were successfully predicted by this algorithm. However, predicted T(m) values are significantly higher than experimental values when there are strings of oppositely charged residues or concentrations of like charges near the terminus. Application of the algorithm to collagen sequences highlights regions of unusually high or low stability, and these regions often correlate with biologically significant features. The prediction of stability from sequence indicates an understanding of the major forces maintaining this protein motif. The use of highly favorable KGE and KGD sequences is seen to complement the stabilizing effects of imino acids in modulating stability and may become dominant in the collagenous domains of bacterial proteins that lack hydroxyproline. The effect of single amino acid mutations in the X and Y positions can be evaluated with this algorithm. An interactive collagen stability calculator based on this algorithm is available online.  相似文献   

6.
Expanded understanding of the factors that direct polypeptide ion fragmentation can lead to improved specificity in the use of tandem mass spectrometry for the identification and characterization of proteins. Like the fragmentation of peptide cations, the dissociation of whole protein cations shows several preferred cleavages, the likelihood for which is parent ion charge dependent. While such cleavages are often observed, they are far from universally observed, despite the presence of the residues known to promote them. Furthermore, cleavages at residues not noted to be common in a variety of proteins can be dominant for a particular protein or protein ion charge state. Motivated by the ability to study a small protein, turkey ovomucoid third domain, for which a variety of single amino acid variants are available, the effects of changing the identity of one amino acid in the protein sequence on its dissociation behavior were examined. In particular, changes in amino acids associated with C-terminal aspartic acid cleavage and N-terminal proline cleavage were emphasized. Consistent with previous studies, the product ion spectra were found to be dependent upon the parent ion charge state. Furthermore, the fraction of possible C-terminal aspartic acid cleavages observed to occur for this protein was significantly larger than the fraction of possible N-terminal proline cleavages. In fact, very little N-terminal proline cleavage was noted for the wild-type protein despite the presence of three proline residues in the protein. The addition/removal of proline and aspartic acids was studied along with changes in selected residues adjacent to proline residues. Evidence for inhibition of proline cleavage by the presence of nearby basic residues was noted, particularly if the basic residue was likely to be protonated.  相似文献   

7.
The region of the colicin E1 polypeptide that interacts with immunity protein has been localized to a 168-residue COOH-terminal peptide. This is the length of a proteolytically generated peptide fragment of colicin E1 against which imm+ function can be demonstrated in osmotically shocked cells. The role of particular amino acids of the COOH-terminal peptide in the expression of the immune phenotype was studied. Chemical modification showed that the two histidine residues (His 427 and His 440) and the single cysteine residue (Cys 505) present in the COOH-terminal peptide were not necessary for the colicin-immunity protein interaction. The immunity protein was localized in the cytoplasmic membrane fraction, consistent with previous work of others on the colicin Ia immunity protein and the prediction from the immunity protein amino acid sequence that it is a hydrophobic protein. The distribution of hydrophobic residues along the immunity polypeptide was calculated.  相似文献   

8.
Helix-helix interactions are important for the folding, stability, and function of membrane proteins. Here, two independent and complementary methods are used to investigate the nature and distribution of amino acids that mediate helix-helix interactions in membrane and soluble alpha-bundle proteins. The first method characterizes the packing density of individual amino acids in helical proteins based on the van der Waals surface area occluded by surrounding atoms. We have recently used this method to show that transmembrane helices pack more tightly, on average, than helices in soluble proteins. These studies are extended here to characterize the packing of interfacial and noninterfacial amino acids and the packing of amino acids in the interfaces of helices that have either right- or left-handed crossing angles, and either parallel or antiparallel orientations. We show that the most abundant tightly packed interfacial residues in membrane proteins are Gly, Ala, and Ser, and that helices with left-handed crossing angles are more tightly packed on average than helices with right-handed crossing angles. The second method used to characterize helix-helix interactions involves the use of helix contact plots. We find that helices in membrane proteins exhibit a broader distribution of interhelical contacts than helices in soluble proteins. Both helical membrane and soluble proteins make use of a general motif for helix interactions that relies mainly on four residues (Leu, Ala, Ile, Val) to mediate helix interactions in a fashion characteristic of left-handed helical coiled coils. However, a second motif for mediating helix interactions is revealed by the high occurrence and high average packing values of small and polar residues (Ala, Gly, Ser, Thr) in the helix interfaces of membrane proteins. Finally, we show that there is a strong linear correlation between the occurrence of residues in helix-helix interfaces and their packing values, and discuss these results with respect to membrane protein structure prediction and membrane protein stability.  相似文献   

9.
Most protein-based affinity chromatography media are very sensitive towards alkaline treatment, which is a preferred method for regeneration and removal of contaminants from the purification devices in industrial applications. In a previous study, we concluded that a simple and straightforward strategy consisting of replacing asparagine residues could improve the stability towards alkaline conditions. In this study, we have shown the potential of this rationale by stabilizing an IgG-binding domain of streptococcal protein G, i.e. the C2 domain. In order to analyze the contribution of the different amino acids to the alkaline sensitivity of the domain we used a single point mutation strategy. Amino acids known to be susceptible towards high pH, asparagine and glutamine, were substituted for less-alkali-susceptible residues. In addition, aspartic acid residues were mutated to evaluate if the stability could be further increased. The stability of the different C2 variants was subsequently analyzed by exposing them to NaOH. The obtained results reveal that the most sensitive amino acid towards alkaline conditions in the structure of C2 is Asn36. The double mutant, C2(N7,36A), was found to be the most stable mutant constructed. In addition to the increased alkaline stability and also very important for potential use as an affinity ligand, this mutated variant also retains the secondary structure, as well as the affinity to the Fc fragment of IgG.  相似文献   

10.
The sequences of four-alpha-helical bundle proteins are characterized by a pattern of hydrophilic and hydrophobic amino acids which is repeated every seven residues. At each position of the heptad repeat there are specific constraints on the amino acid properties which result from the topology of the tertiary motif. These constraints give rise to patterns of amino acid distribution which are distinct from those of other proteins. The distributions in each of the heptad positions have been determined by a statistical analysis of structural and sequence data derived from seven families of aligned protein sequences. The constitution of each position is dominated by a very small number of different amino acids, with the core positions consisting overwhelmingly of Leu and Ala. The positional preferences of the individual amino acids can be generally interpreted in terms of residue properties and topological constraints. The potential for four-alpha-helix bundle folding is reflected primarily in the pattern of residue occurrence in the heptad and not in the overall amino acid composition of the protein. Possible applications of this analysis in structure predictions, sequence alignments and in the rational design and engineering of four-alpha-helical bundle proteins are discussed.  相似文献   

11.
Characteristic sequential residue environment of amino acids in proteins   总被引:1,自引:0,他引:1  
The occurrence of all di- and tripeptide segments of proteins was counted in a large data base containing about 119 000 residues. It was found that the abundance of the amino acids does not determine the frequency of the various di- and tripeptide segments. In addition, the frequency of the various tripeptides cannot be predicted from the observed pair-frequency values. The pair-frequency distribution of amino acids is highly asymmetrical, pairs formed from identical residues are generally preferred and amino acids cannot be clustered on the basis of their first neighbour preferences. These data indicate the existence of general short range regularities in the primary structure of proteins. The consequences of these short range regularities were studied by comparing Chou-Fasman parameters with analogous parameters determined from the results of conformational energy calculations of single amino acids. This comparison shows that Chou-Fasman parameters carry significant information about the environment of each amino acid. The success of the Chou-Fasman's prediction and the properties of the pair and triplet distribution of the amino acid residues suggest that every amino acid has a characteristic sequential residue environment in proteins. The observed preferences could be invoked, for example, in protein design or in the study of the evolutionary relationship of proteins.  相似文献   

12.
In this work, we study the first passage statistics of amino acid primary sequences, that is the probability of observing an amino acid for the first time at a certain number of residues away from a fixed amino acid. By using this rich mathematical framework, we are able to capture the background distribution for an organism, and infer lengths at which the first passage has a probability that differs from what is expected. While many features of an organism''s genome are due to natural selection, others are related to amino acid chemistry and the environment in which an organism lives, constraining the randomness of genomes upon which selection can further act. We therefore use this approach to infer amino acid correlations, and then study how these correlations vary across a wide range of organisms under a wide range of optimal growth temperatures. We find a nearly universal exponential background distribution, consistent with the idea that most amino acids are globally uncorrelated from other amino acids in genomes. When we are able to extract significant correlations, these correlations are reliably dependent on optimal growth temperature, across phylogenetic boundaries. Some of the correlations we extract, such as the enhanced probability of finding, for the first time, a cysteine three residues away from a cysteine or glutamic acid two residues away from an arginine, likely relate to thermal stability. However, other correlations, likely appearing on alpha helical surfaces, have a less clear physiochemical interpretation and may relate to thermal stability or unusual metabolic properties of organisms that live in a high temperature environment.  相似文献   

13.
Burioni R  Cassi D  Cecconi F  Vulpiani A 《Proteins》2004,55(3):529-535
We present an analysis of the effects of global topology on the structural stability of folded proteins in thermal equilibrium with a heat bath. For a large class of single domain proteins, we computed the harmonic spectrum within the Gaussian Network Model (GNM) and determined their spectral dimension, a parameter describing the low frequency behavior of the density of modes. We found a surprisingly strong correlation between the spectral dimension and the number of amino acids in the protein. Considering that larger spectral dimension values relate to more topologically compact folded states, our results indicate that, for a given temperature and length of protein, the folded structure corresponds to a less compact folding, one compatible with thermodynamic stability.  相似文献   

14.
A Monte Carlo simulation based sequence design method is proposed to investigate the role of site-directed point mutations in protein misfolding. Site-directed point mutations are incorporated in the designed sequences of selected proteins. While most mutated sequences correctly fold to their native conformation, some of them stabilize in other nonnative conformations and thus misfold/unfold. The results suggest that a critical number of hydrophobic amino acid residues must be present in the core of the correctly folded proteins, whereas proteins misfold/unfold if this number of hydrophobic residues falls below the critical limit. A protein can accommodate only a particular number of hydrophobic residues at the surface, provided a large number of hydrophilic residues are present at the surface and critical hydrophobicity of the core is preserved. Some surface sites are observed to be equally sensitive toward site-directed point mutations as the core sites. Point mutations with highly polar and charged amino acids increases the misfold/unfold propensity of proteins. Substitution of natural amino acids at sites with different number of nonbonded contacts suggests that both amino acid identity and its respective site-specificity determine the stability of a protein. A clash-match method is developed to calculate the number of matching and clashing interactions in the mutated protein sequences. While misfolded/unfolded sequences have a higher number of clashing and a lower number of matching interactions, the correctly folded sequences have a lower number of clashing and a higher number of matching interactions. These results are valid for different SCOP classes of proteins.  相似文献   

15.
A suite of FORTRAN programs, PREF, is described for calculating preference functions from the data base of known protein structures and for comparing smoothed profiles of sequence-dependent preferences in proteins of unknown structure. Amino acid preferences for a secondary structure are considered as functions of a sequence environment. Sequence environment of amino acid residue in a protein is defined as an average over some physical, chemical, or statistical property of its primary structure neighbors. The frequency distribution of sequence environments in the data base of soluble protein structures is approximately normal for each amino acid type of known secondary conformation. An analytical expression for the dependence of preferences on sequence environment is obtained after each frequency distribution is replaced by corresponding Gaussian function. The preference for the α-helical conformation increases for each amino acid type with the increase of sequence environment of buried solvent-accessible surface areas. We show that a set of preference functions based on buried surface area is useful for predicting folding motifs in α-class proteins and in integral membrane proteins. The prediction accuracy for helical residues is 79% for 5 integral membrane proteins and 74% for 11 α-class soluble proteins. Most residues found in transmembrane segments of membrane proteins with known α-helical structure are predicted to be indeed in the helical conformation because of very high middle helix preferences. Both extramembrane and transmembrane helices in the photosynthetic reaction center M and L subunits are correctly predicted. We point out in the discussion that our method of conformational preference functions can identify what physical properties of the amino acids are important in the formation of particular secondary structure elements. © 1993 John Wiley & Sons, Inc.  相似文献   

16.
Distribution of accessible surfaces of amino acids in globular proteins   总被引:1,自引:0,他引:1  
C Lawrence  I Auger  C Mannella 《Proteins》1987,2(2):153-161
  相似文献   

17.
The covalent structure of rat ribosomal protein L7 was determined in part from the sequence of nucleotides in a recombinant cDNA and in part from the sequence of amino acids in portions of the protein. The complementary analyses supplemented and confirmed each other. Ribosomal protein L7 contains 258 amino acids and has a molecular weight of 30,040. The protein has an unusual and striking structural feature near the NH2 terminus: five tandem repeats of a sequence of 12 residues. Rat L7 appears to be related to ribosomal protein L7 from the moderate halophile Vibrio costicola and perhaps to L30 from Bacillus stearothermophilus, to L7 from the moderate halophile NRCC 41227, and to L22 from Nicotinia tobaccum chloroplast. In addition, there is a sequence of 24 amino acids in rat protein L7 that may be related to segments of the same number of residues in Escherichia coli ribosomal proteins S10, S15, L9, and L22.  相似文献   

18.
S Miyazawa  R L Jernigan 《Proteins》1999,36(3):347-356
Short-range interactions for secondary structures of proteins are evaluated as potentials of mean force from the observed frequencies of secondary structures in known protein structures which are assumed to have an equilibrium distribution with the Boltzmann factor of secondary structure energies. A secondary conformation at each residue position in a protein is described by a tripeptide, including one nearest neighbor on each side. The secondary structure potentials are approximated as additive contributions from neighboring residues along the sequence. These are part of an empirical potential to provide a crude estimate of protein conformational energy at a residue level. Unlike previous works, interactions are decoupled into intrinsic potentials of residues, potentials of backbone-backbone interactions, and of side chain-backbone interactions. Also interactions are decoupled into one-body, two-body, and higher order interactions between peptide backbone and side chain and between backbones. These decouplings are essential to correctly evaluate the total secondary structure energy of a protein structure without overcounting interactions. Each interaction potential is evaluated separately by taking account of the correlation in the amino acid order of protein sequences. Interactions among side chains are neglected, because of the relatively limited number of protein structures. Proteins 1999;36:347-356. Published 1999 Wiley-Liss, Inc.  相似文献   

19.
20.
Recent studies indicate that a fraction of the information contained in an amino acid sequence may be sufficient for specifying a native protein structure. An earlier alanine-scanning experiment conducted on bovine pancreatic trypsin inhibitor (BPTI; 58 residues) suggested that if cumulative mutations have additive effects on protein stability, a native protein structure could be built from BPTI sequences that contained many alanine residues distributed throughout the protein. To test this hypothesis, we designed and produced six BPTI mutants containing from 21 to 29 alanine residues. We found that the melting temperature of mutants containing up to 27 alanine residues (48 % of the total number of residues) could be predicted quite well by the sum of the change in melting temperature for the single mutations. Additionally, these same mutants folded into a native-like structure, as judged by their cooperative thermal denaturation curves and heteronuclear multiple quantum correlation (HMQC) NMR spectra. A BPTI mutant containing 22 alanine residues was further shown by 2D and 3D-NMR to fold into a structure very similar to that of native BPTI, and to be a functional trypsin inhibitor. These results provide insight into the extent to which native protein structure and function can be achieved with a highly simplified amino acid sequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号