首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
Because the space of folded protein structures is highly degenerate, with recurring secondary and tertiary motifs, methods for representing protein structure in terms of collective physically relevant coordinates are of great interest. By collapsing structural diversity to a handful of parameters, such methods can be used to delineate the space of designable structures (i.e., conformations that can be stabilized with a large number of sequences)—a crucial task for de novo protein design. We first demonstrate this on natural α-helical coiled coils using the Crick parameterization. We show that over 95% of known coiled-coil structures are within  1-Å Cα root mean square deviation of a Crick-ideal backbone. Derived parameters show that natural geometric space of coiled coils is highly restricted and can be represented by “allowed” conformations amidst a potential continuum of conformers. Allowed structures have (1) restricted axial offsets between helices, which differ starkly between parallel and anti-parallel structures; (2) preferred superhelical radii, which depend linearly on the oligomerization state; (3) pronounced radius-dependent a- and d-position amino acid propensities; and (4) discrete angles of rotation of helices about their axes, which are surprisingly independent of oligomerization state or orientation. In all, we estimate the space of designable coiled-coil structures to be reduced at least 160-fold relative to the space of geometrically feasible structures. To extend the benefits of structural parameterization to other systems, we developed a general mathematical framework for parameterizing arbitrary helical structures, which reduces to the Crick parameterization as a special case. The method is successfully validated on a set of non-coiled-coil helical bundles, frequent in channels and transporter proteins, which show significant helix bending but not supercoiling. Programs for coiled-coil parameter fitting and structure generation are provided via a web interface at http://www.gevorggrigoryan.com/cccp/, and code for generalized helical parameterization is available upon request.  相似文献   

2.
Arodź T  Płonka PM 《Proteins》2012,80(7):1780-1790
Inspection of structure changes in proteins borne by altering their sequences brings understanding of physics, functioning and evolution of existing proteins, and helps engineer modified ones. On single amino acid substitutions, the most frequent mutation type, shifts in backbone conformation are typically small, raising doubts if and how such minor modifications could drive evolutionary divergence. Here, we report that the distribution of magnitudes of structure change on such substitutions is heavy-tailed--whereas protein structures are robust to most substitutions, changes much larger than average occur with raised odds compared to what would be expected for exponential distribution with the same mean. This nonexponential behavior allows for reconciling the apparent contradiction between the observed conservation of protein structures and the substantial evolutionary plasticity implied in their diversity. The presence of the heavy tail in the distribution promotes structure divergence, facilitating exploration of new functionality, and conformations within folds, as well as exploration of structure space for new folds.  相似文献   

3.
Hu X  Kuhlman B 《Proteins》2006,62(3):739-748
Loss of side-chain conformational entropy is an important force opposing protein folding and the relative preferences of the amino acids for being buried or solvent exposed may be partially determined by which amino acids lose more side-chain entropy when placed in the core of a protein. To investigate these preferences, we have incorporated explicit modeling of side-chain entropy into the protein design algorithm, RosettaDesign. In the standard version of the program, the energy of a particular sequence for a fixed backbone depends only on the lowest energy side-chain conformations that can be identified for that sequence. In the new model, the free energy of a single amino acid sequence is calculated by evaluating the average energy and entropy of an ensemble of structures generated by Monte Carlo sampling of amino acid side-chain conformations. To evaluate the impact of including explicit side-chain entropy, sequences were designed for 110 native protein backbones with and without the entropy model. In general, the differences between the two sets of sequences are modest, with the largest changes being observed for the longer amino acids: methionine and arginine. Overall, the identity between the designed sequences and the native sequences does not increase with the addition of entropy, unlike what is observed when other key terms are added to the model (hydrogen bonding, Lennard-Jones energies, and solvation energies). These results suggest that side-chain conformational entropy has a relatively small role in determining the preferred amino acid at each residue position in a protein.  相似文献   

4.
Li H  Tang C  Wingreen NS 《Proteins》2002,49(3):403-412
We study the designability of all compact 3 x 3 x 3 and 6 x 6 lattice-protein structures using the Miyazawa-Jernigan (MJ) matrix. The designability of a structure is the number of sequences that design the structure, i.e., sequences that have that structure as their unique lowest-energy state. Previous studies of hydrophobic-polar (HP) models showed a wide distribution of structure designabilities. Recently, questions were raised concerning the use of a two-letter (HP) code in such studies. Here, we calculate designabilities using all 20 amino acids, with empirically determined interaction potentials (MJ matrix) and compare with HP model results. We find good qualitative agreement between the two models. In particular, highly designable structures in the HP model are also highly designable in the MJ model-and vice versa-with the associated sequences having enhanced thermodynamic stability.  相似文献   

5.
6.
M J Sippl  S Weitckus 《Proteins》1992,13(3):258-271
We present an approach which can be used to identify native-like folds in a data base of protein conformations in the absence of any sequence homology to proteins in the data base. The method is based on a knowledge-based force field derived from a set of known protein conformations. A given sequence is mounted on all conformations in the data base and the associated energies are calculated. Using several conformations and sequences from the globin family we show that the native conformation is identified correctly. In fact the resolution of the force field is high enough to discriminate between a native fold and several closely related conformations. We then apply the procedure to several globins of known sequence but unknown three dimensional structure. The homology of these sequences to globins of known structures in the data base ranges from 49 to 17%. With one exception we find that for all globin sequences one of the known globin folds is identified as the most favorable conformation. These results are obtained using a force field derived from a data base devoid of globins of known structure. We briefly discuss useful applications in protein structural research and future development of our approach.  相似文献   

7.
Recently we developed methods for the construction of knowledge-based mean fields from a data base of known protein structures. As shown previously, this approach can be used to calculate ensembles of probable conformations for short fragments of polypeptide chains. Here we develop procedures for the assembly of short fragments to complete three-dimensional models of polypeptide chains. The amino acid sequence of a given protein is decomposed into all possible overlapping fragments of a given length, and an ensemble of probable conformations is calculated for each fragment. The fragments are assembled to a complete model by choosing appropriate conformations from the individual ensembles and by averaging over equivalent angles. Finally a consistent model is obtained by rebuilding the conformation from the average angles. From the average angles the local variability of the structure can be calculated, which is a useful criterion for the reliability of the model. The procedure is applied to the calculation of the local backbone conformations of myoglobin and lysozyme whose structures have been solved by X-ray analysis and thymosin beta 4, a polypeptide of 43 amino acid residues whose structure was recently investigated by NMR spectroscopy. We demonstrate that substantial fractions of the calculated local backbone conformations are similar to the experimentally determined structures.  相似文献   

8.
Emberly EG  Miller J  Zeng C  Wingreen NS  Tang C 《Proteins》2002,47(3):295-304
Using an off-lattice model, we fully enumerate folded conformations of polypeptide chains of up to N = 19 monomers. Structures are found to differ markedly in designability, defined as the number of sequences with that structure as a unique lowest-energy conformation. We find that designability is closely correlated with the pattern of surface exposure of the folded structure. For longer chains, complete enumeration of structures is impractical. Instead, structures can be randomly sampled, and relative designability estimated either from designability within the random sample, or directly from surface-exposure pattern. We compare the surface-exposure patterns of those structures identified as highly designable to the patterns of naturally occurring proteins.  相似文献   

9.
A reduced representation model, which has been described in previous reports, was used to predict the folded structures of proteins from their primary sequences and random starting conformations. The molecular structure of each protein has been reduced to its backbone atoms (with ideal fixed bond lengths and valence angles) and each side chain approximated by a single virtual united-atom. The coordinate variables were the backbone dihedral angles phi and psi. A statistical potential function, which included local and nonlocal interactions and was computed from known protein structures, was used in the structure minimization. A novel approach, employing the concepts of genetic algorithms, has been developed to simultaneously optimize a population of conformations. With the information of primary sequence and the radius of gyration of the crystal structure only, and starting from randomly generated initial conformations, I have been able to fold melittin, a protein of 26 residues, with high computational convergence. The computed structures have a root mean square error of 1.66 A (distance matrix error = 0.99 A) on average to the crystal structure. Similar results for avian pancreatic polypeptide inhibitor, a protein of 36 residues, are obtained. Application of the method to apamin, an 18-residue polypeptide with two disulfide bonds, shows that it folds apamin to native-like conformations with the correct disulfide bonds formed.  相似文献   

10.
In recent years, there have been significant advances in the field of computational protein design including the successful computational design of enzymes based on backbone scaffolds from experimentally solved structures. It is likely that large‐scale sampling of protein backbone conformations will become necessary as further progress is made on more complicated systems. Removing the constraint of having to use scaffolds based on known protein backbones is a potential method of solving the problem. With this application in mind, we describe a method to systematically construct a large number of de novo backbone structures from idealized topological forms in a top–down hierarchical approach. The structural properties of these novel backbone scaffolds were analyzed and compared with a set of high‐resolution experimental structures from the protein data bank (PDB). It was found that the Ramachandran plot distribution and relative γ‐ and β‐turn frequencies were similar to those found in the PDB. The de novo scaffolds were sequence designed with RosettaDesign, and the energy distributions and amino acid compositions were comparable with the results for redesigned experimentally solved backbones. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

11.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

12.
A number of investigators have addressed the issue of why certain protein structures are especially common by considering structure designability, defined as the number of sequences that would successfully fold into any particular native structure. One such approach, based on foldability, suggested that structures could be classified according to their maximum possible foldability and that this optimal foldability would be highly correlated with structure designability. Other approaches have focused on computing the designability of lattice proteins written with reduced two-letter amino acid alphabets. These different approaches suggested contrasting characteristics of the most designable structures. This report compares the designability of lattice proteins over a wide range of amino acid alphabets and foldability requirements. While all alphabets have a wide distribution of protein designabilities, the form of the distribution depends on how protein "viability" is defined. Furthermore, under increasing foldability requirements, the change in designabilities for all alphabets are in good agreement with the previous conclusions of the foldability approach. Most importantly, it was noticed that those structures that were highly designable for the two-letter amino acid alphabets are not especially designable with higher-letter alphabets.  相似文献   

13.
14.
Protein-DNA interactions are crucial for many biological processes. Attempts to model these interactions have generally taken the form of amino acid-base recognition codes or purely sequence-based profile methods, which depend on the availability of extensive sequence and structural information for specific structural families, neglect side-chain conformational variability, and lack generality beyond the structural family used to train the model. Here, we take advantage of recent advances in rotamer-based protein design and the large number of structurally characterized protein-DNA complexes to develop and parameterize a simple physical model for protein-DNA interactions. The model shows considerable promise for redesigning amino acids at protein-DNA interfaces, as design calculations recover the amino acid residue identities and conformations at these interfaces with accuracies comparable to sequence recovery in globular proteins. The model shows promise also for predicting DNA-binding specificity for fixed protein sequences: native DNA sequences are selected correctly from pools of competing DNA substrates; however, incorporation of backbone movement will likely be required to improve performance in homology modeling applications. Interestingly, optimization of zinc finger protein amino acid sequences for high-affinity binding to specific DNA sequences results in proteins with little or no predicted specificity, suggesting that naturally occurring DNA-binding proteins are optimized for specificity rather than affinity. When combined with algorithms that optimize specificity directly, the simple computational model developed here should be useful for the engineering of proteins with novel DNA-binding specificities.  相似文献   

15.
Hue Sun Chan  Ken A. Dill 《Proteins》1996,24(3):335-344
Proteins fold to unique compact native structures. Perhaps other polymers could be designed to fold in similar ways. The chemical nature of the monomer “alphabet” determines the “energy matrix” of monomer interactions—which defines the folding code, the relationship between sequence and structure. We study two properties of energy matrices using two-dimensional lattice models: uniqueness, the number of sequences that fold to only one structure, and encodability, the number of folds that are unique lowest-energy structures of certain monomer sequences. For the simplest model folding code, involving binary sequences of H (hydrophobic) and P (polar) monomers, only a small fraction of sequences fold uniquely, and not all structures can be encoded. Adding strong repulsive interactions results in a folding code with more sequences folding uniquely and more designable folds. Some theories suggest that the quality of a folding code depends only on the number of letters in the monomer alphabet, but we find that the energy matrix itself can be at least as important as the size of the alphabet. Certain multi-letter codes, including some with 20 letters, may be less physical or protein-like than codes with smaller numbers of letters because they neglect correlations among inter-residue interactions, treat only maximally compact conformations, or add arbitrary energies to the energy matrix.  相似文献   

16.
The question of how best to compare and classify the (three‐dimensional) structures of proteins is one of the most important unsolved problems in computational biology. To help tackle this problem, we have developed a novel shape‐density superposition algorithm called 3D‐Blast which represents and superposes the shapes of protein backbone folds using the spherical polar Fourier correlation technique originally developed by us for protein docking. The utility of this approach is compared with several well‐known protein structure alignment algorithms using receiver‐operator‐characteristic plots of queries against the “gold standard” CATH database. Despite being completely independent of protein sequences and using no information about the internal geometry of proteins, our results from searching the CATH database show that 3D‐Blast is highly competitive compared to current state‐of‐the‐art protein structure alignment algorithms. A novel and potentially very useful feature of our approach is that it allows an average or “consensus” fold to be calculated easily for a given group of protein structures. We find that using consensus shapes to represent entire fold families also gives very good database query performance. We propose that using the notion of consensus fold shapes could provide a powerful new way to index existing protein structure databases, and that it offers an objective way to cluster and classify all of the currently known folds in the protein universe. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

17.
The rational design of loops and turns is a key step towards creating proteins with new functions. We used a computational design procedure to create new backbone conformations in the second turn of protein L. The Protein Data Bank was searched for alternative turn conformations, and sequences optimal for these turns in the context of protein L were identified using a Monte Carlo search procedure and an energy function that favors close packing. Two variants containing 12 and 14 mutations were found to be as stable as wild-type protein L. The crystal structure of one of the variants has been solved at a resolution of 1.9 A, and the backbone conformation in the second turn is remarkably close to that of the in silico model (1.1 A RMSD) while it differs significantly from that of wild-type protein L (the turn residues are displaced by an average of 7.2 A). The folding rates of the redesigned proteins are greater than that of the wild-type protein and in contrast to wild-type protein L the second beta-turn appears to be formed at the rate limiting step in folding.  相似文献   

18.
19.
We use flexible backbone protein design to explore the sequence and structure neighborhoods of naturally occurring proteins. The method samples sequence and structure space in the vicinity of a known sequence and structure by alternately optimizing the sequence for a fixed protein backbone using rotamer based sequence search, and optimizing the backbone for a fixed amino acid sequence using atomic-resolution structure prediction. We find that such a flexible backbone design method better recapitulates protein family sequence variation than sequence optimization on fixed backbones or randomly perturbed backbone ensembles for ten diverse protein structures. For the SH3 domain, the backbone structure variation in the family is also better recapitulated than in randomly perturbed backbones. The potential application of this method as a model of protein family evolution is highlighted by a concerted transition to the amino acid sequence in the structural core of one SH3 domain starting from the backbone coordinates of an homologous structure.  相似文献   

20.
While ab initio modeling of protein structures is not routine, certain types of proteins are more straightforward to model than others. Proteins with short repetitive sequences typically exhibit repetitive structures. These repetitive sequences can be more amenable to modeling if some information is known about the predominant secondary structure or other key features of the protein sequence. We have successfully built models of a number of repetitive structures with novel folds using knowledge of the consensus sequence within the sequence repeat and an understanding of the likely secondary structures that these may adopt. Our methods for achieving this success are reviewed here.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号