首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A number of investigators have addressed the issue of why certain protein structures are especially common by considering structure designability, defined as the number of sequences that would successfully fold into any particular native structure. One such approach, based on foldability, suggested that structures could be classified according to their maximum possible foldability and that this optimal foldability would be highly correlated with structure designability. Other approaches have focused on computing the designability of lattice proteins written with reduced two-letter amino acid alphabets. These different approaches suggested contrasting characteristics of the most designable structures. This report compares the designability of lattice proteins over a wide range of amino acid alphabets and foldability requirements. While all alphabets have a wide distribution of protein designabilities, the form of the distribution depends on how protein "viability" is defined. Furthermore, under increasing foldability requirements, the change in designabilities for all alphabets are in good agreement with the previous conclusions of the foldability approach. Most importantly, it was noticed that those structures that were highly designable for the two-letter amino acid alphabets are not especially designable with higher-letter alphabets.  相似文献   

2.
Proteins exhibit a nonuniform distribution of structures. A number of models have been advanced to explain this observation by considering the distribution of designabilities, that is, the fraction of all sequences that could successfully fold into any particular structure. It has been postulated that more designable structures should be more common, although the exact nature of this relationship has not been addressed. We find that the nonuniform distribution of protein structures found in nature can be explained by the interplay of evolution and population dynamics with the designability distribution. The relative frequency of different structures has a greater-than-linear dependence on designability, making the distribution of observed protein structures more uneven than the distribution of designabilities. The distribution of structures is also affected by additional factors such as the topology of the sequence space and the similarity of other structures.  相似文献   

3.
Emberly EG  Miller J  Zeng C  Wingreen NS  Tang C 《Proteins》2002,47(3):295-304
Using an off-lattice model, we fully enumerate folded conformations of polypeptide chains of up to N = 19 monomers. Structures are found to differ markedly in designability, defined as the number of sequences with that structure as a unique lowest-energy conformation. We find that designability is closely correlated with the pattern of surface exposure of the folded structure. For longer chains, complete enumeration of structures is impractical. Instead, structures can be randomly sampled, and relative designability estimated either from designability within the random sample, or directly from surface-exposure pattern. We compare the surface-exposure patterns of those structures identified as highly designable to the patterns of naturally occurring proteins.  相似文献   

4.
Miller J  Zeng C  Wingreen NS  Tang C 《Proteins》2002,47(4):506-512
Despite the variety of protein sizes, shapes, and backbone configurations found in nature, the design of novel protein folds remains an open problem. Within simple lattice models it has been shown that all structures are not equally suitable for design. Rather, certain structures are distinguished by unusually high designability: the number of amino acid sequences for which they represent the unique lowest energy state; sequences associated with such structures possess both robustness to mutation and thermodynamic stability. Here we report that highly designable backbone conformations also emerge in a realistic off-lattice model. The highly designable conformations of a chain of 23 amino acids are identified and found to be remarkably insensitive to model parameters. Although some of these conformations correspond closely to known natural protein folds, such as the zinc finger and the helix-turn-helix motifs, others do not resemble known folds and may be candidates for novel fold design.  相似文献   

5.
6.
Rashin AA  Rashin AH 《Proteins》2007,66(2):321-341
Two-dimensional lattice protein models were studied in two approximations of the conformational equilibrium to elucidate the role of surface hydrophobic groups in their stabilities. We demonstrate that stability of any compactly folded sequence is determined by its ability to "flip-flop" (refold) into alternative compact structures. The degree of stability required for folded sequences determines the average numbers of surface hydrophobic groups in stable lattice structures which are in good agreement with ratios of core to surface hydrophobic groups in real proteins. However, the average destabilization of the native structure per surface hydrophobic group is small (0-0.25 kcal/mol), often disagrees with the free energies derived from the ratios of core to surface hydrophobic groups in the same structures, and has a combinatorial entropic nature independent of the strength of structure stabilizing interactions. This suggests that the free energies derived from the core to surface ratios of hydrophobic groups in real proteins have little to do with folding thermodynamics. On average, sequences with highly stable native structures are the least hydrophobic. The results suggest that in designing novel stable proteins hydrophobic groups on the surface should be avoided to reduce the possibility of flip-flopping. The average stability of highly designable structures is never higher than that of some low designability structures, contrary to the accepted view. In the equilibrium approximation with alternative compact and partially unfolded structures, the requirement of high stability selects a unique 5 x 5 structure formed by only a few sequences, suggesting much stronger sequence selectivity than commonly thought.  相似文献   

7.
Hue Sun Chan  Ken A. Dill 《Proteins》1996,24(3):335-344
Proteins fold to unique compact native structures. Perhaps other polymers could be designed to fold in similar ways. The chemical nature of the monomer “alphabet” determines the “energy matrix” of monomer interactions—which defines the folding code, the relationship between sequence and structure. We study two properties of energy matrices using two-dimensional lattice models: uniqueness, the number of sequences that fold to only one structure, and encodability, the number of folds that are unique lowest-energy structures of certain monomer sequences. For the simplest model folding code, involving binary sequences of H (hydrophobic) and P (polar) monomers, only a small fraction of sequences fold uniquely, and not all structures can be encoded. Adding strong repulsive interactions results in a folding code with more sequences folding uniquely and more designable folds. Some theories suggest that the quality of a folding code depends only on the number of letters in the monomer alphabet, but we find that the energy matrix itself can be at least as important as the size of the alphabet. Certain multi-letter codes, including some with 20 letters, may be less physical or protein-like than codes with smaller numbers of letters because they neglect correlations among inter-residue interactions, treat only maximally compact conformations, or add arbitrary energies to the energy matrix.  相似文献   

8.
With the aim of studying the relationship between protein sequences and their native structures, we adopted vectorial representations for both sequence and structure. The structural representation was based on the principal eigenvector of the fold's contact matrix (PE). As has been recently shown, the latter encodes sufficient information for reconstructing the whole contact matrix. The sequence was represented through a hydrophobicity profile (HP), using a generalized hydrophobicity scale that we obtained from the principal eigenvector of a residue-residue interaction matrix, and denoted as interactivity scale. Using this novel scale, we defined the optimal HP of a protein fold, and, by means of stability arguments, predicted to be strongly correlated with the PE of the fold's contact matrix. This prediction was confirmed through an evolutionary analysis, which showed that the PE correlates with the HP of each individual sequence adopting the same fold and, even more strongly, with the average HP of this set of sequences. Thus, protein sequences evolve in such a way that their average HP is close to the optimal one, implying that neutral evolution can be viewed as a kind of motion in sequence space around the optimal HP. Our results indicate that the correlation coefficient between N-dimensional vectors constitutes a natural metric in the vectorial space in which we represent both protein sequences and protein structures, which we call vectorial protein space. In this way, we define a unified framework for sequence-to-sequence, sequence-to-structure and structure-to-structure alignments. We show that the interactivity scale is nearly optimal both for the comparison of sequences to sequences and sequences to structures.  相似文献   

9.
Understanding the evolution of biopolymers is a key element in rationalizing their structures and functions. Simple exact models (SEMs) are well-positioned to address general principles of evolution as they permit the exhaustive enumeration of both sequence and structure (conformational) spaces. The physics-based models of the complete mapping between genotypes and phenotypes afforded by SEMs have proven valuable for gaining insight into how adaptation and selection operate among large collections of sequences and structures. This study compares the properties of evolutionary landscapes of a variety of SEMs to delineate robust predictions and possible model-specific artifacts. Among the models studied, the ruggedness of evolutionary landscape is significantly model-dependent; those derived from more protein-like models appear to be smoother. We found that a common practice of restricting protein structure space to maximally compact lattice conformations results in (i.e., "designs in") many encodable (designable) structures that are not otherwise encodable in the corresponding unrestrained structure space. This discrepancy is especially severe for model potentials that seek to mimic the major role of hydrophobic interactions in protein folding. In general, restricting conformations to be maximally compact leads to larger changes in the model genotype-phenotype mapping than a moderate shifting of reference state energy of the model potential function to allow for more specific encoding via the "designing out" effects of repulsive interactions. Despite these variations, the superfunnel paradigm applies to all SEMs we have tested: For a majority of neutral nets across different models, there exists a funnel-like organization of native stabilities for the sequences in a neutral net encoding for the same structure, and the thermodynamically most stable sequence is also the most robust against mutation.  相似文献   

10.
Using a triangular lattice model to study the designability of protein folding, we overcame the parity problem of previous cubic lattice model and enumerated all the sequences and compact structures on a simple two-dimensional triangular lattice model of size 4 5 6 5 4. We used two types of amino acids, hydrophobic and polar, to make up the sequences, and achieved 223W212 different sequences excluding the reverse symmetry sequences. The total string number of distinct compact structures was 219,093, excluding reflection symmetry in the self-avoiding path of length 24 triangular lattice model. Based on this model, we applied a fast search algorithm by constructing a cluster tree. The algorithm decreased the computation by computing the objective energy of non-leaf nodes. The parallel experiments proved that the fast tree search algorithm yielded an exponential speed-up in the model of size 4 5 6 5 4. Designability analysis was performed to understand the search result.  相似文献   

11.
Because the space of folded protein structures is highly degenerate, with recurring secondary and tertiary motifs, methods for representing protein structure in terms of collective physically relevant coordinates are of great interest. By collapsing structural diversity to a handful of parameters, such methods can be used to delineate the space of designable structures (i.e., conformations that can be stabilized with a large number of sequences)—a crucial task for de novo protein design. We first demonstrate this on natural α-helical coiled coils using the Crick parameterization. We show that over 95% of known coiled-coil structures are within  1-Å Cα root mean square deviation of a Crick-ideal backbone. Derived parameters show that natural geometric space of coiled coils is highly restricted and can be represented by “allowed” conformations amidst a potential continuum of conformers. Allowed structures have (1) restricted axial offsets between helices, which differ starkly between parallel and anti-parallel structures; (2) preferred superhelical radii, which depend linearly on the oligomerization state; (3) pronounced radius-dependent a- and d-position amino acid propensities; and (4) discrete angles of rotation of helices about their axes, which are surprisingly independent of oligomerization state or orientation. In all, we estimate the space of designable coiled-coil structures to be reduced at least 160-fold relative to the space of geometrically feasible structures. To extend the benefits of structural parameterization to other systems, we developed a general mathematical framework for parameterizing arbitrary helical structures, which reduces to the Crick parameterization as a special case. The method is successfully validated on a set of non-coiled-coil helical bundles, frequent in channels and transporter proteins, which show significant helix bending but not supercoiling. Programs for coiled-coil parameter fitting and structure generation are provided via a web interface at http://www.gevorggrigoryan.com/cccp/, and code for generalized helical parameterization is available upon request.  相似文献   

12.
Analysis of increasingly saturated sequence databases have shown that gene family sizes are highly skewed with many families being small and few containing many, far-diverged homologs. Additionally, recently published results have identified a structural determinant of mutational plasticity: designability that correlates strongly with gene family size. In this paper, we explore the possible links between the two observations, exploring the possible effect of designability on duplication and divergence. We show that designability has an inverse of expected relationship with strength of selection. More designable domains that should have more mutational plasticity evolve slower. However, we also present evidence that recently duplicated genes have variable probability of locus fixation correlated with strength of selection. As expected, paralogs under stronger evolutionary pressure have a lower failure rate. Finally, we show that probability of pseudogene formation from gene duplication can be directly tied to designability and functional flexibility of the family. We present evidence that gene families with higher designability have diverged farther because of lower probability of pseudogenization. Additionally, mutational plasticity may play an integral role by influencing pseudogenization rate. Either way, we show that considering the failure rate of duplications is integral in understanding the determinants and dynamics of molecular evolution.  相似文献   

13.
In this study, we address the issue of performing meaningful pK(a) calculations using homology modeled three-dimensional (3D) structures and analyze the possibility of using the calculated pK(a) values to detect structural defects in the models. For this purpose, the 3D structure of each member of five large protein families of a bacterial nucleoside monophosphate kinases (NMPK) have been modeled by means of homology-based approach. Further, we performed pK(a) calculations for the each model and for the template X-ray structures. Each bacterial NMPK family used in the study comprised on average 100 members providing a pool of sequences and 3D models large enough for reliable statistical analysis. It was shown that pK(a) values of titratable groups, which are highly conserved within a family, tend to be conserved among the models too. We demonstrated that homology modeled structures with sequence identity larger than 35% and gap percentile smaller than 10% can be used for meaningful pK(a) calculations. In addition, it was found that some highly conserved titratable groups either exhibit large pK(a) fluctuations among the models or have pK(a) values shifted by several pH units with respect to the pK(a) calculated for the X-ray structure. We demonstrated that such case usually indicates structural errors associated with the model. Thus, we argue that pK(a) calculations can be used for assessing the quality of the 3D models by monitoring fluctuations of the pK(a) values for highly conserved titratable residues within large sets of homologous proteins.  相似文献   

14.
We report on the biochemical and structural properties of a putative P-type H(+)-ATPase, MJ1226p, from the anaerobic hyperthermophilic Archaea Methanococcus jannaschii. An efficient heterologous expression system was developed in Saccharomyces cerevisiae and a four-step purification protocol, using n-dodecyl beta-d-maltoside, led to a homogeneous detergent-solubilized protein fraction with a yield of over 2 mg of protein per liter of culture. The three-dimensional structure of the purified detergent-solubilized protein obtained at 2.4 nm resolution by electron microscopy showed a dimeric organization in which the size and the shape of each monomer was compatible with the reported structures of P-type ATPases. The purified MJ1226p ATPase was inactive at 40 degrees C and was active at elevated temperature reaching high specific activity, up to 180 micromol of P(i) x min(-1) x mg(-1) at 95 degrees C. Maximum ATPase activity was observed at pH 4.2 and required up to 200 mm monovalent salts. The ATPase activity was stable for several days upon storage at 65 degrees C and was highly resistant to urea and guanidine hydrochloride. The protein formed catalytic phosphoenzyme intermediates from MgATP or P(i), a functional characteristic specific of P-type ATPases. The highly purified, homogeneous, stable, and active MJ1226p ATPase provides a new model for further structure-function studies of P-type ATPases.  相似文献   

15.
A blinded study to assess the state of the art in three‐dimensional structure modeling of the variable region (Fv) of antibodies was conducted. Nine unpublished high‐resolution x‐ray Fab crystal structures covering a wide range of antigen‐binding site conformations were used as benchmark to compare Fv models generated by four structure prediction methodologies. The methodologies included two homology modeling strategies independently developed by CCG (Chemical Computer Group) and Accerlys Inc, and two fully automated antibody modeling servers: PIGS (Prediction of ImmunoGlobulin Structure), based on the canonical structure model, and Rosetta Antibody Modeling, based on homology modeling and Rosetta structure prediction methodology. The benchmark structure sequences were submitted to Accelrys and CCG and a set of models for each of the nine antibody structures were generated. PIGS and Rosetta models were obtained using the default parameters of the servers. In most cases, we found good agreement between the models and x‐ray structures. The average rmsd (root mean square deviation) values calculated over the backbone atoms between the models and structures were fairly consistent, around 1.2 Å. Average rmsd values of the framework and hypervariable loops with canonical structures (L1, L2, L3, H1, and H2) were close to 1.0 Å. H3 prediction yielded rmsd values around 3.0 Å for most of the models. Quality assessment of the models and the relative strengths and weaknesses of the methods are discussed. We hope this initiative will serve as a model of scientific partnership and look forward to future antibody modeling assessments. Proteins 2011; © 2011 Wiley‐Liss, Inc.  相似文献   

16.
In this work, we discovered a fundamental connection between selection for protein stability and emergence of preferred structures of proteins. Using a standard exact three-dimensional lattice model we evolve sequences starting from random ones and determine the exact native structure after each mutation. Acceptance of mutations is biased to select for stable proteins. We found that certain structures, "wonderfolds", are independently discovered numerous times as native states of stable proteins in many unrelated runs of selection. The strong dependence of lattice fold usage on the structural determinant of designability quantitatively reproduces uneven fold usage in natural proteins. Diversity of sequences that fold into wonderfold structures gives rise to superfamilies, i.e. sets of dissimilar sequences that fold into the same or very similar structures. The present work establishes a model of pre-biotic structure selection, which identifies dominant structural patterns emerging upon optimization of proteins for survival in a hot environment. Convergently discovered pre-biotic initial superfamilies with wonderfold structures could have served as a seed for subsequent biological evolution involving gene duplications and divergence.  相似文献   

17.
Linear models are typically used to analyze multivariate longitudinal data. With these models, estimating the covariance matrix is not easy because the covariance matrix should account for complex correlated structures: the correlation between responses at each time point, the correlation within separate responses over time, and the cross-correlation between different responses at different times. In addition, the estimated covariance matrix should satisfy the positive definiteness condition, and it may be heteroscedastic. However, in practice, the structure of the covariance matrix is assumed to be homoscedastic and highly parsimonious, such as exchangeable or autoregressive with order one. These assumptions are too strong and result in inefficient estimates of the effects of covariates. Several studies have been conducted to solve these restrictions using modified Cholesky decomposition (MCD) and linear covariance models. However, modeling the correlation between responses at each time point is not easy because there is no natural ordering of the responses. In this paper, we use MCD and hypersphere decomposition to model the complex correlation structures for multivariate longitudinal data. We observe that the estimated covariance matrix using the decompositions is positive-definite and can be heteroscedastic and that it is also interpretable. The proposed methods are illustrated using data from a nonalcoholic fatty liver disease study.  相似文献   

18.
Kappa-conotoxin RIIIJ is a conopeptide to inhibit voltage-gated potassium channels, however, its detailed folding structures have yet to be studied. With the advance in computing power, it is possible to use the HP model to analyze all its possible folding structures. In this study, the amino acid sequences of kappa-conotoxin RIIIJ and its four mutageneses were converted into ten HP sequences according to the normalized hydrophobicity index. All 282 429 536 481 possible folding structures in each HP sequence were found using the 2-dimensional HP model, and the detailed folding structures at native state were studied. The results showed that kappa-conotoxin RIIIJ had 180 and 90 folding structures at their native state with minimal energy of -9 and -10 at pH 2 and pH 7; its mutagenesis (6-8) TPP - > SLN increased the numbers of the folding structures to 456 and 564 at pH 2 and pH 7; whereas its mutageneses (6-11) TPPKKH - > SLNLRL, (9- 11) KKH - > LRL, and (10-11) KH - > RL decreased the numbers of the folding structures to 60, 30 and 90 at both pH levels, respectively. Thereafter, the normalized hydrophobicity index was employed to distinguish those native states, and attempts were made to explain the effect of mutageneses on potassium channels in terms of the number of folding structures and numerical native states.  相似文献   

19.
Although the hydrophobic-polar (HP) model was proposed a decade ago, it applies almost to no real-case study because of its intense computation. In this study, a 2D HP model was applied to study the folding structures of M-lycotoxin-Hc1a, an antimicrobial peptide, in order to get full pictures of its numerous folding structures. The normalised hydrophobicity index was used to convert M-lycotoxin-Hc1a and its six mutageneses into HP sequences, and then the 2D HP model was used to compute all the possible folding structures (324 = 282,429,536,481), and finally the normalised hydrophobicity index was used to distinguish the native state. The results showed that M-lycotoxin-Hc1a had 6 and 138 folding structures at their native state with the minimal energy of ? 13 at pH 2 and pH 7 when glycine served as hydrophobic amino acid. When glycine serves as polar amino acid, M-lycotoxin-Hc1a had 12 and 54 folding structures at their native state with the minimal energy of ? 12 and ? 13 at pH 2 and pH 7, respectively. This study advanced the knowledge on how to apply the HP model to real-life study, and how the mutageneses influenced the folding structures of M-lycotoxin-Hc1a, their native states and minimal energy at different pH levels.  相似文献   

20.
Tang Y  Goger MJ  Raleigh DP 《Biochemistry》2006,45(22):6940-6946
The villin headpiece subdomain (HP36) is the smallest naturally occurring protein that folds cooperatively. The protein folds on a microsecond time scale. Its small size and very rapid folding have made it a popular target for biophysical studies of protein folding. Temperature-dependent one-dimensional (1D) NMR studies of the full-length protein together with CD and 1D NMR studies of the 21-residue peptide fragment (HP21) derived from HP36 have shown that there is significant structure in the unfolded state of HP36 and have demonstrated that HP21 is a good model of these interactions. Here, we characterized the model peptide HP21 in detail by two-dimensional NMR. Strongly upfield shifted C(alpha) protons, the magnitude of the 3J(NH,alpha) coupling constants, and the pattern of backbone-backbone and backbone-side chain NOEs indicate that the ensemble of structures populated by HP21 contains alpha-helical structure and native as well as non-native hydrophobic contacts. The hydrogen-bonded secondary structure inferred from the NOEs is, however, not sufficient to confer significant protection against amide H-D exchange. These studies indicate that there is significant secondary structure and hydrophobic clustering in the unfolded state of HP36. The implications for the folding of HP36 are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号