首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Kamat AP  Lesk AM 《Proteins》2007,66(4):869-876
Comparing and classifying protein folding patterns allows organizing the known structures and enumerating possible protein structural patterns including those not yet observed. We capture the essence of protein folding patterns in a concise tableau representation based on the order and contact patterns of secondary structures: helices and strands of sheet. The tableaux are intelligible to both humans and computers. They provide a database, derived from the Protein Data Bank, mineable in studies of protein architecture. Using this database, we have: (i) determined statistical properties of secondary structure contacts in an unbiased set of protein domains from ASTRAL, (ii) observed that in 98% of cases, the tableau is a faithful representation of the folding pattern as classified in SCOP, (iii) demonstrated that to a large extent the local structure of proteins indicates their complete folding topology, and (iv) studied the use of the representation for fold identification.  相似文献   

2.
Smith CR  Mateljevic N  Bowler BE 《Biochemistry》2002,41(31):10173-10181
The conformational constraints on protein denatured states are of prime importance in modulating early events in protein folding. Although structural studies have demonstrated residual structure in protein denatured states, much remains poorly understood with regard to the conformational properties of this state. Here, we investigate topological effects on loop formation probabilities in denatured iso-1-cytochrome c by comparing histidine-heme binding affinities for histidines on the N- versus the C-terminal side of the heme. For histidines N-terminal to the heme (preceding cysteine 14), the polypeptide emerges from the edge of the heme and must simply fold over to bind to the heme. For histidines C-terminal to the heme (following histidine 18), the polypeptide emerges from the back side of the heme and must wrap around the heme for the histidine to bind to the heme. Thus, the steric constraints on this wrap-around topology are expected to be much more demanding than for the heme-edge topology of the N-terminal histidines. Evaluation of loop formation probabilities in 3 M guanidine hydrochloride, conditions that fully denature the variants studied, demonstrates that N-terminal histidine-heme loop formation is 10-25-fold more favorable than C-terminal histidine-heme loop formation, for similar loop sizes. A two-dimensional square lattice model indicates that excluded volume is important in this topological preference. These data provide direct evidence that denatured state topology affects contact probability, and thus probable folding pathways, in a disordered protein.  相似文献   

3.
Accompanying recent advances in determining RNA secondary structure is the growing appreciation for the importance of relatively simple topological constraints, encoded at the secondary structure level, in defining the overall architecture, folding pathways, and dynamic adaptability of RNA. A new view is emerging in which tertiary interactions do not define RNA 3D structure, but rather, help select specific conformers from an already narrow, topologically pre-defined conformational distribution. Studies are providing fundamental insights into the nature of these topological constraints, how they are encoded by the RNA secondary structure, and how they interplay with other interactions, breathing new meaning to RNA secondary structure. New approaches have been developed that take advantage of topological constraints in determining RNA backbone conformation based on secondary structure, and a limited set of other, easily accessible constraints. Topological constraints are also providing a much-needed framework for rationalizing and describing RNA dynamics and structural adaptation. Finally, studies suggest that topological constraints may play important roles in steering RNA folding pathways. Here, we review recent advances in our understanding of topological constraints encoded by the RNA secondary structure.  相似文献   

4.
The protein folding problem represents one of the most challenging problems in computational biology. Distance constraints and topology predictions can be highly useful for the folding problem in reducing the conformational space that must be searched by deterministic algorithms to find a protein structure of minimum conformational energy. We present a novel optimization framework for predicting topological contacts and generating interhelical distance restraints between hydrophobic residues in alpha-helical globular proteins. It should be emphasized that since the model does not make assumptions about the form of the helices, it is applicable to all alpha-helical proteins, including helices with kinks and irregular helices. This model aims at enhancing the ASTRO-FOLD protein folding approach of Klepeis and Floudas (Journal of Computational Chemistry 2003;24:191-208), which finds the structure of global minimum conformational energy via a constrained nonlinear optimization problem. The proposed topology prediction model was evaluated on 26 alpha-helical proteins ranging from 2 to 8 helices and 35 to 159 residues, and the best identified average interhelical distances corresponding to the predicted contacts fell below 11 A in all 26 of these systems. Given the positive results of applying the model to several protein systems, the importance of interhelical hydrophobic-to-hydrophobic contacts in determining the folding of alpha-helical globular proteins is highlighted.  相似文献   

5.
Computational de novo protein structure prediction is limited to small proteins of simple topology. The present work explores an approach to extend beyond the current limitations through assembling protein topologies from idealized α-helices and β-strands. The algorithm performs a Monte Carlo Metropolis simulated annealing folding simulation. It optimizes a knowledge-based potential that analyzes radius of gyration, β-strand pairing, secondary structure element (SSE) packing, amino acid pair distance, amino acid environment, contact order, secondary structure prediction agreement and loop closure. Discontinuation of the protein chain favors sampling of non-local contacts and thereby creation of complex protein topologies. The folding simulation is accelerated through exclusion of flexible loop regions further reducing the size of the conformational search space. The algorithm is benchmarked on 66 proteins with lengths between 83 and 293 amino acids. For 61 out of these proteins, the best SSE-only models obtained have an RMSD100 below 8.0 Å and recover more than 20% of the native contacts. The algorithm assembles protein topologies with up to 215 residues and a relative contact order of 0.46. The method is tailored to be used in conjunction with low-resolution or sparse experimental data sets which often provide restraints for regions of defined secondary structure.  相似文献   

6.
Feng H  Takei J  Lipsitz R  Tjandra N  Bai Y 《Biochemistry》2003,42(43):12461-12465
Structures of intermediates and transition states in protein folding are usually characterized by amide hydrogen exchange and protein engineering methods and interpreted on the basis of the assumption that they have native-like conformations. We were able to stabilize and determine the high-resolution structure of a partially unfolded intermediate that exists after the rate-limiting step of a four-helix bundle protein, Rd-apocyt b(562), by multidimensional NMR methods. The intermediate has partial native-like secondary structure and backbone topology, consistent with our earlier native state hydrogen exchange results. However, non-native hydrophobic interactions exist throughout the structure. These and other results in the literature suggest that non-native hydrophobic interactions may occur generally in partially folded states. This can alter the interpretation of mutational protein engineering results in terms of native-like side chain interactions. In addition, since the intermediate exists after the rate-limiting step and Rd-apocyt b(562) folds very rapidly (k(f) approximately 10(4) s(-1)), these results suggest that non-native hydrophobic interactions, in the absence of topological misfolding, are repaired too rapidly to slow folding and cause the accumulation of folding intermediates. More generally, these results illustrate an approach for determining the high-resolution structure of folding intermediates.  相似文献   

7.
Comparing and classifying the three-dimensional (3D) structures of proteins is of crucial importance to molecular biology, from helping to determine the function of a protein to determining its evolutionary relationships. Traditionally, 3D structures are classified into groups of families that closely resemble the grouping according to their primary sequence. However, significant structural similarities exist at multiple levels between proteins that belong to these different structural families. In this study, we propose a new algorithm, CLICK, to capture such similarities. The method optimally superimposes a pair of protein structures independent of topology. Amino acid residues are represented by the Cartesian coordinates of a representative point (usually the C(α) atom), side chain solvent accessibility, and secondary structure. Structural comparison is effected by matching cliques of points. CLICK was extensively benchmarked for alignment accuracy on four different sets: (i) 9537 pair-wise alignments between two structures with the same topology; (ii) 64 alignments from set (i) that were considered to constitute difficult alignment cases; (iii) 199 pair-wise alignments between proteins with similar structure but different topology; and (iv) 1275 pair-wise alignments of RNA structures. The accuracy of CLICK alignments was measured by the average structure overlap score and compared with other alignment methods, including HOMSTRAD, MUSTANG, Geometric Hashing, SALIGN, DALI, GANGSTA(+), FATCAT, ARTS and SARA. On average, CLICK produces pair-wise alignments that are either comparable or statistically significantly more accurate than all of these other methods. We have used CLICK to uncover relationships between (previously) unrelated proteins. These new biological insights include: (i) detecting hinge regions in proteins where domain or sub-domains show flexibility; (ii) discovering similar small molecule binding sites from proteins of different folds and (iii) discovering topological variants of known structural/sequence motifs. Our method can generally be applied to compare any pair of molecular structures represented in Cartesian coordinates as exemplified by the RNA structure superimposition benchmark.  相似文献   

8.
刘超洋  庄文颖 《菌物学报》2011,30(6):912-919
探讨了核糖体小亚基二级结构对真菌系统发育分析的影响。对用不同方法构建的系统发育树进行比较,结果表明结合二级结构信息的分析方法较传统方法产生了更为合理的拓扑结构。二级结构信息除用于优化序列比对外,还需整合到核酸替代模型中;恰当的序列比对方法、进化模型和建树运算法则有助于更加准确地揭示类群之间的亲缘关系。  相似文献   

9.
Protein folding experiments demonstrate that the folding behaviors of many proteins can be roughly classified into two types: two-state kinetics and multi-state kinetics. Although the two types of protein folding kinetics have been observed for a long time, what determines the folding type of a protein is still largely unclear. The present work performed a comparative study based on a dataset of 43 two-state and 42 multi-state folders at different levels of proteins' intrinsic properties from the simplest sequence length to native structure topology. The results show that protein's amino acids composition and the long-range interaction-based topological complexity rather than secondary structure contents are the major determinants of protein folding type. Furthermore, a sequence-based folding type prediction achieved an accuracy of more than 80%. These findings implicate that there is no clear boundary between secondary and tertiary structure formation during the protein folding process and support the existence of a continuum of folding mechanism between the two ends of hierarchic and nucleation folding scenarios.  相似文献   

10.
Thermodynamic folding algorithms and structure probing experiments are commonly used to determine the secondary structure of RNAs. Here we propose a formal framework to reconcile information from both prediction algorithms and probing experiments. The thermodynamic energy parameters are adjusted using 'pseudo-energies' to minimize the discrepancy between prediction and experiment. Our framework differs from related approaches that used pseudo-energies in several key aspects. (i) The energy model is only changed when necessary and no adjustments are made if prediction and experiment are consistent. (ii) Pseudo-energies remain biophysically interpretable and hold positional information where experiment and model disagree. (iii) The whole thermodynamic ensemble of structures is considered thus allowing to reconstruct mixtures of suboptimal structures from seemingly contradicting data. (iv) The noise of the energy model and the experimental data is explicitly modeled leading to an intuitive weighting factor through which the problem can be seen as folding with 'soft' constraints of different strength. We present an efficient algorithm to iteratively calculate pseudo-energies within this framework and demonstrate how this approach can be used in combination with SHAPE chemical probing data to improve secondary structure prediction. We further demonstrate that the pseudo-energies correlate with biophysical effects that are known to affect RNA folding such as chemical nucleotide modifications and protein binding.  相似文献   

11.
Three different approaches to improve tertiary fold prediction using the genetic algorithm are discussed: (i) Refinement of the search strategy, (ii) combination of prediction and experiment and (iii) inclusion of experimental data as selection criteria into the genetic algorithm. Examples from our current work are presented for refined strategies against crowding in solution space, definition of domain boundaries and secondary structure in combination with experiment, and direct incorporation of experimentally known distance constraints into the fitness function.Electronic Supplementary Material available.  相似文献   

12.
The Escherichia coli nucleoid is maintained in its folded highly condensed state by constraints which involve RNA and protein. We have developed a rapid sedimentation assay to determine the state of folding of the membrane-free nucleoid. An approximate measure of the stability of the nucleoids under various conditions can then be estimated by measuring the temperature at which the nucleoids unfold. Using ethidium and gamma irradiation (which removes the negative supercoiling of the native nucleoid) as probes, it can be shown that there are two types of constraint involved in the condensation of the nucleoid. One of these constraints is destabilized by ethidium but stabilized by negative supercoiling; the second constraint is unaffected by both ethidium and negative supercoiling. Several models can be proposed: (i) a DNA . RNA duplex, (ii) a double-strand DNA (dsDNA) . RNA triplex, (iii) DNA-protein interactions, (iv) a topological knot with RNA, and (v) a DNA tetraplex. The topological knot model is not consistent with the data and many combinations of the others can be excluded. If RNA is involved in both constraints then RNA . DNA duplexes and dsDNA . RNA triplexes are involved in stabilizing the nucleoid.  相似文献   

13.
Recent studies have shown that basic steric and connectivity constraints encoded at the secondary structure level are key determinants of 3D structure and dynamics in simple two-way RNA junctions. However, the role of these topological constraints in higher order RNA junctions remains poorly understood. Here, we use a specialized coarse-grained molecular dynamics model to directly probe the thermodynamic contributions of topological constraints in defining the 3D architecture and dynamics of transfer RNA (tRNA). Topological constraints alone restrict tRNA''s allowed conformational space by over an order of magnitude and strongly discriminate against formation of non-native tertiary contacts, providing a sequence independent source of folding specificity. Topological constraints also give rise to long-range correlations between the relative orientation of tRNA''s helices, which in turn provides a mechanism for encoding thermodynamic cooperativity between distinct tertiary interactions. These aspects of topological constraints make it such that only several tertiary interactions are needed to confine tRNA to its native global structure and specify functionally important 3D dynamics. We further show that topological constraints are conserved across tRNA''s different naturally occurring secondary structures. Taken together, our results emphasize the central role of secondary-structure-encoded topological constraints in defining RNA 3D structure, dynamics and folding.  相似文献   

14.
There are constraints on a protein sequence/structure for it to adopt a particular fold. These constraints could be either a local signature involving particular sequences or arrangements of secondary structure or a global signature involving features along the entire chain. To search systematically for protein fold signatures, we have explored the use of Inductive Logic Programming (ILP). ILP is a machine learning technique which derives rules from observation and encoded principles. The derived rules are readily interpreted in terms of concepts used by experts. For 20 populated folds in SCOP, 59 rules were found automatically. The accuracy of these rules, which is defined as the number of true positive plus true negative over the total number of examples, is 74% (cross-validated value). Further analysis was carried out for 23 signatures covering 30% or more positive examples of a particular fold. The work showed that signatures of protein folds exist, about half of rules discovered automatically coincide with the level of fold in the SCOP classification. Other signatures correspond to homologous family and may be the consequence of a functional requirement. Examination of the rules shows that many correspond to established principles published in specific literature. However, in general, the list of signatures is not part of standard biological databases of protein patterns. We find that the length of the loops makes an important contribution to the signatures, suggesting that this is an important determinant of the identity of protein folds. With the expansion in the number of determined protein structures, stimulated by structural genomics initiatives, there will be an increased need for automated methods to extract principles of protein folding from coordinates.  相似文献   

15.
Several methods have been developed for identifying more or less complex RNA structures in a genome. All these methods are based on the search for conserved primary and secondary sub-structures. In this paper, we present a simple formal representation of a helix, which is a combination of sequence and folding constraints, as a constrained regular expression. This representation allows us to develop a well-founded algorithm that searches for all approximate matches of a helix in a genome. The algorithm is based on an alignment graph constructed from several copies of a pushdown automaton, arranged one on top of another. This is a first attempt to take advantage of the possibilities of pushdown automata in the context of approximate matching. The worst time complexity is O(krpn), where k is the error threshold, n the size of the genome, p the size of the secondary expression, and r its number of union symbols. We then extend the algorithm to search for pseudo-knots and secondary structures containing an arbitrary number of helices.  相似文献   

16.
Many RNA molecules exert their biological function only after folding to unique three-dimensional structures. For long, noncoding RNA molecules, the complexity of finding the native topology can be a major impediment to correct folding to the biologically active structure. An RNA molecule may fold to a near-native structure but not be able to continue to the correct structure due to a topological barrier such as crossed strands or incorrectly stacked helices. Achieving the native conformation thus requires unfolding and refolding, resulting in a long-lived intermediate. We investigate the role of topology in the folding of two phylogenetically related catalytic group I introns, the Twort and Azoarcus group I ribozymes. The kinetic models describing the Mg2+-mediated folding of these ribozymes were previously determined by time-resolved hydroxyl (⋅OH) radical footprinting. Two intermediates formed by parallel intermediates were resolved for each RNA. These data and analytical ultracentrifugation compaction analyses are used herein to constrain coarse-grained models of these folding intermediates as we investigate the role of nonnative topology in dictating the lifetime of the intermediates. Starting from an ensemble of unfolded conformations, we folded the RNA molecules by progressively adding native constraints to subdomains of the RNA defined by the ⋅OH time-progress curves to simulate folding through the different kinetic pathways. We find that nonnative topologies (arrangement of helices) occur frequently in the folding simulations despite using only native constraints to drive the reaction, and that the initial conformation, rather than the folding pathway, is the major determinant of whether the RNA adopts nonnative topology during folding. From these analyses we conclude that biases in the initial conformation likely determine the relative flux through parallel RNA folding pathways.  相似文献   

17.
To investigate the relationships between protein topology, amino acid sequence and folding mechanisms, the folding transition state of the Sso7d protein has been characterised both experimentally and theoretically. Although Sso7d protein has a similar topology to that of the SH3 domains, the structure of its transition state is different from that of alpha-spectrin and src SH3 domains previously studied. The folding algorithm, Fold-X, including an energy function with specific sequence features, accounts for these differences and reproduces with a good agreement the set of experimental phi(double dagger-U) values obtained for the three proteins. Our analysis shows that taking into account sequence features underlying protein topology is critical for an accurate prediction of the folding process.  相似文献   

18.
Functionally homologous RNA sequences can substantially diverge in their primary sequences but it can be reasonably assumed that they are related in their higher-degree structures. The problem to find such structures and simultaneously satisfy as far as possible the free-energy-minimization criterion, is considered here in two aspects. Firstly a quantitative measure of the folding consensus among secondary structures is defined, translating each structure into a linear representation and using the correlation theorem to compare them. Secondly an algorithm for the parallel search for secondary structures according to the free-energy-minimization criterion, but with a filtering action on the basis of the folding consensus measure is presented. The method is tested on groups of RNA sequences different in origin and in functions, for which proposals of homologous secondary structures based on experimental data exist. A comparison of the results with a blank consisting of a search on the basis of the free energy minimization alone is always performed. In these tests the method shows its ability in obtaining, from different sequences, secondary structures characterized by a high-folding consensus measure also when lower free energy but not homologous structures are possible. Two applications are also shown. The first demonstrates the transfer of experimental data available for one sequence, to a functionally related and therefore homologous one. The second application is the possibility of using a topological probe in the search for precise structural motifs.  相似文献   

19.
Although the folding rates of proteins have been studied extensively, both experimentally and theoretically, and many native state topological parameters have been proposed to correlate with or predict these rates, unfolding rates have received much less attention. Moreover, unfolding rates have generally been thought either to not relate to native topology in the same manner as folding rates, perhaps depending on different topological parameters, or to be more difficult to predict. Using a dataset of 108 proteins including two-state and multistate folders, we find that both unfolding and folding rates correlate strongly, and comparably well, with well-established measures of native topology, the absolute contact order and the long range order, with correlation coefficient values of 0.75 or higher. In addition, compared to folding rates, the absolute values of unfolding rates vary more strongly with native topology, have a larger range of values, and correlate better with thermodynamic stability. Similar trends are observed for subsets of different protein structural classes. Taken together, these results suggest that choosing a scaffold for protein engineering may require a compromise between a simple topology that will fold sufficiently quickly but also unfold quickly, and a complex topology that will unfold slowly and hence have kinetic stability, but fold slowly. These observations, together with the established role of kinetic stability in determining resistance to thermal and chemical denaturation as well as proteases, have important implications for understanding fundamental aspects of protein unfolding and folding and for protein engineering and design.  相似文献   

20.
Joshi S  Rana S  Wangikar P  Durani S 《Biopolymers》2006,83(2):122-134
Artificial proteins potentially barrier-free in the folding kinetics are approached computationally under the guidance of protein-folding theories. The smallest and fastest folding globular protein triple-helix-bundle (THB) is so modified as to minimize or eliminate its presumed barriers in folding speed. As the barriers may reside in the ordering of either secondary or tertiary structure, the elements of both secondary and tertiary structure in the protein are targeted for prenucleation with suitable stereochemically constrained amino acid residues. The required elements of topology and sequence for the THB are optimized independently; first the topology is optimized with simulated annealing in polypeptides of highly simplified alphabet; next, the sequence in side chains is optimized using the standard inverse design methods. The resultant three best-adapted THBs, variable in topology and distinctive in sequences, are assessed by comparing them with a few benchmark proteins. The results of mainly molecular dynamics (MD) comparisons, undertaken in explicit water at different temperatures, show that the designed sequences are favorably placed against the chosen benchmarks as THB proteins potentially thermostable in the native folds. Folding simulation experiments with MD establish that the designed sequences are rapid in the folding of individual helices, but not in the evolution of tertiary structure; energetic cum topological frustrations remain but could be the artifacts of the starting conformations that were chosen in the THBs in the folding simulations. Overall, a practical high-throughput approach for de novo protein design has been developed that may have fruitful application for any type of tertiary structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号