首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Proteins are active, flexible machines that perform a range of different functions. Innovative experimental approaches may now provide limited partial information about conformational changes along motion pathways of proteins. There is therefore a need for computational approaches that can efficiently incorporate prior information into motion prediction schemes. In this paper, we present PathRover, a general setup designed for the integration of prior information into the motion planning algorithm of rapidly exploring random trees (RRT). Each suggested motion pathway comprises a sequence of low-energy clash-free conformations that satisfy an arbitrary number of prior information constraints. These constraints can be derived from experimental data or from expert intuition about the motion. The incorporation of prior information is very straightforward and significantly narrows down the vast search in the typically high-dimensional conformational space, leading to dramatic reduction in running time. To allow the use of state-of-the-art energy functions and conformational sampling, we have integrated this framework into Rosetta, an accurate protocol for diverse types of structural modeling. The suggested framework can serve as an effective complementary tool for molecular dynamics, Normal Mode Analysis, and other prevalent techniques for predicting motion in proteins. We applied our framework to three different model systems. We show that a limited set of experimentally motivated constraints may effectively bias the simulations toward diverse predicates in an outright fashion, from distance constraints to enforcement of loop closure. In particular, our analysis sheds light on mechanisms of protein domain swapping and on the role of different residues in the motion.  相似文献   

2.

Background

Despite computational challenges, elucidating conformations that a protein system assumes under physiologic conditions for the purpose of biological activity is a central problem in computational structural biology. While these conformations are associated with low energies in the energy surface that underlies the protein conformational space, few existing conformational search algorithms focus on explicitly sampling low-energy local minima in the protein energy surface.

Methods

This work proposes a novel probabilistic search framework, PLOW, that explicitly samples low-energy local minima in the protein energy surface. The framework combines algorithmic ingredients from evolutionary computation and computational structural biology to effectively explore the subspace of local minima. A greedy local search maps a conformation sampled in conformational space to a nearby local minimum. A perturbation move jumps out of a local minimum to obtain a new starting conformation for the greedy local search. The process repeats in an iterative fashion, resulting in a trajectory-based exploration of the subspace of local minima.

Results and conclusions

The analysis of PLOW's performance shows that, by navigating only the subspace of local minima, PLOW is able to sample conformations near a protein's native structure, either more effectively or as well as state-of-the-art methods that focus on reproducing the native structure for a protein system. Analysis of the actual subspace of local minima shows that PLOW samples this subspace more effectively that a naive sampling approach. Additional theoretical analysis reveals that the perturbation function employed by PLOW is key to its ability to sample a diverse set of low-energy conformations. This analysis also suggests directions for further research and novel applications for the proposed framework.
  相似文献   

3.
4.
Multiscale modeling has a long history of use in structural biology, as computational biologists strive to overcome the time- and length-scale limits of atomistic molecular dynamics. Contemporary machine learning techniques, such as deep learning, have promoted advances in virtually every field of science and engineering and are revitalizing the traditional notions of multiscale modeling. Deep learning has found success in various approaches for distilling information from fine-scale models, such as building surrogate models and guiding the development of coarse-grained potentials. However, perhaps its most powerful use in multiscale modeling is in defining latent spaces that enable efficient exploration of conformational space. This confluence of machine learning and multiscale simulation with modern high-performance computing promises a new era of discovery and innovation in structural biology.  相似文献   

5.
We present a novel de novo method to generate protein models from sparse, discretized restraints on the conformation of the main chain and side chain atoms. We focus on Calpha-trace generation, the problem of constructing an accurate and complete model from approximate knowledge of the positions of the Calpha atoms and, in some cases, the side chain centroids. Spatial restraints on the Calpha atoms and side chain centroids are supplemented by constraints on main chain geometry, phi/xi angles, rotameric side chain conformations, and inter-atomic separations derived from analyses of known protein structures. A novel conformational search algorithm, combining features of tree-search and genetic algorithms, generates models consistent with these restraints by propensity-weighted dihedral angle sampling. Models with ideal geometry, good phi/xi angles, and no inter-atomic overlaps are produced with 0.8 A main chain and, with side chain centroid restraints, 1.0 A all-atom root-mean-square deviation (RMSD) from the crystal structure over a diverse set of target proteins. The mean model derived from 50 independently generated models is closer to the crystal structure than any individual model, with 0.5 A main chain RMSD under only Calpha restraints and 0.7 A all-atom RMSD under both Calpha and centroid restraints. The method is insensitive to randomly distributed errors of up to 4 A in the Calpha restraints. The conformational search algorithm is efficient, with computational cost increasing linearly with protein size. Issues relating to decoy set generation, experimental structure determination, efficiency of conformational sampling, and homology modeling are discussed.  相似文献   

6.
A new method for the analysis of NMR data in terms of the solution structure of proteins has been developed. The method consists of two steps: first a systematic search of the conformational space to define the region allowed by the initial set of experimental constraints, and second, the narrowing of this region by the introduction of additional constraints and optional refinement procedures. The search of the conformational space is guided by heuristics to make it computationally feasible. The method is therefore called the heuristic refinement method and is coded in an expert system called PROTEAN. The paper describes the validation of the first step of the method using an artificial NMR data set generated from the known crystal structure of sperm whale carbon monoxymyoglobin. It is shown that the initial search procedure yields a low-resolution structure of the myoglobin molecule, accurately reproducing its main topological features, and that the precision of the structure depends on the quality of the initial data set.  相似文献   

7.

Background

Elucidating the native structure of a protein molecule from its sequence of amino acids, a problem known as de novo structure prediction, is a long standing challenge in computational structural biology. Difficulties in silico arise due to the high dimensionality of the protein conformational space and the ruggedness of the associated energy surface. The issue of multiple minima is a particularly troublesome hallmark of energy surfaces probed with current energy functions. In contrast to the true energy surface, these surfaces are weakly-funneled and rich in comparably deep minima populated by non-native structures. For this reason, many algorithms seek to be inclusive and obtain a broad view of the low-energy regions through an ensemble of low-energy (decoy) conformations. Conformational diversity in this ensemble is key to increasing the likelihood that the native structure has been captured.

Methods

We propose an evolutionary search approach to address the multiple-minima problem in decoy sampling for de novo structure prediction. Two population-based evolutionary search algorithms are presented that follow the basic approach of treating conformations as individuals in an evolving population. Coarse graining and molecular fragment replacement are used to efficiently obtain protein-like child conformations from parents. Potential energy is used both to bias parent selection and determine which subset of parents and children will be retained in the evolving population. The effect on the decoy ensemble of sampling minima directly is measured by additionally mapping a conformation to its nearest local minimum before considering it for retainment. The resulting memetic algorithm thus evolves not just a population of conformations but a population of local minima.

Results and conclusions

Results show that both algorithms are effective in terms of sampling conformations in proximity of the known native structure. The additional minimization is shown to be key to enhancing sampling capability and obtaining a diverse ensemble of decoy conformations, circumventing premature convergence to sub-optimal regions in the conformational space, and approaching the native structure with proximity that is comparable to state-of-the-art decoy sampling methods. The results are shown to be robust and valid when using two representative state-of-the-art coarse-grained energy functions.
  相似文献   

8.
For a variety of problems in structural biology, low-resolution maps generated by electron microscopy imaging are often interpreted with the help of various flexible-fitting computational algorithms. In this work, we systematically analyze the quality of final models of various proteins obtained via molecular dynamics flexible fitting (MDFF) by varying the map-resolution, strength of structural restraints, and the steering forces. We find that MDFF can be extended to understand conformational changes in lower-resolution maps if larger structural restraints and lower steering forces are used to prevent overfitting. We further show that the capabilities of MDFF can be extended by combining it with an enhanced conformational sampling method, temperature-accelerated molecular dynamics (TAMD). Specifically, either TAMD can be used to generate better starting configurations for MDFF fitting or TAMD-assisted MDFF (TAMDFF) can be performed to accelerate conformational search in atomistic simulations.  相似文献   

9.
MOTIVATION: Sampling the conformational space is a fundamental step for both ligand- and structure-based drug design. However, the rational organization of different molecular conformations still remains a challenge. In fact, for drug design applications, the sampling process provides a redundant conformation set whose thorough analysis can be intensive, or even prohibitive. We propose a statistical approach based on cluster analysis aimed at rationalizing the output of methods such as Monte Carlo, genetic, and reconstruction algorithms. Although some software already implements clustering procedures, at present, a universally accepted protocol is still missing. RESULTS: We integrated hierarchical agglomerative cluster analysis with a clusterability assessment method and a user independent cutting rule, to form a global protocol that we implemented in a MATLAB metalanguage program (AClAP). We tested it on the conformational space of a quite diverse set of drugs generated via Metropolis Monte Carlo simulation, and on the poses we obtained by reiterated docking runs performed by four widespread programs. In our tests, AClAP proved to remarkably reduce the dimensionality of the original datasets at a negligible computational cost. Moreover, when applied to the outcomes of many docking programs together, it was able to point to the crystallographic pose. AVAILABILITY: AClAP is available at the "AClAP" section of the website http://www.scfarm.unibo.it.  相似文献   

10.
The rapidly increasing wealth of structural information on RNA and knowledge of its varying roles in biology have facilitated the study of RNA structure using computational methods. Here, we present a new method to describe RNA structure based on nucleotide doublets, where a doublet is any two nucleotides in a structure. We restrict our search to doublets that are close together in space, but not necessarily in sequence, and obtain doublet libraries of various sizes by clustering a large set of doublets taken from a data set of high-resolution RNA structures. We demonstrate that these libraries are able to both capture structural features present in RNA and fit local RNA structure with a high level of accuracy. Libraries ranging in size from ten to 100 doublets are examined, and a detailed analysis shows that a library with as few as 30 doublets is sufficient to capture the most common structural features, while larger libraries would be more appropriate for accurate modeling. We anticipate many uses for these libraries, from annotation to structure refinement and prediction.  相似文献   

11.
The three-dimensional structure of a protein is a key determinant of its biological function. Given the cost and time required to acquire this structure through experimental means, computational models are necessary to complement wet-lab efforts. Many computational techniques exist for navigating the high-dimensional protein conformational search space, which is explored for low-energy conformations that comprise a protein's native states. This work proposes two strategies to enhance the sampling of conformations near the native state. An enhanced fragment library with greater structural diversity is used to expand the search space in the context of fragment-based assembly. To manage the increased complexity of the search space, only a representative subset of the sampled conformations is retained to further guide the search towards the native state. Our results make the case that these two strategies greatly enhance the sampling of the conformational space near the native state. A detailed comparative analysis shows that our approach performs as well as state-of-the-art ab initio structure prediction protocols.  相似文献   

12.
13.
The rapidly expanding body of available genomic and protein structural data provides a rich resource for understanding protein dynamics with biomolecular simulation. While computational infrastructure has grown rapidly, simulations on an omics scale are not yet widespread, primarily because software infrastructure to enable simulations at this scale has not kept pace. It should now be possible to study protein dynamics across entire (super)families, exploiting both available structural biology data and conformational similarities across homologous proteins. Here, we present a new tool for enabling high-throughput simulation in the genomics era. Ensembler takes any set of sequences—from a single sequence to an entire superfamily—and shepherds them through various stages of modeling and refinement to produce simulation-ready structures. This includes comparative modeling to all relevant PDB structures (which may span multiple conformational states of interest), reconstruction of missing loops, addition of missing atoms, culling of nearly identical structures, assignment of appropriate protonation states, solvation in explicit solvent, and refinement and filtering with molecular simulation to ensure stable simulation. The output of this pipeline is an ensemble of structures ready for subsequent molecular simulations using computer clusters, supercomputers, or distributed computing projects like Folding@home. Ensembler thus automates much of the time-consuming process of preparing protein models suitable for simulation, while allowing scalability up to entire superfamilies. A particular advantage of this approach can be found in the construction of kinetic models of conformational dynamics—such as Markov state models (MSMs)—which benefit from a diverse array of initial configurations that span the accessible conformational states to aid sampling. We demonstrate the power of this approach by constructing models for all catalytic domains in the human tyrosine kinase family, using all available kinase catalytic domain structures from any organism as structural templates. Ensembler is free and open source software licensed under the GNU General Public License (GPL) v2. It is compatible with Linux and OS X. The latest release can be installed via the conda package manager, and the latest source can be downloaded from https://github.com/choderalab/ensembler.  相似文献   

14.
Zheng W  Brooks BR 《Biophysical journal》2006,90(12):4327-4336
Recently we have developed a normal-modes-based algorithm that predicts the direction of protein conformational changes given the initial state crystal structure together with a small number of pairwise distance constraints for the end state. Here we significantly extend this method to accurately model both the direction and amplitude of protein conformational changes. The new protocol implements a multisteps search in the conformational space that is driven by iteratively minimizing the error of fitting the given distance constraints and simultaneously enforcing the restraint of low elastic energy. At each step, an incremental structural displacement is computed as a linear combination of the lowest 10 normal modes derived from an elastic network model, whose eigenvectors are reorientated to correct for the distortions caused by the structural displacements in the previous steps. We test this method on a list of 16 pairs of protein structures for which relatively large conformational changes are observed (root mean square deviation >3 angstroms), using up to 10 pairwise distance constraints selected by a fluctuation analysis of the initial state structures. This method has achieved a near-optimal performance in almost all cases, and in many cases the final structural models lie within root mean square deviation of 1 approximately 2 angstroms from the native end state structures.  相似文献   

15.
Thompson J  Baker D 《Proteins》2011,79(8):2380-2388
Prediction of protein structures from sequences is a fundamental problem in computational biology. Algorithms that attempt to predict a structure from sequence primarily use two sources of information. The first source is physical in nature: proteins fold into their lowest energy state. Given an energy function that describes the interactions governing folding, a method for constructing models of protein structures, and the amino acid sequence of a protein of interest, the structure prediction problem becomes a search for the lowest energy structure. Evolution provides an orthogonal source of information: proteins of similar sequences have similar structure, and therefore proteins of known structure can guide modeling. The relatively successful Rosetta approach takes advantage of the first, but not the second source of information during model optimization. Following the classic work by Andrej Sali and colleagues, we develop a probabilistic approach to derive spatial restraints from proteins of known structure using advances in alignment technology and the growth in the number of structures in the Protein Data Bank. These restraints define a region of conformational space that is high-probability, given the template information, and we incorporate them into Rosetta's comparative modeling protocol. The combined approach performs considerably better on a benchmark based on previous CASP experiments. Incorporating evolutionary information into Rosetta is analogous to incorporating sparse experimental data: in both cases, the additional information eliminates large regions of conformational space and increases the probability that energy-based refinement will hone in on the deep energy minimum at the native state.  相似文献   

16.
Abstract

In order to investigate the relationship between the bioactive conformation of a peptide and its set of thermodynamically accessible structures in solution, the conformational profile of the tetrapeptide Ac-Pro-Ala-Pro-Tyr-OH was characterized by computational methods. Search of the conformational space was performed within the molecular mechanics framework using the AMBER4.0 force field with an effective dielectric constant of 80. Unique structures of the peptide were compared with its bioactive conformation for the protein Streptomyces griseus Protease A, as taken from the crystal structure of the enzyme-peptide complex. The results show that the bound conformation is close to one of the unique conformations characterized in the conformational search of the isolated peptide. Moreover, the lowest energy minimum characterized in the conformational search exhibits large deviations when compared to the bound conformation of the crystal structure.  相似文献   

17.
18.
19.
Side-chain flexibility of ligand-binding sites needs to be considered in the rational design of novel inhibitors. We have developed a method to generate conformational ensembles that efficiently sample local side-chain flexibility from a single crystal structure. The rotamer-based approach is tested here for the S1' pocket of human collagenase-1 (MMP-1), which is known to undergo conformational changes in multiple side-chains upon binding of certain inhibitors. First, a raw ensemble consisting of a large number of conformers of the S1' pocket was generated using an exhaustive search of rotamer combinations on a template crystal structure. A combination of principal component analysis and fuzzy clustering was then employed to successfully identify a core ensemble consisting of a low number of representatives from the raw ensemble. The core ensemble contained geometrically diverse conformers of stable nature, as indicated in several cases by a relative energy lower than that of the minimised template crystal structure. Through comparisons with X-ray crystallography and NMR structural data we show that the core ensemble occupied a conformational space similar to that observed under experimental conditions. The synthetic inhibitor RS-104966 is known to induce a conformational change in the side-chains of the S1' pocket of MMP-1 and could not be docked in the template crystal structure. However, the experimental binding mode was reproduced successfully using members of the core ensemble as the docking target, establishing the usefulness of the method in drug design.  相似文献   

20.
An algorithm for locating the region in conformational space containing the global energy minimum of a polypeptide is described. Distances are used as the primary variables in the minimization of an objective function that incorporates both energetic and distance-geometric terms. The latter are obtained from geometry and energy functions, rather than nuclear magnetic resonance experiments, although the algorithm can incorporate distances from nuclear magnetic resonance data if desired. The polypeptide is generated originally in a space of high dimensionality. This has two important consequences. First, all interatomic distances are initially at their energetically most favorable values; i.e. the polypeptide is initially at a global minimum-energy conformation, albeit a high-dimensional one. Second, the relaxation of dimensionality constraints in the early stages of the minimization removes many potential energy barriers that exist in three dimensions, thereby allowing a means of escaping from three-dimensional local minima. These features are used in an algorithm that produces short trajectories of three-dimensional minimum-energy conformations. A conformation in the trajectory is generated by allowing the previous conformation in the trajectory to evolve in a high-dimensional space before returning to three dimensions. The resulting three-dimensional structure is taken to be the next conformation in the trajectory, and the process is iterated. This sequence of conformations results in a limited but efficient sampling of conformational space. Results for test calculations on Met-enkephalin, a pentapeptide with the amino acid sequence H-Tyr-Gly-Gly-Phe-Met-OH, are presented. A tight cluster of conformations (in three-dimensional space) is found with ECEPP energies (Empirical Conformational Energy Program for Peptides) lower than any previously reported. This cluster of conformations defines a region in conformational space in which the global-minimum-energy conformation of enkephalin appears to lie.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号