首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《Biophysical journal》2021,120(22):5124-5135
Intrinsically disordered proteins and flexible regions in multidomain proteins display substantial conformational heterogeneity. Characterizing the conformational ensembles of these proteins in solution typically requires combining one or more biophysical techniques with computational modeling or simulations. Experimental data can either be used to assess the accuracy of a computational model or to refine the computational model to get a better agreement with the experimental data. In both cases, one generally needs a so-called forward model (i.e., an algorithm to calculate experimental observables from individual conformations or ensembles). In many cases, this involves one or more parameters that need to be set, and it is not always trivial to determine the optimal values or to understand the impact on the choice of parameters. For example, in the case of small-angle x-ray scattering (SAXS) experiments, many forward models include parameters that describe the contribution of the hydration layer and displaced solvent to the background-subtracted experimental data. Often, one also needs to fit a scale factor and a constant background for the SAXS data but across the entire ensemble. Here, we present a protocol to dissect the effect of the free parameters on the calculated SAXS intensities and to identify a reliable set of values. We have implemented this procedure in our Bayesian/maximum entropy framework for ensemble refinement and demonstrate the results on four intrinsically disordered proteins and a protein with three domains connected by flexible linkers. Our results show that the resulting ensembles can depend on the parameters used for solvent effects and suggest that these should be chosen carefully. We also find a set of parameters that work robustly across all proteins.  相似文献   

2.
Disordered states of proteins include the biologically functional intrinsically disordered proteins and the unfolded states of normally folded proteins. In recent years, ensemble‐modeling strategies using various experimental measurements as restraints have emerged as powerful means for structurally characterizing disordered states. However, these methods are still in their infancy compared with the structural determination of folded proteins. Here, we have addressed several issues important to ensemble modeling using our ENSEMBLE methodology. First, we assessed how calculating ensembles containing different numbers of conformers affects their structural properties. We find that larger ensembles have very similar properties to smaller ensembles fit to the same experimental restraints, thus allowing a considerable speed improvement in our calculations. In addition, we analyzed the contributions of different experimental restraints to the structural properties of calculated ensembles, enabling us to make recommendations about the experimental measurements that should be made for optimal ensemble modeling. The effects of different restraints, most significantly from chemical shifts, paramagnetic relaxation enhancements and small‐angle X‐ray scattering, but also from other data, underscore the importance of utilizing multiple sources of experimental data. Finally, we validate our ENSEMBLE methodology using both cross‐validation and synthetic experimental restraints calculated from simulated ensembles. Our results suggest that secondary structure and molecular size distribution can generally be modeled very accurately, whereas the accuracy of calculated tertiary structure is dependent on the number of distance restraints used. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

3.
The roles of unfolded states of proteins in normal folding and in diseases involving aggregation, as well as the prevalence and regulatory functions of intrinsically disordered proteins, have become increasingly recognized. The structural representation of these disordered states as ensembles of interconverting conformers can therefore provide critical insights. Experimental methods can be used to probe ensemble-averaged structural properties of disordered states and computational approaches generate representative ensembles of conformers using experimental restraints. In particular, NMR and small-angle X-ray scattering provide quantitative data that can readily be incorporated into calculations. These techniques have gleaned structural information about denatured, unfolded and intrinsically disordered proteins. The use of experimental data in different computational approaches, including ensemble molecular dynamics simulations and algorithms that assign populations to pregenerated conformers, has highlighted the presence of both local and long-range structure, and the occurrence of native-like and non-native interactions in unfolded and denatured states. Analysis of the resulting ensembles has suggested important implications of this fluctuating structure for folding, aggregation and binding.  相似文献   

4.
Computational protein and drug design generally require accurate modeling of protein conformations. This modeling typically starts with an experimentally determined protein structure and considers possible conformational changes due to mutations or new ligands. The DEE/A* algorithm provably finds the global minimum‐energy conformation (GMEC) of a protein assuming that the backbone does not move and the sidechains take on conformations from a set of discrete, experimentally observed conformations called rotamers. DEE/A* can efficiently find the overall GMEC for exponentially many mutant sequences. Previous improvements to DEE/A* include modeling ensembles of sidechain conformations and either continuous sidechain or backbone flexibility. We present a new algorithm, DEEPer (D ead‐E nd E limination with Per turbations), that combines these advantages and can also handle much more extensive backbone flexibility and backbone ensembles. DEEPer provably finds the GMEC or, if desired by the user, all conformations and sequences within a specified energy window of the GMEC. It includes the new abilities to handle arbitrarily large backbone perturbations and to generate ensembles of backbone conformations. It also incorporates the shear, an experimentally observed local backbone motion never before used in design. Additionally, we derive a new method to accelerate DEE/A*‐based calculations, indirect pruning, that is particularly useful for DEEPer. In 67 benchmark tests on 64 proteins, DEEPer consistently identified lower‐energy conformations than previous methods did, indicating more accurate modeling. Additional tests demonstrated its ability to incorporate larger, experimentally observed backbone conformational changes and to model realistic conformational ensembles. These capabilities provide significant advantages for modeling protein mutations and protein–ligand interactions. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

5.
Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.  相似文献   

6.
Mathematical models are powerful tools for epidemiology and can be used to compare control actions. However, different models and model parameterizations may provide different prediction of outcomes. In other fields of research, ensemble modeling has been used to combine multiple projections. We explore the possibility of applying such methods to epidemiology by adapting Bayesian techniques developed for climate forecasting. We exemplify the implementation with single model ensembles based on different parameterizations of the Warwick model run for the 2001 United Kingdom foot and mouth disease outbreak and compare the efficacy of different control actions. This allows us to investigate the effect that discrepancy among projections based on different modeling assumptions has on the ensemble prediction. A sensitivity analysis showed that the choice of prior can have a pronounced effect on the posterior estimates of quantities of interest, in particular for ensembles with large discrepancy among projections. However, by using a hierarchical extension of the method we show that prior sensitivity can be circumvented. We further extend the method to include a priori beliefs about different modeling assumptions and demonstrate that the effect of this can have different consequences depending on the discrepancy among projections. We propose that the method is a promising analytical tool for ensemble modeling of disease outbreaks.  相似文献   

7.
In nature, proteins partake in numerous protein– protein interactions that mediate their functions. Moreover, proteins have been shown to be physically stable in multiple structures, induced by cellular conditions, small ligands, or covalent modifications. Understanding how protein sequences achieve this structural promiscuity at the atomic level is a fundamental step in the drug design pipeline and a critical question in protein physics. One way to investigate this subject is to computationally predict protein sequences that are compatible with multiple states, i.e., multiple target structures or binding to distinct partners. The goal of engineering such proteins has been termed multispecific protein design. We develop a novel computational framework to efficiently and accurately perform multispecific protein design. This framework utilizes recent advances in probabilistic graphical modeling to predict sequences with low energies in multiple target states. Furthermore, it is also geared to specifically yield positional amino acid probability profiles compatible with these target states. Such profiles can be used as input to randomly bias high‐throughput experimental sequence screening techniques, such as phage display, thus providing an alternative avenue for elucidating the multispecificity of natural proteins and the synthesis of novel proteins with specific functionalities. We prove the utility of such multispecific design techniques in better recovering amino acid sequence diversities similar to those resulting from millions of years of evolution. We then compare the approaches of prediction of low energy ensembles and of amino acid profiles and demonstrate their complementarity in providing more robust predictions for protein design. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

8.
9.
Nanni L  Lumini A 《Amino acids》2009,36(3):409-416
The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows quantifying the performance improvement obtained by the ensembles proposed in this work.  相似文献   

10.
Shehu A  Clementi C  Kavraki LE 《Proteins》2006,65(1):164-179
Characterizing protein flexibility is an important goal for understanding the physical-chemical principles governing biological function. This paper presents a Fragment Ensemble Method to capture the mobility of a protein fragment such as a missing loop and its extension into a Protein Ensemble Method to characterize the mobility of an entire protein at equilibrium. The underlying approach in both methods is to combine a geometric exploration of conformational space with a statistical mechanics formulation to generate an ensemble of physical conformations on which thermodynamic quantities can be measured as ensemble averages. The Fragment Ensemble Method is validated by applying it to characterize loop mobility in both instances of strongly stable and disordered loop fragments. In each instance, fluctuations measured over generated ensembles are consistent with data from experiment and simulation. The Protein Ensemble Method captures the mobility of an entire protein by generating and combining ensembles of conformations for consecutive overlapping fragments defined over the protein sequence. This method is validated by applying it to characterize flexibility in ubiquitin and protein G. Thermodynamic quantities measured over the ensembles generated for both proteins are fully consistent with available experimental data. On these proteins, the method recovers nontrivial data such as order parameters, residual dipolar couplings, and scalar couplings. Results presented in this work suggest that the proposed methods can provide insight into the interplay between protein flexibility and function.  相似文献   

11.
In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.  相似文献   

12.
When accounting for structural fluctuations or measurement errors, a single rigid structure may not be sufficient to represent a protein. One approach to solve this problem is to represent the possible conformations as a discrete set of observed conformations, an ensemble. In this work, we follow a different richer approach, and introduce a framework for estimating probability density functions in very high dimensions, and then apply it to represent ensembles of folded proteins. This proposed approach combines techniques such as kernel density estimation, maximum likelihood, cross-validation, and bootstrapping. We present the underlying theoretical and computational framework and apply it to artificial data and protein ensembles obtained from molecular dynamics simulations. We compare the results with those obtained experimentally, illustrating the potential and advantages of this representation.  相似文献   

13.
Partially folded and denatured proteins can give important insights into protein folding, misfolding, and aggregation. Such non-native states of proteins are however very difficult to characterise in detail as they are dynamic, heterogeneous systems comprising of ensembles of interconverting conformers. This article describes methods that produce models for non-native proteins in atomic detail. A variety of molecular dynamics based protocols are discussed together with some recent procedures that include restraints from experimental data. These models provide an important framework for interpreting experimental data from studies of non-native states using nuclear magnetic resonance spectroscopy, fluorescence, circular dichroism, and small angle scattering techniques.  相似文献   

14.
Mechanosensitive channels allow bacteria to respond to osmotic stress by opening a nanometer-sized pore in the cellular membrane. Although the underlying mechanism has been thoroughly studied on the basis of individual channels, the behavior of channel ensembles has yet to be elucidated. This work reveals that mechanosensitive channels of large conductance (MscL) exhibit a tendency to spatially cluster, and demonstrates the functional relevance of clustering. We evaluated the spatial distribution of channels in a lipid bilayer using patch-clamp electrophysiology, fluorescence and atomic force microscopy, and neutron scattering and reflection techniques, coupled with mathematical modeling of the mechanics of a membrane crowded with proteins. The results indicate that MscL forms clusters under a wide range of conditions. MscL is closely packed within each cluster but is still active and mechanosensitive. However, the channel activity is modulated by the presence of neighboring proteins, indicating membrane-mediated protein-protein interactions. Collectively, these results suggest that MscL self-assembly into channel clusters plays an osmoregulatory functional role in the membrane.  相似文献   

15.
Despite the importance of intracellular signaling networks, there is currently no consensus regarding the fundamental nature of the protein complexes such networks employ. One prominent view involves stable signaling machines with well-defined quaternary structures. The combinatorial complexity of signaling networks has led to an opposing perspective, namely that signaling proceeds via heterogeneous pleiomorphic ensembles of transient complexes. Since many hypotheses regarding network function rely on how we conceptualize signaling complexes, resolving this issue is a central problem in systems biology. Unfortunately, direct experimental characterization of these complexes has proven technologically difficult, while combinatorial complexity has prevented traditional modeling methods from approaching this question. Here we employ rule-based modeling, a technique that overcomes these limitations, to construct a model of the yeast pheromone signaling network. We found that this model exhibits significant ensemble character while generating reliable responses that match experimental observations. To contrast the ensemble behavior, we constructed a model that employs hierarchical assembly pathways to produce scaffold-based signaling machines. We found that this machine model could not replicate the experimentally observed combinatorial inhibition that arises when the scaffold is overexpressed. This finding provides evidence against the hierarchical assembly of machines in the pheromone signaling network and suggests that machines and ensembles may serve distinct purposes in vivo. In some cases, e.g. core enzymatic activities like protein synthesis and degradation, machines assembled via hierarchical energy landscapes may provide functional stability for the cell. In other cases, such as signaling, ensembles may represent a form of weak linkage, facilitating variation and plasticity in network evolution. The capacity of ensembles to signal effectively will ultimately shape how we conceptualize the function, evolution and engineering of signaling networks.  相似文献   

16.
17.
In recent years, much effort has been devoted to understanding the three-dimensional (3D) organization of the genome and how genomic structure mediates nuclear function. The development of experimental techniques that combine DNA proximity ligation with high-throughput sequencing, such as Hi-C, have substantially improved our knowledge about chromatin organization. Numerous experimental advancements, not only utilizing DNA proximity ligation but also high-resolution genome imaging (DNA tracing), have required theoretical modeling to determine the structural ensembles consistent with such data. These 3D polymer models of the genome provide an understanding of the physical mechanisms governing genome architecture. Here, we present an overview of the recent advances in modeling the ensemble of 3D chromosomal structures by employing the maximum entropy approach combined with polymer physics. Particularly, we discuss the minimal chromatin model (MiChroM) along with the “maximum entropy genomic annotations from biomarkers associated with structural ensembles” (MEGABASE) model, which have been remarkably successful in the accurate modeling of chromosomes consistent with both Hi-C and DNA-tracing data.  相似文献   

18.
19.
随着同步辐射装置的建设与发展及各种建模方法的产生与完善,小角X-射线散射(small angle X-ray scattering,SAXS)法已经逐渐成为结构生物学中的一种重要的工具。SAXS可以用于研究溶液中生物大分子的结构及构象变化,蛋白质的组装、折叠等动态过程。本文对SAXS的基本原理、常用的研究技术和建模方法及其应用进行了综述。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号