首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.

Background

Mathematical modeling is a powerful tool to analyze, and ultimately design biochemical networks. However, the estimation of the parameters that appear in biochemical models is a significant challenge. Parameter estimation typically involves expensive function evaluations and noisy data, making it difficult to quickly obtain optimal solutions. Further, biochemical models often have many local extrema which further complicates parameter estimation. Toward these challenges, we developed Dynamic Optimization with Particle Swarms (DOPS), a novel hybrid meta-heuristic that combined multi-swarm particle swarm optimization with dynamically dimensioned search (DDS). DOPS uses a multi-swarm particle swarm optimization technique to generate candidate solution vectors, the best of which is then greedily updated using dynamically dimensioned search.

Results

We tested DOPS using classic optimization test functions, biochemical benchmark problems and real-world biochemical models. We performed \(\mathcal {T}\) = 25 trials with \(\mathcal {N}\) = 4000 function evaluations per trial, and compared the performance of DOPS with other commonly used meta-heuristics such as differential evolution (DE), simulated annealing (SA) and dynamically dimensioned search (DDS). On average, DOPS outperformed other common meta-heuristics on the optimization test functions, benchmark problems and a real-world model of the human coagulation cascade.

Conclusions

DOPS is a promising meta-heuristic approach for the estimation of biochemical model parameters in relatively few function evaluations. DOPS source code is available for download under a MIT license at http://www.varnerlab.org.
  相似文献   

2.

Background  

Ordinary differential equations (ODEs) are an important tool for describing the dynamics of biological systems. However, for ODE models to be useful, their parameters must first be calibrated. Parameter estimation, that is, finding parameter values given experimental data, is an inference problem that can be treated systematically through a Bayesian framework.  相似文献   

3.

Background

Ordinary differential equations (ODEs) are often used to understand biological processes. Since ODE-based models usually contain many unknown parameters, parameter estimation is an important step toward deeper understanding of the process. Parameter estimation is often formulated as a least squares optimization problem, where all experimental data points are considered as equally important. However, this equal-weight formulation ignores the possibility of existence of relative importance among different data points, and may lead to misleading parameter estimation results. Therefore, we propose to introduce weights to account for the relative importance of different data points when formulating the least squares optimization problem. Each weight is defined by the uncertainty of one data point given the other data points. If one data point can be accurately inferred given the other data, the uncertainty of this data point is low and the importance of this data point is low. Whereas, if inferring one data point from the other data is almost impossible, it contains a huge uncertainty and carries more information for estimating parameters.

Results

G1/S transition model with 6 parameters and 12 parameters, and MAPK module with 14 parameters were used to test the weighted formulation. In each case, evenly spaced experimental data points were used. Weights calculated in these models showed similar patterns: high weights for data points in dynamic regions and low weights for data points in flat regions. We developed a sampling algorithm to evaluate the weighted formulation, and demonstrated that the weighted formulation reduced the redundancy in the data. For G1/S transition model with 12 parameters, we examined unevenly spaced experimental data points, strategically sampled to have more measurement points where the weights were relatively high, and fewer measurement points where the weights were relatively low. This analysis showed that the proposed weights can be used for designing measurement time points.

Conclusions

Giving a different weight to each data point according to its relative importance compared to other data points is an effective method for improving robustness of parameter estimation by reducing the redundancy in the experimental data.
  相似文献   

4.

Background  

When creating mechanistic mathematical models for biological signaling processes it is tempting to include as many known biochemical interactions into one large model as possible. For the JAK-STAT, MAP kinase, and NF-κB pathways a lot of biological insight is available, and as a consequence, large mathematical models have emerged. For large models the question arises whether unknown model parameters can uniquely be determined by parameter estimation from measured data. Systematic approaches to answering this question are indispensable since the uniqueness of model parameter values is essential for predictive mechanistic modeling.  相似文献   

5.

Background

Translating a known metabolic network into a dynamic model requires reasonable guesses of all enzyme parameters. In Bayesian parameter estimation, model parameters are described by a posterior probability distribution, which scores the potential parameter sets, showing how well each of them agrees with the data and with the prior assumptions made.

Results

We compute posterior distributions of kinetic parameters within a Bayesian framework, based on integration of kinetic, thermodynamic, metabolic, and proteomic data. The structure of the metabolic system (i.e., stoichiometries and enzyme regulation) needs to be known, and the reactions are modelled by convenience kinetics with thermodynamically independent parameters. The parameter posterior is computed in two separate steps: a first posterior summarises the available data on enzyme kinetic parameters; an improved second posterior is obtained by integrating metabolic fluxes, concentrations, and enzyme concentrations for one or more steady states. The data can be heterogenous, incomplete, and uncertain, and the posterior is approximated by a multivariate log-normal distribution. We apply the method to a model of the threonine synthesis pathway: the integration of metabolic data has little effect on the marginal posterior distributions of individual model parameters. Nevertheless, it leads to strong correlations between the parameters in the joint posterior distribution, which greatly improve the model predictions by the following Monte-Carlo simulations.

Conclusion

We present a standardised method to translate metabolic networks into dynamic models. To determine the model parameters, evidence from various experimental data is combined and weighted using Bayesian parameter estimation. The resulting posterior parameter distribution describes a statistical ensemble of parameter sets; the parameter variances and correlations can account for missing knowledge, measurement uncertainties, or biological variability. The posterior distribution can be used to sample model instances and to obtain probabilistic statements about the model's dynamic behaviour.  相似文献   

6.

Background

The investigation of network dynamics is a major issue in systems and synthetic biology. One of the essential steps in a dynamics investigation is the parameter estimation in the model that expresses biological phenomena. Indeed, various techniques for parameter optimization have been devised and implemented in both free and commercial software. While the computational time for parameter estimation has been greatly reduced, due to improvements in calculation algorithms and the advent of high performance computers, the accuracy of parameter estimation has not been addressed.

Results

We propose a new approach for parameter optimization by using differential elimination, to estimate kinetic parameter values with a high degree of accuracy. First, we utilize differential elimination, which is an algebraic approach for rewriting a system of differential equations into another equivalent system, to derive the constraints between kinetic parameters from differential equations. Second, we estimate the kinetic parameters introducing these constraints into an objective function, in addition to the error function of the square difference between the measured and estimated data, in the standard parameter optimization method. To evaluate the ability of our method, we performed a simulation study by using the objective function with and without the newly developed constraints: the parameters in two models of linear and non-linear equations, under the assumption that only one molecule in each model can be measured, were estimated by using a genetic algorithm (GA) and particle swarm optimization (PSO). As a result, the introduction of new constraints was dramatically effective: the GA and PSO with new constraints could successfully estimate the kinetic parameters in the simulated models, with a high degree of accuracy, while the conventional GA and PSO methods without them frequently failed.

Conclusions

The introduction of new constraints in an objective function by using differential elimination resulted in the drastic improvement of the estimation accuracy in parameter optimization methods. The performance of our approach was illustrated by simulations of the parameter optimization for two models of linear and non-linear equations, which included unmeasured molecules, by two types of optimization techniques. As a result, our method is a promising development in parameter optimization.
  相似文献   

7.
Parameter estimation constitutes a major challenge in dynamic modeling of metabolic networks. Here we examine, via computational simulations, the influence of system nonlinearity and the nature of available data on the distribution and predictive capability of identified model parameters. Simulated methionine cycle metabolite concentration data (both with and without corresponding flux data) was inverted to identify model parameters consistent with it. Thousands of diverse parameter families were found to be consistent with the data to within moderate error, with most of the parameter values spanning over 1000-fold ranges irrespective of whether flux data was included. Due to strong correlations within the extracted parameter families, model predictions were generally reliable despite the broad ranges found for individual parameters. Inclusion of flux data, by strengthening these correlations, resulted in substantially more reliable flux predictions. These findings suggest that, despite the difficulty of extracting biochemically accurate model parameters from system level data, such data may nevertheless prove adequate for driving the development of predictive dynamic metabolic models.  相似文献   

8.
Folly WS 《PloS one》2011,6(9):e24414

Background

Comparative and predictive analyses of suicide data from different countries are difficult to perform due to varying approaches and the lack of comparative parameters.

Methodology/Principal Findings

A simple model (the Threshold Bias Model) was tested for comparative and predictive analyses of suicide rates by age. The model comprises of a six parameter distribution that was applied to the USA suicide rates by age for the years 2001 and 2002. Posteriorly, linear extrapolations are performed of the parameter values previously obtained for these years in order to estimate the values corresponding to the year 2003. The calculated distributions agreed reasonably well with the aggregate data. The model was also used to determine the age above which suicide rates become statistically observable in USA, Brazil and Sri Lanka.

Conclusions/Significance

The Threshold Bias Model has considerable potential applications in demographic studies of suicide. Moreover, since the model can be used to predict the evolution of suicide rates based on information extracted from past data, it will be of great interest to suicidologists and other researchers in the field of mental health.  相似文献   

9.

Background

One important preprocessing step in the analysis of microarray data is background subtraction. In high-density oligonucleotide arrays this is recognized as a crucial step for the global performance of the data analysis from raw intensities to expression values.

Results

We propose here an algorithm for background estimation based on a model in which the cost function is quadratic in a set of fitting parameters such that minimization can be performed through linear algebra. The model incorporates two effects: 1) Correlated intensities between neighboring features in the chip and 2) sequence-dependent affinities for non-specific hybridization fitted by an extended nearest-neighbor model.

Conclusion

The algorithm has been tested on 360 GeneChips from publicly available data of recent expression experiments. The algorithm is fast and accurate. Strong correlations between the fitted values for different experiments as well as between the free-energy parameters and their counterparts in aqueous solution indicate that the model captures a significant part of the underlying physical chemistry.  相似文献   

10.

Background

Current methods of analyzing Affymetrix GeneChip® microarray data require the estimation of probe set expression summaries, followed by application of statistical tests to determine which genes are differentially expressed. The S-Score algorithm described by Zhang and colleagues is an alternative method that allows tests of hypotheses directly from probe level data. It is based on an error model in which the detected signal is proportional to the probe pair signal for highly expressed genes, but approaches a background level (rather than 0) for genes with low levels of expression. This model is used to calculate relative change in probe pair intensities that converts probe signals into multiple measurements with equalized errors, which are summed over a probe set to form the S-Score. Assuming no expression differences between chips, the S-Score follows a standard normal distribution, allowing direct tests of hypotheses to be made. Using spike-in and dilution datasets, we validated the S-Score method against comparisons of gene expression utilizing the more recently developed methods RMA, dChip, and MAS5.

Results

The S-score showed excellent sensitivity and specificity in detecting low-level gene expression changes. Rank ordering of S-Score values more accurately reflected known fold-change values compared to other algorithms.

Conclusion

The S-score method, utilizing probe level data directly, offers significant advantages over comparisons using only probe set expression summaries.  相似文献   

11.

Background

This article describes classical and Bayesian interval estimation of genetic susceptibility based on random samples with pre-specified numbers of unrelated cases and controls.

Results

Frequencies of genotypes in cases and controls can be estimated directly from retrospective case-control data. On the other hand, genetic susceptibility defined as the expected proportion of cases among individuals with a particular genotype depends on the population proportion of cases (prevalence). Given this design, prevalence is an external parameter and hence the susceptibility cannot be estimated based on only the observed data. Interval estimation of susceptibility that can incorporate uncertainty in prevalence values is explored from both classical and Bayesian perspective. Similarity between classical and Bayesian interval estimates in terms of frequentist coverage probabilities for this problem allows an appealing interpretation of classical intervals as bounds for genetic susceptibility. In addition, it is observed that both the asymptotic classical and Bayesian interval estimates have comparable average length. These interval estimates serve as a very good approximation to the "exact" (finite sample) Bayesian interval estimates. Extension from genotypic to allelic susceptibility intervals shows dependency on phenotype-induced deviations from Hardy-Weinberg equilibrium.

Conclusions

The suggested classical and Bayesian interval estimates appear to perform reasonably well. Generally, the use of exact Bayesian interval estimation method is recommended for genetic susceptibility, however the asymptotic classical and approximate Bayesian methods are adequate for sample sizes of at least 50 cases and controls.  相似文献   

12.

Key message

The calibration data for genomic prediction should represent the full genetic spectrum of a breeding program. Data heterogeneity is minimized by connecting data sources through highly related test units.

Abstract

One of the major challenges of genome-enabled prediction in plant breeding lies in the optimum design of the population employed in model training. With highly interconnected breeding cycles staggered in time the choice of data for model training is not straightforward. We used cross-validation and independent validation to assess the performance of genome-based prediction within and across genetic groups, testers, locations, and years. The study comprised data for 1,073 and 857 doubled haploid lines evaluated as testcrosses in 2 years. Testcrosses were phenotyped for grain dry matter yield and content and genotyped with 56,110 single nucleotide polymorphism markers. Predictive abilities strongly depended on the relatedness of the doubled haploid lines from the estimation set with those on which prediction accuracy was assessed. For scenarios with strong population heterogeneity it was advantageous to perform predictions within a priori defined genetic groups until higher connectivity through related test units was achieved. Differences between group means had a strong effect on predictive abilities obtained with both cross-validation and independent validation. Predictive abilities across subsequent cycles of selection and years were only slightly reduced compared to predictive abilities obtained with cross-validation within the same year. We conclude that the optimum data set for model training in genome-enabled prediction should represent the full genetic and environmental spectrum of the respective breeding program. Data heterogeneity can be reduced by experimental designs that maximize the connectivity between data sources by common or highly related test units.  相似文献   

13.

Background  

In order to improve understanding of metabolic systems there have been attempts to construct S-system models from time courses. Conventionally, non-linear curve-fitting algorithms have been used for modelling, because of the non-linear properties of parameter estimation from time series. However, the huge iterative calculations required have hindered the development of large-scale metabolic pathway models. To solve this problem we propose a novel method involving power-law modelling of metabolic pathways from the Jacobian of the targeted system and the steady-state flux profiles by linearization of S-systems.  相似文献   

14.

Background

Biochemical equilibria are usually modeled iteratively: given one or a few fitted models, if there is a lack of fit or over fitting, a new model with additional or fewer parameters is then fitted, and the process is repeated. The problem with this approach is that different analysts can propose and select different models and thus extract different binding parameter estimates from the same data. An alternative is to first generate a comprehensive standardized list of plausible models, and to then fit them exhaustively, or semi-exhaustively.

Results

A framework is presented in which equilibriums are modeled as pairs (g, h) where g = 0 maps total reactant concentrations (system inputs) into free reactant concentrations (system states) which h then maps into expected values of measurements (system outputs). By letting dissociation constants K d be either freely estimated, infinity, zero, or equal to other K d , and by letting undamaged protein fractions be either freely estimated or 1, many g models are formed. A standard space of g models for ligand-induced protein dimerization equilibria is given. Coupled to an h model, the resulting (g, h) were fitted to dTTP induced R1 dimerization data (R1 is the large subunit of ribonucleotide reductase). Models with the fewest parameters were fitted first. Thereafter, upon fitting a batch, the next batch of models (with one more parameter) was fitted only if the current batch yielded a model that was better (based on the Akaike Information Criterion) than the best model in the previous batch (with one less parameter). Within batches models were fitted in parallel. This semi-exhaustive approach yielded the same best models as an exhaustive model space fit, but in approximately one-fifth the time.

Conclusion

Comprehensive model space based biochemical equilibrium model selection methods are realizable. Their significance to systems biology as mappings of data into mathematical models warrants their development.  相似文献   

15.
Analytical solutions were developed based on the Green's function method to describe heat transfer in tissue including the effects of blood perfusion. These one-dimensional transient solutions were used with a simple parameter estimation technique and experimental measurements of temperature and heat flux at the surface of simulated tissue. It was demonstrated how such surface measurements can be used during step changes in the surface thermal conditions to estimate the value of three important parameters: blood perfusion (w(b)), thermal contact resistance (R"), and core temperature of the tissue (T(core)). The new models were tested against finite-difference solutions of thermal events on the surface to show the validity of the analytical solution. Simulated data was used to demonstrate the response of the model in predicting optimal parameters from noisy temperature and heat flux measurements. Finally, the analytical model and simple parameter estimation routine were used with actual experimental data from perfusion in phantom tissue. The model was shown to provide a very good match with the data curves. This demonstrated the first time that all three of these important parameters (w(b), R", and T(core)) have simultaneously been estimated from a single set of thermal measurements at the surface of tissue.  相似文献   

16.
The response of the lower limb to dynamic, transient torsional loading applied at the foot has been measured for a male test subject. The dynamic loading was provided by a computer controlled pneumatic system which applied single haversine (i.e. half cycle of a sine wave) axial moment pulses of variable amplitude (0-100 Nm) and duration (50-600 ms). Potentiometers measured the absolute rotations of the three leg segments. Test variables included rotation direction, weight bearing and joint flexion. Two approaches were explored for specifying parameters (i.e. inertia, damping, stiffness) of a three degree-of-freedom dynamic system model which best duplicated the measured response. One approach involved identification of linear parameters by means of optimization while the other approach entailed estimation. Parameter estimates, which included non-linear, asymmetric stiffness functions, were derived from the literature. The optimization was undertaken so as to identify parameter dependence on test variables. Results indicate that parameter values are influenced by test variables. Results also indicate that the non-linear, estimated model better approximates the experimental data than the linear, identified model. In addition to identifying parameters of a three degree-of-freedom model, parameters were also identified for a single degree-of-freedom model where the motion variable was intended to indicate the rotation of the in vivo knee. It is concluded that the simpler model offers good accuracy in predicting both magnitude and time of occurrence of peak knee axial rotations. Model motion fails to track the measured knee rotation subsequent to the peak, however.  相似文献   

17.

Background

It was recently shown that the treatment effect of an antibody can be described by a consolidated parameter which includes the reaction rates of the receptor-toxin-antibody kinetics and the relative concentration of reacting species. As a result, any given value of this parameter determines an associated range of antibody kinetic properties and its relative concentration in order to achieve a desirable therapeutic effect. In the current study we generalize the existing kinetic model by explicitly taking into account the diffusion fluxes of the species.

Results

A refined model of receptor-toxin-antibody (RTA) interaction is studied numerically. The protective properties of an antibody against a given toxin are evaluated for a spherical cell placed into a toxin-antibody solution. The selection of parameters for numerical simulation approximately corresponds to the practically relevant values reported in the literature with the significant ranges in variation to allow demonstration of different regimes of intracellular transport.

Conclusions

The proposed refinement of the RTA model may become important for the consistent evaluation of protective potential of an antibody and for the estimation of the time period during which the application of this antibody becomes the most effective. It can be a useful tool for in vitro selection of potential protective antibodies for progression to in vivo evaluation.  相似文献   

18.
An important aspect of systems biology research is the so-called “reverse engineering” of cellular metabolic dynamics from measured input-output data. This allows researchers to estimate and validate both the pathway’s structure as well as the kinetic constants. In this paper, the recently published ‘Proximate Parameter Tuning’ (PPT) method for the identification of biochemical networks is analysed. In particular, it is shown that the described PPT algorithm is essentially equivalent to a sequential linear programming implementation of a constrained optimization problem. The corresponding objective function consists of two parts, the first emphasises the data fitting where a residual 1-norm is used, and the second emphasises the proximity of the calculated parameters to the specified nominal values, using an ∞-norm. The optimality properties of PPT algorithm solution as well as its geometric interpretation are analyzed. The concept of optimal parameter locus is applied for the exploration of the entire family of optimal solutions. An efficient implementation of the parameter locus is also developed. Parallels are drawn with 1-norm parameter deviation regularization which attempt to fit the data with a minimal number of parameters. Finally, a small example is used to illustrate all of these properties.  相似文献   

19.

Background

In order to reduce time and efforts to develop microbial strains with better capability of producing desired bioproducts, genome-scale metabolic simulations have proven useful in identifying gene knockout and amplification targets. Constraints-based flux analysis has successfully been employed for such simulation, but is limited in its ability to properly describe the complex nature of biological systems. Gene knockout simulations are relatively straightforward to implement, simply by constraining the flux values of the target reaction to zero, but the identification of reliable gene amplification targets is rather difficult. Here, we report a new algorithm which incorporates physiological data into a model to improve the model??s prediction capabilities and to capitalize on the relationships between genes and metabolic fluxes.

Results

We developed an algorithm, flux variability scanning based on enforced objective flux (FVSEOF) with grouping reaction (GR) constraints, in an effort to identify gene amplification targets by considering reactions that co-carry flux values based on physiological omics data via ??GR constraints??. This method scans changes in the variabilities of metabolic fluxes in response to an artificially enforced objective flux of product formation. The gene amplification targets predicted using this method were validated by comparing the predicted effects with the previous experimental results obtained for the production of shikimic acid and putrescine in Escherichia coli. Moreover, new gene amplification targets for further enhancing putrescine production were validated through experiments involving the overexpression of each identified targeted gene under condition-controlled batch cultivation.

Conclusions

FVSEOF with GR constraints allows identification of gene amplification targets for metabolic engineering of microbial strains in order to enhance the production of desired bioproducts. The algorithm was validated through the experiments on the enhanced production of putrescine in E. coli, in addition to the comparison with the previously reported experimental data. The FVSEOF strategy with GR constraints will be generally useful for developing industrially important microbial strains having enhanced capabilities of producing chemicals of interest.  相似文献   

20.
Stable isotope-assisted metabolic flux analysis (MFA) is a powerful method to estimate carbon flow and partitioning in metabolic networks. At its core, MFA is a parameter estimation problem wherein the fluxes and metabolite pool sizes are model parameters that are estimated, via optimization, to account for measurements of steady-state or isotopically-nonstationary isotope labeling patterns. As MFA problems advance in scale, they require efficient computational methods for fast and robust convergence. The structure of the MFA problem enables it to be cast as an equality-constrained nonlinear program (NLP), where the equality constraints are constructed from the MFA model equations, and the objective function is defined as the sum of squared residuals (SSR) between the model predictions and a set of labeling measurements. This NLP can be solved by using an algebraic modeling language (AML) that offers state-of-the-art optimization solvers for robust parameter estimation and superior scalability to large networks. When implemented in this manner, the optimization is performed with no distinction between state variables and model parameters. During each iteration of such an optimization, the system state is updated instead of being calculated explicitly from scratch, and this occurs concurrently with improvement in the model parameter estimates. This optimization approach starkly contrasts with traditional “shooting” methods where the state variables and model parameters are kept distinct and the system state is computed afresh during each iteration of a stepwise optimization. Our NLP formulation uses the MFA modeling framework of Wiechert et al. [1], which is amenable to incorporation of the model equations into an NLP. The NLP constraints consist of balances on either elementary metabolite units (EMUs) or cumomers. In this formulation, both the steady-state and isotopically-nonstationary MFA (inst-MFA) problems may be solved as an NLP. For the inst-MFA case, the ordinary differential equation (ODE) system describing the labeling dynamics is transcribed into a system of algebraic constraints for the NLP using collocation. This large-scale NLP may be solved efficiently using an NLP solver implemented on an AML. In our implementation, we used the reduced gradient solver CONOPT, implemented in the General Algebraic Modeling System (GAMS). The NLP framework is particularly advantageous for inst-MFA, scaling well to large networks with many free parameters, and having more robust convergence properties compared to the shooting methods that compute the system state and sensitivities at each iteration. Additionally, this NLP approach supports the use of tandem-MS data for both steady-state and inst-MFA when the cumomer framework is used. We assembled a software, eiFlux, written in Python and GAMS that uses the NLP approach and supports both steady-state and inst-MFA. We demonstrate the effectiveness of the NLP formulation on several examples, including a genome-scale inst-MFA model, to highlight the scalability and robustness of this approach. In addition to typical inst-MFA applications, we expect that this framework and our associated software, eiFlux, will be particularly useful for applying inst-MFA to complex MFA models, such as those developed for eukaryotes (e.g. algae) and co-cultures with multiple cell types.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号