首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In quantitative biology, observed data are fitted to a model that captures the essence of the system under investigation in order to obtain estimates of the parameters of the model, as well as their standard errors and interactions. The fitting is best done by the method of maximum likelihood, though least-squares fits are often used as an approximation because the calculations are perceived to be simpler. Here Brian Williams and Chris Dye argue that the method of maximum likelihood is generally preferable to least squares giving the best estimates of the parameters for data with any given error distribution, and the calculations are no more difficult than for least-squares fitting. They offer a relatively simple explanation of the methods and describe its implementation using examples from leishmaniasis epidemiology.  相似文献   

2.
Aims Accurate forecast of ecosystem states is critical for improving natural resource management and climate change mitigation. Assimilating observed data into models is an effective way to reduce uncertainties in ecological forecasting. However, influences of measurement errors on parameter estimation and forecasted state changes have not been carefully examined. This study analyzed the parameter identifiability of a process-based ecosystem carbon cycle model, the sensitivity of parameter estimates and model forecasts to the magnitudes of measurement errors and the information contributions of the assimilated data to model forecasts with a data assimilation approach.Methods We applied a Markov Chain Monte Carlo method to assimilate eight biometric data sets into the Terrestrial ECOsystem model. The data were the observations of foliage biomass, wood biomass, fine root biomass, microbial biomass, litter fall, litter, soil carbon and soil respiration, collected at the Duke Forest free-air CO2 enrichment facilities from 1996 to 2005. Three levels of measurement errors were assigned to these data sets by halving and doubling their original standard deviations.Important findings Results showed that only less than half of the 30 parameters could be constrained, though the observations were extensive and the model was relatively simple. Higher measurement errors led to higher uncertainties in parameters estimates and forecasted carbon (C) pool sizes. The long-term predictions of the slow turnover pools were affected less by the measurement errors than those of fast turnover pools. Assimilated data contributed less information for the pools with long residence times in long-term forecasts. These results indicate the residence times of C pools played a key role in regulating propagation of errors from measurements to model forecasts in a data assimilation system. Improving the estimation of parameters of slow turnover C pools is the key to better forecast long-term ecosystem C dynamics.  相似文献   

3.
4.
Analysis of data in terms of the sum of two rectangular hyperbolas is frequently required in solute uptake studies. Four methods for such analysis have been compared. Three are based on least-squares fitting whereas the fourth (partition method I) is an extension of a single hyperbola fitting procedure based on non-parametric statistics. The four methods were tested using data sets which had been generated with two primary types of random, normal error in the dependent variable: one of constant error variance and the other of constant coefficient of variation. The methods were tested on further data sets which were obtained by incorporating single 10% bias errors at different positions in the original two sets. Partition method I consistently gave good estimates for the four parameters defining the double hyperbola and was highly insensitive to the bias errors. The least-squares procedures performed well under conditions satisfying the least-squares assumptions regarding error distribution, but frequently gave poor estimates when these assumptions did not hold. Our conclusion is that in view of the errors inherent in many solute uptake experiments it would usually be preferable to analyse data by a method such as partition method I rather than to rely on a least-squares procedure.  相似文献   

5.

Background  

Experimental results are commonly fitted by determining parameter values of suitable mathematical expressions. In case a relation exists between different data sets, the accuracy of the parameters obtained can be increased by incorporating this relationship in the fitting process instead of fitting the recordings separately.  相似文献   

6.
1.?State space models are starting to replace more simple time series models in analyses of temporal dynamics of populations that are not perfectly censused. By simultaneously modelling both the dynamics and the observations, consistent estimates of population dynamical parameters may be obtained. For many data sets, the distribution of observation errors is unknown and error models typically chosen in an ad-hoc manner. 2.?To investigate the influence of the choice of observation error on inferences, we analyse the dynamics of a replicated time series of red kangaroo surveys using a state space model with linear state dynamics. Surveys were performed through aerial counts and Poisson, overdispersed Poisson, normal and log-normal distributions may all be adequate for modelling observation errors for the data. We fit each of these to the data and compare them using AIC. 3.?The state space models were fitted with maximum likelihood methods using a recent importance sampling technique that relies on the Kalman filter. The method relaxes the assumption of Gaussian observation errors required by the basic Kalman filter. Matlab code for fitting linear state space models with Poisson observations is provided. 4.?The ability of AIC to identify the correct observation model was investigated in a small simulation study. For the parameter values used in the study, without replicated observations, the correct observation distribution could sometimes be identified but model selection was prone to misclassification. On the other hand, when observations were replicated, the correct distribution could typically be identified. 5.?Our results illustrate that inferences may differ markedly depending on the observation distributions used, suggesting that choosing an adequate observation model can be critical. Model selection and simulations show that for the models and parameter values in this study, a suitable observation model can typically be identified if observations are replicated. Model selection and replication of observations, therefore, provide a potential solution when the observation distribution is unknown.  相似文献   

7.
Assessing animal population growth curves is an essential feature of field studies in ecology and wildlife management. We used five models to assess population growth rates with a number of sets of population growth rate data. A 'generalized' logistic curve provides a better model than do four other popular models. Use of difference equations for fitting was checked by a comparison of that method and direct fitting of the analytical (integrated) solution for three of the models. Fits to field data indicate that estimates of the asymptote, K, from the 'generalized logistic' and the ordinary logistic agree well enough to support use of estimates of K from the ordinary logistic on data that cannot be satisfactorily fitted with the generalized logistic. Akaike's information criterion is widely used, often with a small sample version AICc. Our study of five models indicated a bias in the AICc criterion, so we recommend checking results with estimates of variance about regression for fitted models. Fitting growth curves provides a valuable supplement to, and check on computer models of populations.  相似文献   

8.
Codon-based substitution models have been widely used to identify amino acid sites under positive selection in comparative analysis of protein-coding DNA sequences. The nonsynonymous-synonymous substitution rate ratio (d(N)/d(S), denoted omega) is used as a measure of selective pressure at the protein level, with omega > 1 indicating positive selection. Statistical distributions are used to model the variation in omega among sites, allowing a subset of sites to have omega > 1 while the rest of the sequence may be under purifying selection with omega < 1. An empirical Bayes (EB) approach is then used to calculate posterior probabilities that a site comes from the site class with omega > 1. Current implementations, however, use the naive EB (NEB) approach and fail to account for sampling errors in maximum likelihood estimates of model parameters, such as the proportions and omega ratios for the site classes. In small data sets lacking information, this approach may lead to unreliable posterior probability calculations. In this paper, we develop a Bayes empirical Bayes (BEB) approach to the problem, which assigns a prior to the model parameters and integrates over their uncertainties. We compare the new and old methods on real and simulated data sets. The results suggest that in small data sets the new BEB method does not generate false positives as did the old NEB approach, while in large data sets it retains the good power of the NEB approach for inferring positively selected sites.  相似文献   

9.
A sensitivity analysis based on weighted least-squares regression is presented to evaluate alternative methods for fitting lumped-parameter models to respiratory impedance data. The goal is to maintain parameter accuracy simultaneously with practical experiment design. The analysis focuses on predicting parameter uncertainties using a linearized approximation for joint confidence regions. Applications are with four-element parallel and viscoelastic models for 0.125- to 4-Hz data and a six-element model with separate tissue and airway properties for input and transfer impedance data from 2-64 Hz. The criterion function form was evaluated by comparing parameter uncertainties when data are fit as magnitude and phase, dynamic resistance and compliance, or real and imaginary parts of input impedance. The proper choice of weighting can make all three criterion variables comparable. For the six-element model, parameter uncertainties were predicted when both input impedance and transfer impedance are acquired and fit simultaneously. A fit to both data sets from 4 to 64 Hz could reduce parameter estimate uncertainties considerably from those achievable by fitting either alone. For the four-element models, use of an independent, but noisy, measure of static compliance was assessed as a constraint on model parameters. This may allow acceptable parameter uncertainties for a minimum frequency of 0.275-0.375 Hz rather than 0.125 Hz. This reduces data acquisition requirements from a 16- to a 5.33- to 8-s breath holding period. These results are approximations, and the impact of using the linearized approximation for the confidence regions is discussed.  相似文献   

10.
Patterns that resemble strongly skewed size distributions are frequently observed in ecology. A typical example represents tree size distributions of stem diameters. Empirical tests of ecological theories predicting their parameters have been conducted, but the results are difficult to interpret because the statistical methods that are applied to fit such decaying size distributions vary. In addition, binning of field data as well as measurement errors might potentially bias parameter estimates. Here, we compare three different methods for parameter estimation – the common maximum likelihood estimation (MLE) and two modified types of MLE correcting for binning of observations or random measurement errors. We test whether three typical frequency distributions, namely the power-law, negative exponential and Weibull distribution can be precisely identified, and how parameter estimates are biased when observations are additionally either binned or contain measurement error. We show that uncorrected MLE already loses the ability to discern functional form and parameters at relatively small levels of uncertainties. The modified MLE methods that consider such uncertainties (either binning or measurement error) are comparatively much more robust. We conclude that it is important to reduce binning of observations, if possible, and to quantify observation accuracy in empirical studies for fitting strongly skewed size distributions. In general, modified MLE methods that correct binning or measurement errors can be applied to ensure reliable results.  相似文献   

11.
A method of fluorescence anisotropy decay analysis is described in this work. The transient anisotropy r(ex)(t) measured in a photocounting pulsefluorimeter is fitted by a non linear least square procedure to the ratio of convolutions of the apparatus response function g(t) by sums of appropriate exponential functions. This method takes rigorously into account the apparatus response function and is applicable to any shape of the later as well as to any values of fluorescence decay times and correlation times. The performances of the method have been tested with data simulated from measured response functions corresponding to an air lamp and a high pressure nitrogen lamp. The statistical standard errors of the anisotropy deca parameters have been found to be smaller than the standard errors previously calculated for the moment method. A systematic error delta in the fluorescence decay time entailed an error deltatheta in the correlation time such as Deltatheta/theta < deltatau/tau. By this method, good fitting of experimental data have been achieved very conveniently and accurately.  相似文献   

12.
Model-independent methods for the reconstruction of the nitroxide spin probe angular distribution of labeled oriented biological assemblies from electron spin resonance (ESR) spectra were investigated. We found that accurate probe angular distribution information could be obtained from the simultaneous consideration of a series of ESR spectra originating from a sample at differing tilt angles relative to the Zeeman magnetic field. Using simulated tilt series data sets, we developed a consistent criteria for judging the reliability of the simulated fit to the data as a function of the free spectral parameters and thereby have increased the significance of the model-independent reconstruction of the probe angular distribution derived from the fit. We have also enhanced the angular resolution measurable with the model-independent methodology by increasing the rank of the order parameters that we can reliably deduce from a spectrum. This enhancement allows us to accurately deduce higher resolution features of the spin probe distribution. Finally we investigated the usefulness of fitting the tilt series data in multiple data sets such that tilt series data from many identical sample preparations are fitted simultaneously. This method proved to be useful in rapidly reducing a large amount of data by eliminating any redundant computations in the application of the enhanced model-independent analysis to identical sets of tilt series data. We applied the methodology developed here to ESR spectra from probe labeled muscle fibers to study the orientation of myosin cross-bridges in fibers. This application is described in the accompanying paper.  相似文献   

13.
This article extends the study of the comparison between linear and nonlinear forms of the two widely used kinetic models, namely, pseudo-first-order and pseudo-second-order, by considering the binary biosorption of the basic dyes methylene blue and safranin onto pretreated rice husk in a batch system. The present investigation showed that nonlinear forms of pseudo-first-order and pseudo-second-order models were more suitable than the linear forms for fitting the experimental data. The sorption process was found to follow the pseudo-second-order kinetics. The results suggest that it is not appropriate to use the linear method in determining the kinetic parameters of a particular kinetic model. The nonlinear method is a better way to obtain the kinetic parameters than the linear method and thus it should be primarily adopted to determine the kinetic parameters.  相似文献   

14.
Errors in the experimental baseline used to normalize dynamic light scattering data can seriously affect the size distribution resulting from the data analysis. A revised method, which incorporates the characteristics of this error into the size distribution algorithm CONTIN (Ruf 1989), is tested with experimental data of high statistical accuracy obtained from a sample of phospholipid vesicles. It is shown that the various commonly used ways of accumulating and normalizing dynamic light scattering data are associated with rather different normalization errors. As a consequence a variety of solutions differing in modality, as well as in width, are obtained on carrying out data analysis in the common way. It is demonstrated that a single monomodal solution is retrieved from all these data sets when the new method is applied, which in addition provides the corresponding baseline errors quantitatively. Furthermore, stable solutions are obtainable with data of lower statistical accuracy which results from measurements of shorter duration. The use of an additional parameter in data inversion reduces the occurrence of spurious peaks. This stabilizing effect is accompanied by larger uncertainties in the width of the size distribution. It is demonstrated that these uncertainties are reduced by nearly a factor of two on using the normalization error function instead of the ‘dust term’ option for the analysis of noisy data sets.  相似文献   

15.
A convenient method for evaluation of biochemical reaction rate coefficients and their uncertainties is described. The motivation for developing this method was the complexity of existing statistical methods for analysis of biochemical rate equations, as well as the shortcomings of linear approaches, such as Lineweaver-Burk plots. The nonlinear least-squares method provides accurate estimates of the rate coefficients and their uncertainties from experimental data. Linearized methods that involve inversion of data are unreliable since several important assumptions of linear regression are violated. Furthermore, when linearized methods are used, there is no basis for calculation of the uncertainties in the rate coefficients. Uncertainty estimates are crucial to studies involving comparisons of rates for different organisms or environmental conditions. The spreadsheet method uses weighted least-squares analysis to determine the best-fit values of the rate coefficients for the integrated Monod equation. Although the integrated Monod equation is an implicit expression of substrate concentration, weighted least-squares analysis can be employed to calculate approximate differences in substrate concentration between model predictions and data. An iterative search routine in a spreadsheet program is utilized to search for the best-fit values of the coefficients by minimizing the sum of squared weighted errors. The uncertainties in the best-fit values of the rate coefficients are calculated by an approximate method that can also be implemented in a spreadsheet. The uncertainty method can be used to calculate single-parameter (coefficient) confidence intervals, degrees of correlation between parameters, and joint confidence regions for two or more parameters. Example sets of calculations are presented for acetate utilization by a methanogenic mixed culture and trichloroethylene cometabolism by a methane-oxidizing mixed culture. An additional advantage of application of this method to the integrated Monod equation compared with application of linearized methods is the economy of obtaining rate coefficients from a single batch experiment or a few batch experiments rather than having to obtain large numbers of initial rate measurements. However, when initial rate measurements are used, this method can still be used with greater reliability than linearized approaches.  相似文献   

16.
The double Michaelis-Menten equation describes the reaction kinetics of two independent, saturable uptake mechanisms. The use of this equation to describe drug uptake has been reported several times in the literature, and several methods have been published to fit the equation to data. So far, however, confidence intervals on the fitted kinetic parameters have not been provided. We present a grid-search method for fitting the double Michaelis-Menten equation to kinetic uptake data, and a Monte-Carlo procedure for estimating confidence intervals on the fitted parameters. We show that the fitting problem is extremely ill-conditioned, and that very accurate data are required before any confidence can be placed in the fitted parameters.  相似文献   

17.
Deng X  Geng H  Ali H 《Bio Systems》2005,81(2):125-136
Reverse-engineering of gene networks using linear models often results in an underdetermined system because of excessive unknown parameters. In addition, the practical utility of linear models has remained unclear. We address these problems by developing an improved method, EXpression Array MINing Engine (EXAMINE), to infer gene regulatory networks from time-series gene expression data sets. EXAMINE takes advantage of sparse graph theory to overcome the excessive-parameter problem with an adaptive-connectivity model and fitting algorithm. EXAMINE also guarantees that the most parsimonious network structure will be found with its incremental adaptive fitting process. Compared to previous linear models, where a fully connected model is used, EXAMINE reduces the number of parameters by O(N), thereby increasing the chance of recovering the underlying regulatory network. The fitting algorithm increments the connectivity during the fitting process until a satisfactory fit is obtained. We performed a systematic study to explore the data mining ability of linear models. A guideline for using linear models is provided: If the system is small (3-20 elements), more than 90% of the regulation pathways can be determined correctly. For a large-scale system, either clustering is needed or it is necessary to integrate information in addition to expression profile. Coupled with the clustering method, we applied EXAMINE to rat central nervous system development (CNS) data with 112 genes. We were able to efficiently generate regulatory networks with statistically significant pathways that have been predicted previously.  相似文献   

18.
We develop a theoretical foundation for a time-series analysis method suitable for revealing the spectrum of diffusion coefficients in mixed Brownian systems, for which no prior knowledge of particle distinction is required. This method is directly relevant for particle tracking in biological systems, in which diffusion processes are often nonuniform. We transform Brownian data onto the logarithmic domain, in which the coefficients for individual modes of diffusion appear as distinct spectral peaks in the probability density. We refer to the method as the logarithmic measure of diffusion, or simply as the logarithmic measure. We provide a general protocol for deriving analytical expressions for the probability densities on the logarithmic domain. The protocol is applicable for any number of spatial dimensions with any number of diffusive states. The analytical form can be fitted to data to reveal multiple diffusive modes. We validate the theoretical distributions and benchmark the accuracy and sensitivity of the method by extracting multimodal diffusion coefficients from two-dimensional Brownian simulations of polydisperse filament bundles. Bundling the filaments allows us to control the system nonuniformity and hence quantify the sensitivity of the method. By exploiting the anisotropy of the simulated filaments, we generalize the logarithmic measure to rotational diffusion. By fitting the analytical forms to simulation data, we confirm the method’s theoretical foundation. An error analysis in the single-mode regime shows that the proposed method is comparable in accuracy to the standard mean-squared displacement approach for evaluating diffusion coefficients. For the case of multimodal diffusion, we compare the logarithmic measure against other, more sophisticated methods, showing that both model selectivity and extraction accuracy are comparable for small data sets. Therefore, we suggest that the logarithmic measure, as a method for multimodal diffusion coefficient extraction, is ideally suited for small data sets, a condition often confronted in the experimental context. Finally, we critically discuss the proposed benefits of the method and its information content.  相似文献   

19.
The boundary line model was proposed to interpret biological data sets, where one variable is a biological response (e.g. crop yield) to an independent variable (e.g. available water content of the soil). The upper (or lower) boundary on a plot of the dependent variable (ordinate) against the independent variable (abscissa) represents the limiting response of the dependent variable to the independent variable value. Although the concept has been widely used, the methods proposed to define the boundary line have been subject to criticism. This is because of their ad hoc nature and lack of theoretical basis. In this article, we present a novel method for fitting the boundary line to a set of data. The method uses a censored probability distribution to interpret the data structure. The parameters of the distribution (and hence the boundary line parameters) are fitted using maximum likelihood and related confidence intervals deduced. The method is demonstrated using both simulated and real data sets.  相似文献   

20.
The method of generalized least squares (GLS) is used to assess the variance function for isothermal titration calorimetry (ITC) data collected for the 1:1 complexation of Ba(2+) with 18-crown-6 ether. In the GLS method, the least squares (LS) residuals from the data fit are themselves fitted to a variance function, with iterative adjustment of the weighting function in the data analysis to produce consistency. The data are treated in a pooled fashion, providing 321 fitted residuals from 35 data sets in the final analysis. Heteroscedasticity (nonconstant variance) is clearly indicated. Data error terms proportional to q(i) and q(i)/v are well defined statistically, where q(i) is the heat from the ith injection of titrant and v is the injected volume. The statistical significance of the variance function parameters is confirmed through Monte Carlo calculations that mimic the actual data set. For the data in question, which fall mostly in the range of q(i)=100-2000 microcal, the contributions to the data variance from the terms in q(i)(2) typically exceed the background constant term for q(i)>300 microcal and v<10 microl. Conversely, this means that in reactions with q(i) much less than this, heteroscedasticity is not a significant problem. Accordingly, in such cases the standard unweighted fitting procedures provide reliable results for the key parameters, K and DeltaH(degrees) and their statistical errors. These results also support an important earlier finding: in most ITC work on 1:1 binding processes, the optimal number of injections is 7-10, which is a factor of 3 smaller than the current norm. For high-q reactions, where weighting is needed for optimal LS analysis, tips are given for using the weighting option in the commercial software commonly employed to process ITC data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号