首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Inferring pH from diatoms: a comparison of old and new calibration methods   总被引:35,自引:20,他引:15  
Two new methods for inferring pH from diatoms are presented. Both are based on the observation that the relationships between diatom taxa and pH are often unimodal. The first method is maximum likelihood calibration based on Gaussian logit response curves of taxa against pH. The second is weighted averaging. In a lake with a particular pH, taxa with an optimum close to the lake pH will be most abundant, so an intuitively reasonable estimate of the lake pH is to take a weighted average of the pH optima of the species present.Optima and tolerances of diatom taxa were estimated from contemporary pH and proportional diatom counts in littoral zone samples from 97 pristine soft water lakes and pools in Western Europe. The optima showed a strong relation with Hustedt's pH preference groups. The two new methods were then compared with existing calibration methods on the basis of differences between inferred and observed pH in a test set of 62 additional samples taken between 1918 and 1983. The methods were ranked in order of performance as follows (between brackets the standard error of inferred pH in pH units); maximum likelihood (0.63) > weighted averaging (0.71) = multiple regression using pH groups (0.71) = the Gasse & Tekaia method (0.71) > Renberg & Hellberg's Index B (0.83) » multiple regression using taxa (2.2). The standard errors are larger than those usually obtained from surface sediment samples. The relatively large standard may be due to seasonal variation and to the effects of other factors such as humus content. The maximum likelihood method is statistically rigorous and can in principle be extended to allow for additional environmental factors. It is computer intensive however. The weighted averaging approach is a good approximation to the maximum likelihood method and is recommended as a practical and robust alternative.  相似文献   

2.
1.?State space models are starting to replace more simple time series models in analyses of temporal dynamics of populations that are not perfectly censused. By simultaneously modelling both the dynamics and the observations, consistent estimates of population dynamical parameters may be obtained. For many data sets, the distribution of observation errors is unknown and error models typically chosen in an ad-hoc manner. 2.?To investigate the influence of the choice of observation error on inferences, we analyse the dynamics of a replicated time series of red kangaroo surveys using a state space model with linear state dynamics. Surveys were performed through aerial counts and Poisson, overdispersed Poisson, normal and log-normal distributions may all be adequate for modelling observation errors for the data. We fit each of these to the data and compare them using AIC. 3.?The state space models were fitted with maximum likelihood methods using a recent importance sampling technique that relies on the Kalman filter. The method relaxes the assumption of Gaussian observation errors required by the basic Kalman filter. Matlab code for fitting linear state space models with Poisson observations is provided. 4.?The ability of AIC to identify the correct observation model was investigated in a small simulation study. For the parameter values used in the study, without replicated observations, the correct observation distribution could sometimes be identified but model selection was prone to misclassification. On the other hand, when observations were replicated, the correct distribution could typically be identified. 5.?Our results illustrate that inferences may differ markedly depending on the observation distributions used, suggesting that choosing an adequate observation model can be critical. Model selection and simulations show that for the models and parameter values in this study, a suitable observation model can typically be identified if observations are replicated. Model selection and replication of observations, therefore, provide a potential solution when the observation distribution is unknown.  相似文献   

3.
We describe a FORTRAN computer program for fitting the logistic distribution function: (formula: see text) Where x represents dose or time, to dose-response data. The program determines both weighted least squares and maximum likelihood estimates for the parameters alpha and beta. It also calculates the standard errors of alpha and beta under both estimation methods, as well as the median lethal dose (LD50) and its standard error. Dose--response curves found by both fitting methods can be plotted as well as the 95% confidence bands for these lines.  相似文献   

4.
Response curves are presented for 15 species of vascular plants, bryophytes, and lichens relative to the water-table gradient in a boreal Norwegian mire. The gradient is scaled in two ways; position of median water-table in cm below the surface of the bottom layer, and in units of compositional turnover (rescaled by DCA). The two scalings produce quite different curves above the maximum water-level, which corresponds to median water-level 10–14 cm below the surface of the bottom layer. The significant drop in compositional turnover above this level can be ascribed to vertical differences in water-table having a lesser effect on plants in aerated hummocks than on plants in temporarily water-logged hollows. Skewness of response curves was reduced by gradient rescaling, most strongly after removal of influence of other gradients. It is argued that response curves will generally be Gaussian if: (1) the response is with respect to a dominant ecological factor, (2) the range of variation along the gradient is sufficient, (3) the distribution of samples is adequate, and (4), the gradient is scaled in units of compositional turnover.  相似文献   

5.
The invertebrate fauna has been surveyed for twenty one unlimed generally acidic river systems in Norway. The data consist of 180 samples and 127 invertebrate taxa and associated water chemistry data (pH, calcium, acid neutralizing capacity, total aluminium, and conductivity). Multivariate numerical methods are used to quantify the relationships between aquatic invertebrates and water chemistry. Detrended canonical correspondence analysis (DCCA) shows one dominant axis of variation with high correlations for pH and aluminium. DCCA axis 2 is significantly correlated with calcium. The predictive abilities of invertebrates to pH are explored by means of weighted averaging (WA) regression and calibration and weighted averaging partial-least-squares regression (WA-PLS). The performance of the methods is reported in terms of the root mean square error of prediction (RMSEP) of (observed pH-inferred pH). Bootstrapping and leave-one-out jackknifing are used as cross-validation procedures. The predictive abilities of invertebrates are good (RMSEPboot for WA = 0.309 pH units). Comparison of the invertebrates with diatom studies shows that invertebrates are as good predictors of modern pH as diatoms are. RMSEPjack shows that WA-PLS improves the predictive abilities. Indicator taxa for pH are found by Gaussian regression. Anisoptera, Agrypnia obsoleta, Leptophlebia marginata, Sialis lutaria, and Zygoptera have significant sigmoidal curves where abundances increase with decreasing pH. Cyrnus flavidus shows a significant unimodal response and has an estimated optimum in the acid part of the gradient. Isoperla spp. and Ostracoda show significant sigmoidal responses where abundances increase with increasing pH. Amphinemura borealis, Diura nanseni, Isoperla grammatica, I. obscura, and Siphonoperla burmeisteri show significant unimodal responses and have high pH optima. Many taxa do not have statistically significant unimodal or sigmoidal curves, but are found by WA to be characteristic of either high pH or low pH. These results suggest that a combined use of Gaussian regression and direct gradient analysis is needed to get a full overview of potential indicator taxa.  相似文献   

6.
Summary Doubling time has been widely used to represent the growth pattern of cells. A traditional method for finding the doubling time is to apply gray-scaled cells, where the logarithmic transformed scale is used. As an alternative statistical method, the log-linear model was recently proposed, for which actual cell numbers are used instead of the transformed gray-scaled cells. In this paper, I extend the log-linear model and propose the extended log-linear model. This model is designed for extra-Poisson variation, where the log-linear model produces the less appropriate estimate of the doubling time. Moreover, I compare statistical properties of the gray-scaled method, the log-linear model, and the extended log-linear model. For this purpose, I perform a Monte Carlo simulation study with three data-generating models: the additive error model, the multiplicative error model, and the overdispersed Poisson model. From the simulation study, I found that the gray-scaled method highly depends on the normality assumption of the gray-scaled cells; hence, this method is appropriate when the error model is multiplicative with the log-normally distributed errors. However, it is less efficient for other types of error distributions, especially when the error model is additive or the errors follow the Poisson distribution. The estimated standard error for the doubling time is not accurate in this case. The log-linear model was found to be efficient when the errors follow the Poisson distribution or nearly Poisson distribution. The efficiency of the log-linear model was decreased accordingly as the overdispersion increased, compared to the extended log-linear model. When the error model is additive or multiplicative with Gamma-distributed errors, the log-linear model is more efficient than the gray-scaled method. The extended log-linear model performs well overall for all three data-generating models. The loss of efficiency of the extended log-linear model is observed only when the error model is multiplicative with log-normally distributed errors, where the gray-scaled method is appropriate. However, the extended log-linear model is more efficient than the log-linear model in this case.  相似文献   

7.
In this paper, we consider selection based on the best predictor of animal additive genetic values in Gaussian linear mixed models, threshold models, Poisson mixed models, and log normal frailty models for survival data (including models with time-dependent covariates with associated fixed or random effects). In the different models, expressions are given (when these can be found – otherwise unbiased estimates are given) for prediction error variance, accuracy of selection and expected response to selection on the additive genetic scale and on the observed scale. The expressions given for non Gaussian traits are generalisations of the well-known formulas for Gaussian traits – and reflect, for Poisson mixed models and frailty models for survival data, the hierarchal structure of the models. In general the ratio of the additive genetic variance to the total variance in the Gaussian part of the model (heritability on the normally distributed level of the model) or a generalised version of heritability plays a central role in these formulas.  相似文献   

8.
Question: Species optima or indicator values are frequently used to predict environmental variables from species composition. The present study focuses on the question whether predictions can be improved by using species environmental amplitudes instead of single values representing species optima. Location: Semi‐natural, deciduous hardwood forests of northwestern Germany. Methods: Based on a data set of 558 relevés, species responses (presence/absence) to pH were modelled with Huisman‐Olff‐Fresco (HOF) regression models. Species amplitudes were derived from response curves using three different methods. To predict the pH from vegetation, a maximum amplitude overlap method was applied. For comparison, predictions resulting from several established methods, i. e. maximum likelihood/present and absent species, maximum likelihood/present species only, mean weighted averages and mean Ellenberg indicator values were calculated. The predictive success (squared Pearson's r and root mean square error of prediction) was evaluated using an independent data set of 151 relevés. Results: Predictions based upon amplitudes defined by maximum Cohen's x probability threshold yield the best results of all amplitude definitions (R2= 0.75, RMSEP = 0.52). Provided there is an even distribution of the environmental variable, amplitudes defined by predicted probability exceeding prevalence are also suitable (R2= 0.76, RMSEP = 0.55). The prediction success is comparable to maximum likelihood (present species only) and – after rescaling – to mean weighted averages. Predicted values show a good linearity to observed pH values as opposed to a curvilinear relationship of mean Ellenberg indicator values. Transformation or rescaling of the predicted values is not required. Conclusions: Species amplitudes given by a minimum and maximum boundary for each species can be used to efficiently predict environmental variables from species composition. The predictive success is superior to mean Ellenberg indicator values and comparable to mean indicator values based on species weighted averages.  相似文献   

9.
Aim The goals of this study were to: (1) compare water conductivity and pH as proxy measures of mineral richness in relation to mollusc assemblages in fens, (2) examine the patterns of mollusc species richness along the gradient of mineral richness based on these factors, (3) model species–response curves and analyse calcicole–calcifuge behaviour of molluscs, and (4) compare the results with those from other studies concerning non‐marine mollusc ecology. Location Altogether, 135 treeless spring fen sites were sampled within the area of the Western Carpathians (east Czech Republic, north‐west Slovakia and south Poland; overall extent of study area was 12,000 km2). Methods Mollusc communities were recorded quantitatively from a homogeneous area of 16 m2. Water conductivity and pH were measured in the field. The patterns of local species diversity along selected gradients, and species–response curves, were modelled using generalized linear models (GLM) and generalized additive models (GAM), both using the Poisson distribution. Results When the most acid sites (practically free of molluscs) were excluded, conductivity expressed the sites’ mineral richness and base saturation within the entire gradient, in contrast to pH. In the base‐rich sites, pH did not correlate with mineral richness. A unimodal response of local species diversity to mineral richness (expressed as conductivity) was found. In the extremely mineral‐rich, tufa‐forming sites (conductivity > 600 μS cm?1) a decrease in species diversity was encountered. Response curves of the most common species showed clear differentiation of their niches. Significant models of either unimodal or monotonic form were fitted for 18 of the 30 species analysed. Species showed five types of calcicole–calcifuge behaviour: (1) a decreasing monotonic response curve and a preference for the really acid sites; (2) a skewed unimodal response curve with the optimum shifted towards the slightly acid sites; (3) a symmetrical unimodal model response curve with the optimum in the base‐rich sites, with no or slight tufa precipitation; (4) a skewed unimodal response curve but with the optimum shifted to the more mineral‐rich sites; and (5) an increasingly monotonic response curve, the optimum in the extremely base‐rich sites with strong tufa precipitation. Main conclusions Conductivity is the only reliable proxy measure of mineral richness across the entire gradient, within the confines of this study. This information is of great ecological significance in studies of fen mollusc communities. Species richness does not increase with increasing mineral richness along the entire gradient: only a few species are able to dwell in the extremely base‐rich sites. The five types of calcicole–calcifuge behaviour seen in species living in fens have a wider application: data published so far suggest they are also applicable to mollusc communities in other habitats.  相似文献   

10.
基于Z标度法对C/18家族几丁质酶的特征序列PS01095和PS00232进行数字转换,将得到的数据集采用逐步回归方法回归预测,构建了几丁质酶特征序列与其最适pH间关系的数学模型.当模型的相关系数为R=0.964,显著水平P<0.001,得到了最佳的预测效果.模型对pH值拟合的平均绝对百分比误差为0.05 %,同时具有良好的预测效果,预测的平均绝对误差为0.26 个pH单位,比基于几丁质酶氨基酸组成的支持向量机模型更好.  相似文献   

11.
We propose a state space model for analyzing equally or unequally spaced longitudinal count data with serial correlation. With a log link function, the mean of the Poisson response variable is a nonlinear function of the fixed and random effects. The random effects are assumed to be generated from a Gaussian first order autoregression (AR(1)). In this case, the mean of the observations has a log normal distribution. We use a combination of linear and nonlinear methods to take advantage of the Gaussian process embedded in a nonlinear function. The state space model uses a modified Kalman filter recursion to estimate the mean and variance of the AR(1) random error given the previous observations. The marginal likelihood is approximated by numerically integrating out the AR(1) random error. Simulation studies with different sets of parameters show that the state space model performs well. The model is applied to Epileptic Seizure data and Primary Care Visits Data. Missing and unequally spaced observations are handled naturally with this model.  相似文献   

12.
Abstract. Generalized additive, generalized linear, and classification tree models were developed to predict the distribution of 20 species of chaparral and coastal sage shrubs within the southwest ecoregion of California. Mapped explanatory variables included bioclimatic attributes related to primary environmental regimes: averages of annual precipitation, minimum temperature of the coldest month, maximum temperature of the warmest month, and topographically-distributed potential solar insolation of the wettest quarter (winter) and of the growing season (spring). Also tested for significance were slope angle (related to soil depth) and the geographic coordinates of each observation. Models were parameterized and evaluated based on species presence/absence data from 906 plots surveyed on National Forest lands. Although all variables were significant in at least one of the species’ models, those models based only on the bioclimatic variables predicted species presence with 3–26% error. While error would undoubtedly be greater if the models were evaluated using independent data, results indicate that these models are useful for predictive mapping – for interpolating species distribution data within the ecoregion. All three methods produced models with similar accuracy for a given species; GAMs were useful for exploring the shape of the response functions, GLMs allowed those response functions to be parameterized and their significance tested, and classification trees, while some-times difficult to interpret, yielded the lowest prediction errors (lower by 3–5%).  相似文献   

13.
We present a method for characterizing the free-energy and affinity distributions of a heterogeneous population of molecules interacting with a homogeneous population of ligands, by driving expressions for the moments as functions of experimental binding curve characteristics, and then constructing the distribution as an expansion over a Gaussian basis set. Although the method provides the complete distribution in principle, in practice it is restricted by experimental noise, inaccuracies in data fitting, and the severity with which the distribution deviates from a Gaussian. Limitations imposed by experimental inaccuracies and the requirement of an appropriate analytic function for data fitting were evaluated by Monte Carlo simulations of binding experiments with various degrees of error in the data. Thus a distribution was assumed, binding curves with random errors were generated, and the technique was applied in order to determine the extent to which the characteristics of the assumed distribution could be recovered. Typical inaccuracies in the first two moments fell within experimental error, whereas inaccuracies in the third and fourth were generally larger than standard deviations in the data. The accuracy of these higher-order moments was invarient for experimental errors ranging from 2 to 10% and may thus be limited, within this range, primarily by the curve fitting procedure. The other aspect of the problem, accurate inference of the distribution, is limited in part by inaccuracies in the moments but more importantly by the extent to which the distribution deviates from a Gaussian. The extensive statistical literature on the problem of inference enables the delineation of specific criteria for estimating the efficiency of construction, as well as for deciding whether certain features of the inferred distribution, such as bimodality, are artifacts of the procedure. In spite of the limitations of the method, the results indicate that the mean and standard deviation are obtainable with greater accuracy than by a Sipsian analysis. This difference is particularly important when the distribution is narrow and width detection is beyond the sensitivity of the Sips plot. The method should be more accurate than the latter as an assay for homogeneity as well as for characterizing the moments, though equally easy to apply.  相似文献   

14.
A new method based on Taylor dispersion has been developed that enables an analyte gradient to be titrated over a ligand-coated surface for kinetic/affinity analysis of interactions from a minimal number of injections. Taylor dispersion injections generate concentration ranges in excess of four orders of magnitude and enable the analyte diffusion coefficient to be reliably estimated as a fitted parameter when fitting binding interaction models. A numerical model based on finite element analysis, Monte Carlo simulations, and statistical profiling were used to compare the Taylor dispersion method with standard fixed concentration injections in terms of parameter correlation, linearity of parameter error space, and global versus local model fitting. A dramatic decrease in parameter correlations was observed for TDi curves relative to curves from standard fixed concentration injections when surface saturation was achieved. In FCI the binding progress is recorded with respect to injection time, whereas in TDi the second time dependency encoded in the analyte gradient increases resolving power. This greatly lowers the dependence of all parameters on each other and on experimental interferences. When model parameters were fitted locally, the performance of TDis remained comparable to global model fitting, whereas fixed concentration binding response curves yielded unreliable parameter estimates.  相似文献   

15.
Schafer DW 《Biometrics》2001,57(1):53-61
This paper presents an EM algorithm for semiparametric likelihood analysis of linear, generalized linear, and nonlinear regression models with measurement errors in explanatory variables. A structural model is used in which probability distributions are specified for (a) the response and (b) the measurement error. A distribution is also assumed for the true explanatory variable but is left unspecified and is estimated by nonparametric maximum likelihood. For various types of extra information about the measurement error distribution, the proposed algorithm makes use of available routines that would be appropriate for likelihood analysis of (a) and (b) if the true x were available. Simulations suggest that the semiparametric maximum likelihood estimator retains a high degree of efficiency relative to the structural maximum likelihood estimator based on correct distributional assumptions and can outperform maximum likelihood based on an incorrect distributional assumption. The approach is illustrated on three examples with a variety of structures and types of extra information about the measurement error distribution.  相似文献   

16.
When the observed data are contaminated with errors, the standard two-sample testing approaches that ignore measurement errors may produce misleading results, including a higher type-I error rate than the nominal level. To tackle this inconsistency, a nonparametric test is proposed for testing equality of two distributions when the observed contaminated data follow the classical additive measurement error model. The proposed test takes into account the presence of errors in the observed data, and the test statistic is defined in terms of the (deconvoluted) characteristic functions of the latent variables. Proposed method is applicable to a wide range of scenarios as no parametric restrictions are imposed either on the distribution of the underlying latent variables or on the distribution of the measurement errors. Asymptotic null distribution of the test statistic is derived, which is given by an integral of a squared Gaussian process with a complicated covariance structure. For data-based calibration of the test, a new nonparametric Bootstrap method is developed under the two-sample measurement error framework and its validity is established. Finite sample performance of the proposed test is investigated through simulation studies, and the results show superior performance of the proposed method than the standard tests that exhibit inconsistent behavior. Finally, the proposed method was applied to real data sets from the National Health and Nutrition Examination Survey. An R package MEtest is available through CRAN.  相似文献   

17.
In quantitative biology, observed data are fitted to a model that captures the essence of the system under investigation in order to obtain estimates of the parameters of the model, as well as their standard errors and interactions. The fitting is best done by the method of maximum likelihood, though least-squares fits are often used as an approximation because the calculations are perceived to be simpler. Here Brian Williams and Chris Dye argue that the method of maximum likelihood is generally preferable to least squares giving the best estimates of the parameters for data with any given error distribution, and the calculations are no more difficult than for least-squares fitting. They offer a relatively simple explanation of the methods and describe its implementation using examples from leishmaniasis epidemiology.  相似文献   

18.
A method of fluorescence anisotropy decay analysis is described in this work. The transient anisotropy r(ex)(t) measured in a photocounting pulsefluorimeter is fitted by a non linear least square procedure to the ratio of convolutions of the apparatus response function g(t) by sums of appropriate exponential functions. This method takes rigorously into account the apparatus response function and is applicable to any shape of the later as well as to any values of fluorescence decay times and correlation times. The performances of the method have been tested with data simulated from measured response functions corresponding to an air lamp and a high pressure nitrogen lamp. The statistical standard errors of the anisotropy deca parameters have been found to be smaller than the standard errors previously calculated for the moment method. A systematic error delta in the fluorescence decay time entailed an error deltatheta in the correlation time such as Deltatheta/theta < deltatau/tau. By this method, good fitting of experimental data have been achieved very conveniently and accurately.  相似文献   

19.
Repeatability (more precisely the common measure of repeatability, the intra‐class correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to between‐subject (or between‐group) variation. As a consequence, the non‐repeatable fraction of phenotypic variation is the sum of measurement error and phenotypic flexibility. There are several ways to estimate repeatability for Gaussian data, but there are no formal agreements on how repeatability should be calculated for non‐Gaussian data (e.g. binary, proportion and count data). In addition to point estimates, appropriate uncertainty estimates (standard errors and confidence intervals) and statistical significance for repeatability estimates are required regardless of the types of data. We review the methods for calculating repeatability and the associated statistics for Gaussian and non‐Gaussian data. For Gaussian data, we present three common approaches for estimating repeatability: correlation‐based, analysis of variance (ANOVA)‐based and linear mixed‐effects model (LMM)‐based methods, while for non‐Gaussian data, we focus on generalised linear mixed‐effects models (GLMM) that allow the estimation of repeatability on the original and on the underlying latent scale. We also address a number of methods for calculating standard errors, confidence intervals and statistical significance; the most accurate and recommended methods are parametric bootstrapping, randomisation tests and Bayesian approaches. We advocate the use of LMM‐ and GLMM‐based approaches mainly because of the ease with which confounding variables can be controlled for. Furthermore, we compare two types of repeatability (ordinary repeatability and extrapolated repeatability) in relation to narrow‐sense heritability. This review serves as a collection of guidelines and recommendations for biologists to calculate repeatability and heritability from both Gaussian and non‐Gaussian data.  相似文献   

20.
Pennello GA  Devesa SS  Gail MH 《Biometrics》1999,55(3):774-781
Commonly used methods for depicting geographic variation in cancer rates are based on rankings. They identify where the rates are high and low but do not indicate the magnitude of the rates nor their variability. Yet such measures of variability may be useful in suggesting which types of cancer warrant further analytic studies of localized risk factors. We consider a mixed effects model in which the logarithm of the mean Poisson rate is additive in fixed stratum effects (e.g., age effects) and in logarithms of random relative risk effects associated with geographic areas. These random effects are assumed to follow a gamma distribution with unit mean and variance 1/alpha, similar to Clayton and Kaldor (1987, Biometrics 43, 671-681). We present maximum likelihood and method-of-moments estimates with standard errors for inference on alpha -1/2, the relative risk standard deviation (RRSD). The moment estimates rely on only the first two moments of the Poisson and gamma distributions but have larger standard errors than the maximum likelihood estimates. We compare these estimates with other measures of variability. Several examples suggest that the RRSD estimates have advantages compared to other measures of variability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号