首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 89 毫秒
1.
Clinical prediction models play a key role in risk stratification, therapy assignment and many other fields of medical decision making. Before they can enter clinical practice, their usefulness has to be demonstrated using systematic validation. Methods to assess their predictive performance have been proposed for continuous, binary, and time-to-event outcomes, but the literature on validation methods for discrete time-to-event models with competing risks is sparse. The present paper tries to fill this gap and proposes new methodology to quantify discrimination, calibration, and prediction error (PE) for discrete time-to-event outcomes in the presence of competing risks. In our case study, the goal was to predict the risk of ventilator-associated pneumonia (VAP) attributed to Pseudomonas aeruginosa in intensive care units (ICUs). Competing events are extubation, death, and VAP due to other bacteria. The aim of this application is to validate complex prediction models developed in previous work on more recently available validation data.  相似文献   

2.
Inferring the demographic history of species and their populations is crucial to understand their contemporary distribution, abundance and adaptations. The high computational overhead of likelihood‐based inference approaches severely restricts their applicability to large data sets or complex models. In response to these restrictions, approximate Bayesian computation (ABC) methods have been developed to infer the demographic past of populations and species. Here, we present the results of an evaluation of the ABC‐based approach implemented in the popular software package diyabc using simulated data sets (mitochondrial DNA sequences, microsatellite genotypes and single nucleotide polymorphisms). We simulated population genetic data under five different simple, single‐population models to assess the model recovery rates as well as the bias and error of the parameter estimates. The ability of diyabc to recover the correct model was relatively low (0.49): 0.6 for the simplest models and 0.3 for the more complex models. The recovery rate improved significantly when reducing the number of candidate models from five to three (from 0.57 to 0.71). Among the parameters of interest, the effective population size was estimated at a higher accuracy compared to the timing of events. Increased amounts of genetic data did not significantly improve the accuracy of the parameter estimates. Some gains in accuracy and decreases in error were observed for scaled parameters (e.g., Neμ) compared to unscaled parameters (e.g., Ne and μ). We concluded that diyabc ‐based assessments are not suited to capture a detailed demographic history, but might be efficient at capturing simple, major demographic changes.  相似文献   

3.
Reliable and accurate information on animal abundance is fundamental for the conservation and management of wildlife. Recently, a number of innovative devices (such as camera traps) have been widely used in field surveys and have largely improved survey efficiency. However, these devices often constitute noninstantaneous point surveys, resulting in the multiple counts of the same animal individuals within a single sampling occasion (i.e., false-positive errors). Many commonly-used statistical models do not explicitly account for the false-positive error, with its effects on estimates being poorly understood. Here, I tested the performance of the commonly-used Poisson-binomial N-mixture and the Royle-Nichols model in the presence of both false-positive and negative errors (i.e., individuals in a population might not be detected). I also implemented the Poisson-Poisson mixture model in the Bayesian framework to evaluate its reliability. The results of the simulation using random walks based on Ornstein-Uhlenbeck processes showed that the Poisson-binomial model was not robust to false-positive errors. In comparison, the Royle-Nichols and Poisson-Poisson models provided reasonable estimates of the number of animals whose home range included the survey point. However, the number of animals whose home range included the survey point is inherently influenced by the size of animal home ranges, and thus cannot be used as a surrogate of animal density. Although the N-mixture and Royle-Nichols models are widely used, their utility might be restricted by this limitation. In conclusion, studies should clearly define the objective of surveys and carefully consider whether the models used are valid.  相似文献   

4.
The selection of the most appropriate model for an ecological risk assessment depends on the application, the data and resources available, the knowledge base of the assessor, the relevant endpoints, and the extent to which the model deals with uncertainty. Since ecological systems are highly variable and our knowledge of model input parameters is uncertain, it is important that models include treatments of uncertainty and variability, and that results are reported in this light. In this paper we discuss treatments of variation and uncertainty in a variety of population models. In ecological risk assessments, the risk relates to the probability of an adverse event in the context of environmental variation. Uncertainty relates to ignorance about parameter values, e.g., measurement error and systematic error. An assessment of the full distribution of risks, under variability and parameter uncertainty, will give the most comprehensive and flexible endpoint. In this paper we present the rationale behind probabilistic risk assessment, identify the sources of uncertainty relevant for risk assessment and provide an overview of a range of population models. While all of the models reviewed have some utility in ecology, some have more comprehensive treatments of uncertainty than others. We identify the models that allow probabilistic assessments and sensitivity analyses, and we offer recommendations for further developments that aim towards more comprehensive and reliable ecological risk assessments for populations.  相似文献   

5.
CoMFA and CoMSIA studies on fluorinated hexahydropyrimidine derivatives   总被引:1,自引:0,他引:1  
3D-QSAR models of a series of fluorinated hexahydropyrimidine derivatives with cytotoxic activities have been developed using CoMFA and CoMSIA. These models provide a better understanding of the mechanism of action and structure–activity relationship of these compounds. By applying leave-one-out (LOO) cross validation study, the best predictive CoMFA model was achieved with 3 as the optimum number of components, which gave rise to a non-cross-validated r2 value of 0.978, and standard error of estimate of 0.059, and F value of 144.492. Similarly, the best predictive CoMSIA model was derived with 4 as the number of components, r2 value of 0.999, F value of 4381.143, and standard error of estimate, 0.011. The above models will inspire the design and synthesis of novel hexahydropyrimidines with enhanced potency and selectivity.  相似文献   

6.
The development of color vision models has allowed the appraisal of color vision independent of the human experience. These models are now widely used in ecology and evolution studies. However, in common scenarios of color measurement, color vision models may generate spurious results. Here I present a guide to color vision modeling (Chittka (1992, Journal of Comparative Physiology A, 170, 545) color hexagon, Endler & Mielke (2005, Journal Of The Linnean Society, 86, 405) model, and the linear and log‐linear receptor noise limited models (Vorobyev & Osorio 1998, Proceedings of the Royal Society B, 265, 351; Vorobyev et al. 1998, Journal of Comparative Physiology A, 183, 621)) using a series of simulations, present a unified framework that extends and generalize current models, and provide an R package to facilitate the use of color vision models. When the specific requirements of each model are met, between‐model results are qualitatively and quantitatively similar. However, under many common scenarios of color measurements, models may generate spurious values. For instance, models that log‐transform data and use relative photoreceptor outputs are prone to generate spurious outputs when the stimulus photon catch is smaller than the background photon catch; and models may generate unrealistic predictions when the background is chromatic (e.g. leaf reflectance) and the stimulus is an achromatic low reflectance spectrum. Nonetheless, despite differences, all three models are founded on a similar set of assumptions. Based on that, I provide a new formulation that accommodates and extends models to any number of photoreceptor types, offers flexibility to build user‐defined models, and allows users to easily adjust chromaticity diagram sizes to account for changes when using different number of photoreceptors.  相似文献   

7.
Model-free analysis is a technique commonly used within the field of NMR spectroscopy to extract atomic resolution, interpretable dynamic information on multiple timescales from the R 1, R 2, and steady state NOE. Model-free approaches employ two disparate areas of data analysis, the discipline of mathematical optimisation, specifically the minimisation of a χ2 function, and the statistical field of model selection. By searching through a large number of model-free minimisations, which were setup using synthetic relaxation data whereby the true underlying dynamics is known, certain model-free models have been identified to, at times, fail. This has been characterised as either the internal correlation times, τ e , τ f , or τ s , or the global correlation time parameter, local τ m , heading towards infinity, the result being that the final parameter values are far from the true values. In a number of cases the minimised χ2 value of the failed model is significantly lower than that of all other models and, hence, will be the model which is chosen by model selection techniques. If these models are not removed prior to model selection the final model-free results could be far from the truth. By implementing a series of empirical rules involving inequalities these models can be specifically isolated and removed. Model-free analysis should therefore consist of three distinct steps: model-free minimisation, model-free model elimination, and finally model-free model selection. Failure has also been identified to affect the individual Monte Carlo simulations used within error analysis. Each simulation involves an independent randomised relaxation data set and model-free minimisation, thus simulations suffer from exactly the same types of failure as model-free models. Therefore, to prevent these outliers from causing a significant overestimation of the errors the failed Monte Carlo simulations need to be culled prior to calculating the parameter standard deviations.  相似文献   

8.
Most approaches to species delimitation to date have considered divergence-only models. Although these models are appropriate for allopatric speciation, their failure to incorporate many of the population-level processes that drive speciation, such as gene flow (e.g., in sympatric speciation), places an unnecessary limit on our collective understanding of the processes that produce biodiversity. To consider these processes while inferring species boundaries, we introduce the R-package delimitR and apply it to identify species boundaries in the reticulate taildropper slug (Prophysaon andersoni). Results suggest that secondary contact is an important mechanism driving speciation in this system. By considering process, we both avoid erroneous inferences that can be made when population-level processes such as secondary contact drive speciation but only divergence is considered, and gain insight into the process of speciation in terrestrial slugs. Further, we apply delimitR to three published empirical datasets and find results corroborating previous findings. Finally, we evaluate the performance of delimitR using simulation studies, and find that error rates are near zero when comparing models that include lineage divergence and gene flow for three populations with a modest number of Single Nucleotide Polymorphisms (SNPs; 1500) and moderate divergence times (<100,000 generations). When we apply delimitR to a complex model set (i.e., including divergence, gene flow, and population size changes), error rates are moderate (∼0.15; 10,000 SNPs), and, when present, misclassifications occur among highly similar models.  相似文献   

9.
Aim Spatial autocorrelation is a frequent phenomenon in ecological data and can affect estimates of model coefficients and inference from statistical models. Here, we test the performance of three different simultaneous autoregressive (SAR) model types (spatial error = SARerr, lagged = SARlag and mixed = SARmix) and common ordinary least squares (OLS) regression when accounting for spatial autocorrelation in species distribution data using four artificial data sets with known (but different) spatial autocorrelation structures. Methods We evaluate the performance of SAR models by examining spatial patterns in model residuals (with correlograms and residual maps), by comparing model parameter estimates with true values, and by assessing their type I error control with calibration curves. We calculate a total of 3240 SAR models and illustrate how the best models [in terms of minimum residual spatial autocorrelation (minRSA), maximum model fit (R2), or Akaike information criterion (AIC)] can be identified using model selection procedures. Results Our study shows that the performance of SAR models depends on model specification (i.e. model type, neighbourhood distance, coding styles of spatial weights matrices) and on the kind of spatial autocorrelation present. SAR model parameter estimates might not be more precise than those from OLS regressions in all cases. SARerr models were the most reliable SAR models and performed well in all cases (independent of the kind of spatial autocorrelation induced and whether models were selected by minRSA, R2 or AIC), whereas OLS, SARlag and SARmix models showed weak type I error control and/or unpredictable biases in parameter estimates. Main conclusions SARerr models are recommended for use when dealing with spatially autocorrelated species distribution data. SARlag and SARmix might not always give better estimates of model coefficients than OLS, and can thus generate bias. Other spatial modelling techniques should be assessed comprehensively to test their predictive performance and accuracy for biogeographical and macroecological research.  相似文献   

10.
Aim Studies have typically employed species–area relationships (SARs) from sample areas to fit either the power relationship or the logarithmic (exponential) relationship. However, the plots from empirical data often fall between these models. This article proposes two complementary and hybrid models as solutions to the controversy regarding which model best fits sample‐area SARs. Methods The two models are and , where SA is number of species in an area, A, where z, b, c1 and c2 are predetermined parameters found by calculation, and where d and n are parameters to be fitted. The number of parameters is reduced from six to two by fixing the model at either end of the scale window of the data set, a step that is justified by the condition that the error or the bias, or both, in the first and the last data points is negligible. The new hybrid models as well as the power model and the logarithmic model are fitted to 10 data sets. Results The two proposed models fit well not only to Arrhenius’ and Gleason’s data sets, but also to the other six data sets. They also provide a good fit to data sets that follow a sigmoid (or triphasic) shape in log–log space and to data sets that do not fall between the power model and the logarithmic model. The log‐transformation of the dependent variable, S, does not affect the curve fit appreciably, although it enhances the performance of the new models somewhat. Main conclusions Sample‐area SARs have previously been shown to be convex upward, convex downward (concave), sigmoid and inverted sigmoid in log–log space. The new hybrid models describe successfully data sets with all these curve shapes, and should therefore produce good fits also to what are termed triphasic SARs.  相似文献   

11.
We study an epidemiological model which assumes that the susceptibility after a primary infection is r times the susceptibility before a primary infection. For r = 0 (r = 1) this is the SIR (SIS) model. For r > 1 + (μ/α) this model shows backward bifurcations, where μ is the death rate and α is the recovery rate. We show for the first time that for such models we can give an expression for the minimum effort required to eradicate the infection if we concentrate on control measures affecting the transmission rate constant β. This eradication effort is explicitly expressed in terms of α,r, and μ As in models without backward bifurcation it can be interpreted as a reproduction number, but not necessarily as the basic reproduction number. We define the relevant reproduction numbers for this purpose. The eradication effort can be estimated from the endemic steady state. The classical basic reproduction number R 0 is smaller than the eradication effort for r > 1 + (μ/α) and equal to the effort for other values of r. The method we present is relevant to the whole class of compartmental models with backward bifurcation.Dedicated to Karl Peter Hadeler on the occasion of his 70th birthday.  相似文献   

12.
Summary We discuss the issue of identifiability of models for multiple dichotomous diagnostic tests in the absence of a gold standard (GS) test. Data arise as multinomial or product‐multinomial counts depending upon the number of populations sampled. Models are generally posited in terms of population prevalences, test sensitivities and specificities, and test dependence terms. It is commonly believed that if the degrees of freedom in the data meet or exceed the number of parameters in a fitted model then the model is identifiable. Goodman (1974, Biometrika 61, 215–231) established that this was not the case a long time ago. We discuss currently available models for multiple tests and argue in favor of an extension of a model that was developed by Dendukuri and Joseph (2001, Biometrics 57, 158–167). Subsequently, we further develop Goodman's technique, and make geometric arguments to give further insight into the nature of models that lack identifiability. We present illustrations using simulated and real data.  相似文献   

13.
14.
Abstract

Intermolecular interaction is investigated for an isomeric pair of fluoro propane, CH3CF2CF3 (HFC–245cb, CB) and CH2FCF2CHF2 (HFC–245ca, CA). CB has a larger dipole moment than CA. This may suggest that CB has a larger intermolecular attractive interaction than CA; the reverse is, however, found from the experimental data: normal boiling point, critical temperature, and heat of vaporization. Systematic ab initio calculations have been done for both CB dimer and CA dimer, and confirmed that the former has a smaller attractive interaction than the latter.

On the basis of these calculations, analytic functions have been constructed as the pair potential models for the two isomers. Each of these models has 11 Lennard-Jones and Coulomb interaction sites in the molecule. The present models can explain why CB dimer has a smaller attractive interaction than CA dimer, and they will easily be extended to a series of fluoro propanes, and make it possible to perform the systematic molecular simulation studies.  相似文献   

15.
We present an approach to estimate gross primary production (GPP) using a remotely sensed biophysical vegetation product (fraction of absorbed photosynthetically active radiation, FAPAR) from the European Commission Joint Research Centre (JRC) in conjunction with GPP estimates from eddy covariance measurement towers in Europe. By analysing the relationship between the cumulative growing season FAPAR and annual GPP by vegetation type, we find that the former can be used to accurately predict the latter. The root mean square error of prediction is of the order of 250 gC m−2 yr−1. The cumulative growing season FAPAR integrates over a number of effects relevant for GPP such as the length of the growing season, the vegetation's response to environmental conditions and the amount of light harvested that is available for photosynthesis. We corroborate the proposed GPP estimate (noted FAPAR-based productivity assessment+land cover, FPA+LC) on the continental scale with results from the MOD17+radiation-use efficiency model, an artificial neural network up-scaling approach (ANN) and the Lund–Potsdam–Jena managed Land biosphere model (LPJmL). The closest agreement of the mean spatial GPP pattern among the four models is between FPA+LC and ANN (R2= 0.74). At least some of the discrepancy between FPA-LC and the other models result from biases of meteorological forcing fields for MOD17+, ANN and LPJmL. Our analysis further implies that meteorological information is to a large degree redundant for GPP estimation when using the JRC-FAPAR. A major advantage of the FPA+LC approach presented in this paper lies in its simplicity and that it requires no additional meteorological input driver data that commonly introduce substantial uncertainty. We find that results from different data-oriented models may be robust enough to evaluate process-oriented models regarding the mean spatial pattern of GPP, while there is too little consensus among the diagnostic models for such purpose regarding inter-annual variability.  相似文献   

16.
Musculoskeletal modelling is a methodology used to investigate joint contact forces during a movement. High accuracy in the estimation of the hip or knee joint contact forces can be obtained with subject-specific models. However, construction of subject-specific models remains time consuming and expensive. The purpose of this systematic review of the literature was to identify what alterations can be made on generic (i.e. literature-based, without any subject-specific measurement other than body size and weight) musculoskeletal models to obtain a better estimation of the joint contact forces. The impact of these alterations on the accuracy of the estimated joint contact forces were appraised.The systematic search yielded to 141 articles and 24 papers were included in the review. Different strategies of alterations were found: skeletal and joint model (e.g. number of degrees of freedom, knee alignment), muscle model (e.g. Hill-type muscle parameters, level of muscular redundancy), and optimisation problem (e.g. objective function, design variables, constraints). All these alterations had an impact on joint contact force accuracy, so demonstrating the potential for improving the model predictions without necessarily involving costly and time consuming medical images. However, due to discrepancies in the reported evidence about this impact and despite a high quality of the reviewed studies, it was not possible to highlight any trend defining which alteration had the largest impact.  相似文献   

17.
Golden‐cheeked Warblers (Setophaga chrysoparia) are endangered songbirds that breed exclusively in the Ashe juniper (Juniperus ashei) and oak (Quercus spp.) woodlands of central Texas. Despite being the focus of numerous studies, we still know little about the size of the range‐wide breeding population and how density varies across the spectrum of juniper co‐dominated woodlands. Models that have been tested and shown to be accurate are needed to help develop management and conservation guidelines. We evaluated the accuracy and bias of density estimates from binomial mixture models, the dependent double‐observer method, and distance sampling by comparing them to actual densities determined by intensive territory monitoring on plots in the Balcones Canyonlands Preserve, Austin, Texas. We found that the binomial mixture models consistently overestimated density by 1.1–3.2 times (actual density = 0.07–0.46 males/ha), and the other two models overestimated by 1.1–29.8 times at low density and underestimated by 0.5–0.9 times at high density plots (actual density = 0.01–0.46 males/ha). The magnitude of error for all models was greatest at sites with few or no birds (<0.15 males/ha), with model performance improving as actual density increased. These non‐linear relationships indicate a lack of sensitivity with respect to true changes in density. Until systematic evaluation demonstrates that models such as those we tested provide accurate and unbiased density estimates for a given species over space and time, we recommend additional field tests to validate model‐based estimates. Continued model validation and refinement of point‐count methods are needed until accurate estimates are obtained across the density spectrum for Golden‐cheeked Warblers and other songbird species.  相似文献   

18.
Sightability models are binary logistic-regression models used to estimate and adjust for visibility bias in wildlife-population surveys. Like many models in wildlife and ecology, sightability models are typically developed from small observational datasets with many candidate predictors. Aggressive model-selection methods are often employed to choose a best model for prediction and effect estimation, despite evidence that such methods can lead to overfitting (i.e., selected models may describe random error or noise rather than true predictor–response curves) and poor predictive ability. We used moose (Alces alces) sightability data from northeastern Minnesota (2005–2007) as a case study to illustrate an alternative approach, which we refer to as degrees-of-freedom (df) spending: sample-size guidelines are used to determine an acceptable level of model complexity and then a pre-specified model is fit to the data and used for inference. For comparison, we also constructed sightability models using Akaike's Information Criterion (AIC) step-down procedures and model averaging (based on a small set of models developed using df-spending guidelines). We used bootstrap procedures to mimic the process of model fitting and prediction, and to compute an index of overfitting, expected predictive accuracy, and model-selection uncertainty. The index of overfitting increased 13% when the number of candidate predictors was increased from three to eight and a best model was selected using step-down procedures. Likewise, model-selection uncertainty increased when the number of candidate predictors increased. Model averaging (based on R = 30 models with 1–3 predictors) effectively shrunk regression coefficients toward zero and produced similar estimates of precision to our 3-df pre-specified model. As such, model averaging may help to guard against overfitting when too many predictors are considered (relative to available sample size). The set of candidate models will influence the extent to which coefficients are shrunk toward zero, which has implications for how one might apply model averaging to problems traditionally approached using variable-selection methods. We often recommend the df-spending approach in our consulting work because it is easy to implement and it naturally forces investigators to think carefully about their models and predictors. Nonetheless, similar concepts should apply whether one is fitting 1 model or using multi-model inference. For example, model-building decisions should consider the effective sample size, and potential predictors should be screened (without looking at their relationship to the response) for missing data, narrow distributions, collinearity, potentially overly influential observations, and measurement errors (e.g., via logical error checks). © 2011 The Wildlife Society.  相似文献   

19.
Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation.  相似文献   

20.
Menggang Yu  Bin Nan 《Biometrics》2010,66(2):405-414
Summary In large cohort studies, it often happens that some covariates are expensive to measure and hence only measured on a validation set. On the other hand, relatively cheap but error‐prone measurements of the covariates are available for all subjects. Regression calibration (RC) estimation method ( Prentice, 1982 , Biometrika 69 , 331–342) is a popular method for analyzing such data and has been applied to the Cox model by Wang et al. (1997, Biometrics 53 , 131–145) under normal measurement error and rare disease assumptions. In this article, we consider the RC estimation method for the semiparametric accelerated failure time model with covariates subject to measurement error. Asymptotic properties of the proposed method are investigated under a two‐phase sampling scheme for validation data that are selected via stratified random sampling, resulting in neither independent nor identically distributed observations. We show that the estimates converge to some well‐defined parameters. In particular, unbiased estimation is feasible under additive normal measurement error models for normal covariates and under Berkson error models. The proposed method performs well in finite‐sample simulation studies. We also apply the proposed method to a depression mortality study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号